verify and fix issue 1668 (#1673)

* Adding test.

* Verifies and fix issue 1668. This commit updates the previous behavior of the
On Demand stream support by return a value type (document_reference) instead
of a reference to a document. This allows us to bridge with the usually simdjson
error system, with its simdjson_result types.

* Minor reformat.

* Adds a test with initial tests passing.

* Adding an example.
This commit is contained in:
Daniel Lemire 2021-07-27 08:51:07 -04:00 committed by GitHub
parent 7d887fdc1e
commit eb93b98d6a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
14 changed files with 495 additions and 76 deletions

View File

@ -18,7 +18,7 @@ struct simdjson_ondemand {
ondemand::document_stream::iterator i = stream.begin();
++i; // Skip first line
for (;i != stream.end(); ++i) {
auto & doc = *i;
auto doc = *i;
size_t index{0};
StringType copy;
double rating;

View File

@ -18,7 +18,7 @@ struct simdjson_ondemand {
ondemand::document_stream::iterator i = stream.begin();
++i; // Skip first line
for (;i != stream.end(); ++i) {
auto & doc = *i;
auto doc = *i;
size_t index{0};
StringType copy;
double rating;

View File

@ -991,28 +991,68 @@ format. If your JSON documents all contain arrays or objects, we even support di
concatenation without whitespace. The concatenated file has no size restrictions (including larger
than 4GB), though each individual document must be no larger than 4 GB.
Here is a simple example:
Here is an example:
```c++
auto json = R"({ "foo": 1 } { "foo": 2 } { "foo": 3 } )"_padded;
ondemand::parser parser;
ondemand::document_stream docs = parser.iterate_many(json);
for (auto & doc : docs) {
for (auto doc : docs) {
std::cout << doc["foo"] << std::endl;
}
// Prints 1 2 3
```
It is important to note that the iteration returns a `document` reference, and hence why the `&` is needed.
Unlike `parser.iterate`, `parser.iterate_many` may parse "on demand" (lazily). That is, no parsing may have been done before you enter the loop
`for (auto & doc : docs) {` and you should expect the parser to only ever fully parse one JSON document at a time.
`for (auto doc : docs) {` and you should expect the parser to only ever fully parse one JSON document at a time.
As with `parser.iterate`, when calling `parser.iterate_many(string)`, no copy is made of the provided string input. The provided memory buffer may be accessed each time a JSON document is parsed. Calling `parser.iterate_many(string)` on a temporary string buffer (e.g., `docs = parser.parse_many("[1,2,3]"_padded)`) is unsafe (and will not compile) because the `document_stream` instance needs access to the buffer to return the JSON documents.
The `iterate_many` function can also take an optional parameter `size_t batch_size` which defines the window processing size. It is set by default to a large value (`1000000` corresponding to 1 MB). None of your JSON documents should exceed this window size, or else you will get the error `simdjson::CAPACITY`. You cannot set this window size larger than 4 GB: you will get the error `simdjson::CAPACITY`. The smaller the window size is, the less memory the function will use. Setting the window size too small (e.g., less than 100 kB) may also impact performance negatively. Leaving it to 1 MB is expected to be a good choice, unless you have some larger documents.
The following toy examples illustrates how to get capacity errors. It is an artificial example since you should never use a `batch_size` of 50 bytes (it is far too small).
```c++
// We are going to set the capacity to 50 bytes which means that we cannot
// loading a document longer than 50 bytes. The first few documents are small,
// but the last one is large. We will get an error at the last document.
auto json = R"([1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100])"_padded;
ondemand::parser parser;
ondemand::document_stream stream;
size_t counter{0};
auto error = parser.iterate_many(json, 50).get(stream);
if( error ) { /* handle the error */ }
for (auto doc: stream) {
if(counter < 6) {
int64_t val;
error = doc.at_pointer("/4").get(val);
if( error ) { /* handle the error */ }
std::cout << "5 = " << val << std::endl;
} else {
ondemand::value val;
error = doc.at_pointer("/4").get(val);
// error == simdjson::CAPACITY
if(error) { std::cerr << error << std::endl; break; }
}
counter++;
}
```
This example should print out:
```
5 = 5
5 = 5
5 = 5
5 = 5
5 = 5
5 = 5
This parser can't support a document that big
```
If your documents are large (e.g., larger than a megabyte), then the `iterate_many` function is maybe ill-suited. It is really meant to support reading efficiently streams of relatively small documents (e.g., a few kilobytes each). If you have larger documents, you should use other functions like `iterate`.
See [iterate_many.md](iterate_many.md) for detailed information and design.

View File

@ -175,7 +175,7 @@ Let us illustrate the idea with code:
auto i = stream.begin();
size_t count{0};
for(; i != stream.end(); ++i) {
auto & doc = *i;
auto doc = *i;
if(!i.error()) {
std::cout << "got full document at " << i.current_index() << std::endl;
std::cout << i.source() << std::endl;

View File

@ -49,18 +49,23 @@ simdjson_really_inline implementation_simdjson_result_base<T>::operator T&&() &&
return std::forward<implementation_simdjson_result_base<T>>(*this).take_value();
}
#endif // SIMDJSON_EXCEPTIONS
template<typename T>
simdjson_really_inline const T& implementation_simdjson_result_base<T>::value_unsafe() const& noexcept {
return this->first;
}
template<typename T>
simdjson_really_inline T& implementation_simdjson_result_base<T>::value_unsafe() & noexcept {
return this->first;
}
template<typename T>
simdjson_really_inline T&& implementation_simdjson_result_base<T>::value_unsafe() && noexcept {
return std::forward<T>(this->first);
}
#endif // SIMDJSON_EXCEPTIONS
template<typename T>
simdjson_really_inline implementation_simdjson_result_base<T>::implementation_simdjson_result_base(T &&value, error_code error) noexcept
: first{std::forward<T>(value)}, second{error} {}

View File

@ -97,20 +97,25 @@ struct implementation_simdjson_result_base {
*/
simdjson_really_inline operator T&&() && noexcept(false);
#endif // SIMDJSON_EXCEPTIONS
/**
* Get the result value. This function is safe if and only
* the error() method returns a value that evaluates to false.
*/
simdjson_really_inline const T& value_unsafe() const& noexcept;
/**
* Get the result value. This function is safe if and only
* the error() method returns a value that evaluates to false.
*/
simdjson_really_inline T& value_unsafe() & noexcept;
/**
* Take the result value (move it). This function is safe if and only
* the error() method returns a value that evaluates to false.
*/
simdjson_really_inline T&& value_unsafe() && noexcept;
#endif // SIMDJSON_EXCEPTIONS
T first{};
error_code second{UNINITIALIZED};
}; // struct implementation_simdjson_result_base

View File

@ -385,4 +385,189 @@ simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value>
return first.at_pointer(json_pointer);
}
} // namespace simdjson
namespace simdjson {
namespace SIMDJSON_IMPLEMENTATION {
namespace ondemand {
simdjson_really_inline document_reference::document_reference() noexcept : doc{nullptr} {}
simdjson_really_inline document_reference::document_reference(document &d) noexcept : doc(&d) {}
simdjson_really_inline void document_reference::rewind() noexcept { doc->rewind(); }
simdjson_really_inline simdjson_result<array> document_reference::get_array() & noexcept { return doc->get_array(); }
simdjson_really_inline simdjson_result<object> document_reference::get_object() & noexcept { return doc->get_object(); }
simdjson_really_inline simdjson_result<uint64_t> document_reference::get_uint64() noexcept { return doc->get_uint64(); }
simdjson_really_inline simdjson_result<int64_t> document_reference::get_int64() noexcept { return doc->get_int64(); }
simdjson_really_inline simdjson_result<double> document_reference::get_double() noexcept { return doc->get_double(); }
simdjson_really_inline simdjson_result<std::string_view> document_reference::get_string() noexcept { return doc->get_string(); }
simdjson_really_inline simdjson_result<raw_json_string> document_reference::get_raw_json_string() noexcept { return doc->get_raw_json_string(); }
simdjson_really_inline simdjson_result<bool> document_reference::get_bool() noexcept { return doc->get_bool(); }
simdjson_really_inline bool document_reference::is_null() noexcept { return doc->is_null(); }
#if SIMDJSON_EXCEPTIONS
simdjson_really_inline document_reference::operator array() & noexcept(false) { return array(*doc); }
simdjson_really_inline document_reference::operator object() & noexcept(false) { return object(*doc); }
simdjson_really_inline document_reference::operator uint64_t() noexcept(false) { return uint64_t(*doc); }
simdjson_really_inline document_reference::operator int64_t() noexcept(false) { return int64_t(*doc); }
simdjson_really_inline document_reference::operator double() noexcept(false) { return double(*doc); }
simdjson_really_inline document_reference::operator std::string_view() noexcept(false) { return std::string_view(*doc); }
simdjson_really_inline document_reference::operator raw_json_string() noexcept(false) { return raw_json_string(*doc); }
simdjson_really_inline document_reference::operator bool() noexcept(false) { return bool(*doc); }
#endif
simdjson_really_inline simdjson_result<size_t> document_reference::count_elements() & noexcept { return doc->count_elements(); }
simdjson_really_inline simdjson_result<array_iterator> document_reference::begin() & noexcept { return doc->begin(); }
simdjson_really_inline simdjson_result<array_iterator> document_reference::end() & noexcept { return doc->end(); }
simdjson_really_inline simdjson_result<value> document_reference::find_field(std::string_view key) & noexcept { return doc->find_field(key); }
simdjson_really_inline simdjson_result<value> document_reference::find_field(const char *key) & noexcept { return doc->find_field(key); }
simdjson_really_inline simdjson_result<value> document_reference::operator[](std::string_view key) & noexcept { return (*doc)[key]; }
simdjson_really_inline simdjson_result<value> document_reference::operator[](const char *key) & noexcept { return (*doc)[key]; }
simdjson_really_inline simdjson_result<value> document_reference::find_field_unordered(std::string_view key) & noexcept { return doc->find_field_unordered(key); }
simdjson_really_inline simdjson_result<value> document_reference::find_field_unordered(const char *key) & noexcept { return doc->find_field_unordered(key); }
simdjson_really_inline simdjson_result<json_type> document_reference::type() noexcept { return doc->type(); }
simdjson_really_inline simdjson_result<std::string_view> document_reference::raw_json_token() noexcept { return doc->raw_json_token(); }
simdjson_really_inline simdjson_result<value> document_reference::at_pointer(std::string_view json_pointer) noexcept { return doc->at_pointer(json_pointer); }
simdjson_really_inline simdjson_result<std::string_view> document_reference::raw_json() noexcept { return doc->raw_json();}
simdjson_really_inline document_reference::operator document&() const noexcept { return *doc; }
} // namespace ondemand
} // namespace SIMDJSON_IMPLEMENTATION
} // namespace simdjson
namespace simdjson {
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::simdjson_result(SIMDJSON_IMPLEMENTATION::ondemand::document_reference value, error_code error)
noexcept : implementation_simdjson_result_base<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>(std::forward<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>(value), error) {}
simdjson_really_inline simdjson_result<size_t> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::count_elements() & noexcept {
if (error()) { return error(); }
return first.count_elements();
}
simdjson_really_inline error_code simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::rewind() noexcept {
if (error()) { return error(); }
first.rewind();
return SUCCESS;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::begin() & noexcept {
if (error()) { return error(); }
return first.begin();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::end() & noexcept {
return {};
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::find_field_unordered(std::string_view key) & noexcept {
if (error()) { return error(); }
return first.find_field_unordered(key);
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::find_field_unordered(const char *key) & noexcept {
if (error()) { return error(); }
return first.find_field_unordered(key);
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator[](std::string_view key) & noexcept {
if (error()) { return error(); }
return first[key];
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator[](const char *key) & noexcept {
if (error()) { return error(); }
return first[key];
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::find_field(std::string_view key) & noexcept {
if (error()) { return error(); }
return first.find_field(key);
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::find_field(const char *key) & noexcept {
if (error()) { return error(); }
return first.find_field(key);
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::get_array() & noexcept {
if (error()) { return error(); }
return first.get_array();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::object> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::get_object() & noexcept {
if (error()) { return error(); }
return first.get_object();
}
simdjson_really_inline simdjson_result<uint64_t> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::get_uint64() noexcept {
if (error()) { return error(); }
return first.get_uint64();
}
simdjson_really_inline simdjson_result<int64_t> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::get_int64() noexcept {
if (error()) { return error(); }
return first.get_int64();
}
simdjson_really_inline simdjson_result<double> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::get_double() noexcept {
if (error()) { return error(); }
return first.get_double();
}
simdjson_really_inline simdjson_result<std::string_view> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::get_string() noexcept {
if (error()) { return error(); }
return first.get_string();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::raw_json_string> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::get_raw_json_string() noexcept {
if (error()) { return error(); }
return first.get_raw_json_string();
}
simdjson_really_inline simdjson_result<bool> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::get_bool() noexcept {
if (error()) { return error(); }
return first.get_bool();
}
simdjson_really_inline bool simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::is_null() noexcept {
if (error()) { return error(); }
return first.is_null();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::json_type> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::type() noexcept {
if (error()) { return error(); }
return first.type();
}
#if SIMDJSON_EXCEPTIONS
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator SIMDJSON_IMPLEMENTATION::ondemand::array() & noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator SIMDJSON_IMPLEMENTATION::ondemand::object() & noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator uint64_t() noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator int64_t() noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator double() noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator std::string_view() noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator SIMDJSON_IMPLEMENTATION::ondemand::raw_json_string() noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator bool() noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
#endif
simdjson_really_inline simdjson_result<std::string_view> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::raw_json_token() noexcept {
if (error()) { return error(); }
return first.raw_json_token();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::at_pointer(std::string_view json_pointer) noexcept {
if (error()) { return error(); }
return first.at_pointer(json_pointer);
}
} // namespace simdjson

View File

@ -13,7 +13,7 @@ class array_iterator;
class document_stream;
/**
* A JSON document iteration.
* A JSON document. It holds a json_iterator instance.
*
* Used by tokens to get text, and string buffer location.
*
@ -411,6 +411,54 @@ protected:
friend class document_stream;
};
/**
* A document_reference is a thin wrapper around a document reference instance.
*/
class document_reference {
public:
simdjson_really_inline document_reference() noexcept;
simdjson_really_inline document_reference(document &d) noexcept;
simdjson_really_inline document_reference(const document_reference &other) noexcept = default;
simdjson_really_inline void rewind() noexcept;
simdjson_really_inline simdjson_result<array> get_array() & noexcept;
simdjson_really_inline simdjson_result<object> get_object() & noexcept;
simdjson_really_inline simdjson_result<uint64_t> get_uint64() noexcept;
simdjson_really_inline simdjson_result<int64_t> get_int64() noexcept;
simdjson_really_inline simdjson_result<double> get_double() noexcept;
simdjson_really_inline simdjson_result<std::string_view> get_string() noexcept;
simdjson_really_inline simdjson_result<raw_json_string> get_raw_json_string() noexcept;
simdjson_really_inline simdjson_result<bool> get_bool() noexcept;
simdjson_really_inline bool is_null() noexcept;
simdjson_really_inline simdjson_result<std::string_view> raw_json() noexcept;
simdjson_really_inline operator document&() const noexcept;
#if SIMDJSON_EXCEPTIONS
simdjson_really_inline operator array() & noexcept(false);
simdjson_really_inline operator object() & noexcept(false);
simdjson_really_inline operator uint64_t() noexcept(false);
simdjson_really_inline operator int64_t() noexcept(false);
simdjson_really_inline operator double() noexcept(false);
simdjson_really_inline operator std::string_view() noexcept(false);
simdjson_really_inline operator raw_json_string() noexcept(false);
simdjson_really_inline operator bool() noexcept(false);
#endif
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
simdjson_really_inline simdjson_result<array_iterator> begin() & noexcept;
simdjson_really_inline simdjson_result<array_iterator> end() & noexcept;
simdjson_really_inline simdjson_result<value> find_field(std::string_view key) & noexcept;
simdjson_really_inline simdjson_result<value> find_field(const char *key) & noexcept;
simdjson_really_inline simdjson_result<value> operator[](std::string_view key) & noexcept;
simdjson_really_inline simdjson_result<value> operator[](const char *key) & noexcept;
simdjson_really_inline simdjson_result<value> find_field_unordered(std::string_view key) & noexcept;
simdjson_really_inline simdjson_result<value> find_field_unordered(const char *key) & noexcept;
simdjson_really_inline simdjson_result<json_type> type() noexcept;
simdjson_really_inline simdjson_result<std::string_view> raw_json_token() noexcept;
simdjson_really_inline simdjson_result<value> at_pointer(std::string_view json_pointer) noexcept;
private:
document *doc{nullptr};
};
} // namespace ondemand
} // namespace SIMDJSON_IMPLEMENTATION
} // namespace simdjson
@ -470,4 +518,57 @@ public:
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> at_pointer(std::string_view json_pointer) noexcept;
};
} // namespace simdjson
namespace simdjson {
template<>
struct simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference> : public SIMDJSON_IMPLEMENTATION::implementation_simdjson_result_base<SIMDJSON_IMPLEMENTATION::ondemand::document_reference> {
public:
simdjson_really_inline simdjson_result(SIMDJSON_IMPLEMENTATION::ondemand::document_reference value, error_code error) noexcept;
simdjson_really_inline simdjson_result() noexcept = default;
simdjson_really_inline error_code rewind() noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array> get_array() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::object> get_object() & noexcept;
simdjson_really_inline simdjson_result<uint64_t> get_uint64() noexcept;
simdjson_really_inline simdjson_result<int64_t> get_int64() noexcept;
simdjson_really_inline simdjson_result<double> get_double() noexcept;
simdjson_really_inline simdjson_result<std::string_view> get_string() noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::raw_json_string> get_raw_json_string() noexcept;
simdjson_really_inline simdjson_result<bool> get_bool() noexcept;
simdjson_really_inline bool is_null() noexcept;
#if SIMDJSON_EXCEPTIONS
simdjson_really_inline operator SIMDJSON_IMPLEMENTATION::ondemand::array() & noexcept(false);
simdjson_really_inline operator SIMDJSON_IMPLEMENTATION::ondemand::object() & noexcept(false);
simdjson_really_inline operator uint64_t() noexcept(false);
simdjson_really_inline operator int64_t() noexcept(false);
simdjson_really_inline operator double() noexcept(false);
simdjson_really_inline operator std::string_view() noexcept(false);
simdjson_really_inline operator SIMDJSON_IMPLEMENTATION::ondemand::raw_json_string() noexcept(false);
simdjson_really_inline operator bool() noexcept(false);
#endif
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> begin() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> end() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field(std::string_view key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field(const char *key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> operator[](std::string_view key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> operator[](const char *key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field_unordered(std::string_view key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field_unordered(const char *key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::json_type> type() noexcept;
/** @copydoc simdjson_really_inline std::string_view document_reference::raw_json_token() const noexcept */
simdjson_really_inline simdjson_result<std::string_view> raw_json_token() noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> at_pointer(std::string_view json_pointer) noexcept;
};
} // namespace simdjson

View File

@ -137,8 +137,9 @@ simdjson_really_inline document_stream::iterator::iterator(document_stream* _str
: stream{_stream}, finished{is_end} {
}
simdjson_really_inline ondemand::document& document_stream::iterator::operator*() noexcept {
return stream->doc;
simdjson_really_inline simdjson_result<ondemand::document_reference> document_stream::iterator::operator*() noexcept {
//if(stream->error) { return stream->error; }
return simdjson_result<ondemand::document_reference>(stream->doc, stream->error);
}
simdjson_really_inline document_stream::iterator& document_stream::iterator::operator++() noexcept {

View File

@ -130,7 +130,7 @@ public:
/**
* Get the current document (or error).
*/
simdjson_really_inline ondemand::document& operator*() noexcept;
simdjson_really_inline simdjson_result<ondemand::document_reference> operator*() noexcept;
/**
* Advance to the next document (prefix).
*/

View File

@ -20,6 +20,13 @@ inline simdjson_result<std::string_view> to_json_string(SIMDJSON_IMPLEMENTATION:
return trim(v);
}
inline simdjson_result<std::string_view> to_json_string(SIMDJSON_IMPLEMENTATION::ondemand::document_reference& x) noexcept {
std::string_view v;
auto error = x.raw_json().get(v);
if(error) {return error; }
return trim(v);
}
inline simdjson_result<std::string_view> to_json_string(SIMDJSON_IMPLEMENTATION::ondemand::value& x) noexcept {
/**
* If we somehow receive a value that has already been consumed,
@ -66,28 +73,30 @@ inline simdjson_result<std::string_view> to_json_string(SIMDJSON_IMPLEMENTATION:
return trim(v);
}
#if SIMDJSON_EXCEPTIONS
inline simdjson_result<std::string_view> to_json_string(simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document> x) {
if (x.error()) { return x.error(); }
return to_json_string(x.value());
return to_json_string(x.value_unsafe());
}
inline simdjson_result<std::string_view> to_json_string(simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference> x) {
if (x.error()) { return x.error(); }
return to_json_string(x.value_unsafe());
}
inline simdjson_result<std::string_view> to_json_string(simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> x) {
if (x.error()) { return x.error(); }
return to_json_string(x.value());
return to_json_string(x.value_unsafe());
}
inline simdjson_result<std::string_view> to_json_string(simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::object> x) {
if (x.error()) { return x.error(); }
return to_json_string(x.value());
return to_json_string(x.value_unsafe());
}
inline simdjson_result<std::string_view> to_json_string(simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array> x) {
if (x.error()) { return x.error(); }
return to_json_string(x.value());
return to_json_string(x.value_unsafe());
}
#endif
} // namespace simdjson
@ -153,7 +162,20 @@ inline std::ostream& operator<<(std::ostream& out, simdjson::SIMDJSON_IMPLEMENTA
throw simdjson::simdjson_error(error);
}
}
inline std::ostream& operator<<(std::ostream& out, simdjson::simdjson_result<simdjson::SIMDJSON_IMPLEMENTATION::ondemand::document> x) {
inline std::ostream& operator<<(std::ostream& out, simdjson::SIMDJSON_IMPLEMENTATION::ondemand::document_reference& value) {
std::string_view v;
auto error = simdjson::to_json_string(value).get(v);
if(error == simdjson::SUCCESS) {
return (out << v);
} else {
throw simdjson::simdjson_error(error);
}
}
inline std::ostream& operator<<(std::ostream& out, simdjson::simdjson_result<simdjson::SIMDJSON_IMPLEMENTATION::ondemand::document>&& x) {
if (x.error()) { throw simdjson::simdjson_error(x.error()); }
return (out << x.value());
}
inline std::ostream& operator<<(std::ostream& out, simdjson::simdjson_result<simdjson::SIMDJSON_IMPLEMENTATION::ondemand::document_reference>&& x) {
if (x.error()) { throw simdjson::simdjson_error(x.error()); }
return (out << x.value());
}

View File

@ -23,12 +23,10 @@ inline simdjson_result<std::string_view> to_json_string(SIMDJSON_IMPLEMENTATION:
* contains JSON text that is suitable to be parsed as JSON again.
*/
inline simdjson_result<std::string_view> to_json_string(SIMDJSON_IMPLEMENTATION::ondemand::array& x) noexcept;
#if SIMDJSON_EXCEPTIONS
inline simdjson_result<std::string_view> to_json_string(simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document> x);
inline simdjson_result<std::string_view> to_json_string(simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> x);
inline simdjson_result<std::string_view> to_json_string(simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::object> x);
inline simdjson_result<std::string_view> to_json_string(simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array> x);
#endif
} // namespace simdjson
@ -63,7 +61,11 @@ inline std::ostream& operator<<(std::ostream& out, simdjson::simdjson_result<sim
*/
inline std::ostream& operator<<(std::ostream& out, simdjson::SIMDJSON_IMPLEMENTATION::ondemand::document& value);
#if SIMDJSON_EXCEPTIONS
inline std::ostream& operator<<(std::ostream& out, simdjson::simdjson_result<simdjson::SIMDJSON_IMPLEMENTATION::ondemand::document> x);
inline std::ostream& operator<<(std::ostream& out, simdjson::simdjson_result<simdjson::SIMDJSON_IMPLEMENTATION::ondemand::document>&& x);
#endif
inline std::ostream& operator<<(std::ostream& out, simdjson::SIMDJSON_IMPLEMENTATION::ondemand::document_reference& value);
#if SIMDJSON_EXCEPTIONS
inline std::ostream& operator<<(std::ostream& out, simdjson::simdjson_result<simdjson::SIMDJSON_IMPLEMENTATION::ondemand::document_reference>&& x);
#endif
/**
* Print JSON to an output stream.

View File

@ -5,12 +5,6 @@ using namespace simdjson;
namespace document_stream_tests {
std::string my_string(ondemand::document& doc) {
std::stringstream ss;
ss << doc;
return ss.str();
}
bool simple_document_iteration() {
TEST_START();
auto json = R"([1,[1,2]] {"a":1,"b":2} {"o":{"1":1,"2":2}} [1,2,3])"_padded;
@ -19,9 +13,11 @@ namespace document_stream_tests {
ASSERT_SUCCESS(parser.iterate_many(json).get(stream));
std::string_view expected[4] = {"[1,[1,2]]", "{\"a\":1,\"b\":2}", "{\"o\":{\"1\":1,\"2\":2}}", "[1,2,3]"};
size_t counter{0};
for(auto & doc : stream) {
for(auto doc : stream) {
ASSERT_TRUE(counter < 4);
ASSERT_EQUAL(my_string(doc), expected[counter++]);
std::string_view view;
ASSERT_SUCCESS(to_json_string(doc).get(view));
ASSERT_EQUAL(view, expected[counter++]);
}
ASSERT_EQUAL(counter, 4);
TEST_SUCCEED();
@ -60,7 +56,8 @@ namespace document_stream_tests {
++i;
ASSERT_EQUAL(i.source(),expected[counter++]);
ASSERT_SUCCESS( (*i).find_field("a").get(x) );
simdjson_result<ondemand::document_reference> xxx = *i;
ASSERT_SUCCESS( xxx.find_field("a").get(x) );
ASSERT_EQUAL(x,1);
++i;
@ -289,7 +286,7 @@ namespace document_stream_tests {
ondemand::document_stream stream;
ASSERT_SUCCESS(parser.iterate_many(json).get(stream));
size_t count{0};
for (auto & doc : stream) {
for (auto doc : stream) {
(void)doc;
count++;
}
@ -304,7 +301,7 @@ namespace document_stream_tests {
ondemand::document_stream stream;
ASSERT_SUCCESS(parser.iterate_many(json).get(stream));
size_t count{0};
for (auto & doc : stream) {
for (auto doc : stream) {
(void)doc;
count++;
}
@ -336,7 +333,7 @@ namespace document_stream_tests {
ondemand::document_stream stream;
size_t count{0};
ASSERT_SUCCESS( parser.iterate_many(str, batch_size).get(stream) );
for (auto & doc : stream) {
for (auto doc : stream) {
int64_t keyid;
ASSERT_SUCCESS( doc["id"].get(keyid) );
ASSERT_EQUAL( keyid, int64_t(count) );
@ -349,6 +346,42 @@ namespace document_stream_tests {
}
bool issue1668() {
TEST_START();
auto json = R"([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100])"_padded;
ondemand::parser odparser;
ondemand::document_stream odstream;
ASSERT_SUCCESS( odparser.iterate_many(json.data(), json.length(), 50).get(odstream) );
for (auto doc: odstream) {
ondemand::value val;
ASSERT_ERROR(doc.at_pointer("/40").get(val), CAPACITY);
}
TEST_SUCCEED();
}
bool issue1668_long() {
TEST_START();
auto json = R"([1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100])"_padded;
ondemand::parser odparser;
ondemand::document_stream odstream;
size_t counter{0};
ASSERT_SUCCESS( odparser.iterate_many(json.data(), json.length(), 50).get(odstream) );
for (auto doc: odstream) {
if(counter < 6) {
int64_t val;
ASSERT_SUCCESS(doc.at_pointer("/4").get(val));
ASSERT_EQUAL(val, 5);
} else {
ondemand::value val;
ASSERT_ERROR(doc.at_pointer("/4").get(val), CAPACITY);
}
counter++;
}
TEST_SUCCEED();
}
bool document_stream_utf8_test() {
TEST_START();
fflush(NULL);
@ -373,7 +406,7 @@ namespace document_stream_tests {
ondemand::document_stream stream;
size_t count{0};
ASSERT_SUCCESS( parser.iterate_many(str, batch_size).get(stream) );
for (auto & doc : stream) {
for (auto doc : stream) {
int64_t keyid;
ASSERT_SUCCESS( doc["id"].get(keyid) );
ASSERT_EQUAL( keyid, int64_t(count) );
@ -386,44 +419,46 @@ namespace document_stream_tests {
}
bool stress_data_race() {
TEST_START();
// Correct JSON.
auto input = R"([1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] )"_padded;;
ondemand::parser parser;
ondemand::document_stream stream;
ASSERT_SUCCESS(parser.iterate_many(input, 32).get(stream));
for(auto i = stream.begin(); i != stream.end(); ++i) {
ASSERT_SUCCESS(i.error());
}
TEST_SUCCEED();
}
bool stress_data_race_with_error() {
TEST_START();
#if SIMDJSON_THREAD_ENABLED
std::cout << "ENABLED" << std::endl;
#endif
// Intentionally broken
auto input = R"([1,23] [1,23] [1,23] [1,23 [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] )"_padded;
ondemand::parser parser;
ondemand::document_stream stream;
ASSERT_SUCCESS(parser.iterate_many(input, 32).get(stream));
size_t count{0};
for(auto i = stream.begin(); i != stream.end(); ++i) {
auto error = i.error();
if(count <= 3) {
ASSERT_SUCCESS(error);
} else {
ASSERT_ERROR(error,TAPE_ERROR);
break;
TEST_START();
// Correct JSON.
auto input = R"([1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] )"_padded;;
ondemand::parser parser;
ondemand::document_stream stream;
ASSERT_SUCCESS(parser.iterate_many(input, 32).get(stream));
for(auto i = stream.begin(); i != stream.end(); ++i) {
ASSERT_SUCCESS(i.error());
}
count++;
TEST_SUCCEED();
}
bool stress_data_race_with_error() {
TEST_START();
#if SIMDJSON_THREAD_ENABLED
std::cout << "ENABLED" << std::endl;
#endif
// Intentionally broken
auto input = R"([1,23] [1,23] [1,23] [1,23 [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] [1,23] )"_padded;
ondemand::parser parser;
ondemand::document_stream stream;
ASSERT_SUCCESS(parser.iterate_many(input, 32).get(stream));
size_t count{0};
for(auto i = stream.begin(); i != stream.end(); ++i) {
auto error = i.error();
if(count <= 3) {
ASSERT_SUCCESS(error);
} else {
ASSERT_ERROR(error,TAPE_ERROR);
break;
}
count++;
}
TEST_SUCCEED();
}
TEST_SUCCEED();
}
bool run() {
return
issue1668() &&
issue1668_long() &&
simple_document_iteration() &&
simple_document_iteration_multiple_batches() &&
simple_document_iteration_with_parsing() &&

View File

@ -360,7 +360,7 @@ bool iterate_many_example() {
size_t expected_indexes[3] = {0,9,29};
std::string_view expected_doc[3] = {"[1,2,3]", R"({"1":1,"2":3,"4":4})", "[1,2,3]"};
for(; i != stream.end(); ++i) {
auto & doc = *i;
auto doc = *i;
ASSERT_SUCCESS(doc.type());
ASSERT_SUCCESS(i.error());
ASSERT_EQUAL(i.current_index(),expected_indexes[count]);
@ -400,14 +400,36 @@ bool ndjson_basics_example() {
ASSERT_SUCCESS( parser.iterate_many(json).get(docs) );
size_t count{0};
int64_t expected[3] = {1,2,3};
for (auto & doc : docs) {
for (auto doc : docs) {
int64_t actual;
ASSERT_SUCCESS( doc["foo"].get(actual) );
ASSERT_EQUAL( actual,expected[count++] );
}
TEST_SUCCEED();
}
bool stream_capacity_example() {
auto json = R"([1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5] [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100])"_padded;
ondemand::parser parser;
ondemand::document_stream stream;
size_t counter{0};
auto error = parser.iterate_many(json, 50).get(stream);
if( error ) { /* handle the error */ }
for (auto doc: stream) {
if(counter < 6) {
int64_t val;
error = doc.at_pointer("/4").get(val);
if( error ) { /* handle the error */ }
std::cout << "5 = " << val << std::endl;
} else {
ondemand::value val;
error = doc.at_pointer("/4").get(val);
// error == simdjson::CAPACITY
if(error) { std::cerr << error << std::endl; break; }
}
counter++;
}
return true;
}
int main() {
if (
true
@ -434,6 +456,7 @@ int main() {
&& iterate_many_example()
&& iterate_many_truncated_example()
&& ndjson_basics_example()
&& stream_capacity_example()
) {
return 0;
} else {