Additional tests and document tuning (#1684)

* Additional example.

* Adds more tests.

* Actually using the variable.
This commit is contained in:
Daniel Lemire 2021-08-02 16:35:02 -04:00 committed by GitHub
parent 0fa68d8930
commit 06643fc9f5
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 312 additions and 158 deletions

View File

@ -12,7 +12,6 @@ An overview of what you need to know to use simdjson, with examples.
* [Documents Are Iterators](#documents-are-iterators)
* [C++11 Support and string_view](#c11-support-and-string_view)
* [Using the Parsed JSON](#using-the-parsed-json)
* [C++17 Support](#c17-support)
* [Minifying JSON strings without parsing](#minifying-json-strings-without-parsing)
* [UTF-8 validation (alone)](#utf-8-validation-alone)
* [JSON Pointer](#json-pointer)
@ -152,6 +151,9 @@ strcpy(json, "[1]");
ondemand::document doc = parser.iterate(json, strlen(json), sizeof(json));
```
The simdjson library will also accept `std::string` instances, as long as the `capacity()` of
the string exceeds the `size()` by at least `SIMDJSON_PADDING`. You can increase the `capacity()` with the `reserve()` function of your strings.
We recommend against creating many `std::string` or many `std::padding_string` instances in your application to store your JSON data.
Consider reusing the same buffers and limiting memory allocations.
@ -168,11 +170,17 @@ you get where you are going. This is the key to On Demand's performance: since i
it lets you parse values as you use them. And particularly, it lets you *skip* values you do not want
to use.
We refer to "On Demand" as a front-end component since it is an interface between the
low-level parsing functions and the user. It hides much of the complexity of parsing JSON
documents.
### Parser, Document and JSON Scope
Because a document is an iterator over the JSON text, both the JSON text and the parser must
remain alive (in scope) while you are using it. Further, a `parser` may have at most
one document open at a time, since it holds allocated memory used for the parsing.
In particular, if you must pass a document instance to a function, you should avoid
passing it by value: choose to pass it by reference instance to avoid the copy.
During the `iterate` call, the original JSON text is never modified--only read. After you are done
with the document, the source (whether file or string) can be safely discarded.
@ -352,6 +360,7 @@ support for users who avoid exceptions. See [the simdjson error handling documen
if (error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
cout << value << endl; // Prints 3.14
```
This examples also show how we can string several operations and only check for the error once, a strategy we call *error chaining*.
* **Counting elements in arrays:** Sometimes it is useful to scan an array to determine its length prior to parsing it.
For this purpose, `array` instances have a `count_elements` method. Users should be
aware that the `count_elements` method can be costly since it requires scanning the
@ -520,36 +529,6 @@ for (ondemand::object points : parser.iterate(points_json)) {
}
```
C++17 Support
-------------
While the simdjson library can be used in any project using C++ 11 and above, field iteration has special support C++ 17's destructuring syntax. For example:
```c++
padded_string json = R"( { "foo": 1, "bar": 2 } )"_padded;
dom::parser parser;
dom::object object;
auto error = parser.parse(json).get(object);
if (error) { cerr << error << endl; return; }
for (auto [key, value] : object) {
cout << key << " = " << value << endl;
}
```
For comparison, here is the C++ 11 version of the same code:
```c++
// C++ 11 version for comparison
padded_string json = R"( { "foo": 1, "bar": 2 } )"_padded;
dom::parser parser;
dom::object object;
auto error = parser.parse(json).get(object);
if (error) { cerr << error << endl; return; }
for (dom::key_value_pair field : object) {
cout << field.key << " = " << field.value << endl;
}
```
Minifying JSON strings without parsing
----------------------
@ -585,6 +564,8 @@ The UTF-8 validation function merely checks that the input is valid UTF-8: it wo
Your input string does not need any padding. Any string will do. The `validate_utf8` function does not do any memory allocation on the heap, and it does not throw exceptions.
If you find yourself needing only fast Unicode functions, consider using the simdutf library instead: https://github.com/simdutf/simdutf
JSON Pointer
------------
@ -704,13 +685,14 @@ The entire simdjson API is usable with and without exceptions. All simdjson APIs
pair. You can retrieve the value with .get() without generating an exception, like so:
```c++
dom::element doc;
auto error = parser.parse(json).get(doc);
ondemand::element doc;
auto error = parser.iterate(json).get(doc);
if (error) { cerr << error << endl; exit(1); }
```
When you use the code this way, it is your responsibility to check for error before using the
result: if there is an error, the result value will not be valid and using it will caused undefined behavior.
result: if there is an error, the result value will not be valid and using it will caused undefined behavior. Most compilers should be able to help you if you activate the right
set of warnings: they can identify variables that are written to but never otherwise accessed.
Let us illustrate with an example where we try to access a number that is not valid (`3.14.1`).
If we want to proceed without throwing and catching exceptions, we can do so as follows:
@ -775,24 +757,29 @@ We can write a "quick start" example where we attempt to parse the following JSO
}
```
Our program loads the file, selects value corresponding to key "search_metadata" which expected to be an object, and then
it selects the key "count" within that object.
Our program loads the file, selects value corresponding to key `"search_metadata"` which expected to be an object, and then
it selects the key `"count"` within that object.
```C++
#include "simdjson.h"
#include <iostream>
int main(void) {
simdjson::dom::parser parser;
simdjson::dom::element tweets;
auto error = parser.load("twitter.json").get(tweets);
if (error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
simdjson::dom::element res;
if ((error = tweets["search_metadata"]["count"].get(res))) {
std::cerr << "could not access keys" << std::endl;
simdjson::ondemand::parser parser;
auto error = padded_string::load("twitter.json").get(json);
if(error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
simdjson::ondemand::document tweets;
error = parser.iterate(json).get(tweets);
if( error ) { std::cerr << error << std::endl; return EXIT_FAILURE; }
simdjson::ondemand::value res;
error = tweets["search_metadata"]["count"].get(res);
if (error != SUCCESS) {
std::cerr << "could not access keys : " << error << std::endl;
return EXIT_FAILURE;
}
std::cout << res << " results." << std::endl;
return EXIT_SUCCESS;
}
```
@ -805,19 +792,23 @@ triggering exceptions. To do this, we use `["statuses"].at(0)["id"]`. We break t
Observe how we use the `at` method when querying an index into an array, and not the bracket operator.
```C++
#include "simdjson.h"
#include <iostream>
int main(void) {
simdjson::dom::parser parser;
simdjson::dom::element tweets;
auto error = parser.load("twitter.json").get(tweets);
simdjson::ondemand::parser parser;
simdjson::ondemand::document tweets;
padded_string json;
auto error = padded_string::load("twitter.json").get(json);
if(error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
error = parser.iterate(json).get(tweets);
if(error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
uint64_t identifier;
error = tweets["statuses"].at(0)["id"].get(identifier);
if(error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
std::cout << identifier << std::endl;
return EXIT_SUCCESS;
}
```
@ -826,121 +817,60 @@ int main(void) {
This is how the example in "Using the Parsed JSON" could be written using only error code checking (without exceptions):
```c++
auto cars_json = R"( [
{ "make": "Toyota", "model": "Camry", "year": 2018, "tire_pressure": [ 40.1, 39.9, 37.7, 40.4 ] },
{ "make": "Kia", "model": "Soul", "year": 2012, "tire_pressure": [ 30.1, 31.0, 28.6, 28.7 ] },
{ "make": "Toyota", "model": "Tercel", "year": 1999, "tire_pressure": [ 29.8, 30.0, 30.2, 30.5 ] }
] )"_padded;
dom::parser parser;
dom::array cars;
auto error = parser.parse(cars_json).get(cars);
if (error) { cerr << error << endl; exit(1); }
bool parse() {
ondemand::parser parser;
auto cars_json = R"( [
{ "make": "Toyota", "model": "Camry", "year": 2018, "tire_pressure": [ 40.1, 39.9, 37.7, 40.4 ] },
{ "make": "Kia", "model": "Soul", "year": 2012, "tire_pressure": [ 30.1, 31.0, 28.6, 28.7 ] },
{ "make": "Toyota", "model": "Tercel", "year": 1999, "tire_pressure": [ 29.8, 30.0, 30.2, 30.5 ] }
] )"_padded;
ondemand::document doc;
// Iterating through an array of objects
for (dom::element car_element : cars) {
dom::object car;
if ((error = car_element.get(car))) { cerr << error << endl; exit(1); }
// Iterating through an array of objects
auto error = parser.iterate(cars_json).get(doc);
if(error) { std::cerr << error << std::endl; return false; }
ondemand::array cars;
error = doc.get_array().get(cars);
for (auto car_value : cars) {
ondemand::object car;
error = car_value.get_object().get(car);
if(error) { std::cerr << error << std::endl; return false; }
// Accessing a field by name
std::string_view make, model;
if ((error = car["make"].get(make))) { cerr << error << endl; exit(1); }
if ((error = car["model"].get(model))) { cerr << error << endl; exit(1); }
std::string_view make;
std::string_view model;
error = car["make"].get(make);
if(error) { std::cerr << error << std::endl; return false; }
error = car["model"].get(model);
if(error) { std::cerr << error << std::endl; return false; }
cout << "Make/Model: " << make << "/" << model << endl;
// Casting a JSON element to an integer
uint64_t year;
if ((error = car["year"].get(year))) { cerr << error << endl; exit(1); }
cout << "- This car is " << 2020 - year << "years old." << endl;
error = car["year"].get(year);
if(error) { std::cerr << error << std::endl; return false; }
cout << "- This car is " << 2020 - year << " years old." << endl;
// Iterating through an array of floats
double total_tire_pressure = 0;
dom::array tire_pressure_array;
if ((error = car["tire_pressure"].get(tire_pressure_array))) { cerr << error << endl; exit(1); }
for (dom::element tire_pressure_element : tire_pressure_array) {
double tire_pressure;
if ((error = tire_pressure_element.get(tire_pressure))) { cerr << error << endl; exit(1); }
total_tire_pressure += tire_pressure;
ondemand::array pressures;
error = car["tire_pressure"].get_array().get(pressures);
if(error) { std::cerr << error << std::endl; return false; }
for (auto tire_pressure_value : pressures) {
double tire_pressure;
error = tire_pressure_value.get_double().get(tire_pressure);
if(error) { std::cerr << error << std::endl; return false; }
total_tire_pressure += tire_pressure;
}
cout << "- Average tire pressure: " << (total_tire_pressure / 4) << endl;
// Writing out all the information about the car
for (auto field : car) {
cout << "- " << field.key << ": " << field.value << endl;
}
}
```
Here is another example:
```C++
auto abstract_json = R"( [
{ "12345" : {"a":12.34, "b":56.78, "c": 9998877} },
{ "12545" : {"a":11.44, "b":12.78, "c": 11111111} }
] )"_padded;
dom::parser parser;
dom::array array;
auto error = parser.parse(abstract_json).get(array);
if (error) { cerr << error << endl; exit(1); }
// Iterate through an array of objects
for (dom::element elem : array) {
dom::object obj;
if ((error = elem.get(obj))) { cerr << error << endl; exit(1); }
for (auto & key_value : obj) {
cout << "key: " << key_value.key << " : ";
dom::object innerobj;
if ((error = key_value.value.get(innerobj))) { cerr << error << endl; exit(1); }
double va, vb;
if ((error = innerobj["a"].get(va))) { cerr << error << endl; exit(1); }
cout << "a: " << va << ", ";
if ((error = innerobj["b"].get(vc))) { cerr << error << endl; exit(1); }
cout << "b: " << vb << ", ";
int64_t vc;
if ((error = innerobj["c"].get(vc))) { cerr << error << endl; exit(1); }
cout << "c: " << vc << endl;
}
}
```
And another one:
```C++
auto abstract_json = R"(
{ "str" : { "123" : {"abc" : 3.14 } } } )"_padded;
dom::parser parser;
double v;
auto error = parser.parse(abstract_json)["str"]["123"]["abc"].get(v);
if (error) { cerr << error << endl; exit(1); }
cout << "number: " << v << endl;
```
Notice how we can string several operations (`parser.parse(abstract_json)["str"]["123"]["abc"].get(v)`) and only check for the error once, a strategy we call *error chaining*.
The next two functions will take as input a JSON document containing an array with a single element, either a string or a number. They return true upon success.
```C++
simdjson::dom::parser parser{};
bool parse_double(const char *j, double &d) {
auto error = parser.parse(j, std::strlen(j))
.at(0)
.get(d, error);
if (error) { return false; }
return true;
}
bool parse_string(const char *j, std::string &s) {
std::string_view answer;
auto error = parser.parse(j,strlen(j))
.at(0)
.get(answer, error);
if (error) { return false; }
s.assign(answer.data(), answer.size());
}
return true;
}
```
### Disabling Exceptions
The simdjson can be build with exceptions entirely disabled. It checks the `__cpp_exceptions` macro at compile time. Even if exceptions are enabled in your compiler, you may still disable exceptions specifically for simdjson, by setting `SIMDJSON_EXCEPTIONS` to `0` (false) at compile-time when building the simdjson library. If you are building with CMake, to ensure you don't write any code that uses exceptions, you compile with `SIMDJSON_EXCEPTIONS=OFF`. For example, if including the project via cmake:
@ -954,7 +884,7 @@ target_compile_definitions(simdjson PUBLIC SIMDJSON_EXCEPTIONS=OFF)
Users more comfortable with an exception flow may choose to directly cast the `simdjson_result<T>` to the desired type:
```c++
dom::element doc = parser.parse(json); // Throws an exception if there was an error!
simdjson::ondemande::document doc = parser.iterate(json); // Throws an exception if there was an error!
```
When used this way, a `simdjson_error` exception will be thrown if an error occurs, preventing the
@ -965,16 +895,18 @@ If one is willing to trigger exceptions, it is possible to write simpler code:
```C++
#include "simdjson.h"
#include <iostream>
int main(void) {
simdjson::dom::parser parser;
simdjson::dom::element tweets = parser.load("twitter.json");
std::cout << "ID: " << tweets["statuses"].at(0)["id"] << std::endl;
simdjson::ondemand::parser parser;
padded_string json = padded_string::load("twitter.json");
simdjson::ondemand::document tweets = parser.iterate(json);
uint64_t identifier = tweets["statuses"].at(0)["id"];
std::cout << identifier << std::endl;
return EXIT_SUCCESS;
}
```
Rewinding
----------
@ -1231,9 +1163,8 @@ We built simdjson with thread safety in mind.
The simdjson library is single-threaded except for [`iterate_many`](iterate_many.md) and [`parse_many`](parse_many.md) which may use secondary threads under their control when the library is compiled with thread support.
We recommend using one `dom::parser` object per thread in which case the library is thread-safe.
It is unsafe to reuse a `dom::parser` object between different threads.
The parsed results (`dom::document`, `dom::element`, `array`, `object`) depend on the `dom::parser`, etc. therefore it is also potentially unsafe to use the result of the parsing between different threads.
We recommend using one `parser` object per thread. When using the On Demand front-end (our default), you should access the `document` instances in a single-threaded manner since it
acts as an iterator (and is therefore not thread safe).
The CPU detection, which runs the first time parsing is attempted and switches to the fastest
parser for your CPU, is transparent and thread-safe.

View File

@ -132,6 +132,10 @@ simdjson_really_inline simdjson_result<size_t> document::count_elements() & noex
if(answer.error() == SUCCESS) { iter._depth -= 1 ; /* undoing the increment so we go back at the doc depth.*/ }
return answer;
}
simdjson_really_inline simdjson_result<value> document::at(size_t index) & noexcept {
auto a = get_array();
return a.at(index);
}
simdjson_really_inline simdjson_result<array_iterator> document::begin() & noexcept {
return get_array().begin();
}
@ -229,6 +233,10 @@ simdjson_really_inline simdjson_result<size_t> simdjson_result<SIMDJSON_IMPLEMEN
if (error()) { return error(); }
return first.count_elements();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document>::at(size_t index) & noexcept {
if (error()) { return error(); }
return first.at(index);
}
simdjson_really_inline error_code simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document>::rewind() noexcept {
if (error()) { return error(); }
first.rewind();
@ -417,6 +425,7 @@ simdjson_really_inline document_reference::operator raw_json_string() noexcept(f
simdjson_really_inline document_reference::operator bool() noexcept(false) { return bool(*doc); }
#endif
simdjson_really_inline simdjson_result<size_t> document_reference::count_elements() & noexcept { return doc->count_elements(); }
simdjson_really_inline simdjson_result<value> document_reference::at(size_t index) & noexcept { return doc->at(index); }
simdjson_really_inline simdjson_result<array_iterator> document_reference::begin() & noexcept { return doc->begin(); }
simdjson_really_inline simdjson_result<array_iterator> document_reference::end() & noexcept { return doc->end(); }
simdjson_really_inline simdjson_result<value> document_reference::find_field(std::string_view key) & noexcept { return doc->find_field(key); }
@ -447,6 +456,10 @@ simdjson_really_inline simdjson_result<size_t> simdjson_result<SIMDJSON_IMPLEMEN
if (error()) { return error(); }
return first.count_elements();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::at(size_t index) & noexcept {
if (error()) { return error(); }
return first.at(index);
}
simdjson_really_inline error_code simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::rewind() noexcept {
if (error()) { return error(); }
first.rewind();

View File

@ -27,7 +27,7 @@ public:
* Exists so you can declare a variable and later assign to it before use.
*/
simdjson_really_inline document() noexcept = default;
simdjson_really_inline document(const document &other) noexcept = delete;
simdjson_really_inline document(const document &other) noexcept = delete; // pass your documents by reference, not by copy
simdjson_really_inline document(document &&other) noexcept = default;
simdjson_really_inline document &operator=(const document &other) noexcept = delete;
simdjson_really_inline document &operator=(document &&other) noexcept = default;
@ -233,6 +233,14 @@ public:
* safe to continue.
*/
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
/**
* Get the value at the given index in the array. This function has linear-time complexity.
* This function should only be called once as the array iterator is not reset between each call.
*
* @return The value at the given index, or:
* - INDEX_OUT_OF_BOUNDS if the array index is larger than an array length
*/
simdjson_really_inline simdjson_result<value> at(size_t index) & noexcept;
/**
* Begin array iteration.
*
@ -444,6 +452,7 @@ public:
simdjson_really_inline operator bool() noexcept(false);
#endif
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
simdjson_really_inline simdjson_result<value> at(size_t index) & noexcept;
simdjson_really_inline simdjson_result<array_iterator> begin() & noexcept;
simdjson_really_inline simdjson_result<array_iterator> end() & noexcept;
simdjson_really_inline simdjson_result<value> find_field(std::string_view key) & noexcept;
@ -501,6 +510,7 @@ public:
simdjson_really_inline operator bool() noexcept(false);
#endif
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> at(size_t index) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> begin() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> end() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field(std::string_view key) & noexcept;
@ -553,6 +563,7 @@ public:
simdjson_really_inline operator bool() noexcept(false);
#endif
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> at(size_t index) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> begin() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> end() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field(std::string_view key) & noexcept;

View File

@ -114,6 +114,10 @@ simdjson_really_inline simdjson_result<size_t> value::count_elements() & noexcep
iter.move_at_start();
return answer;
}
simdjson_really_inline simdjson_result<value> value::at(size_t index) noexcept {
auto a = get_array();
return a.at(index);
}
simdjson_really_inline simdjson_result<value> value::find_field(std::string_view key) noexcept {
return start_or_resume_object().find_field(key);
@ -182,6 +186,10 @@ simdjson_really_inline simdjson_result<size_t> simdjson_result<SIMDJSON_IMPLEMEN
if (error()) { return error(); }
return first.count_elements();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value>::at(size_t index) noexcept {
if (error()) { return error(); }
return first.at(index);
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value>::begin() & noexcept {
if (error()) { return error(); }
return first.begin();

View File

@ -244,6 +244,14 @@ public:
* safe to continue.
*/
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
/**
* Get the value at the given index in the array. This function has linear-time complexity.
* This function should only be called once as the array iterator is not reset between each call.
*
* @return The value at the given index, or:
* - INDEX_OUT_OF_BOUNDS if the array index is larger than an array length
*/
simdjson_really_inline simdjson_result<value> at(size_t index) noexcept;
/**
* Look up a field by name on an object (order-sensitive).
*
@ -465,6 +473,7 @@ public:
simdjson_really_inline operator bool() noexcept(false);
#endif
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> at(size_t index) noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> begin() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::array_iterator> end() & noexcept;

View File

@ -30,7 +30,7 @@ namespace internal {
{ UNEXPECTED_ERROR, "Unexpected error, consider reporting this problem as you may have found a bug in simdjson" },
{ PARSER_IN_USE, "Cannot parse a new document while a document is still in use." },
{ OUT_OF_ORDER_ITERATION, "Objects and arrays can only be iterated when they are first encountered." },
{ INSUFFICIENT_PADDING, "simdjson requires the input JSON string to have at least SIMDJSON_PADDING extra bytes allocated, beyond the string's length." },
{ INSUFFICIENT_PADDING, "simdjson requires the input JSON string to have at least SIMDJSON_PADDING extra bytes allocated, beyond the string's length. Consider using the simdjson::padded_string class if needed." },
{ INCOMPLETE_ARRAY_OR_OBJECT, "JSON document ended early in the middle of an object or array." }
}; // error_messages[]

View File

@ -5,6 +5,58 @@ using namespace simdjson;
namespace document_stream_tests {
template <typename T>
bool process_doc(T &docref) {
int64_t val;
ASSERT_SUCCESS(docref.at_pointer("/4").get(val));
//ASSERT_SUCCESS(err);
ASSERT_EQUAL(val, 5);
return true;
}
bool issue1683() {
TEST_START();
std::string json = R"([1,2,3,4,5]
[1,2,3,4,5]
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100]
[1,2,3,4,5])";
ondemand::parser odparser;
ondemand::document_stream odstream;
// iterate_many all at once
auto oderror = odparser.iterate_many(json, 50).get(odstream);
if (oderror) { std::cerr << "ondemand iterate_many error: " << oderror << std::endl; return false; }
size_t currindex = 0;
auto i = odstream.begin();
for (; i != odstream.end(); ++i) {
ondemand::document_reference doc;
auto err = (*i).get(doc);
if(err == SUCCESS) { if(!process_doc(doc)) {return false; } }
currindex = i.current_index();
if (err == simdjson::CAPACITY) {
ASSERT_EQUAL(currindex, 24);
ASSERT_EQUAL(odstream.truncated_bytes(), 305);
break;
} else if (err) {
TEST_FAIL(std::string("ondemand: error accessing jsonpointer: ") + simdjson::error_message(err));
}
}
ASSERT_EQUAL(odstream.truncated_bytes(), 305);
// iterate line-by-line
std::stringstream ss(json);
std::string oneline;
oneline.reserve(json.size() + SIMDJSON_PADDING);
while (getline(ss, oneline)) {
ondemand::document doc;
ASSERT_SUCCESS(odparser.iterate(oneline).get(doc));
if( ! process_doc(doc) ) { return false; }
}
TEST_SUCCEED();
}
bool simple_document_iteration() {
TEST_START();
auto json = R"([1,[1,2]] {"a":1,"b":2} {"o":{"1":1,"2":2}} [1,2,3])"_padded;
@ -461,6 +513,7 @@ namespace document_stream_tests {
bool run() {
return
issue1683() &&
issue1668() &&
issue1668_long() &&
simple_document_iteration() &&

View File

@ -198,7 +198,7 @@ bool using_the_parsed_json_3() {
// Casting a JSON element to an integer
uint64_t year = car["year"];
cout << "- This car is " << 2020 - year << "years old." << endl;
cout << "- This car is " << 2020 - year << " years old." << endl;
// Iterating through an array of floats
double total_tire_pressure = 0;
@ -276,6 +276,61 @@ bool using_the_parsed_json_5() {
#endif // SIMDJSON_EXCEPTIONS
bool using_the_parsed_json_no_exceptions() {
TEST_START();
ondemand::parser parser;
auto cars_json = R"( [
{ "make": "Toyota", "model": "Camry", "year": 2018, "tire_pressure": [ 40.1, 39.9, 37.7, 40.4 ] },
{ "make": "Kia", "model": "Soul", "year": 2012, "tire_pressure": [ 30.1, 31.0, 28.6, 28.7 ] },
{ "make": "Toyota", "model": "Tercel", "year": 1999, "tire_pressure": [ 29.8, 30.0, 30.2, 30.5 ] }
] )"_padded;
ondemand::document doc;
// Iterating through an array of objects
auto error = parser.iterate(cars_json).get(doc);
if(error) { std::cerr << error << std::endl; return false; }
ondemand::array cars;
error = doc.get_array().get(cars);
for (auto car_value : cars) {
ondemand::object car;
error = car_value.get_object().get(car);
if(error) { std::cerr << error << std::endl; return false; }
// Accessing a field by name
std::string_view make;
std::string_view model;
error = car["make"].get(make);
if(error) { std::cerr << error << std::endl; return false; }
error = car["model"].get(model);
if(error) { std::cerr << error << std::endl; return false; }
cout << "Make/Model: " << make << "/" << model << endl;
// Casting a JSON element to an integer
uint64_t year;
error = car["year"].get(year);
if(error) { std::cerr << error << std::endl; return false; }
cout << "- This car is " << 2020 - year << " years old." << endl;
// Iterating through an array of floats
double total_tire_pressure = 0;
ondemand::array pressures;
error = car["tire_pressure"].get_array().get(pressures);
if(error) { std::cerr << error << std::endl; return false; }
for (auto tire_pressure_value : pressures) {
double tire_pressure;
error = tire_pressure_value.get_double().get(tire_pressure);
if(error) { std::cerr << error << std::endl; return false; }
total_tire_pressure += tire_pressure;
}
cout << "- Average tire pressure: " << (total_tire_pressure / 4) << endl;
}
TEST_SUCCEED();
}
int using_the_parsed_json_6_process() {
auto abstract_json = R"(
{ "str" : { "123" : {"abc" : 3.14 } } }
@ -458,6 +513,7 @@ bool simple_error_example() {
return true;
}
#if SIMDJSON_EXCEPTIONS
bool simple_error_example_except() {
TEST_START();
@ -476,6 +532,76 @@ bool simple_error_example() {
}
#endif
int load_example() {
simdjson::ondemand::parser parser;
simdjson::ondemand::document tweets;
padded_string json;
auto error = padded_string::load("twitter.json").get(json);
if(error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
error = parser.iterate(json).get(tweets);
if(error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
uint64_t identifier;
error = tweets["statuses"].at(0)["id"].get(identifier);
if(error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
std::cout << identifier << std::endl;
return EXIT_SUCCESS;
}
int example_1() {
TEST_START();
simdjson::ondemand::parser parser;
//auto error = padded_string::load("twitter.json").get(json);
// if(error) { std::cerr << error << std::endl; return EXIT_FAILURE; }
padded_string json = R"( {
"statuses": [
{
"id": 505874924095815700
},
{
"id": 505874922023837700
}
],
"search_metadata": {
"count": 100
}
} )"_padded;
simdjson::ondemand::document tweets;
auto error = parser.iterate(json).get(tweets);
if( error ) { return EXIT_FAILURE; }
simdjson::ondemand::value res;
error = tweets["search_metadata"]["count"].get(res);
if (error != SUCCESS) {
std::cerr << "could not access keys" << std::endl;
return EXIT_FAILURE;
}
std::cout << res << " results." << std::endl;
return true;
}
#if SIMDJSON_EXCEPTIONS
int load_example_except() {
simdjson::ondemand::parser parser;
padded_string json = padded_string::load("twitter.json");
simdjson::ondemand::document tweets = parser.iterate(json);
uint64_t identifier = tweets["statuses"].at(0)["id"];
std::cout << identifier << std::endl;
return EXIT_SUCCESS;
}
#endif
bool test_load_example() {
TEST_START();
simdjson::ondemand::parser parser;
simdjson::ondemand::document tweets;
padded_string json = R"( {"statuses":[{"id":1234}]} )"_padded;
auto error = parser.iterate(json).get(tweets);
if(error) { std::cerr << error << std::endl; return false; }
uint64_t identifier;
error = tweets["statuses"].at(0)["id"].get(identifier);
if(error) { std::cerr << error << std::endl; return false; }
std::cout << identifier << std::endl;
return identifier == 1234;
}
int main() {
if (
true
@ -503,6 +629,9 @@ int main() {
&& iterate_many_truncated_example()
&& ndjson_basics_example()
&& stream_capacity_example()
&& test_load_example()
&& example_1()
&& using_the_parsed_json_no_exceptions()
) {
return 0;
} else {