Makes it possible to cast a document to a value. (#1690)

* Makes it possible to cast a document to a value.
This commit is contained in:
Daniel Lemire 2021-08-11 20:02:30 -04:00 committed by GitHub
parent ba46616cbc
commit de4deb8c4e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
16 changed files with 380 additions and 48 deletions

View File

@ -17,7 +17,6 @@
"__errc": "cpp",
"__functional_base": "cpp",
"__hash_table": "cpp",
"__locale": "cpp",
"__mutex_base": "cpp",
"__node_handle": "cpp",
"__nullptr": "cpp",
@ -85,6 +84,8 @@
"utility": "cpp",
"valarray": "cpp",
"vector": "cpp",
"*.ipp": "cpp"
"*.ipp": "cpp",
"__functional_base_03": "cpp",
"filesystem": "cpp"
}
}

View File

@ -191,8 +191,6 @@ For best performance, a `parser` instance should be reused over several files: o
needlessly reallocate memory, an expensive process. It is also possible to avoid entirely memory
allocations during parsing when using simdjson. [See our performance notes for details](performance.md).
C++11 Support and string_view
-------------
@ -233,11 +231,14 @@ Using the Parsed JSON
---------------------
Once you have a document (`simdjson::ondemand::document`), you can navigate it with idiomatic C++ iterators, operators and casts.
Besides native types (`double`, `uint64_t`, `int64_t`, `bool`), we also have access Unicode (UTF-8) strings (`std::string_view`),
Besides the documents instances and native types (`double`, `uint64_t`, `int64_t`, `bool`), we also access Unicode (UTF-8) strings (`std::string_view`),
objects (`simdjson::ondemand::object`) and arrays (`simdjson::ondemand::array`). We also have a generic type (`simdjson::ondemand::value`)
which represent a potential array or object, or scalar type (`double`, `uint64_t`, `int64_t`, `bool`, `null`, string) inside an array
or an object. Both generic types (`simdjson::ondemand::document` and `simdjson::ondemand::value`) have a `type()` method returning
a `json_type` value describing the value (`json_type::array`, `json_type::object`, `json_type::number`, j`son_type::string`, `json_type::boolean`, `json_type::null`).
While you are accessing the document, the `document` instance should remain in scope: it is your "iterator" which keeps track
of where you are in the JSON document. By design, there is one and only one `document` instance per JSON document.
The following specific instructions indicate how to use the JSON when exceptions are enabled, but simdjson has full, idiomatic
support for users who avoid exceptions. See [the simdjson error handling documentation](basics.md#error-handling) for more.
@ -310,8 +311,8 @@ support for users who avoid exceptions. See [the simdjson error handling documen
- `field.unescaped_key()` will get you the unescaped key string.
- `field.value()` will get you the value, which you can then use all these other methods on.
* **Array Index:** Because it is forward-only, you cannot look up an array element by index by index. Instead,
you should to iterate through the array and keep an index yourself.
* **Output to strings (simdjson 1.0 or better):** Given a document, a value, an array or an object in a JSON document, you can output a JSON string version suitable to be parsed again as JSON content: `simdjson::to_json_string(element)`. A call to `to_json_string` consumes fully the element: if you apply it on a document, the JSON pointer is advanced to the end of the document. The `simdjson::to_json_string` does not allocate memory. The `to_json_string` function should not be confused with retrieving the value of a string instance which are escaped and represented using a lightweight `std::string_view` instance pointing at an internal string buffer inside the parser instance. To illustrate, the first of the following two code segments will print the unescaped string `"test"` complete with the quote whereas the second one will print the escaped content of the string (without the quotes).
you should iterate through the array and keep an index yourself.
* **Output to strings:** Given a document, a value, an array or an object in a JSON document, you can output a JSON string version suitable to be parsed again as JSON content: `simdjson::to_json_string(element)`. A call to `to_json_string` consumes fully the element: if you apply it on a document, the JSON pointer is advanced to the end of the document. The `simdjson::to_json_string` does not allocate memory. The `to_json_string` function should not be confused with retrieving the value of a string instance which are escaped and represented using a lightweight `std::string_view` instance pointing at an internal string buffer inside the parser instance. To illustrate, the first of the following two code segments will print the unescaped string `"test"` complete with the quote whereas the second one will print the escaped content of the string (without the quotes).
> ```C++
> // serialize a JSON to an escaped std::string instance so that it can be parsed again as JSON
> auto silly_json = R"( { "test": "result" } )"_padded;
@ -398,14 +399,11 @@ support for users who avoid exceptions. See [the simdjson error handling documen
}
```
* **Tree Walking and JSON Element Types:** Sometimes you don't necessarily have a document
with a known type, and are trying to generically inspect or walk over JSON elements. To do that, you can use iterators and the type() method.
For example, here's a quick and dirty recursive function that verbosely prints the JSON document as JSON:
with a known type, and are trying to generically inspect or walk over JSON elements. To do that, you can use iterators and the `type()` method. You can also represent arbitrary JSON values with
`ondemand::value` instances: it can represent anything except a scalar document (lone number, string, null or Boolean). You can check for scalar documents with the method `scalar()`.
For example, the following is a quick and dirty recursive function that verbosely prints the JSON document as JSON. This example also illustrates lifecycle requirements: the `document` instance holds the iterator. The document must remain in scope while you are accessing instances of `value`, `object` and `array`.
```c++
// We use a template function because we need to
// support both ondemand::value and ondemand::document
// as a parameter type. Note that we move the values.
template <class T>
void recursive_print_json(T&& element) {
void recursive_print_json(ondemand::value element) {
bool add_comma;
switch (element.type()) {
case ondemand::json_type::array:
@ -436,7 +434,7 @@ support for users who avoid exceptions. See [the simdjson error handling documen
recursive_print_json(field.value());
add_comma = true;
}
cout << "}";
cout << "}\n";
break;
case ondemand::json_type::number:
// assume it fits in a double
@ -456,9 +454,16 @@ support for users who avoid exceptions. See [the simdjson error handling documen
}
}
void basics_treewalk() {
padded_string json = R"( [
{ "make": "Toyota", "model": "Camry", "year": 2018, "tire_pressure": [ 40.1, 39.9, 37.7, 40.4 ] },
{ "make": "Kia", "model": "Soul", "year": 2012, "tire_pressure": [ 30.1, 31.0, 28.6, 28.7 ] },
{ "make": "Toyota", "model": "Tercel", "year": 1999, "tire_pressure": [ 29.8, 30.0, 30.2, 30.5 ] }
] )"_padded;
ondemand::parser parser;
auto json = padded_string::load("twitter.json");
recursive_print_json(parser.iterate(json));
ondemand::document doc = parser.iterate(json);
ondemand::value val = doc;
recursive_print_json(val);
std::cout << std::endl;
}
```
@ -686,7 +691,8 @@ doc.rewind(); // Need to manually rewind to be able to use find_field properly f
std::cout << doc.find_field("k0") << std::endl; // Prints 27
```
When the JSON path is the empty string (`""`) applied to a scalar document (lone string, number, Boolean or null), a SCALAR_DOCUMENT_AS_VALUE error is returned because scalar document cannot
be represented as `value` instances. You can check that a document is a scalar with the method `scalar()`.
Error Handling
--------------

View File

@ -38,6 +38,7 @@ enum error_code {
OUT_OF_ORDER_ITERATION, ///< tried to iterate an array or object out of order
INSUFFICIENT_PADDING, ///< The JSON doesn't have enough padding for simdjson to safely parse it.
INCOMPLETE_ARRAY_OR_OBJECT, ///< The document ends early.
SCALAR_DOCUMENT_AS_VALUE, ///< A scalar document is treated as a value.
NUM_ERROR_CODES
};

View File

@ -33,24 +33,30 @@ simdjson_really_inline simdjson_result<object> document::start_or_resume_object(
return object::resume(resume_value_iterator());
}
}
simdjson_really_inline simdjson_result<value> document::get_value_unsafe() noexcept {
simdjson_really_inline simdjson_result<value> document::get_value() noexcept {
// Make sure we start any arrays or objects before returning, so that start_root_<object/array>()
// gets called.
iter.assert_at_document_depth();
switch (*iter.peek()) {
case '[': {
array result;
SIMDJSON_TRY( get_array().get(result) );
iter._depth = 1 ; /* undoing the potential increment so we go back at the doc depth.*/
iter.assert_at_document_depth();
return value(result.iter);
}
case '{': {
object result;
SIMDJSON_TRY( get_object().get(result) );
iter._depth = 1 ; /* undoing the potential increment so we go back at the doc depth.*/
iter.assert_at_document_depth();
return value(result.iter);
}
default:
// TODO it is still wrong to convert this to a value! get_root_bool / etc. will not be
// called if you do this.
return value(get_root_value_iterator());
// Unfortunately, scalar documents are a special case in simdjson and they cannot
// be safely converted to value instances.
return SCALAR_DOCUMENT_AS_VALUE;
// return value(get_root_value_iterator());
}
}
simdjson_really_inline simdjson_result<array> document::get_array() & noexcept {
@ -100,6 +106,7 @@ template<> simdjson_really_inline simdjson_result<double> document::get() & noex
template<> simdjson_really_inline simdjson_result<uint64_t> document::get() & noexcept { return get_uint64(); }
template<> simdjson_really_inline simdjson_result<int64_t> document::get() & noexcept { return get_int64(); }
template<> simdjson_really_inline simdjson_result<bool> document::get() & noexcept { return get_bool(); }
template<> simdjson_really_inline simdjson_result<value> document::get() & noexcept { return get_value(); }
template<> simdjson_really_inline simdjson_result<raw_json_string> document::get() && noexcept { return get_raw_json_string(); }
template<> simdjson_really_inline simdjson_result<std::string_view> document::get() && noexcept { return get_string(); }
@ -107,6 +114,7 @@ template<> simdjson_really_inline simdjson_result<double> document::get() && noe
template<> simdjson_really_inline simdjson_result<uint64_t> document::get() && noexcept { return std::forward<document>(*this).get_uint64(); }
template<> simdjson_really_inline simdjson_result<int64_t> document::get() && noexcept { return std::forward<document>(*this).get_int64(); }
template<> simdjson_really_inline simdjson_result<bool> document::get() && noexcept { return std::forward<document>(*this).get_bool(); }
template<> simdjson_really_inline simdjson_result<value> document::get() && noexcept { return get_value(); }
template<typename T> simdjson_really_inline error_code document::get(T &out) & noexcept {
return get<T>().get(out);
@ -124,12 +132,17 @@ simdjson_really_inline document::operator double() noexcept(false) { return get_
simdjson_really_inline document::operator std::string_view() noexcept(false) { return get_string(); }
simdjson_really_inline document::operator raw_json_string() noexcept(false) { return get_raw_json_string(); }
simdjson_really_inline document::operator bool() noexcept(false) { return get_bool(); }
simdjson_really_inline document::operator value() noexcept(false) { return get_value(); }
#endif
simdjson_really_inline simdjson_result<size_t> document::count_elements() & noexcept {
auto a = get_array();
simdjson_result<size_t> answer = a.count_elements();
/* If there was an array, we are now left pointing at its first element. */
if(answer.error() == SUCCESS) { iter._depth -= 1 ; /* undoing the increment so we go back at the doc depth.*/ }
if(answer.error() == SUCCESS) {
iter._depth = 1 ; /* undoing the increment so we go back at the doc depth.*/
iter.assert_at_document_depth();
}
return answer;
}
simdjson_really_inline simdjson_result<value> document::at(size_t index) & noexcept {
@ -184,6 +197,13 @@ simdjson_really_inline simdjson_result<json_type> document::type() noexcept {
return get_root_value_iterator().type();
}
simdjson_really_inline simdjson_result<bool> document::scalar() noexcept {
json_type this_type;
auto error = type().get(this_type);
if(error) { return error; }
return ! ((this_type == json_type::array) || (this_type == json_type::object));
}
simdjson_really_inline simdjson_result<std::string_view> document::raw_json_token() noexcept {
auto _iter = get_root_value_iterator();
return std::string_view(reinterpret_cast<const char*>(_iter.peek_start()), _iter.peek_start_length());
@ -192,7 +212,7 @@ simdjson_really_inline simdjson_result<std::string_view> document::raw_json_toke
simdjson_really_inline simdjson_result<value> document::at_pointer(std::string_view json_pointer) noexcept {
rewind(); // Rewind the document each time at_pointer is called
if (json_pointer.empty()) {
return this->get_value_unsafe();
return this->get_value();
}
json_type t;
SIMDJSON_TRY(type().get(t));
@ -305,6 +325,10 @@ simdjson_really_inline simdjson_result<bool> simdjson_result<SIMDJSON_IMPLEMENTA
if (error()) { return error(); }
return first.get_bool();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document>::get_value() noexcept {
if (error()) { return error(); }
return first.get_value();
}
simdjson_really_inline bool simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document>::is_null() noexcept {
if (error()) { return error(); }
return first.is_null();
@ -348,6 +372,11 @@ simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::json_t
return first.type();
}
simdjson_really_inline simdjson_result<bool> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document>::scalar() noexcept {
if (error()) { return error(); }
return first.scalar();
}
#if SIMDJSON_EXCEPTIONS
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document>::operator SIMDJSON_IMPLEMENTATION::ondemand::array() & noexcept(false) {
if (error()) { throw simdjson_error(error()); }
@ -381,6 +410,10 @@ simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::docume
if (error()) { throw simdjson_error(error()); }
return first;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document>::operator SIMDJSON_IMPLEMENTATION::ondemand::value() noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
#endif
simdjson_really_inline simdjson_result<std::string_view> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document>::raw_json_token() noexcept {
@ -412,6 +445,7 @@ simdjson_really_inline simdjson_result<double> document_reference::get_double()
simdjson_really_inline simdjson_result<std::string_view> document_reference::get_string() noexcept { return doc->get_string(); }
simdjson_really_inline simdjson_result<raw_json_string> document_reference::get_raw_json_string() noexcept { return doc->get_raw_json_string(); }
simdjson_really_inline simdjson_result<bool> document_reference::get_bool() noexcept { return doc->get_bool(); }
simdjson_really_inline simdjson_result<value> document_reference::get_value() noexcept { return doc->get_value(); }
simdjson_really_inline bool document_reference::is_null() noexcept { return doc->is_null(); }
#if SIMDJSON_EXCEPTIONS
@ -423,6 +457,7 @@ simdjson_really_inline document_reference::operator double() noexcept(false) { r
simdjson_really_inline document_reference::operator std::string_view() noexcept(false) { return std::string_view(*doc); }
simdjson_really_inline document_reference::operator raw_json_string() noexcept(false) { return raw_json_string(*doc); }
simdjson_really_inline document_reference::operator bool() noexcept(false) { return bool(*doc); }
simdjson_really_inline document_reference::operator value() noexcept(false) { return value(*doc); }
#endif
simdjson_really_inline simdjson_result<size_t> document_reference::count_elements() & noexcept { return doc->count_elements(); }
simdjson_really_inline simdjson_result<value> document_reference::at(size_t index) & noexcept { return doc->at(index); }
@ -434,8 +469,8 @@ simdjson_really_inline simdjson_result<value> document_reference::operator[](std
simdjson_really_inline simdjson_result<value> document_reference::operator[](const char *key) & noexcept { return (*doc)[key]; }
simdjson_really_inline simdjson_result<value> document_reference::find_field_unordered(std::string_view key) & noexcept { return doc->find_field_unordered(key); }
simdjson_really_inline simdjson_result<value> document_reference::find_field_unordered(const char *key) & noexcept { return doc->find_field_unordered(key); }
simdjson_really_inline simdjson_result<json_type> document_reference::type() noexcept { return doc->type(); }
simdjson_really_inline simdjson_result<bool> document_reference::scalar() noexcept { return doc->scalar(); }
simdjson_really_inline simdjson_result<std::string_view> document_reference::raw_json_token() noexcept { return doc->raw_json_token(); }
simdjson_really_inline simdjson_result<value> document_reference::at_pointer(std::string_view json_pointer) noexcept { return doc->at_pointer(json_pointer); }
simdjson_really_inline simdjson_result<std::string_view> document_reference::raw_json() noexcept { return doc->raw_json();}
@ -528,6 +563,10 @@ simdjson_really_inline simdjson_result<bool> simdjson_result<SIMDJSON_IMPLEMENTA
if (error()) { return error(); }
return first.get_bool();
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::get_value() noexcept {
if (error()) { return error(); }
return first.get_value();
}
simdjson_really_inline bool simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::is_null() noexcept {
if (error()) { return error(); }
return first.is_null();
@ -536,7 +575,10 @@ simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::json_t
if (error()) { return error(); }
return first.type();
}
simdjson_really_inline simdjson_result<bool> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::scalar() noexcept {
if (error()) { return error(); }
return first.scalar();
}
#if SIMDJSON_EXCEPTIONS
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator SIMDJSON_IMPLEMENTATION::ondemand::array() & noexcept(false) {
if (error()) { throw simdjson_error(error()); }
@ -570,6 +612,10 @@ simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::docume
if (error()) { throw simdjson_error(error()); }
return first;
}
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::operator SIMDJSON_IMPLEMENTATION::ondemand::value() noexcept(false) {
if (error()) { throw simdjson_error(error()); }
return first;
}
#endif
simdjson_really_inline simdjson_result<std::string_view> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::document_reference>::raw_json_token() noexcept {

View File

@ -115,6 +115,14 @@ public:
* @returns INCORRECT_TYPE if the JSON value is not true or false.
*/
simdjson_really_inline simdjson_result<bool> get_bool() noexcept;
/**
* Cast this JSON value to a value when the document is an object or an array.
*
* @returns A value if a JSON array or object cannot be found.
* @returns SCALAR_DOCUMENT_AS_VALUE error is the document is a scalar (see scalar() function).
*/
simdjson_really_inline simdjson_result<value> get_value() noexcept;
/**
* Checks if this JSON value is null.
*
@ -148,7 +156,9 @@ public:
/**
* Get this value as the given type.
*
* Supported types: object, array, raw_json_string, string_view, uint64_t, int64_t, double, bool
* Supported types: object, array, raw_json_string, string_view, uint64_t, int64_t, double, bool, value
*
* Be mindful that the document instance must remain in scope while you are accessing object, array and value instances.
*
* @param out This is set to a value of the given type, parsed from the JSON. If there is an error, this may not be initialized.
* @returns INCORRECT_TYPE If the JSON value is not an object.
@ -220,6 +230,13 @@ public:
* @exception simdjson_error(INCORRECT_TYPE) if the JSON value is not true or false.
*/
simdjson_really_inline operator bool() noexcept(false);
/**
* Cast this JSON value to a value.
*
* @returns A value value.
* @exception if a JSON value cannot be found
*/
simdjson_really_inline operator value() noexcept(false);
#endif
/**
* This method scans the array and counts the number of elements.
@ -316,6 +333,15 @@ public:
*/
simdjson_really_inline simdjson_result<json_type> type() noexcept;
/**
* Checks whether the document is a scalar (string, number, null, Boolean).
* Returns false when there it is an array or object.
*
* @returns true if the type is string, number, null, Boolean
* @error TAPE_ERROR when the JSON value is a bad token like "}" "," or "alse".
*/
simdjson_really_inline simdjson_result<bool> scalar() noexcept;
/**
* Get the raw JSON for this token.
*
@ -380,6 +406,7 @@ public:
* - INDEX_OUT_OF_BOUNDS if an array index is larger than an array length
* - INCORRECT_TYPE if a non-integer is used to access an array
* - INVALID_JSON_POINTER if the JSON pointer is invalid and cannot be parsed
* - SCALAR_DOCUMENT_AS_VALUE if the json_pointer is empty and the document is not a scalar (see scalar() function).
*/
simdjson_really_inline simdjson_result<value> at_pointer(std::string_view json_pointer) noexcept;
/**
@ -399,7 +426,6 @@ protected:
simdjson_really_inline value_iterator resume_value_iterator() noexcept;
simdjson_really_inline value_iterator get_root_value_iterator() noexcept;
simdjson_really_inline simdjson_result<value> get_value_unsafe() noexcept;
simdjson_really_inline simdjson_result<object> start_or_resume_object() noexcept;
static simdjson_really_inline document start(ondemand::json_iterator &&iter) noexcept;
@ -437,6 +463,8 @@ public:
simdjson_really_inline simdjson_result<std::string_view> get_string() noexcept;
simdjson_really_inline simdjson_result<raw_json_string> get_raw_json_string() noexcept;
simdjson_really_inline simdjson_result<bool> get_bool() noexcept;
simdjson_really_inline simdjson_result<value> get_value() noexcept;
simdjson_really_inline bool is_null() noexcept;
simdjson_really_inline simdjson_result<std::string_view> raw_json() noexcept;
simdjson_really_inline operator document&() const noexcept;
@ -450,6 +478,7 @@ public:
simdjson_really_inline operator std::string_view() noexcept(false);
simdjson_really_inline operator raw_json_string() noexcept(false);
simdjson_really_inline operator bool() noexcept(false);
simdjson_really_inline operator value() noexcept(false);
#endif
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
simdjson_really_inline simdjson_result<value> at(size_t index) & noexcept;
@ -463,6 +492,8 @@ public:
simdjson_really_inline simdjson_result<value> find_field_unordered(const char *key) & noexcept;
simdjson_really_inline simdjson_result<json_type> type() noexcept;
simdjson_really_inline simdjson_result<bool> scalar() noexcept;
simdjson_really_inline simdjson_result<std::string_view> raw_json_token() noexcept;
simdjson_really_inline simdjson_result<value> at_pointer(std::string_view json_pointer) noexcept;
private:
@ -491,6 +522,7 @@ public:
simdjson_really_inline simdjson_result<std::string_view> get_string() noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::raw_json_string> get_raw_json_string() noexcept;
simdjson_really_inline simdjson_result<bool> get_bool() noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> get_value() noexcept;
simdjson_really_inline bool is_null() noexcept;
template<typename T> simdjson_really_inline simdjson_result<T> get() & noexcept;
@ -508,6 +540,7 @@ public:
simdjson_really_inline operator std::string_view() noexcept(false);
simdjson_really_inline operator SIMDJSON_IMPLEMENTATION::ondemand::raw_json_string() noexcept(false);
simdjson_really_inline operator bool() noexcept(false);
simdjson_really_inline operator SIMDJSON_IMPLEMENTATION::ondemand::value() noexcept(false);
#endif
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> at(size_t index) & noexcept;
@ -519,9 +552,8 @@ public:
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> operator[](const char *key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field_unordered(std::string_view key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field_unordered(const char *key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::json_type> type() noexcept;
simdjson_really_inline simdjson_result<bool> scalar() noexcept;
/** @copydoc simdjson_really_inline std::string_view document::raw_json_token() const noexcept */
simdjson_really_inline simdjson_result<std::string_view> raw_json_token() noexcept;
@ -550,6 +582,7 @@ public:
simdjson_really_inline simdjson_result<std::string_view> get_string() noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::raw_json_string> get_raw_json_string() noexcept;
simdjson_really_inline simdjson_result<bool> get_bool() noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> get_value() noexcept;
simdjson_really_inline bool is_null() noexcept;
#if SIMDJSON_EXCEPTIONS
@ -561,6 +594,7 @@ public:
simdjson_really_inline operator std::string_view() noexcept(false);
simdjson_really_inline operator SIMDJSON_IMPLEMENTATION::ondemand::raw_json_string() noexcept(false);
simdjson_really_inline operator bool() noexcept(false);
simdjson_really_inline operator SIMDJSON_IMPLEMENTATION::ondemand::value() noexcept(false);
#endif
simdjson_really_inline simdjson_result<size_t> count_elements() & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> at(size_t index) & noexcept;
@ -572,8 +606,8 @@ public:
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> operator[](const char *key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field_unordered(std::string_view key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value> find_field_unordered(const char *key) & noexcept;
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::json_type> type() noexcept;
simdjson_really_inline simdjson_result<bool> scalar() noexcept;
/** @copydoc simdjson_really_inline std::string_view document_reference::raw_json_token() const noexcept */
simdjson_really_inline simdjson_result<std::string_view> raw_json_token() noexcept;

View File

@ -153,6 +153,10 @@ simdjson_really_inline token_position json_iterator::root_position() const noexc
return _root;
}
simdjson_really_inline void json_iterator::assert_at_document_depth() const noexcept {
SIMDJSON_ASSUME( _depth == 1 );
}
simdjson_really_inline void json_iterator::assert_at_root() const noexcept {
SIMDJSON_ASSUME( _depth == 1 );
#ifndef SIMDJSON_CLANG_VISUAL_STUDIO

View File

@ -88,9 +88,12 @@ public:
* Get the root value iterator
*/
simdjson_really_inline token_position root_position() const noexcept;
/**
* Assert if the iterator is not at the start
* Assert that we are at the document depth (== 1)
*/
simdjson_really_inline void assert_at_document_depth() const noexcept;
/**
* Assert that we are at the root of the document
*/
simdjson_really_inline void assert_at_root() const noexcept;

View File

@ -123,6 +123,9 @@ public:
* iteration to ensure intermediate buffers can be accessed. Any document must be destroyed before
* you call parse() again or destroy the parser.
*
* The ondemand::document instance holds the iterator. The document must remain in scope
* while you are accessing instances of ondemand::value, ondemand::object, ondemand::array.
*
* ### REQUIRED: Buffer Padding
*
* The buffer must have at least SIMDJSON_PADDING extra allocated bytes. It does not matter what

View File

@ -144,6 +144,13 @@ simdjson_really_inline simdjson_result<json_type> value::type() noexcept {
return iter.type();
}
simdjson_really_inline simdjson_result<bool> value::scalar() noexcept {
json_type this_type;
auto error = type().get(this_type);
if(error) { return error; }
return ! ((this_type == json_type::array) || (this_type == json_type::object));
}
simdjson_really_inline std::string_view value::raw_json_token() noexcept {
return std::string_view(reinterpret_cast<const char*>(iter.peek_start()), iter.peek_start_length());
}
@ -298,7 +305,10 @@ simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::json_t
if (error()) { return error(); }
return first.type();
}
simdjson_really_inline simdjson_result<bool> simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value>::scalar() noexcept {
if (error()) { return error(); }
return first.scalar();
}
#if SIMDJSON_EXCEPTIONS
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::value>::operator SIMDJSON_IMPLEMENTATION::ondemand::array() noexcept(false) {
if (error()) { throw simdjson_error(error()); }

View File

@ -321,6 +321,14 @@ public:
*/
simdjson_really_inline simdjson_result<json_type> type() noexcept;
/**
* Checks whether the value is a scalar (string, number, null, Boolean).
* Returns false when there it is an array or object.
*
* @returns true if the type is string, number, null, Boolean
* @error TAPE_ERROR when the JSON value is a bad token like "}" "," or "alse".
*/
simdjson_really_inline simdjson_result<bool> scalar() noexcept;
/**
* Get the raw JSON for this token.
*
@ -536,6 +544,7 @@ public:
* let it throw an exception).
*/
simdjson_really_inline simdjson_result<SIMDJSON_IMPLEMENTATION::ondemand::json_type> type() noexcept;
simdjson_really_inline simdjson_result<bool> scalar() noexcept;
/** @copydoc simdjson_really_inline std::string_view value::raw_json_token() const noexcept */
simdjson_really_inline simdjson_result<std::string_view> raw_json_token() noexcept;

View File

@ -31,7 +31,8 @@ namespace internal {
{ PARSER_IN_USE, "Cannot parse a new document while a document is still in use." },
{ OUT_OF_ORDER_ITERATION, "Objects and arrays can only be iterated when they are first encountered." },
{ INSUFFICIENT_PADDING, "simdjson requires the input JSON string to have at least SIMDJSON_PADDING extra bytes allocated, beyond the string's length. Consider using the simdjson::padded_string class if needed." },
{ INCOMPLETE_ARRAY_OR_OBJECT, "JSON document ended early in the middle of an object or array." }
{ INCOMPLETE_ARRAY_OR_OBJECT, "JSON document ended early in the middle of an object or array." },
{ SCALAR_DOCUMENT_AS_VALUE, "A JSON document made of a scalar (number, Boolean, null or string) is treated as a value. Use get_bool(), get_double(), etc. on the document instead. "}
}; // error_messages[]
} // namespace internal

View File

@ -174,13 +174,12 @@ namespace array_tests {
bool iterate_array_count() {
TEST_START();
const auto json = R"([ 1, 10, 100 ])"_padded;
const auto badjson = R"([ 1, 10 100 ])"_padded;
const vector<uint64_t> expected_value = { 1, 10, 100 };
SUBTEST("ondemand::count_elements", test_ondemand_doc(json, [&](auto doc_result) {
ondemand::array array;
ASSERT_RESULT( doc_result.type(), json_type::array );
ASSERT_SUCCESS( doc_result.get(array) );
ASSERT_SUCCESS( doc_result.get_array().get(array) );
size_t count;
ASSERT_SUCCESS( array.count_elements().get(count) );
ASSERT_EQUAL(count, expected_value.size());
@ -207,6 +206,40 @@ namespace array_tests {
TEST_SUCCEED();
}
bool iterate_empty_array_count() {
TEST_START();
const auto json = R"([])"_padded;
SUBTEST("ondemand::count_elements", test_ondemand_doc(json, [&](auto doc_result) {
ondemand::array array;
ASSERT_RESULT( doc_result.type(), json_type::array );
ASSERT_SUCCESS( doc_result.get_array().get(array) );
size_t count;
ASSERT_SUCCESS( array.count_elements().get(count) );
ASSERT_EQUAL(count, 0);
return true;
}));
SUBTEST("ondemand::count_elements_and_decode", test_ondemand_doc(json, [&](auto doc_result) {
ondemand::array array;
ASSERT_RESULT( doc_result.type(), json_type::array );
ASSERT_SUCCESS( doc_result.get(array) );
size_t count;
ASSERT_SUCCESS( array.count_elements().get(count) );
ASSERT_EQUAL(count, 0);
size_t i = 0;
std::vector<uint64_t> receiver(count);
for (auto value : array) {
uint64_t actual;
ASSERT_SUCCESS( value.get(actual) );
i++;
}
ASSERT_EQUAL(i, 0);
return true;
}));
TEST_SUCCEED();
}
bool iterate_bad_array_count() {
TEST_START();
const auto badjson = R"([ 1, 10 100 ])"_padded;
@ -606,6 +639,7 @@ namespace array_tests {
bool run() {
return
iterate_empty_array_count() &&
iterate_sub_array_count() &&
iterate_complex_array_count() &&
iterate_bad_array_count() &&

View File

@ -40,6 +40,7 @@ namespace json_pointer_tests {
bool run_success_test(const padded_string & json,std::string_view json_pointer,std::string expected) {
TEST_START();
std::cout <<":"<< json_pointer<<std::endl;
ondemand::parser parser;
ondemand::document doc;
ondemand::value val;
@ -48,6 +49,33 @@ namespace json_pointer_tests {
ASSERT_SUCCESS(doc.at_pointer(json_pointer).get(val));
ASSERT_SUCCESS(simdjson::to_json_string(val).get(actual));
ASSERT_EQUAL(actual,expected);
// We want to see if the value is usable besides to_json_string.
ASSERT_SUCCESS(parser.iterate(json).get(doc));
ASSERT_SUCCESS(doc.at_pointer(json_pointer).get(val));
ondemand::json_type type;
ASSERT_SUCCESS(val.type().get(type));
switch (type) {
case ondemand::json_type::array:
ASSERT_SUCCESS(val.get_array().error());
break;
case ondemand::json_type::object:
ASSERT_SUCCESS(val.get_object().error());
break;
case ondemand::json_type::number:
ASSERT_SUCCESS(val.get_double().error());
break;
case ondemand::json_type::string:
ASSERT_SUCCESS(val.get_string().error());
break;
case ondemand::json_type::boolean:
ASSERT_SUCCESS(val.get_bool().error());
break;
case ondemand::json_type::null:
ASSERT_TRUE(val.is_null());
break;
default:
TEST_FAIL("unexpected type");
}
TEST_SUCCEED();
}
@ -101,6 +129,58 @@ namespace json_pointer_tests {
TEST_SUCCEED();
}
bool document_as_scalar() {
TEST_START();
auto number_json = R"( 1 )"_padded;
auto null_json = R"( null )"_padded;
auto string_json = R"( "a" )"_padded;
auto true_json = R"( true )"_padded;
auto false_json = R"( false )"_padded;
auto object_json = R"( {} )"_padded;
auto array_json = R"( {} )"_padded;
ondemand::parser parser;
ondemand::document doc;
ondemand::value val;
bool is_scalar;
std::cout << " checking number"<< std::endl;
ASSERT_SUCCESS(parser.iterate(number_json).get(doc));
ASSERT_SUCCESS(doc.scalar().get(is_scalar));
ASSERT_TRUE(is_scalar);
ASSERT_ERROR(doc.at_pointer("").get(val), simdjson::SCALAR_DOCUMENT_AS_VALUE);
std::cout << " checking null"<< std::endl;
ASSERT_SUCCESS(parser.iterate(null_json).get(doc));
ASSERT_SUCCESS(doc.scalar().get(is_scalar));
ASSERT_TRUE(is_scalar);
ASSERT_ERROR(doc.at_pointer("").get(val), simdjson::SCALAR_DOCUMENT_AS_VALUE);
std::cout << " checking string"<< std::endl;
ASSERT_SUCCESS(parser.iterate(string_json).get(doc));
ASSERT_SUCCESS(doc.scalar().get(is_scalar));
ASSERT_TRUE(is_scalar);
ASSERT_ERROR(doc.at_pointer("").get(val), simdjson::SCALAR_DOCUMENT_AS_VALUE);
std::cout << " checking false"<< std::endl;
ASSERT_SUCCESS(parser.iterate(false_json).get(doc));
ASSERT_SUCCESS(doc.scalar().get(is_scalar));
ASSERT_TRUE(is_scalar);
ASSERT_ERROR(doc.at_pointer("").get(val), simdjson::SCALAR_DOCUMENT_AS_VALUE);
std::cout << " checking true"<< std::endl;
ASSERT_SUCCESS(parser.iterate(true_json).get(doc));
ASSERT_SUCCESS(doc.scalar().get(is_scalar));
ASSERT_TRUE(is_scalar);
ASSERT_ERROR(doc.at_pointer("").get(val), simdjson::SCALAR_DOCUMENT_AS_VALUE);
std::cout << " checking object"<< std::endl;
ASSERT_SUCCESS(parser.iterate(object_json).get(doc));
ASSERT_SUCCESS(doc.scalar().get(is_scalar));
ASSERT_FALSE(is_scalar);
ASSERT_SUCCESS(doc.at_pointer("").get(val));
std::cout << " checking array"<< std::endl;
ASSERT_SUCCESS(parser.iterate(array_json).get(doc));
ASSERT_SUCCESS(doc.scalar().get(is_scalar));
ASSERT_FALSE(is_scalar);
ASSERT_SUCCESS(doc.at_pointer("").get(val));
TEST_SUCCEED();
}
bool many_json_pointers() {
TEST_START();
auto cars_json = R"( [
@ -275,6 +355,17 @@ namespace json_pointer_tests {
json_pointer_invalidation() &&
demo_test() &&
demo_relative_path() &&
run_success_test(TEST_RFC_JSON,"/foo",R"(["bar", "baz"])") &&
run_success_test(TEST_RFC_JSON,"/foo/0",R"("bar")") &&
run_success_test(TEST_RFC_JSON,"/",R"(0)") &&
run_success_test(TEST_RFC_JSON,"/a~1b",R"(1)") &&
run_success_test(TEST_RFC_JSON,"/c%d",R"(2)") &&
run_success_test(TEST_RFC_JSON,"/e^f",R"(3)") &&
run_success_test(TEST_RFC_JSON,"/g|h",R"(4)") &&
run_success_test(TEST_RFC_JSON,R"(/i\\j)",R"(5)") &&
run_success_test(TEST_RFC_JSON,R"(/k\"l)",R"(6)") &&
run_success_test(TEST_RFC_JSON,"/ ",R"(7)") &&
run_success_test(TEST_RFC_JSON,"/m~0n",R"(8)") &&
run_success_test(TEST_RFC_JSON,"",R"({
"foo": ["bar", "baz"],
"": 0,
@ -287,17 +378,6 @@ namespace json_pointer_tests {
" ": 7,
"m~n": 8
})") &&
run_success_test(TEST_RFC_JSON,"/foo",R"(["bar", "baz"])") &&
run_success_test(TEST_RFC_JSON,"/foo/0",R"("bar")") &&
run_success_test(TEST_RFC_JSON,"/",R"(0)") &&
run_success_test(TEST_RFC_JSON,"/a~1b",R"(1)") &&
run_success_test(TEST_RFC_JSON,"/c%d",R"(2)") &&
run_success_test(TEST_RFC_JSON,"/e^f",R"(3)") &&
run_success_test(TEST_RFC_JSON,"/g|h",R"(4)") &&
run_success_test(TEST_RFC_JSON,R"(/i\\j)",R"(5)") &&
run_success_test(TEST_RFC_JSON,R"(/k\"l)",R"(6)") &&
run_success_test(TEST_RFC_JSON,"/ ",R"(7)") &&
run_success_test(TEST_RFC_JSON,"/m~0n",R"(8)") &&
run_success_test(TEST_JSON, "", R"({
"/~01abc": [
0,
@ -360,6 +440,7 @@ namespace json_pointer_tests {
run_failure_test(TEST_JSON, "/~1~001abc/", INVALID_JSON_POINTER) &&
run_failure_test(TEST_JSON, "/~1~001abc/-", INDEX_OUT_OF_BOUNDS) &&
many_json_pointers() &&
document_as_scalar() &&
true;
}
} // json_pointer_tests

View File

@ -5,6 +5,23 @@ using namespace simdjson;
namespace misc_tests {
using namespace std;
bool test_get_value() {
TEST_START();
ondemand::parser parser;
padded_string json = R"({"a":[[1,null,3.0],["a","b",true],[10000000000,2,3]]})"_padded;
ondemand::document doc;
ASSERT_SUCCESS(parser.iterate(json).get(doc));
ondemand::value val;
ASSERT_SUCCESS(doc.get_value().get(val));
ondemand::object obj;
ASSERT_SUCCESS(val.get_object().get(obj));
ondemand::array arr;
ASSERT_SUCCESS(obj["a"].get_array().get(arr));
size_t count;
ASSERT_SUCCESS(arr.count_elements().get(count));
ASSERT_EQUAL(3,count);
TEST_SUCCEED();
}
bool issue1661a() {
TEST_START();
@ -348,6 +365,7 @@ namespace misc_tests {
bool run() {
return
test_get_value() &&
issue1660_with_uint64() &&
issue1660_with_int64() &&
issue1660_with_double() &&

View File

@ -5,8 +5,78 @@ using namespace std;
using namespace simdjson;
using error_code=simdjson::error_code;
#if SIMDJSON_EXCEPTIONS
void recursive_print_json(ondemand::value element) {
bool add_comma;
switch (element.type()) {
case ondemand::json_type::array:
cout << "[";
add_comma = false;
for (auto child : element.get_array()) {
if (add_comma) {
cout << ",";
}
// We need the call to value() to get
// an ondemand::value type.
recursive_print_json(child.value());
add_comma = true;
}
cout << "]";
break;
case ondemand::json_type::object:
cout << "{";
add_comma = false;
for (auto field : element.get_object()) {
if (add_comma) {
cout << ",";
}
// key() returns the key as it appears in the raw
// JSON document, if we want the unescaped key,
// we should do field.unescaped_key().
cout << "\"" << field.key() << "\": ";
recursive_print_json(field.value());
add_comma = true;
}
cout << "}\n";
break;
case ondemand::json_type::number:
// assume it fits in a double
cout << element.get_double();
break;
case ondemand::json_type::string:
// get_string() would return escaped string, but
// we are happy with unescaped string.
cout << "\"" << element.get_raw_json_string() << "\"";
break;
case ondemand::json_type::boolean:
cout << element.get_bool();
break;
case ondemand::json_type::null:
cout << "null";
break;
}
}
bool basics_treewalk() {
padded_string json[3] = {R"( [
{ "make": "Toyota", "model": "Camry", "year": 2018, "tire_pressure": [ 40.1, 39.9, 37.7, 40.4 ] },
{ "make": "Kia", "model": "Soul", "year": 2012, "tire_pressure": [ 30.1, 31.0, 28.6, 28.7 ] },
{ "make": "Toyota", "model": "Tercel", "year": 1999, "tire_pressure": [ 29.8, 30.0, 30.2, 30.5 ] }
] )"_padded, R"( {"key":"value"} )"_padded, "[12,3]"_padded};
ondemand::parser parser;
for(size_t i = 0 ; i < 3; i++) {
ondemand::document doc = parser.iterate(json[i]);
ondemand::value val = doc;
recursive_print_json(val);
std::cout << std::endl;
}
return true;
}
bool basics_1() {
TEST_START();
@ -656,6 +726,9 @@ bool test_load_example() {
return identifier == 1234;
}
int main() {
#if SIMDJSON_EXCEPTIONS
basics_treewalk();
#endif
if (
true
#if SIMDJSON_EXCEPTIONS

View File

@ -89,6 +89,13 @@ simdjson_really_inline bool assert_true(bool value, const char *operation = "res
}
return true;
}
simdjson_really_inline bool assert_false(bool value, const char *operation = "result") {
if (value) {
std::cerr << "FAIL: " << operation << " was true!" << std::endl;
return false;
}
return true;
}
template<typename T>
simdjson_really_inline bool assert_iterate_error(T &arr, simdjson::error_code expected, const char *operation = "result") {
int count = 0;
@ -107,6 +114,7 @@ simdjson_really_inline bool assert_iterate_error(T &arr, simdjson::error_code ex
#define ASSERT_SUCCESS(ACTUAL) do { if (!::assert_success((ACTUAL), #ACTUAL)) { return false; } } while (0);
#define ASSERT_ERROR(ACTUAL, EXPECTED) do { if (!::assert_error ((ACTUAL), (EXPECTED), #ACTUAL)) { return false; } } while (0);
#define ASSERT_TRUE(ACTUAL) do { if (!::assert_true ((ACTUAL), #ACTUAL)) { return false; } } while (0);
#define ASSERT_FALSE(ACTUAL) do { if (!::assert_false ((ACTUAL), #ACTUAL)) { return false; } } while (0);
#define ASSERT(ACTUAL, MESSAGE) do { if (!::assert_true ((ACTUAL), (MESSAGE))) { return false; } } while (0);
#define ASSERT_ITERATE_ERROR(ACTUAL, EXPECTED) do { if (!::assert_iterate_error((ACTUAL), (EXPECTED), #ACTUAL)) { return false; } } while (0);
#define RUN_TEST(ACTUAL) do { if (!(ACTUAL)) { return false; } } while (0);