Fix for issue 1246. We document the relationship between parser instances and elements (#1250)
* Fix for issue 1246. * Adopting John's wording.
This commit is contained in:
parent
6a86ef5a7d
commit
a75c07065f
|
@ -112,6 +112,9 @@ dom::element doc = parser.parse("[1,2,3]"_padded); // parse a string, the _padde
|
|||
```
|
||||
|
||||
The parsed document resulting from the `parser.load` and `parser.parse` calls depends on the `parser` instance. Thus the `parser` instance must remain in scope. Furthermore, you must have at most one parsed document in play per `parser` instance.
|
||||
You cannot copy a `parser` instance, you may only move it.
|
||||
|
||||
If you need to keep a document around long term, you can keep or move the parser instance. Note that moving a parser instance, or keeping one in a movable data structure like vector or map, can cause any outstanding `element`, `object` or `array` instances to be invalidated. If you need to store a parser in a movable data structure, you should use a `std::unique_ptr` to avoid this invalidation(e.g., `std::unique_ptr<dom::parser> parser(new dom::parser{})`).
|
||||
|
||||
During the`load` or `parse` calls, neither the input file nor the input string are ever modified. After calling `load` or `parse`, the source (either a file or a string) can be safely discarded. All of the JSON data is stored in the `parser` instance. The parsed document is also immutable in simdjson: you do not modify it by accessing it.
|
||||
|
||||
|
|
|
@ -93,6 +93,9 @@ dom::element doc = parser.parse("[1,2,3]"_padded); // parse a string, the _padde
|
|||
```
|
||||
|
||||
The parsed document resulting from the `parser.load` and `parser.parse` calls depends on the `parser` instance. Thus the `parser` instance must remain in scope. Furthermore, you must have at most one parsed document in play per `parser` instance.
|
||||
You cannot copy a `parser` instance, you may only move it.
|
||||
|
||||
If you need to keep a document around long term, you can keep or move the parser instance. Note that moving a parser instance, or keeping one in a movable data structure like vector or map, can cause any outstanding `element`, `object` or `array` instances to be invalidated. If you need to store a parser in a movable data structure, you should use a `std::unique_ptr` to avoid this invalidation(e.g., `std::unique_ptr<dom::parser> parser(new dom::parser{})`).
|
||||
|
||||
During the`load` or `parse` calls, neither the input file nor the input string are ever modified. After calling `load` or `parse`, the source (either a file or a string) can be safely discarded. All of the JSON data is stored in the `parser` instance. The parsed document is also immutable in simdjson: you do not modify it by accessing it.
|
||||
|
||||
|
|
|
@ -29,6 +29,15 @@ static constexpr size_t DEFAULT_BATCH_SIZE = 1000000;
|
|||
* as well as memory for a single document. The parsed document is overwritten on each parse.
|
||||
*
|
||||
* This class cannot be copied, only moved, to avoid unintended allocations.
|
||||
*
|
||||
* @note Moving a parser instance may invalidate "dom::element" instances. If you need to
|
||||
* preserve both the "dom::element" instances and the parser, consider wrapping the parser
|
||||
* instance in a std::unique_ptr instance:
|
||||
*
|
||||
* std::unique_ptr<dom::parser> parser(new dom::parser{});
|
||||
* auto error = parser->load(f).get(root);
|
||||
*
|
||||
* You can then move std::unique_ptr safely.
|
||||
*
|
||||
* @note This is not thread safe: one parser cannot produce two documents at the same time!
|
||||
*/
|
||||
|
@ -79,6 +88,10 @@ public:
|
|||
* documents because it reuses the same buffers, but you *must* use the document before you
|
||||
* destroy the parser or call parse() again.
|
||||
*
|
||||
* Moving the parser instance is safe, but it invalidates the element instances. You may store
|
||||
* the parser instance without moving it by wrapping it inside an `unique_ptr` instance like
|
||||
* so: `std::unique_ptr<dom::parser> parser(new dom::parser{});`.
|
||||
*
|
||||
* ### Parser Capacity
|
||||
*
|
||||
* If the parser's current capacity is less than the file length, it will allocate enough capacity
|
||||
|
@ -108,6 +121,10 @@ public:
|
|||
* The JSON document still lives in the parser: this is the most efficient way to parse JSON
|
||||
* documents because it reuses the same buffers, but you *must* use the document before you
|
||||
* destroy the parser or call parse() again.
|
||||
*
|
||||
* Moving the parser instance is safe, but it invalidates the element instances. You may store
|
||||
* the parser instance without moving it by wrapping it inside an `unique_ptr` instance like
|
||||
* so: `std::unique_ptr<dom::parser> parser(new dom::parser{});`.
|
||||
*
|
||||
* ### REQUIRED: Buffer Padding
|
||||
*
|
||||
|
|
|
@ -334,7 +334,23 @@ namespace parse_api_tests {
|
|||
const padded_string BASIC_JSON = "[1,2,3]"_padded;
|
||||
const padded_string BASIC_NDJSON = "[1,2,3]\n[4,5,6]"_padded;
|
||||
const padded_string EMPTY_NDJSON = ""_padded;
|
||||
bool parser_moving_parser() {
|
||||
std::cout << "Running " << __func__ << std::endl;
|
||||
typedef std::tuple<std::string, std::unique_ptr<parser>,element> simdjson_tuple;
|
||||
std::vector<simdjson_tuple> results;
|
||||
std::vector<std::string> my_data = {"[1,2,3]", "[1,2,3]", "[1,2,3]"};
|
||||
|
||||
for (std::string s : my_data) {
|
||||
std::unique_ptr<dom::parser> parser(new dom::parser{});
|
||||
element root;
|
||||
ASSERT_SUCCESS( parser->parse(s).get(root) );
|
||||
results.emplace_back(s, std::move(parser), root);
|
||||
}
|
||||
for (auto &t : results) {
|
||||
std::cout << "reserialized: " << simdjson::to_string(std::get<2>(t)) << " ...\n";
|
||||
}
|
||||
return true;
|
||||
}
|
||||
bool parser_parse() {
|
||||
std::cout << "Running " << __func__ << std::endl;
|
||||
dom::parser parser;
|
||||
|
@ -502,7 +518,8 @@ namespace parse_api_tests {
|
|||
#endif
|
||||
|
||||
bool run() {
|
||||
return parser_parse() &&
|
||||
return parser_moving_parser() &&
|
||||
parser_parse() &&
|
||||
parser_parse_many() &&
|
||||
parser_parse_many_deprecated() &&
|
||||
parser_parse_many_empty() &&
|
||||
|
|
Loading…
Reference in New Issue