Commit Graph

2246 Commits

Author SHA1 Message Date
Daniel Lemire e607958a7b
Update README.md 2021-06-24 14:13:26 -04:00
Daniel Lemire 0f068fb7c4
Update README.md 2021-06-24 14:13:04 -04:00
Daniel Lemire b991a4c7f3
We should be more generously testing in debug mode. (#1635) 2021-06-24 12:44:28 -04:00
Daniel Lemire 5b99a75ae1
count_elements did not like empty arrays. (#1631)
* count_elements did not like empty arrays.

* Minor cleaning.

* I don't understand.

* More cleaning.
2021-06-24 11:08:13 -04:00
John Keiser 1ba73b9e6b
Merge pull request #1629 from simdjson/jkeiser/vscode-config
Add .vscode workspace settings
2021-06-23 19:29:21 -06:00
Daniel Lemire cfe3adb599
Added tests over invalid documents. (#1626)
* Added tests over invalid documents.

* Tweaking.
2021-06-23 18:02:00 -04:00
John Keiser be6052bcdc Diff json using text diff 2021-06-23 12:28:03 -06:00
John Keiser ca8e21583c Add basic workspace configuration for vscode 2021-06-23 12:28:00 -06:00
Daniel Lemire 1c01fc35eb
This better documents invalidation. (#1625)
* This better documents invalidation.

* Tweak.
2021-06-22 11:33:25 -04:00
Nicolas Boyer ce38fe7bea
Add automatic rewind for at_pointer (#1624) 2021-06-21 15:17:24 -04:00
Daniel Lemire 6cd04aa858
Improving the documentation: escaping keys and "validate what you use" (#1621)
* Improving the documentation.

* Removing trailing spaces.
2021-06-18 09:59:20 -04:00
Nicolas Boyer 03f7396d50
Fix branches. (#1619) 2021-06-17 18:31:40 -04:00
Nicolas Boyer a4803d50c5
Add JSON Pointer for On Demand (#1615)
* Add working JSON pointer for array of atoms.

* Add working JSON pointer for object with key-atom pairs.

* Add first version of JSON pointer.

* Update tests (2 tests).

* Make tests exceptionless.

* Fix builing issues.

* Add more tests. Add json_pointer validation in array-inl.h and object-inl.h and empty json_pointer in document-inl.h.

* Fix errors in tests.

* Review.

* Add missing comment.
2021-06-11 14:20:05 -04:00
Daniel Lemire 40cba172ed
Adds compile-test for Visual Studio + ARM and turn developer mode throughout CI. (#1609)
* Adds compile-test for Visual Studio + ARM and turn developer mode throughout CI.

* Correcting YAML error.

* Disabling google benchmarks under Windows ARM.

* Turning off exceptions under ARM.
2021-06-09 16:42:37 -04:00
Nicolas Boyer 3ba221eb8e
Add max_capacity setting for On Demand (#1610)
* First try at implementing max_capacity for simdjson_ondemand.

* Add max_capacity check.

* Update doc.

* Add one more example in doc for fixed capacity.

* Make allocate() public.

* Remove whitespace

* Found culprit whitespace.

* Duplicating variable.
2021-06-08 14:42:42 -04:00
Daniel Lemire 8bc12fe7cb
Update basics.md 2021-06-07 14:54:18 -04:00
Daniel Lemire 34bb2079e7
Adding documentation regarding versions. (#1611)
* Adding documentation regarding versions.

* Minor tweaks.
2021-06-07 14:19:23 -04:00
Daniel Lemire 7ca016652e
Update README.md 2021-06-07 11:27:48 -04:00
Daniel Lemire 13ab123daf
Testing issue 1607. (#1608) 2021-06-07 10:50:48 -04:00
Daniel Lemire f54bd69b5b
Update bug_report.md 2021-06-07 09:57:43 -04:00
Daniel Lemire 16e8db1f17
Adding 'count_elements' method. (#1577)
* Adding 'count_elements' method.

* Actually reporting errors.

* removing white space.

* Removing white space again.

* Adding an extra example.

* Prettier.

* Making the functionality more error-proof.

* Avoiding exceptions.

* Various fixes including extending count_elements to value types.

* Various fixes.

* Minor fixes.

* Correcting comment.

* Trimming white spaces.
2021-06-06 17:56:00 -04:00
Daniel Lemire eb0ae041e3
Verification and bug fix of issue 1511 (#1602)
* Verification and bug fix.

* Removing comment.

* Removing spaces.

* Guarding exceptions.

* Tweaking the test
2021-06-06 17:55:33 -04:00
John Keiser 893e613faa
Don't #include "simdjson.cpp" in tests (#1605) 2021-06-06 14:44:04 -04:00
Daniel Lemire 714f0ba222
This deletes most of our data files making the repository much smaller (#1582)
* This deletes most of our data files making the repository much smaller.

* Removing dead code.

* Various minor fixes.
2021-06-04 09:24:03 -04:00
Daniel Lemire 19c3b1315a
Rewind functionality. (#1539)
* Rewind functionality.


* Keeping just the document rewind.
2021-06-04 09:22:33 -04:00
Daniel Lemire f44a53271d
Documentation for issue 1562 (Accessing escaped key with on-demand API) (#1563)
* Documentation for issue 1562.

* Making exception-free.

* Improving wording.
2021-06-04 09:21:52 -04:00
Nicolas Boyer d90714e8df
Add RapidJSON and nlohmann_json SAX to partial_tweets benchmark (#1597)
* Add first working version of rapidjson_sax for partial tweets.

* Add cleaner and faster rapidjson_sax

* Add nlohmann_json_sax.

* Replace array of bool by bitsets.

* Replace strdup to copy string in rapidjson_sax.

* Change std::string_view assignment in rapidjson_sax.
2021-06-03 16:41:20 -04:00
Nicolas Boyer c7fd7353a8
Add RapidJSON and nlohmann_json SAX to top_tweet benchmark (#1599)
* Add rapidjson_sax.h and fix typo in rapidjson.h

* Add nlohmann_json_sax.h and add user key check for screen_name in rapidjson_sax

* Change std::string_view assignement for text and screen_name.
2021-06-03 16:41:00 -04:00
Nicolas Boyer 05f15d88b6
Add large_random/rapidjson_sax.h and large_random/nlohmann_json_sax.h. Clean up kostya/rapidjson_sax.h (add flags also) and kostya/nlohmann_json_sax.h (#1600) 2021-06-03 16:40:39 -04:00
Nicolas Boyer d7d81c7152
Add RapidJSON and nlohmann_json SAX to find_tweet benchmark (#1598)
* Add rapidjson_sax.h .

* Add nlohmann_json_sax.h . Fix typos distinct_user_id/nlohmann_json_sax.h, find_tweet/rapidjson.h and find_tweet/rapidjson_sax.h .

* Add extra check for id key when looking for find_id.
2021-06-03 12:43:54 -04:00
Nicolas Boyer 73b510225f
Add RapidJSON and nlohmann_json SAX to distinct_user_id benchmark (#1593)
* Add rapidjson_sax for distinct_user_id

* Add nlohmann_json_sax.h for distinct_user_id

* Add flags for RapidJSON.

* Fix revisions.

* Fix revisions again.

* Replace strcpy with memcpy. Increase performance fix.
2021-06-01 14:51:27 -04:00
Daniel Lemire 5d2eca2363 Correcting a couple of typographic errors. 2021-06-01 13:59:32 -04:00
Daniel Lemire 939b6b854a
This adds /permissive- to recent visual studio builds (#1596)
* This adds /permissive-.

* Typo.

* Trying this simple fix.
2021-06-01 10:57:37 -04:00
Daniel Lemire 4f8bdf517a
Adds a warning message when SIMDJSON_DEVELOPER_MODE is OFF. (#1594) 2021-06-01 10:29:11 -04:00
Nicolas Boyer 369f66be35
Add RapidJSON and nlohmann_json SAX to kostya benchmark (#1592)
* Add RapidJSON and nlohmann_json SAX to kostya benchmark

* Remove trailing whitespaces

* Fix typo
2021-05-31 10:15:50 -04:00
Daniel Lemire 8a75dbf719
Update README.md 2021-05-28 09:05:56 -04:00
Daniel Lemire 1032f70ddf
Verifies and fixes issue 1588 (#1589)
* Verifies and fix issue 1588

* Removing a trailing space.
2021-05-27 19:35:42 -04:00
strager 16e2323153
Fix UB in dev checks when iterating empty object (#1587)
When find_field_unordered is used on an empty object, it calls
json_iterator::reenter_child. reenter_child asserts that it doesn't
rewind too far back by consulting parser->start_positions.

When the On Demand parser sees an empty object, it fails to update
parser->start_positions. This means that the assertion in
json_iterator::reenter_child reads stale data, or potentially
uninitialized memory. Reading uninitialized memory can cause spurious
assertion failures and Valgrind memcheck reports:

    Running missing_keys_for_empty_top_level_object ...
    ==170679== Conditional jump or move depends on uninitialised value(s)
    ==170679==    at 0x4943D7: reenter_child (json_iterator-inl.h:208)
    ==170679==    by 0x4943D7: find_field_unordered_raw (value_iterator-inl.h:197)
    ==170679==    by 0x4943D7: find_field_unordered (object-inl.h:13)
    ==170679==    by 0x4943D7: find_field_unordered (object-inl.h:96)
    ==170679==    by 0x4943D7: find_field_unordered (value-inl.h:110)
    ==170679==    by 0x4943D7: find_field_unordered (document-inl.h:105)
    ==170679==    by 0x4943D7: object_tests::missing_keys_for_empty_top_level_object() (ondemand_object_tests.cpp:117)
    ==170679==    by 0x4CA761: object_tests::run() (ondemand_object_tests.cpp:1085)
    ==170679==    by 0x8BA314: int test_main<bool ()>(int, char**, bool ( const&)()) (test_ondemand.h:81)
    ==170679==    by 0x4CA9C8: main (ondemand_object_tests.cpp:1119)
    ==170679==

Fix the read of uninitialized or stale memory by updating
parser->start_positions regardless of whether we see an empty object or
an object with some keys.

This commit only affects builds where development checks
(SIMDJSON_DEVELOPMENT_CHECKS) are enabled. Builds where development
checks are disabled are unaffected by this bug.
2021-05-27 08:34:28 -04:00
Pavel Novikov 2ec23bdf37
fixed some typos (#1585) 2021-05-24 09:21:00 -04:00
Daniel Lemire 4fb09824bf
Restricting how we can end key searches (#1575)
* Verifies bug with missing keys.

* Allowing search from any key.

* Workaround for buggy msys

* Restricting how we can end key searches.

* Adding a few tests.
2021-05-20 16:23:38 -04:00
Daniel Lemire ad1cd6a2ce
Documenting raw string access. (#1566)
* Documenting raw string access.

* Removing trailing space.
2021-05-20 13:57:48 -04:00
Daniel Lemire a27367210a
Improving how to_string is explained. (#1583) 2021-05-20 11:22:31 -04:00
Daniel Lemire efe9761f80
Fixing issue 1579. (#1580) 2021-05-19 12:23:17 -04:00
Ivan Volnov 0b75de12ef
Don't allocate std::string just for padded_string::load() (#1578)
* Don't allocate std::string just for padded_string::load()

Use std::string_view

* Remove reference from string_view
2021-05-18 12:32:51 -04:00
Daniel Lemire af5c8175b4
By default, we should not do the DOM checkperf… (#1571)
* By default, we should not do the DOM checkperf. These targets assume that main branch remains
compatible, an assumption that will break over time.
2021-05-15 15:28:59 -04:00
Amos Bird 8df32cea33
Return err when alloc failure (#1567) 2021-05-14 22:51:07 -04:00
Luigi Pinca e4150443ca
Update journal reference (#1565)
Update the journal reference of the "Validating UTF-8 In Less Than One
Instruction Per Byte" paper.
2021-05-10 08:16:37 -04:00
Daniel Lemire d539781cf3
This attempts to fix the fuzzers. (#1564)
* This attempts to fix the fuzzers.

* Retiring bintray.

* Disabling ARM fuzzing.
2021-05-07 22:59:26 -04:00
PavelP 2bbab7d892
Update CONTRIBUTORS (#1560) 2021-05-02 12:30:25 -04:00
Daniel Lemire 729c35c0f8 Removes docker file which is unused and untested, and updates the path to dom/parse. 2021-05-01 10:31:00 -04:00