Commit Graph

410 Commits

Author SHA1 Message Date
Daniel Lemire 8b5a89c136
Parsing floats with 19 significant digits should be fine. (#1191)
* Parsing floats with 19 significant digits should be fine.

* Adding more tests with very long mantissa.
2020-09-29 19:42:43 -04:00
Daniel Lemire da093c1982
Fixing "undefined behavior" issue in new fast_itoa functions (#1186)
* Fixing "undefined behavior" issue.

* Simplifying our custom atoi

* Fixing minor bug
2020-09-29 19:17:03 -04:00
Daniel Lemire 048fb6278a
This adds two tests to verify a new fuzzer issue. (So far I could not verify.) (#1194) 2020-09-29 11:45:41 -04:00
Daniel Lemire 0e584fa4a5
Attempt to fix issue 1187. (#1192) 2020-09-27 12:04:47 -04:00
Daniel Lemire 60c139a844
Faster and more correct serialization (#1168)
* Adding new files.

* Better.

* Fixing minifier and adding tests.

* Adding benchmarks.

* Including the array header.

* Replacing old stream-based code by the new code.

* Doubling up the itoa.

* Hidden away to_chars in internal namespace.

* Removing the repetitions.

* Documented the atoi functions.

* Tuning the escape sequences.

* Moving the operators off the main namespace.

* Added more tests.

* Tweaking the implementation so that it works with and without exp.

* The string_builder template and mini_formatter class
 are not part of  our public API and are subject to change
 at any time!

* Adding a benchmark and some optimization.

* Cleaning.

* Strictly speaking, this header is needed.
2020-09-23 10:00:39 -04:00
Daniel Lemire f410213003
Improve documentation on padding
- Improves and clarifies the documentation on padding.
 - Use std:: prefix for memcpy, strlen etc.

Related to issues #1175 and #1178
2020-09-23 09:07:14 +02:00
Daniel Lemire 19cb5d57db
Some minor documentation fixes. (#1177) 2020-09-17 13:17:35 -04:00
Daniel Lemire 72c83d9430
This avoids locale-dependent number parsing at the standard library level (#1157)
* This avoids locale-dependent number parsing at the standard library level.

* Adding missing cast.

* Inserting the missing "endif"

* Trial and error.

* Another attempt.

* Another tweak.

* Another fix.

* Restricting it even more.

* Tweaking our symbol checks.

* Somewhat smarter tests.

* Nice comments.

* Minor simplification.

* Adding cerr.
2020-09-15 11:36:18 -04:00
Daniel Lemire bfbac12f76
We were forgetting to check the end bytes at the end of the UTF8 validation. (#1173)
* We were forgetting to check the end bytes at the end of the UTF8 validation.

* Silencing the sanitizer

* Better explanation.
2020-09-15 11:33:09 -04:00
Daniel Lemire 3e5497e2f9
Fixes issue 1170 and makes the usage of minify easier. (#1171)
* Fixes issue 1170 and makes the usage of minify easier.

* This should get the fallback implementation to detect unclosed strings.
2020-09-12 16:20:20 -04:00
Daniel Lemire 0552335ec1
Fixing the issue. (#1151) 2020-09-02 18:41:59 -04:00
Daniel Lemire 7aea774b21
Adding a tests and a fix for empty strings in at_pointer (#1148)
* Adding a test.

* More tests.
2020-09-02 17:04:56 -04:00
Daniel Lemire 5b10c38e43
Make parse_many safer. (#1137) 2020-08-20 22:22:46 -04:00
Daniel Lemire 3316df9195
Adding test for issue 1133 and improving documentation (#1134)
* Adding test.

* Saving.

* With exceptions.

* Added extensive tests.

* Better documentation.

* Tweaking CI

* Cleaning.

* Do not assume make.

* Let us make the build verbose

* Reorg

* I do not understand how circle ci works.

* Breaking it up.

* Better syntax.
2020-08-20 14:03:14 -04:00
Daniel Lemire 8a8eea53a2
Prefixing macros (issue 1035) (#1124)
* Renaming partially done.

* More prefixing.

* I thought that this was fixed.

* Missed one.

* Missed a few.

* Missed another one.

* Minor fixes.
2020-08-18 18:25:36 -04:00
Daniel Lemire 09bd7e8ef8
Verification and fix for issue 1063 (JSON Pointers) (#1064)
* Specification is not followed.

* Fixes.

* Do not pass string_view by reference.

* Better documentation.

* The example is written for exceptions.

* Better documentation.

* Updating with deprecation.

* Updating example.

* Updating example.
2020-08-18 17:23:18 -04:00
John Keiser 9356619380
Merge pull request #1110 from simdjson/jkeiser/number-corruption
Fix potential buffer overrun with heavily customized input and padding
2020-08-18 14:17:25 -07:00
Daniel Lemire fc15147cf5
This allows the users to disable threading. (#1122)
* This allows the users to disable threading.

* This would disable bash scripts under FreeBSD. (#1118)

* This would disable bash scripts under FreeBSD.

* Let us also disable GIT.

* Let us try to just disable GIT

* Nope. We must have both bash and git disabled.

* This allows the users to disable threading.
2020-08-18 16:43:08 -04:00
John Keiser fa355603fb Add test for corruption while parsing a number 2020-08-18 10:10:01 -07:00
Daniel Lemire 4a6eebc0e4
This corrects a small typo in the documentation. (#1121)
* This corrects a small typo in the documentation.

* Modifying the test as well.
2020-08-18 08:36:15 -04:00
Daniel Lemire 501fed6c4f
This would disable bash scripts under FreeBSD. (#1118)
* This would disable bash scripts under FreeBSD.

* Let us also disable GIT.

* Let us try to just disable GIT

* Nope. We must have both bash and git disabled.
2020-08-17 11:50:57 -04:00
Daniel Lemire daeca1bb18
Basics. (#1116) 2020-08-14 17:28:09 -04:00
Daniel Lemire 83615ff351
Fixes issue 1088 (#1096) 2020-08-06 11:42:13 -04:00
Daniel Lemire 039d82ff1b
Returning basictests to its original function: basic tests (only) (#1010)
* The initial motivation behind basictests was for a quick set of sanity tests to check whether your code made sense. It
was not meant for thorough testing to find corner cases. However, over time, it grew to include such expensive tests.
This PR takes them out. It also allows us to bring back basictests to MinGW tests, since it is now cheap.

This is not an exercise in software engineering and making things prettier. This is a pragmatic change to improve our
test coverage and quality of life.

* Adds many more cheap tests.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-07-13 09:39:35 -04:00
Daniel Lemire 74870a8189
Fixing issue 1013. (#1016)
* Fixing issue 1013.

* Bumping to 0.4.6

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-07-01 14:14:51 -04:00
Daniel Lemire 0ef4d90ad0
Fix for issue 1014. (#1015)
* Fix for issue 1014.

* Explanation.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-30 19:36:26 -04:00
Daniel Lemire ccc94c9b05
Mingw tests (32-bit and 64-bit) (#1004) 2020-06-29 21:10:54 -04:00
Daniel Lemire cb8a9ef2c0 This removes git as a dependency 2020-06-24 15:13:47 -04:00
John Keiser 187084ce46
Merge pull request #970 from simdjson/jkeiser/singleheader-tests
Make singleheader tests be test-only
2020-06-23 17:07:03 -07:00
Daniel Lemire 544fa57641 Damn merge conflicts. 2020-06-23 19:15:47 -04:00
John Keiser 843b73dedb Make singleheader tests be test-only 2020-06-23 13:35:27 -07:00
Daniel Lemire b84a3a0230
Merge branch 'master' into issue961 2020-06-23 14:33:06 -04:00
John Keiser 257089884f
Merge pull request #958 from simdjson/jkeiser/is
Make simdjson_result<element>.is() return bool
2020-06-23 09:51:37 -07:00
John Keiser c650ea9765
Merge pull request #960 from simdjson/jkeiser/idiomatic-get
Convert simdjson to use .get()
2020-06-23 09:49:41 -07:00
John Keiser 2d84b6f6d9 Make simdjson_result<element>.is() return bool 2020-06-23 09:09:24 -07:00
John Keiser eef1171944
Merge pull request #954 from simdjson/jkeiser/parse-many-result
Return error from parse_many
2020-06-23 09:06:20 -07:00
Daniel Lemire 696b0e29e4 Fixing issue 961 2020-06-23 10:47:32 -04:00
Daniel Lemire dada5090b0 These compilers are insane. 2020-06-22 20:25:55 -04:00
Daniel Lemire 1c4593c648 These compilers are really pedantic. 2020-06-22 20:04:37 -04:00
Daniel Lemire e7004cef76 Removing a test so that it is all ASCII. 2020-06-22 16:55:16 -04:00
Daniel Lemire 2bb101bd19 Code reformatting. 2020-06-22 16:50:57 -04:00
Daniel Lemire 26baf70912 Pedantic compiler 2020-06-22 16:45:32 -04:00
Daniel Lemire 69a247d500 Adding tests. 2020-06-22 16:12:37 -04:00
Daniel Lemire a76c67c19f Fixing... 2020-06-22 15:57:54 -04:00
John Keiser 0c9dc11550 Use really_inline to help g++ detect initialized variable 2020-06-21 16:27:05 -07:00
John Keiser 1ff55c2729 Replace auto [x,error] with .get() everywhere 2020-06-21 16:26:59 -07:00
Daniel Lemire 38bb08778a With an example. 2020-06-21 17:57:22 -04:00
Daniel Lemire 5dbcdf1484 Ok 2020-06-21 17:52:30 -04:00
John Keiser 6fa5abcd7e Replace x.get<T>() with x.get(v) or T(x) 2020-06-21 14:36:38 -07:00
John Keiser 1b1a122b1f Fix copy constructor issue on older gcc 2020-06-21 12:06:14 -07:00
John Keiser ae1bd891e7 Remove deprecated uses of parse_many 2020-06-21 11:19:06 -07:00
John Keiser 9899e5021d Allow use of document_stream with tie() 2020-06-20 21:15:05 -07:00
John Keiser a7fc7d4ffb Switch from get(v,e) to e = get(v) 2020-06-20 17:57:09 -07:00
John Keiser f336103f63 Convert tools/docs/benchmarks to bool get() idiom 2020-06-20 17:55:46 -07:00
John Keiser 56e2b38048 Add bool result from tie()/get(), get<T>(T&,error_code&) 2020-06-20 17:55:46 -07:00
John Keiser 0b8c357eff Add get_X and is_X methods 2020-06-19 13:27:33 -07:00
John Keiser efc168f473 Make test changes only 2020-06-19 13:27:33 -07:00
John Keiser d8428f98d9 Add cast_tester.h 2020-06-19 13:27:33 -07:00
John Keiser 60f17d26a3 Move test macros to a header 2020-06-19 13:27:00 -07:00
Daniel Lemire 5ccdbef7d5
Merge pull request #936 from simdjson/dlemire/new_examples
New examples.
2020-06-18 18:29:06 -04:00
Daniel Lemire c13c2650a2
Merge pull request #940 from simdjson/issue938
Verifying (and fixing) issue 938
2020-06-18 18:25:31 -04:00
John Keiser f632e7c043 Put C++11 capable version back, change name to readme style 2020-06-18 12:50:49 -07:00
Daniel Lemire 04a19f9813 Fixes https://github.com/simdjson/simdjson/issues/937 2020-06-17 18:06:13 -04:00
Daniel Lemire 0655a135e6 Reverting. 2020-06-17 17:52:07 +00:00
Daniel Lemire 4474f8ef18 Cleaning a bit the examples. 2020-06-17 16:24:55 +00:00
Daniel Lemire 6537d0dc76 Avoiding the unused errors. 2020-06-17 14:19:58 +00:00
Daniel Lemire 8d609607e2 Verifying the bug. 2020-06-16 20:04:09 -04:00
Daniel Lemire 27a75a9085 Tweaking. 2020-06-15 17:54:34 -04:00
Daniel Lemire 954d6c326d New examples. 2020-06-15 17:45:15 -04:00
John Keiser fd44c2a2ff
Merge pull request #927 from simdjson/dlemire/exposingthestringminifier
Exposing the string minifier.
2020-06-13 07:47:20 -07:00
John Keiser a86a82b39c Rename minify class to minifier so the minify() method is cleared up 2020-06-12 17:05:25 -07:00
Daniel Lemire 89b059b1ea
Testing with GCC 10 and clang 10 (#926)
* Testing with GCC 10 and clang 10

* Fixing spurious space

* gcc10 does not need the cmake installation.

* We don't want to run the perf test on ARM. I ignore them systematically. ARM performance
should be assessed manually.

* Switching to GCC 10 and Clang 10

* Disabling some tests under sanitizers when they involve rapidjson or other parsers.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-12 17:58:53 -04:00
Daniel Lemire 4dfbf98e4e
Using a worker instead of a thread per batch (#920)
In the parse_many function, we have one thread doing the stage 1, while the main thread does stage 2. So if stage 1 and stage 2 take half the time, the parse_many could run at twice the speed. It is unlikely to do so. Still, we see benefits of about 40% due to threading.

To achieve this interleaving, we load the data in batches (blocks) of some size. In the current code (master), we create a new thread for each batch. Thread creation is expensive so our approach only works over sizeable batches. This PR improves things and makes parse_many faster when using small batches.

  This fixes our parse_stream benchmark which is just busted.
  This replaces the one-thread per batch routine by a worker object that reuses the same thread. In benchmarks, this allows us to get the same maximal speed, but with smaller processing blocks. It does not help much with larger blocks because the cost of the thread create gets amortized efficiently.
This PR makes parse_many beneficial over small datasets. It also makes us less dependent on the thread creation time.

Unfortunately, it is going to be difficult to say anything definitive in general. The cost of creating a thread varies widely depending on the OS. On some systems, it might be cheap, in others very expensive. It should be expected that the new code will depend less drastically on the performances of the underlying system, since we create juste one thread.

Co-authored-by: John Keiser <john@johnkeiser.com>
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-12 16:51:18 -04:00
Daniel Lemire 45e2178ada Duh. 2020-06-11 17:20:28 +00:00
Daniel Lemire a6e4933d93 Exposing the string minifier. 2020-06-11 13:07:18 -04:00
John Keiser fe01da077e Make threaded version work again 2020-06-07 16:21:00 -07:00
John Keiser 3e226795f0 Run all passing json against parse_many. Empty documents pass, too. 2020-06-07 16:20:51 -07:00
John Keiser c4a0fe1606 Add tests for parse_many() errors 2020-06-07 16:20:46 -07:00
John Keiser ef63a84a3e Move document stream state to implementation 2020-06-07 16:20:44 -07:00
Daniel Lemire 7a69da16e4
Fixing issue 906 (#912)
* Fixing issue 906

* Safe patching.

* Now with explanations.

* Bumping up memory allocation.

* Putting the patch back.

* fallback fixes.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-05 15:37:09 -04:00
Daniel Lemire 12150baa5e
Using just ASCII. (#899)
* Using just ASCII.

* Let us prune checkperf.

* Moving the description of lookup2 to the HACKING.md file.
2020-05-21 21:59:06 -04:00
Daniel Lemire d2c9ea8a9a
Detect bash instead of relying on MSVC detection. (#894) 2020-05-20 12:13:14 -04:00
John Keiser 5312fd30e5 Fix CRT_SECURE warnings in clang 2020-05-04 11:36:00 -07:00
John Keiser 1d06624d38 Unset /D_CRT_SECURE_NO_WARNINGS
- Also localize DISABLE_DEPRECATED_WARNING so that we catch other
  deprecations
2020-05-04 11:35:05 -07:00
Furkan Usta 064eb0b24f CMake: Make simdjson-internal-flags subsume simdjson-flags 2020-05-03 02:48:29 +03:00
Furkan Usta af968c5b44 Merge branch 'master' of github.com:simdjson/simdjson into cmake-flags 2020-05-03 02:12:23 +03:00
Furkan Usta 1e9488d4a6 Remove Microsoft comment regarding dirent in parsingchecks 2020-05-02 16:01:30 +03:00
Furkan Usta ff1d77ead9 Add NOMINMAX to parsingchecks 2020-05-02 15:33:53 +03:00
Furkan Usta 977e1a94b2 Use dirent_portable.h only in MSVC 2020-05-02 15:16:50 +03:00
Furkan Usta 60ee5fc844 Enable numberparsingcheck and stringparsingcheck on MSVC 2020-05-02 15:12:30 +03:00
Furkan Usta 293c104cc4 CMake: Separate public and private compilation flags
simdjson-internal-flags for macros and warnings
simdjson-flags for pthread, sanitizer, and libcpp
2020-05-02 04:08:47 +03:00
Daniel Lemire fa4ce6a8bc
There is confusion between gigabytes and gigibytes. Let us standardize throughout. (#838)
* There is confusion between gigabytes and gigibytes.

* Trying to be consistent.
2020-05-01 12:16:18 -04:00
John Keiser 0e6ea76e88
Make checkperf work on Windows (#799)
* Make command line arguments work for Windows

* Run checkperf on Windows
2020-04-27 14:20:05 -04:00
Daniel Lemire f397b6fedf
Another example. (#790)
* Another example.

* Adding a reference to error chaining.
2020-04-23 21:48:41 -04:00
Daniel Lemire 4f72d5cfac
This adds another example (#785) 2020-04-23 18:29:28 -04:00
Daniel Lemire e030f02776 Merge branch 'master' into jkeiser/wconversion 2020-04-22 22:03:34 -04:00
Daniel Lemire f0ac55ec0c
testing on freebsd (#768)
* Adding cirrus tests
* Adding cirrus badge.
2020-04-22 21:22:09 -04:00
John Keiser d4a37f6ef5 Enable conversion warnings on Linux and Windows 2020-04-22 14:21:30 -07:00
John Keiser d3e44b1108 Add amalgamation support to cmake 2020-04-20 19:50:51 -07:00
John Keiser 53d28a713c Fix cmake error when SIMDJSON_COMPETITION=OFF 2020-04-20 10:49:40 -07:00
John Keiser e5e6a46c37 Consolidate multi-implementation tests
Uses SIMDJSON_FORCE_IMPLEMENTATION to switch the implementation at test
time.
2020-04-19 09:59:49 -07:00
John Keiser 22b9a53bef Add SIMDJSON_FORCE_IMPLEMENTATION 2020-04-18 18:21:56 -07:00
John Keiser ff09b6c824 Run fewer redundant steps and configs in CI 2020-04-17 12:23:05 -07:00
John Keiser 289cc3e7a0 Treat warnings as errors during compilation 2020-04-15 19:59:38 -07:00
John Keiser fd418f568c Fix c++11 warnings on clang
- namespace x::y is C++17
- static_assert requires message in C++11
2020-04-15 17:27:48 -07:00
John Keiser 09cf18a646 Add C++11 tests to cmake
- Add simdjson-flags target so callers don't have flags forced on them
2020-04-15 17:26:25 -07:00
Daniel Lemire 6d7c77ddc1
Let us try to check with the exceptions disabled. (#707)
* Tweaking code so that we can run all tests with exceptions off.
* Removing SIMDJSON_DISABLE_EXCEPTIONS
2020-04-15 16:45:36 -04:00
Daniel Lemire efd706528b Minor tweaks to the CMake. 2020-04-15 10:19:05 -04:00
Daniel Lemire b523c43927
Can we provide a size() function to arrays and objects? (eager approach) [TO BE MERGED] (#690)
* This is an implementation of "size()" for arrays and objects.
* Adding benchmark
* Adding a size() remark in the documentation.
* Extending size() to result types.
2020-04-15 10:15:48 -04:00
Paul Dreik 75545ff70d
ref qualify parser methods to avoid use of dangling objects (#703)
To avoid using data belonging to a temporary, the parse functions are ref qualified to get a compile error if used on an rvalue. See https://github.com/simdjson/simdjson/issues/696

Compilation tests are also added, to make sure bad usage fails to compile.

Reviewed by jkeiser.
2020-04-15 09:57:52 +02:00
Daniel Lemire 3c6ef83046
Trying to correct the documentation so that it actually describes how the code behaves. (Attempt two) (#712)
* Trying to correct the documentation so that it actually describes how the code behaves.

* tweaking the wording.

* Improving.

* Removing confusing sentence.

* Fixing formatting.

* Now with working example, tested.

* Added a smaller piece of code
2020-04-14 22:31:21 -04:00
John Keiser b9ac0a79f1
Merge pull request #715 from simdjson/jkeiser/thorough-type-tests
Test more variants of cast, get, etc.
2020-04-14 16:08:36 -07:00
Daniel Lemire 8539896f3d
It is inconvenient to be unable to print a padded_string. (#713)
* It is inconvenient to be unable to print a padded_string.

* Allows us to print the padded_string even when it is embedded in result object when exceptions are enabled.
2020-04-14 19:07:32 -04:00
John Keiser a3b508ceff Test get<>(), exception vs. no exception, explicit vs. implicit cast 2020-04-14 13:18:42 -07:00
John Keiser 1ff22c78b3 Add quickstart to cmake 2020-04-09 14:56:54 -07:00
John Keiser ceb1def55c Add quicktests, slowtests to cmake
- Also add testjson2json.sh
- Move test scripts to tests directory to consolidate concerns
2020-04-09 14:21:45 -07:00
John Keiser 7317fe1440 Don't reinitialize submodules
Add ability to turn competitive benchmarks off (no need for submodules)
2020-04-09 08:52:29 -07:00
John Keiser 6dabfa176a Add competition libraries 2020-04-09 08:52:29 -07:00
John Keiser 218c867f46 Disable failing VS2017 tests in cmake 2020-04-08 14:58:28 -07:00
John Keiser beaa6a9a7a Create simdjson-windows-headers interface library 2020-04-08 14:52:56 -07:00
John Keiser a9c8224f40 Add numberparsingcheck and stringparsingcheck tests 2020-04-08 14:52:56 -07:00
John Keiser 3dcc188d93 Add more tests to cmake 2020-04-08 14:52:56 -07:00
John Keiser 10b7556a37 Specify cmake tests, benchmarks and tools idiomatically 2020-04-08 14:52:56 -07:00
John Keiser 54b7291c34 Reference simdjson by name, don't specify include files individually 2020-04-08 14:52:55 -07:00
John Keiser 1e30b6e334 Compile under C++ 11 2020-04-08 14:00:13 -07:00
John Keiser 406240bae3 Support C++ 14 2020-04-08 14:00:13 -07:00
John Keiser 6eec2d6b4f Simplify cars example 2020-04-05 09:15:20 -07:00
Daniel Lemire 5731c5437a
Sanity test. (#675) 2020-04-04 16:39:37 -04:00
Daniel Lemire 04f14ec026
This adds a test for std::ignore (#674) 2020-04-04 11:53:03 -04:00
John Keiser 13aee51011 Add element.type() for type switching 2020-04-02 14:07:19 -07:00
John Keiser d93af1161d Remove set_capacity, replace with allocate
Makes allocation point more predictable
2020-03-30 13:49:54 -07:00
John Keiser 434776db1a Deprecate more things 2020-03-30 13:48:43 -07:00
John Keiser 2115596ed3 Compile performance.md examples in tests 2020-03-29 16:28:34 -07:00
John Keiser 0e3453f7c2 Compile examples from implementation-selection.md 2020-03-29 16:28:34 -07:00
John Keiser 7ed65e42d7 Add actual examples from basics.md to readme_examples 2020-03-29 16:28:29 -07:00
John Keiser ea8a5020e2 Remove array indexer, make object indexer key lookup 2020-03-28 15:56:43 -07:00
John Keiser 622d9c9480 Replace as_X and is_X with get<T> and is<T> 2020-03-28 15:29:53 -07:00
John Keiser 62da98aef6 Rename dom::stream to dom::document_stream 2020-03-28 13:42:24 -07:00
John Keiser 03746b966b Move document/element/etc. under dom 2020-03-28 13:42:21 -07:00
John Keiser e836c28008 Deprecate parser error code methods
- Also make competitions compile without warnings
2020-03-28 10:13:20 -07:00
John Keiser 5ad405006c Return document::element from parse, load, parse_many, load_many 2020-03-27 12:24:41 -07:00
John Keiser 90a7503181 Rename pj -> doc, fix a few other idioms 2020-03-27 09:22:46 -07:00
John Keiser c14b2fb36c Remove const char* variants for at_key()
- Remove const char * variants for at_key(), string_view covers them
- Add at_key_case_insensitive variants on *_result
- Add at(), at_key(), at_key_case_insensitive() tests
2020-03-27 09:09:08 -07:00
John Keiser f0f111b387 Make ParsedJson::Iterator backcompat test 2020-03-27 09:07:39 -07:00
Daniel Lemire 6a8ec95a46 Various fixes. 2020-03-26 20:08:54 -04:00
Daniel Lemire b6c6680add Ported jsoncheck. 2020-03-26 19:56:04 -04:00
Daniel Lemire 5fb149f833 Converted inter_tests... 2020-03-26 19:52:17 -04:00
Daniel Lemire abb0bf9247 Fixed basictests 2020-03-26 19:40:29 -04:00
Daniel Lemire 8f3ddd3a73 Updating allparserscheckfile 2020-03-26 17:15:33 -04:00
John Keiser 2e420169c3 Remove document::parse and document::load 2020-03-26 10:13:09 -07:00
John Keiser 5aec2671ea Remove JsonStream. Use parse_many() instead. 2020-03-26 09:25:07 -07:00
John Keiser a0bce440a6 Remove document_iterator, document::iterator, ParsedJsonIterator
Keep ParsedJson::Iterator only, without template, in same form as
it was in 0.2
2020-03-25 18:26:51 -07:00
John Keiser e1b1500e3b Make _padded available without using namespace simdjson 2020-03-25 09:37:18 -07:00
John Keiser b28cafc1d1 Remove backslash unescaping from JSON pointer impl
Also speed up non-escaped key lookup
2020-03-25 08:56:40 -07:00
John Keiser 0bcda5e384 Support JSON pointer in DOM navigation model 2020-03-23 15:05:20 -07:00
John Keiser c34b1a1b2a Organize basic tests to make easier to turn on/off 2020-03-21 18:12:16 -07:00
John Keiser e4df0ca368 Add parse, parse_many, load, load_many tests 2020-03-21 18:12:16 -07:00
Daniel Lemire 8a91cecf41 testing only with ok documents. 2020-03-21 18:12:16 -07:00
Daniel Lemire 04e8710cf5 Testing issue 570 2020-03-21 18:12:16 -07:00
John Keiser e8b3f9eaad Support document::parse("[1,2,3]"_padded) 2020-03-21 11:15:20 -07:00
Daniel Lemire 5d1e3efce8
faster minifier (#568)
* Fallback should use our scalar code.
* parse should have a nicer error message.
* Making it so that "minify" can use different architectures.
* Let us change the minifier competition so that it tests all implementations.
* Documenting the untaken optimization opportunity.

Co-authored-by: John Keiser <john@johnkeiser.com>
2020-03-20 16:14:47 -04:00
John Keiser 7cf3a7511b Add fallback implementation to CI
- Also add SIMDJSON_IMPLEMENTATION_HASWELL/WESTMERE/ARM64/FALLBACK=1/0 to
enable/disable various implemnentations
2020-03-17 14:59:47 -07:00
John Keiser af203aaf86 Add fallback parser for pre-SSE4.2 machines 2020-03-17 14:59:47 -07:00
John Keiser 8e2c06cb0e Compile with -fno-exceptions 2020-03-17 13:54:37 -07:00
John Keiser 1a5d8f1957 Add tests for SIMDJSON_EXCEPTIONS=0, add `tie()` support 2020-03-17 13:54:37 -07:00
Daniel Lemire 317fc6ba0e
accurate number parsing (#558) 2020-03-15 22:30:21 -04:00
Daniel Lemire d9a9fd387d Adding a stress test. 2020-03-13 18:59:15 -07:00
John Keiser acc7bd79b0 Support cout << json, cout << minify(json) 2020-03-13 18:59:15 -07:00
Daniel Lemire 12e6611ba4 Fix for printf. 2020-03-13 14:44:21 -04:00
Daniel Lemire 06c1dc3a29
Adding NDEBUG to release (#557)
* Adding NDEBUG to release

* Asserts are deleted with NDEBUG. We want hard asserts.
2020-03-13 14:37:02 -04:00
Daniel Lemire 89d9de2353
Adding a check to see whether document::stream copy constructor and assignment actually compile (#556)
* Currently, document::stream contains an attribute that is a reference:

```
      document::parser &parser;
```

Yet we try to have it default on the move operator:

```
  stream &operator=(document::stream &&other) = default;
  stream &operator=(const document::stream &) = delete; // Disallow copying
```

```
  stream(document::stream &&other) = default;
  stream(const document::stream &) = delete; // Disallow copying
```

I am not sure what the move is supposed to do with the reference.

I cannot find where we test the copy constructor and assignment. This has been concerned that it is either dead code or buggy code.

* Remove non-working, unnecessary move constructors

* We still want to disallow copies.

Co-authored-by: John Keiser <john@johnkeiser.com>
2020-03-13 12:53:42 -04:00
John Keiser ac0899c043 Add error tests, doc_ref_result[] chaining 2020-03-11 17:19:41 -07:00
John Keiser 40c6213d7e Add parser.load() and load_many() to load files 2020-03-11 17:19:41 -07:00
John Keiser d140bc23f5 Automatically allocate memory as needed in parse 2020-03-11 16:14:54 -07:00
John Keiser 3bdfe167de Support cout << error 2020-03-06 15:41:51 -08:00
John Keiser 31e8a12e88 Make error_message(error_code) return C string
- Also move all error message logic to include inline
2020-03-06 15:41:51 -08:00
John Keiser 9a7c8fb5be Use parse_many in examples/tests/docs 2020-03-05 12:04:45 -08:00
John Keiser cfef4ff2ad Create parser.parse_many() API 2020-03-05 12:04:45 -08:00
John Keiser b3ea8c406e Add simdjson.cpp for unified use (#515) 2020-03-04 10:12:27 -08:00
John Keiser 99667f7c55 Create top level simdjson.h (#515)
- Allows everyone to #include the same way, singleheader or not.
2020-03-04 10:12:27 -08:00
John Keiser 0b21203141 Document navigation API 2020-03-02 14:49:03 -08:00
Daniel Lemire 68670301e3
Adding instructions regarding how to check for an unsupported CPU (#508)
* Adding instructions.

* Slighty more documentation.
2020-02-25 11:09:51 -05:00
John Keiser 910f272467
Add parser implementation interface and selection API (#501)
* Make architecture implementations virtual functions

- Easier to add new architectures (add implementation to implementation.cpp)
- Easier to add new algorithms / functions to architecture selection
(add to implementation.h, implement)
- Automatically select best implementation in static initialization
- Allow user to explicitly select implementation with a string (i.e.
parameter)
- Allow user to inspect current implementation name/description
- Allow user to list available implementations
- Eliminate architecture enum and architecture-based templating
- Add noexcept in non-inline functions

* Move implementation static methods to their own classes

* Detect best supported implementation on first use

* available_implementationsI() -> available_implementations
2020-02-21 16:34:27 -05:00
John Keiser 4dc2adf7f8 Update README, add README examples 2020-02-18 08:37:07 -08:00
John Keiser 8e7d1a5f09
Separate document state from ParsedJson
This creates a "document" class with only user-facing document state (no parser internals).

- document: user-facing document state
- document::iterator: iterator (equivalent of ParsedJsonIterator)
- document::parser: parser state plus a "docked" document we parse into (equivalent of ParsedJson)

Usage:

```c++
auto doc = simdjson::document::parse(buf, len); // less efficient but simplest
```

```c++
simdjson::document::parser parser; // reusable parser
parser.allocate_capacity(len);
simdjson::document* doc = parser.parse(buf, len); // pointer to doc inside parser
doc = parser.parse(buf2, len); // reuses all buffers and overwrites doc; more efficient
```
2020-02-07 10:02:36 -08:00
Daniel Lemire c924aaede9
Fix issue472: make JsonStream a template. (#473)
* Fix issue472: make JsonStream a template.

* Adding missing include.

* Tweaking headers and some minor formatting.

* Removing file from aggregation.

* Moving jsoncharutils

* Adding new header.

* Trying another header.

* Let us try to route around Visual Studio's nonesense.
2020-01-30 17:16:41 -05:00
Daniel Lemire 28710f8ad5
fix for Issue 467 (#469)
* Fix for issue467

* Updating single-header

* Let us make it so that JsonStream is constructed from a padded_string which will avoid dangerous overruns.

* Fixing parse_stream

* Updating documentation.
2020-01-29 19:00:18 -05:00
Daniel Lemire 33060738b6
Making the project tag in simdjson more explicit and disabling LTO (#452)
* Making the project tag in simdjson more explicit

* Let us disable deliberately LTO.
2020-01-20 10:18:58 -05:00
Daniel Lemire 6e5e0278c2 Exposing bug #420 2020-01-09 09:55:54 -05:00
Daniel Lemire 7bde23590a
Debugging jsonstream (#432)
Fixes #424 (and provide tests for it), as well as #401
2020-01-03 22:22:47 -05:00
Daniel Lemire ba9dc12164 Adding tests motivated by https://github.com/lemire/simdjson/pull/430 2020-01-02 14:20:51 -05:00
Daniel Lemire 1d621bba37 Being more explicit about EMPTY errors. 2019-12-18 14:39:48 +00:00
Daniel Lemire fc6133b58f
Fixes issue 388 (#394) 2019-12-11 08:13:29 -05:00
Paul Dreik 6d14afd80e
Make threads optional in the cmake build (#376)
Only the simdjson library should optionally depend on threads,
the executables that link to simdjson will get the dependency
indirectly.

* add option for controlling threads (default is on)
* add CI testing with threading on/off for msvc, gcc and clang
* fix an unrelated copy paste comment error in the cirlce ci build conf
2019-11-22 21:51:46 +01:00
Jeremie Piotte 29fc51522a
Introducing concurrency mode in JsonStream. (#373)
* JsonStream threaded prototype

* JsonStream Threaded version working. Still supporting non-threaded version.

* Fix where invalid files would enter infinite loop.

* SingleHeader update

* I will remove -pthread in cmake for now.

* Attempt at resolving the -pthread issue
2019-11-21 11:22:06 -05:00
Jeremie Piotte bdc2b07339
Streams of JSON documents + Large files (>4GB) (#350) (#364)
* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* type

* minor fixes and cleaning.

* minor fixes and cleaning.

* removing warnings

* removing some copies

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* merged main into branch

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* Fix for https://github.com/lemire/simdjson/issues/345

* Follow up test and fix for https://github.com/lemire/simdjson/issues/345 (#347)

* Final (?) fix for https://github.com/lemire/simdjson/issues/345

* Verbose basictest

* Being more forgiving of powers of ten.

* Let us zero the tail end.

* add basic fuzzers (#348)

* add basic fuzzing using libFuzzer

* let cmake respect cflags, otherwise the fuzzer flags go unnoticed

also, integrates badly with oss-fuzz

* add new fuzzer for minification, simplify the old one

* add fuzzer for the dump example

* clang format

* adding Paul Dreik

* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* type

* minor fixes and cleaning.

* Fixing issue 351 (#352)

* Fixing issues 351 and 353

* minor fixes and cleaning.

* removing warnings

* removing some copies

* Fix ARM compile errors on g++ 7.4 (#354)

* Fix ARM compilation errors

* Update singleheader

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* fix integer overflow in subnormal_power10 (#355)

detected by oss-fuzz

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714

* Adding new test file, following https://github.com/lemire/simdjson/pull/355

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* merged main into branch

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* merging main

* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* minor fixes and cleaning.

* minor fixes and cleaning.

* removing warnings

* removing some copies

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* rough prototype working.  Needs more test and fine tuning.

* minor fixes and cleaning.

* adding jsonstream to amalgamation

* merged main into branch

* Addind a JsonStream Demo to Amalgamation

* merging main

* merging main

* make file fix
2019-11-08 17:39:45 -05:00
Daniel Lemire 3484dda45e Being more forgiving of powers of ten. 2019-10-24 18:27:24 -04:00
Daniel Lemire 1ece6c0e2f Verbose basictest 2019-10-24 16:40:40 -04:00
Daniel Lemire c469aed047
Follow up test and fix for https://github.com/lemire/simdjson/issues/345 (#347) 2019-10-24 16:06:29 -04:00
Daniel Lemire d0c0e31220 Increasing the ULP bound. 2019-10-18 17:30:29 -04:00