* If we are going to have a google benchmark flag, we better make sure that we test it out minimal (it should build).
* Fix bench_dom_api
Co-authored-by: John Keiser <john@johnkeiser.com>
* Make architecture implementations virtual functions
- Easier to add new architectures (add implementation to implementation.cpp)
- Easier to add new algorithms / functions to architecture selection
(add to implementation.h, implement)
- Automatically select best implementation in static initialization
- Allow user to explicitly select implementation with a string (i.e.
parameter)
- Allow user to inspect current implementation name/description
- Allow user to list available implementations
- Eliminate architecture enum and architecture-based templating
- Add noexcept in non-inline functions
* Move implementation static methods to their own classes
* Detect best supported implementation on first use
* available_implementationsI() -> available_implementations
This creates a "document" class with only user-facing document state (no parser internals).
- document: user-facing document state
- document::iterator: iterator (equivalent of ParsedJsonIterator)
- document::parser: parser state plus a "docked" document we parse into (equivalent of ParsedJson)
Usage:
```c++
auto doc = simdjson::document::parse(buf, len); // less efficient but simplest
```
```c++
simdjson::document::parser parser; // reusable parser
parser.allocate_capacity(len);
simdjson::document* doc = parser.parse(buf, len); // pointer to doc inside parser
doc = parser.parse(buf2, len); // reuses all buffers and overwrites doc; more efficient
```
* Fix for issue467
* Updating single-header
* Let us make it so that JsonStream is constructed from a padded_string which will avoid dangerous overruns.
* Fixing parse_stream
* Updating documentation.
* This revert the code back to how it was prior to the silly "run two stages" routine and instead
adds an option to benchmark the code over hot buffers. It turns out that it can be expensive,
when the files are large, to allocate the pages.
* Instead of emulating the whole parsing as stage 1 + stage 2, let us
benchmark the real thing.
* Adding explicit constructor.
* Adding warning to the benchmark user.
* Making re-running optional.
* string literal + integer means unintended and incorrect pointer arithmetic
fixes a clang warning. it could not be triggered, because it can only be
triggered if the string given to getopt is not covered among the
cases in the switch.
* handle review comment
* I think we can align the numbers better (so it is prettier).
* Remove space before %, align third line better
Co-authored-by: John Keiser <john@johnkeiser.com>
Only the simdjson library should optionally depend on threads,
the executables that link to simdjson will get the dependency
indirectly.
* add option for controlling threads (default is on)
* add CI testing with threading on/off for msvc, gcc and clang
* fix an unrelated copy paste comment error in the cirlce ci build conf
* JsonStream threaded prototype
* JsonStream Threaded version working. Still supporting non-threaded version.
* Fix where invalid files would enter infinite loop.
* SingleHeader update
* I will remove -pthread in cmake for now.
* Attempt at resolving the -pthread issue
* rough prototype working. Needs more test and fine tuning.
* prototype working on large files.
* prototype working on large files.
* Adding benchmarks
* jsonstream API adjustment
* type
* minor fixes and cleaning.
* minor fixes and cleaning.
* removing warnings
* removing some copies
* runtime dispatch error fix
* makefile linking src/jsonstream.cpp
* fixing arm stage 1 headers
* fixing stage 2 headers
* fixing stage 1 arm header
* making jsonstream portable
* cleaning imports
* including <algorithms> for windows compiler
* cleaning benchmark imports
* adding jsonstream to amalgamation
* merged main into branch
* bug fix where JsonStream would bug on rare cases.
* Addind a JsonStream Demo to Amalgamation
* Fix for https://github.com/lemire/simdjson/issues/345
* Follow up test and fix for https://github.com/lemire/simdjson/issues/345 (#347)
* Final (?) fix for https://github.com/lemire/simdjson/issues/345
* Verbose basictest
* Being more forgiving of powers of ten.
* Let us zero the tail end.
* add basic fuzzers (#348)
* add basic fuzzing using libFuzzer
* let cmake respect cflags, otherwise the fuzzer flags go unnoticed
also, integrates badly with oss-fuzz
* add new fuzzer for minification, simplify the old one
* add fuzzer for the dump example
* clang format
* adding Paul Dreik
* rough prototype working. Needs more test and fine tuning.
* prototype working on large files.
* prototype working on large files.
* Adding benchmarks
* jsonstream API adjustment
* type
* minor fixes and cleaning.
* Fixing issue 351 (#352)
* Fixing issues 351 and 353
* minor fixes and cleaning.
* removing warnings
* removing some copies
* Fix ARM compile errors on g++ 7.4 (#354)
* Fix ARM compilation errors
* Update singleheader
* runtime dispatch error fix
* makefile linking src/jsonstream.cpp
* fixing arm stage 1 headers
* fixing stage 2 headers
* fixing stage 1 arm header
* fix integer overflow in subnormal_power10 (#355)
detected by oss-fuzz
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714
* Adding new test file, following https://github.com/lemire/simdjson/pull/355
* making jsonstream portable
* cleaning imports
* including <algorithms> for windows compiler
* cleaning benchmark imports
* adding jsonstream to amalgamation
* merged main into branch
* bug fix where JsonStream would bug on rare cases.
* Addind a JsonStream Demo to Amalgamation
* merging main
* rough prototype working. Needs more test and fine tuning.
* prototype working on large files.
* prototype working on large files.
* Adding benchmarks
* jsonstream API adjustment
* minor fixes and cleaning.
* minor fixes and cleaning.
* removing warnings
* removing some copies
* runtime dispatch error fix
* makefile linking src/jsonstream.cpp
* fixing arm stage 1 headers
* fixing stage 2 headers
* fixing stage 1 arm header
* making jsonstream portable
* cleaning imports
* including <algorithms> for windows compiler
* cleaning benchmark imports
* adding jsonstream to amalgamation
* bug fix where JsonStream would bug on rare cases.
* Addind a JsonStream Demo to Amalgamation
* rough prototype working. Needs more test and fine tuning.
* minor fixes and cleaning.
* adding jsonstream to amalgamation
* merged main into branch
* Addind a JsonStream Demo to Amalgamation
* merging main
* merging main
* make file fix
* Allow -f
* Support parse -s (force sse)
* Simplify flatten_bits
- Add directly to base instead of storing variable
- Don't modify base_ptr after beginning of function
- Eliminate base variable and increment base_ptr instead
* De-unroll the flatten_bits loops
* Decrease dependencies in stage 1
- Do all finalize_structurals work before computing the quote mask; mask
out the quote mask later
- Join find_whitespace_and_structurals and finalize_structurals into
single find_structurals call, to reduce variable leakage
- Rework pseudo_pred algorithm to refer to "primitive" for clarity and some
dependency reduction
- Rename quote_mask to in_string to describe what we're trying to
achieve ("mask" could mean many things)
- Break up find_quote_mask_and_bits into find_quote_mask and
invalid_string_bytes to reduce data leakage (i.e. don't expose quote bits
or odd_ends at all to find_structural_bits)
- Genericize overflow methods "follows" and "follows_odd_sequence" for
descriptiveness and possible lifting into a generic simd parsing library
* Mark branches as likely/unlikely
* Reorder and unroll+interleave stage 1 loop
* Nest the cnt > 16 branch inside cnt > 8
* handle uint64 value in JSON
* Add integer_tests
* Add get_unsigned_integer() on ParsedJson::BasicIterator
* Write 'u' to tape when the value seems unsigned
* Add to handle 'u' element
* Brush up integer_tests.cpp
* Append tests/integer_tests in .gitignore
* Add comments to is_integer and is_unsigned_integer
* Add -n and -w arguments
* Add Dockerfile that compares perf against master
* Add checkperf to .drone.yml
* Clone from github instead of .git since CI doesn't have .git