Commit Graph

183 Commits

Author SHA1 Message Date
John Keiser cfef4ff2ad Create parser.parse_many() API 2020-03-05 12:04:45 -08:00
John Keiser b3ea8c406e Add simdjson.cpp for unified use (#515) 2020-03-04 10:12:27 -08:00
John Keiser 99667f7c55 Create top level simdjson.h (#515)
- Allows everyone to #include the same way, singleheader or not.
2020-03-04 10:12:27 -08:00
John Keiser 0b21203141 Document navigation API 2020-03-02 14:49:03 -08:00
Daniel Lemire 68670301e3
Adding instructions regarding how to check for an unsupported CPU (#508)
* Adding instructions.

* Slighty more documentation.
2020-02-25 11:09:51 -05:00
John Keiser 910f272467
Add parser implementation interface and selection API (#501)
* Make architecture implementations virtual functions

- Easier to add new architectures (add implementation to implementation.cpp)
- Easier to add new algorithms / functions to architecture selection
(add to implementation.h, implement)
- Automatically select best implementation in static initialization
- Allow user to explicitly select implementation with a string (i.e.
parameter)
- Allow user to inspect current implementation name/description
- Allow user to list available implementations
- Eliminate architecture enum and architecture-based templating
- Add noexcept in non-inline functions

* Move implementation static methods to their own classes

* Detect best supported implementation on first use

* available_implementationsI() -> available_implementations
2020-02-21 16:34:27 -05:00
John Keiser 4dc2adf7f8 Update README, add README examples 2020-02-18 08:37:07 -08:00
John Keiser 8e7d1a5f09
Separate document state from ParsedJson
This creates a "document" class with only user-facing document state (no parser internals).

- document: user-facing document state
- document::iterator: iterator (equivalent of ParsedJsonIterator)
- document::parser: parser state plus a "docked" document we parse into (equivalent of ParsedJson)

Usage:

```c++
auto doc = simdjson::document::parse(buf, len); // less efficient but simplest
```

```c++
simdjson::document::parser parser; // reusable parser
parser.allocate_capacity(len);
simdjson::document* doc = parser.parse(buf, len); // pointer to doc inside parser
doc = parser.parse(buf2, len); // reuses all buffers and overwrites doc; more efficient
```
2020-02-07 10:02:36 -08:00
Daniel Lemire c924aaede9
Fix issue472: make JsonStream a template. (#473)
* Fix issue472: make JsonStream a template.

* Adding missing include.

* Tweaking headers and some minor formatting.

* Removing file from aggregation.

* Moving jsoncharutils

* Adding new header.

* Trying another header.

* Let us try to route around Visual Studio's nonesense.
2020-01-30 17:16:41 -05:00
Daniel Lemire 28710f8ad5
fix for Issue 467 (#469)
* Fix for issue467

* Updating single-header

* Let us make it so that JsonStream is constructed from a padded_string which will avoid dangerous overruns.

* Fixing parse_stream

* Updating documentation.
2020-01-29 19:00:18 -05:00
Daniel Lemire 33060738b6
Making the project tag in simdjson more explicit and disabling LTO (#452)
* Making the project tag in simdjson more explicit

* Let us disable deliberately LTO.
2020-01-20 10:18:58 -05:00
Daniel Lemire 6e5e0278c2 Exposing bug #420 2020-01-09 09:55:54 -05:00
Daniel Lemire 7bde23590a
Debugging jsonstream (#432)
Fixes #424 (and provide tests for it), as well as #401
2020-01-03 22:22:47 -05:00
Daniel Lemire ba9dc12164 Adding tests motivated by https://github.com/lemire/simdjson/pull/430 2020-01-02 14:20:51 -05:00
Daniel Lemire 1d621bba37 Being more explicit about EMPTY errors. 2019-12-18 14:39:48 +00:00
Daniel Lemire fc6133b58f
Fixes issue 388 (#394) 2019-12-11 08:13:29 -05:00
Paul Dreik 6d14afd80e
Make threads optional in the cmake build (#376)
Only the simdjson library should optionally depend on threads,
the executables that link to simdjson will get the dependency
indirectly.

* add option for controlling threads (default is on)
* add CI testing with threading on/off for msvc, gcc and clang
* fix an unrelated copy paste comment error in the cirlce ci build conf
2019-11-22 21:51:46 +01:00
Jeremie Piotte 29fc51522a
Introducing concurrency mode in JsonStream. (#373)
* JsonStream threaded prototype

* JsonStream Threaded version working. Still supporting non-threaded version.

* Fix where invalid files would enter infinite loop.

* SingleHeader update

* I will remove -pthread in cmake for now.

* Attempt at resolving the -pthread issue
2019-11-21 11:22:06 -05:00
Jeremie Piotte bdc2b07339
Streams of JSON documents + Large files (>4GB) (#350) (#364)
* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* type

* minor fixes and cleaning.

* minor fixes and cleaning.

* removing warnings

* removing some copies

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* merged main into branch

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* Fix for https://github.com/lemire/simdjson/issues/345

* Follow up test and fix for https://github.com/lemire/simdjson/issues/345 (#347)

* Final (?) fix for https://github.com/lemire/simdjson/issues/345

* Verbose basictest

* Being more forgiving of powers of ten.

* Let us zero the tail end.

* add basic fuzzers (#348)

* add basic fuzzing using libFuzzer

* let cmake respect cflags, otherwise the fuzzer flags go unnoticed

also, integrates badly with oss-fuzz

* add new fuzzer for minification, simplify the old one

* add fuzzer for the dump example

* clang format

* adding Paul Dreik

* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* type

* minor fixes and cleaning.

* Fixing issue 351 (#352)

* Fixing issues 351 and 353

* minor fixes and cleaning.

* removing warnings

* removing some copies

* Fix ARM compile errors on g++ 7.4 (#354)

* Fix ARM compilation errors

* Update singleheader

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* fix integer overflow in subnormal_power10 (#355)

detected by oss-fuzz

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714

* Adding new test file, following https://github.com/lemire/simdjson/pull/355

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* merged main into branch

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* merging main

* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* minor fixes and cleaning.

* minor fixes and cleaning.

* removing warnings

* removing some copies

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* rough prototype working.  Needs more test and fine tuning.

* minor fixes and cleaning.

* adding jsonstream to amalgamation

* merged main into branch

* Addind a JsonStream Demo to Amalgamation

* merging main

* merging main

* make file fix
2019-11-08 17:39:45 -05:00
Daniel Lemire 3484dda45e Being more forgiving of powers of ten. 2019-10-24 18:27:24 -04:00
Daniel Lemire 1ece6c0e2f Verbose basictest 2019-10-24 16:40:40 -04:00
Daniel Lemire c469aed047
Follow up test and fix for https://github.com/lemire/simdjson/issues/345 (#347) 2019-10-24 16:06:29 -04:00
Daniel Lemire d0c0e31220 Increasing the ULP bound. 2019-10-18 17:30:29 -04:00
Daniel Lemire 9b7832c39a Adding another test (powers of two). 2019-10-16 17:47:52 -04:00
Daniel Lemire e3e29b720d Adding a check that powers of ten can be parsed (sanity check). 2019-10-16 16:27:50 -04:00
Daniel Lemire 92334a8e28 Better tests. 2019-09-02 12:32:44 -04:00
saka1 c1f27fb848 Accept large unsigned integers (#295)
* handle uint64 value in JSON
* Add integer_tests
* Add get_unsigned_integer() on  ParsedJson::BasicIterator
* Write 'u' to tape when the value seems unsigned
* Add to handle 'u' element
* Brush up integer_tests.cpp
* Append tests/integer_tests in .gitignore
* Add comments to is_integer and is_unsigned_integer
2019-09-02 10:50:24 -04:00
Daniel Lemire f667d4965d
This is a bug fix: our prev function was buggy. (#291) 2019-08-23 18:59:43 -04:00
Daniel Lemire 99a153d9e8
Hiding the pointer away... (#252)
* Hiding the runtime dispatch pointer in a source file so it is not an exported symbol
* Disabling hard failure on style check.
* Fixes https://github.com/lemire/simdjson/issues/250
2019-08-04 15:41:00 -04:00
John Keiser bf59ba76f5 Fix most warnings on VS2019 (#241) 2019-07-31 17:43:45 -04:00
ioioioio c2eea8abba Style uniformization (#238)
* massive clang-format -style=LLVM

* naming harmonization

* adding commentary about sysinfoapi.h
2019-07-30 17:18:10 -04:00
ioioioio bcabdfc1ae Json pointer (#220)
* json pointer support

* Addition of tests for the json pointer

* Adding a new tool for the JSON Pointer support, and some documentation.
2019-07-26 18:38:10 -04:00
Daniel Lemire e926b4b3c9
More accurate number parsing (#217)
* This drastically improves the accuracy (down to to a ULP of 1)

* More comments and documentation.
2019-07-15 22:17:49 -04:00
Daniel Lemire 14ee003907
Silencing warning in instances with char is unsigned. (#208)
* Silencing warning in instances with char is unsigned.

* Damn it.
2019-07-09 12:07:28 -04:00
Daniel Lemire 2b2d93b05f Various minor tweaks. 2019-07-04 17:19:05 -04:00
ioioioio 036f9d5a45 Merge branch 'master' of https://github.com/lemire/simdjson into Multiple_implementation_refactoring_stage2 2019-07-03 10:34:58 -04:00
ioioioio 3f24879157 Stage2 refactored to simplify multiple implementations 2019-07-02 17:12:00 -04:00
ioioioio b335af8507 Attemp to solve some singleheader's problems 2019-07-02 16:24:56 -04:00
ioioioio 9230588ce8 conflicts are solved 2019-07-02 15:21:00 -04:00
Daniel Lemire 471c71310b Various formatting issues in the tests directory. 2019-06-26 19:48:51 -04:00
Daniel Lemire f75280ac9c
Fix for issue 150 (#162)
* Checks for issue 150. We run through the test files with sanitizers on.

* Fix for issue 150: the remaining issues were an overrun on the depth capacity and an "off-by-1" overrun on tape capacity.

* Improving makefile.

* Safer git submodule command.

* Getting get 'git' on circleci
2019-05-09 20:51:33 -04:00
Daniel Lemire e370a65383
Fix for issues 32, 50, 131, 137
* Improving portability.

* Revisiting faulty logic regarding same-page overruns.

* Disabling same-page overruns under VS.

* Clarifying the documentation

* Fix for issue 131 + being more explicit regarding memory realloc.

* Fix for issue 137.

* removing "using namespace std" throughout. Fix for 50

* Introducing typed malloc/free.

* Introducing a custom class (padded_string) that solves several minor usability issues.

* Updating amalgamation for testing.
2019-05-09 17:59:51 -04:00
Daniel Lemire f0574d492c
Fix for issue 154 (#157)
* Changes necessary to reproduce

https://github.com/lemire/simdjson/issues/154

* Fixing issue 154.
2019-05-08 22:33:11 -04:00
Daniel Lemire 0d81fd287e
With this commit we can do all tests with full sanitizers on, and get no warning (#132)
* Making sure we can run with the sanitizers on.
* Minor code simplification in the number parsing.
* Following @EmilGedda 's recommendations regarding the makefile.
* Reference to blog post.
* Adding link to https://johnnylee-sde.github.io/Fast-numeric-string-to-int/
* Better hex parsing.
2019-04-24 17:31:47 -04:00
Emil Gedda 5eb3610a01 Fix parsing success check for simdjson's json_parse in allparserscheckfile.cpp and minor cleanup (#141)
* Fix parsing success check for simdjson's json_parse

* Remove trailing whitespace

* Remove redundant using declarations
2019-04-16 22:07:52 -04:00
Emil Gedda 9368d7c25c Fix syntax error introduced by 772919 (#138) 2019-04-13 09:43:28 -04:00
Nan Xiao 8fc77a4365 Fix build warning 2019-03-07 13:49:27 +08:00
Georgios Floros 0ca170b130 Fix crash in singleheader test (#105)
Use `aligned_free` instead of `free` to free memory allocated by aligned function.
2019-03-05 15:35:34 -05:00
myd7349 2851ea490c Export CMake targets (#96) 2019-03-04 16:07:06 -05:00
Thomas Navennec 352dd5e7fa Change parse_json return type from bool to int (#82)
* Added simdjerr namespace

* Updated jsonparser files

* updated stage1 and stage2

* removed stage2 inline function

* Added forgotten return statements

* Updated tools and benchmarks

* Corrected parenthesis

* Removed extra =

* Accidentally undid reinterpret_cast

* Better comments, undid a header name fuckup

* Added an errorMsg method, updated readme

* Removed useless header from stage2

* Updated single-header file

* added simdjerr.cpp contents to simdjson.cpp

* Made single header version work

* Updated singleheader test, fixed simdjson.cpp

* Renamed simdjerr namespace and files to simdjson

* Updating the amalgamation.
2019-03-02 17:18:45 -05:00
Thomas Navennec b0c4302887 Very basic test for single header (#91)
* Very basic test for single  header

* Changed the way the test is declared
2019-02-27 21:04:06 -05:00
Kai Wolf 772919ef11 Use unique_ptr instead of new/delete 2019-02-25 21:03:20 +01:00
Kai Wolf b521719b6f Fix old-style C-Casts 2019-02-23 17:31:38 +01:00
Kai Wolf ff22e75f95 Apply minor readability fixes 2019-02-23 17:28:20 +01:00
Daniel Lemire 1b115dbd3a Adding jsoncpp 2019-01-24 14:28:26 -05:00
Daniel Lemire 974babf69f Adding more competition. 2019-01-17 17:24:29 -05:00
Daniel Lemire 3ce1dd8087 Cleaning. 2018-12-31 17:13:32 -05:00
Daniel Lemire 58d41923fd
Porting to visual studio
Now builds on Visual Studio
2018-12-30 21:00:19 -05:00
Daniel Lemire 3b24ba9043 Adding cmake 2018-12-28 13:05:42 -05:00
Daniel Lemire bf4089b33b Removing custom types (more standard code). 2018-12-27 20:09:25 -05:00
Daniel Lemire f983703a2e A more robust testing program. 2018-12-11 18:01:26 -05:00
Daniel Lemire 0b48fb8bd7 Removing memory leaks. 2018-12-11 17:20:29 -05:00
Daniel Lemire a10e282eb4 Adding a missing free. 2018-12-11 17:03:36 -05:00
Daniel Lemire e8d3d784ab More fixing. 2018-12-10 22:21:03 -05:00
Daniel Lemire 7296d4d48b Fixing... 2018-12-10 17:39:19 -05:00
Daniel Lemire 176d2ccda4 Tweaking. 2018-12-10 14:25:49 -05:00
Daniel Lemire c8706c66ec Solving some build issues 2018-12-05 21:33:32 -05:00
Daniel Lemire c11eefca32 More cleaning. 2018-11-30 21:31:05 -05:00
Daniel Lemire a8b99984f2 Intermediate step. 2018-11-30 20:27:16 -05:00
Daniel Lemire e5707331e9 Some refactoring. 2018-11-30 09:37:57 -05:00
Daniel Lemire a43b0772e1 Lots and lots of cleaning. 2018-11-27 14:37:59 -05:00
Daniel Lemire 5fae7b2100 Still working 2018-11-27 10:10:39 -05:00
Daniel Lemire 5bdf19bb18 Removing parsers that are unfair. 2018-11-20 20:08:02 -05:00
Daniel Lemire 18633e02d2 Added more thorough testing. 2018-10-23 20:19:33 -04:00
Daniel Lemire 6cc5131f7a Adding an allparserscheckfile program. 2018-10-17 12:00:44 -04:00
Daniel Lemire 8f704fdb7c Making the test tougher. 2018-10-08 16:32:54 -04:00
Daniel Lemire 7eb7cd265a We can now parse crazy things like pi to 100 digits. 2018-10-08 15:24:16 -04:00
Daniel Lemire e2a3f751cf Counting numbers. 2018-10-04 09:48:00 -04:00
Daniel Lemire ecbe1158ed Added testing for number parsing. 2018-09-27 20:26:27 -04:00
Daniel Lemire e4094afe08 Moving toward having number-parsing testing. 2018-09-27 17:38:15 -04:00
Daniel Lemire 94ea7cefb0 Moving include files into a sensible subdirectory. 2018-08-20 17:51:38 -04:00
Daniel Lemire fb65be64bb Major surgery. 2018-08-20 17:27:25 -04:00
Daniel Lemire d204e54170 Moving tests to a separate file and directory. 2018-08-17 19:57:31 -04:00