Commit Graph

331 Commits

Author SHA1 Message Date
Daniel Lemire dcd0cb8080
Fix for https://github.com/lemire/simdjson/issues/58 (#168) 2019-05-19 12:25:27 -04:00
Daniel Lemire 47beaff152 Adding white-listing for memory sanitizer. 2019-05-19 11:18:54 -04:00
Dong Xie b98454d213 Add explicit conversion for leading and tailing zeros. (#161) 2019-05-09 20:56:13 -04:00
Daniel Lemire 954b89e762 New version (0.1.2). 2019-05-09 20:55:26 -04:00
Daniel Lemire f75280ac9c
Fix for issue 150 (#162)
* Checks for issue 150. We run through the test files with sanitizers on.

* Fix for issue 150: the remaining issues were an overrun on the depth capacity and an "off-by-1" overrun on tape capacity.

* Improving makefile.

* Safer git submodule command.

* Getting get 'git' on circleci
2019-05-09 20:51:33 -04:00
Daniel Lemire e370a65383
Fix for issues 32, 50, 131, 137
* Improving portability.

* Revisiting faulty logic regarding same-page overruns.

* Disabling same-page overruns under VS.

* Clarifying the documentation

* Fix for issue 131 + being more explicit regarding memory realloc.

* Fix for issue 137.

* removing "using namespace std" throughout. Fix for 50

* Introducing typed malloc/free.

* Introducing a custom class (padded_string) that solves several minor usability issues.

* Updating amalgamation for testing.
2019-05-09 17:59:51 -04:00
Heinz N. Gies c5a3f9ccd4 Add failing test for a json with content zero (#134)
* Add failing test for a json with content zero

* Mark 0 byte as false in structural_or_whitespace_or_exponent_or_decimal_negated
2019-05-09 12:24:22 -04:00
Daniel Lemire f0574d492c
Fix for issue 154 (#157)
* Changes necessary to reproduce

https://github.com/lemire/simdjson/issues/154

* Fixing issue 154.
2019-05-08 22:33:11 -04:00
Daniel Lemire 39fcc62e85
Fixed typo 2019-05-08 13:42:30 -04:00
saka1 719dff1312 Add predicates to ParsedJson::iterator (#153) 2019-05-07 14:11:33 -04:00
Daniel Lemire 0d81fd287e
With this commit we can do all tests with full sanitizers on, and get no warning (#132)
* Making sure we can run with the sanitizers on.
* Minor code simplification in the number parsing.
* Following @EmilGedda 's recommendations regarding the makefile.
* Reference to blog post.
* Adding link to https://johnnylee-sde.github.io/Fast-numeric-string-to-int/
* Better hex parsing.
2019-04-24 17:31:47 -04:00
Daniel Lemire 681cd33698 Making the iterator a tad safer (tweaking the constructor so that it can throw). 2019-04-22 10:53:25 -04:00
Geoff Langdale 777b9c9a9e Unbreak x86. Durp. 2019-03-30 15:50:35 +11:00
Geoff Langdale 5ba29122fd First cut of ARM port. Needs hand-hacked Makefile. 2019-03-30 00:47:35 -04:00
saka1 ddc2867f94 Adjust format and comments on avxcheckOverlong (#129) 2019-03-25 10:06:27 -04:00
Geoff Langdale 2c23b375b2 Temporarily added a non-x86 definition of SIMDJSON_PADDING 2019-03-21 11:37:40 +11:00
Geoff Langdale 9b6d32346b Fixup portability.h to be more portable. 2019-03-21 11:25:51 +11:00
Daniel Lemire 374fe1af1e
Updating comments regarding usage 2019-03-15 12:10:10 -04:00
Daniel Lemire bf9b1b1457 New version (mostly setting the singleheader version in sync). 2019-03-13 21:02:39 -04:00
Daniel Lemire d5a185b13e new release. 2019-03-13 20:02:44 -04:00
Daniel Lemire df8f792183
Store the string lengths on the string tape (#101)
* Store string length in the string-tape item.
* Files are now limited to 4GB.
* Moving detection of unescaped chars to stage 1 to reduce the burden due to string parsing.

Fixes https://github.com/lemire/simdjson/issues/114

Fixes https://github.com/lemire/simdjson/issues/87
2019-03-13 19:32:57 -04:00
Daniel Lemire 609e96b5d1 Fix for https://github.com/lemire/simdjson/issues/119 2019-03-13 11:01:31 -04:00
myd7349 d2fa086198 Fix C4146 build error on UWP with MSVC (#113)
* Fix C4146 build error on UWP with MSVC

* Regenerate single header version

* Fix typo in parsedjson.h

* Regenerate single header version
2019-03-09 08:46:06 -05:00
Thomas Navennec 352dd5e7fa Change parse_json return type from bool to int (#82)
* Added simdjerr namespace

* Updated jsonparser files

* updated stage1 and stage2

* removed stage2 inline function

* Added forgotten return statements

* Updated tools and benchmarks

* Corrected parenthesis

* Removed extra =

* Accidentally undid reinterpret_cast

* Better comments, undid a header name fuckup

* Added an errorMsg method, updated readme

* Removed useless header from stage2

* Updated single-header file

* added simdjerr.cpp contents to simdjson.cpp

* Made single header version work

* Updated singleheader test, fixed simdjson.cpp

* Renamed simdjerr namespace and files to simdjson

* Updating the amalgamation.
2019-03-02 17:18:45 -05:00
Daniel Lemire a24e701b4e First release (0.0.1) 2019-02-26 10:14:49 -05:00
geofflangdale bdc2bc693f
Merge pull request #61 from NewProggie/fix_minor_problems
Fix minor problems
2019-02-26 20:50:03 +11:00
Kai Wolf 33341b60d8 Apply code review suggestions
- Undo explicit bool conversion
 - Don't check for NULL before deleting pointer
2019-02-26 09:36:28 +01:00
Geoff Langdale 5289bf3eeb Fixing Utf8 validation question #72 2019-02-26 13:17:29 +11:00
Kai Wolf e7683820d5
Merge branch 'master' into fix_minor_problems 2019-02-25 21:05:29 +01:00
Kai Wolf 95e6fc2844 Fix CI errors 2019-02-25 20:55:07 +01:00
Wojciech Muła 7830b1be87 Use nothrow (#65)
* Use C++11 features

* Use std::nothrow

By default new throws std::bad_alloc, so no check code would be executed.
2019-02-25 14:36:45 -05:00
Egor Bogatov 83ab72079f Add link to C# version (#66)
* fix noiline for MSVC

* Add SimdJsonSharp link to README.md
2019-02-25 14:17:43 -05:00
Kai Wolf b521719b6f Fix old-style C-Casts 2019-02-23 17:31:38 +01:00
Kai Wolf ff22e75f95 Apply minor readability fixes 2019-02-23 17:28:20 +01:00
Daniel Lemire 3640ab9dd3 Fixing the makefile build. 2019-02-22 15:34:35 -05:00
Thomas Navennec 9606343b2c ParsedJson & ParsedJson::iterator definitions in .cpp files (#47)
* Minor change to benchmark cmake

* Moved ParsedJson and its Iterator to separate .cpp files

* Uncommented functions, that has nothing to do with this pr

* Removed really_inline comments

* Reinstated some inline functions to restore previous performance

* Re-merged iterator in ParsedJson

* Uncommented some WARN_UNUSED
2019-02-22 14:38:35 -05:00
Daniel Lemire 4d6ed2b2c1 Tape capacity increase by 32 bytes to allow for expected overflow. 2019-02-22 13:08:46 -05:00
Daniel Lemire 90c881a3de Invoking -mbmi 2019-01-16 13:26:24 -05:00
Daniel Lemire b5a2c41049 We need the Intel intrinsic. 2019-01-16 13:13:03 -05:00
Daniel Lemire 388f89f185 Working on improving portability 2019-01-16 12:33:06 -05:00
Daniel Lemire a00df9b992 Number parsing fix. 2019-01-04 17:36:52 -05:00
Daniel Lemire d94e8ee973 Fixing dead code. 2019-01-02 12:16:29 -05:00
Daniel Lemire 3ce1dd8087 Cleaning. 2018-12-31 17:13:32 -05:00
Daniel Lemire 58d41923fd
Porting to visual studio
Now builds on Visual Studio
2018-12-30 21:00:19 -05:00
Daniel Lemire 46ef59c679 Cleaning. 2018-12-27 20:19:10 -05:00
Daniel Lemire bf4089b33b Removing custom types (more standard code). 2018-12-27 20:09:25 -05:00
Daniel Lemire 20133963bc Trying a detailed analysis. 2018-12-19 21:23:37 -05:00
Daniel Lemire 0a109508de Added documentation of the tape format. 2018-12-18 15:09:27 -05:00
Daniel Lemire 779ce184fb Getting ready to document the tape format. 2018-12-18 14:21:22 -05:00
Daniel Lemire 0769c39e27 Ok. Looks complete. 2018-12-14 21:32:42 -05:00
Daniel Lemire 05a2547829 Adding benchmark. 2018-12-12 22:42:19 -05:00
Daniel Lemire 751dce98f5 Getting there slowly. 2018-12-11 22:39:39 -05:00
Daniel Lemire e8d3d784ab More fixing. 2018-12-10 22:21:03 -05:00
Daniel Lemire 058eb917d1 Better doc. 2018-12-10 22:00:16 -05:00
Daniel Lemire e4703a383b Even safer. 2018-12-10 20:54:31 -05:00
Daniel Lemire 7296d4d48b Fixing... 2018-12-10 17:39:19 -05:00
Daniel Lemire 05636f3a1d Cleaning. 2018-12-10 16:47:02 -05:00
Daniel Lemire 7fda77d51a Mostly fixed performance regression. 2018-12-10 15:35:42 -05:00
Daniel Lemire 8615760331 Should now pass. 2018-12-10 15:16:31 -05:00
Daniel Lemire 176d2ccda4 Tweaking. 2018-12-10 14:25:49 -05:00
Daniel Lemire 52c4b65f1e Progress validating the API. 2018-12-09 20:47:02 -05:00
Daniel Lemire a56e92a571 API works now. 2018-12-09 13:08:41 -05:00
Daniel Lemire 747bb16919 Iteration API implemented but untested. 2018-12-07 23:35:53 -05:00
Daniel Lemire 9df22452af First API implementation. 2018-12-07 22:19:57 -05:00
Daniel Lemire 628e4e3522 Fix https://github.com/lemire/simdjson/issues/26 2018-12-06 22:51:55 -05:00
Daniel Lemire beb030fc16 Tweaking 2018-12-06 22:23:57 -05:00
Daniel Lemire c2913d5d69 Adding dynamic memory allocation. 2018-12-06 21:44:26 -05:00
Daniel Lemire 8589a0588b More clever parse function. 2018-12-06 17:40:32 -05:00
Daniel Lemire e2d2d2f8ff Adding more tests. 2018-12-06 17:22:22 -05:00
Daniel Lemire 196c41e3bc Fixed typo 2018-12-06 11:37:26 -05:00
Daniel Lemire 0f9a7a6b2f Improving a bit the number parsing using MMX. 2018-12-05 23:08:33 -05:00
Daniel Lemire c8706c66ec Solving some build issues 2018-12-05 21:33:32 -05:00
Daniel Lemire 4a4bf8d98d Fixed issue where the numbers don't appear properly after parsing. 2018-11-30 22:40:10 -05:00
Daniel Lemire e3a4b41c2e Cleaning. 2018-11-30 22:02:32 -05:00
Daniel Lemire c11eefca32 More cleaning. 2018-11-30 21:31:05 -05:00
Daniel Lemire a8b99984f2 Intermediate step. 2018-11-30 20:27:16 -05:00
Daniel Lemire e5707331e9 Some refactoring. 2018-11-30 09:37:57 -05:00
Daniel Lemire 12b518578d Ok, the new code seems quite fast. 2018-11-29 22:15:02 -05:00
Daniel Lemire ce85dd0c3a Still need to streamline number parsing. 2018-11-29 17:56:17 -05:00
Daniel Lemire c1de7662c1 Simplifying function call. 2018-11-28 11:12:28 -05:00
Daniel Lemire b858a404f7 Adding missing include 2018-11-28 10:29:57 -05:00
Daniel Lemire 8648c4108e MOre cleaning. 2018-11-27 20:42:35 -05:00
Daniel Lemire ba0f6fea51 Cleaning. 2018-11-27 17:38:53 -05:00
Daniel Lemire 58ac242770 Ok. Let us benchmark this thing. 2018-11-27 15:05:50 -05:00
Daniel Lemire a43b0772e1 Lots and lots of cleaning. 2018-11-27 14:37:59 -05:00
Daniel Lemire 5fae7b2100 Still working 2018-11-27 10:10:39 -05:00
Daniel Lemire 50defa510f Stupid work. 2018-11-26 16:55:24 -05:00
Daniel Lemire 08ae836aa1 Removing dead file. 2018-11-20 20:53:42 -05:00
Daniel Lemire 1fcd2688f8 Better documentation. 2018-11-20 12:59:06 -05:00
Daniel Lemire bbeb64a70b Cleaning documentation. 2018-11-20 12:54:06 -05:00
Daniel Lemire 78e75a8bae Even faster. 2018-11-20 11:56:10 -05:00
Daniel Lemire 7dd590c43c Saving faster version. 2018-11-20 11:02:39 -05:00
Daniel Lemire 47ae00895a Forgot to save... 2018-11-09 21:42:44 -05:00
Daniel Lemire 17f5d0517d Opting for a more common intrinsic. 2018-11-09 21:41:15 -05:00
Daniel Lemire 76074a821f Various cleaning steps. 2018-11-09 21:31:14 -05:00
Daniel Lemire 0e5b939568 Merge branch 'master' of github.com:lemire/simdjson 2018-11-09 15:16:25 -05:00
Daniel Lemire c1a7e79862 Lifting the mem limit. (Dirty commit.) 2018-11-09 15:16:05 -05:00
Daniel Lemire df65de4ae2 Tuning presentation and fixing a problem with minifier benchmark. 2018-10-23 21:36:32 -04:00
Daniel Lemire 18633e02d2 Added more thorough testing. 2018-10-23 20:19:33 -04:00
Daniel Lemire 9738af68c8 Fixing up the code point parsing. I think that what is there is now correct.
I believe that there was a case of early optimization.
2018-10-19 22:07:22 -04:00
Daniel Lemire 8315f4c888 Cleaning up the code. 2018-10-17 21:31:22 -04:00
Daniel Lemire 35381279c3 Maybe we can do away with the fast ASCII trick. 2018-10-17 21:05:38 -04:00
Daniel Lemire e517414080 We include character-encoding validation. 2018-10-17 19:22:09 -04:00
Daniel Lemire 355e5d2ed3 Checking for unescaped chars. 2018-10-17 15:08:49 -04:00
Daniel Lemire 7eb7cd265a We can now parse crazy things like pi to 100 digits. 2018-10-08 15:24:16 -04:00
Daniel Lemire 70c122074f Tests. 2018-10-08 14:41:36 -04:00
Daniel Lemire 37adea9387 Adding a comment. 2018-09-30 14:44:30 -04:00
Daniel Lemire 314356d561 We have faster number parsing...? 2018-09-28 18:26:27 -04:00
Daniel Lemire 4ee515fa4b The new number parsing code is faster. 2018-09-28 14:45:34 -04:00
Daniel Lemire 57b840327f Faster number parsing? 2018-09-28 14:38:40 -04:00
Geoff Langdale 1e5d8ece56 Update API a bit 2018-09-28 14:59:30 +10:00
Geoff Langdale 89fd074ec9 Draft API. No implementation yet. 2018-09-28 14:55:57 +10:00
Geoff Langdale ceb55cc8db Pick new number parser as winner; move string parsing to own header 2018-09-28 14:27:48 +10:00
Daniel Lemire ecbe1158ed Added testing for number parsing. 2018-09-27 20:26:27 -04:00
Daniel Lemire e4094afe08 Moving toward having number-parsing testing. 2018-09-27 17:38:15 -04:00
Daniel Lemire 7606a43aa9 Merge branch 'master' of github.com:lemire/simdjson 2018-09-26 23:36:19 -04:00
Daniel Lemire 1c8339297d With new number parser (faster!). Removing the dependency on the doubleconv library (which proves to be useless). 2018-09-26 23:35:33 -04:00
Geoff Langdale ccb3670c7c DEBUG mode fixes. 2018-09-27 13:10:33 +10:00
Geoff Langdale 9f91650e72 Remove old 4-stage path. 2018-09-26 15:22:55 +10:00
Geoff Langdale c4c51627d3 Fix compile - jsonparser needs to include unified header 2018-09-26 11:33:35 +10:00
Geoff Langdale 682c224d1a Merge branch 'master' of https://github.com/lemire/simdjson 2018-09-26 11:29:23 +10:00
Geoff Langdale b0c05c03cc Fix linkage between call sites and headers, add dump code, cleanup 2018-09-26 11:28:22 +10:00
Daniel Lemire dee1bbe54e Integrating the new 3-stage approach. 2018-09-25 17:26:58 -04:00
Geoff Langdale 555926849d Bug cleanup (many vestiges of old 32-bit tape stil there) and more encapsulation of tapes. 2018-09-25 16:24:39 +10:00
Geoff Langdale 053f04b15d Crude first cut of "stage34", a unified code-based DFA with explicit stack for stages 3 and 4. 2018-09-24 10:42:30 +10:00
Daniel Lemire 2aa6b93a02 Using a naive strtoll 2018-08-28 22:37:11 -04:00
Daniel Lemire 6807abff96 Made the code safer (at the expense of the memory usage). 2018-08-24 13:20:20 -04:00
Daniel Lemire 94ea7cefb0 Moving include files into a sensible subdirectory. 2018-08-20 17:51:38 -04:00
Daniel Lemire ef0d14c35c Minor fixes + new scripts. 2018-08-20 17:40:50 -04:00
Daniel Lemire fb65be64bb Major surgery. 2018-08-20 17:27:25 -04:00
Daniel Lemire 726eb5a030 Moved the files into subdirectories. 2018-08-20 14:45:51 -04:00