Commit Graph

221 Commits

Author SHA1 Message Date
Daniel Lemire 642132920f Fixing performance regression caused by helpful code contributions
that moved inlineable functions into the source file combined with
helpful compilers which aren't smart enough to do the inlinining in
any case.
2019-05-31 18:16:12 -04:00
Daniel Lemire 8526387acb
Improving error codes. (#176)
* This commit adds new error codes.
2019-05-24 17:28:56 -04:00
Daniel Lemire 17ac5c0525
This adds guards so that we can better detect the case where we have neither AVX2 nor ARM NEON. (#173) 2019-05-24 17:26:29 -04:00
Daniel Lemire 43dba8ac7f
A slightly better "flatten"? (#166)
* This seems beneficial.
2019-05-19 12:33:45 -04:00
Daniel Lemire dcd0cb8080
Fix for https://github.com/lemire/simdjson/issues/58 (#168) 2019-05-19 12:25:27 -04:00
Daniel Lemire 47beaff152 Adding white-listing for memory sanitizer. 2019-05-19 11:18:54 -04:00
Daniel Lemire f75280ac9c
Fix for issue 150 (#162)
* Checks for issue 150. We run through the test files with sanitizers on.

* Fix for issue 150: the remaining issues were an overrun on the depth capacity and an "off-by-1" overrun on tape capacity.

* Improving makefile.

* Safer git submodule command.

* Getting get 'git' on circleci
2019-05-09 20:51:33 -04:00
Daniel Lemire e370a65383
Fix for issues 32, 50, 131, 137
* Improving portability.

* Revisiting faulty logic regarding same-page overruns.

* Disabling same-page overruns under VS.

* Clarifying the documentation

* Fix for issue 131 + being more explicit regarding memory realloc.

* Fix for issue 137.

* removing "using namespace std" throughout. Fix for 50

* Introducing typed malloc/free.

* Introducing a custom class (padded_string) that solves several minor usability issues.

* Updating amalgamation for testing.
2019-05-09 17:59:51 -04:00
Daniel Lemire 20cda07eef
Minor grammatical thing ("an integer" vs "a integer") 2019-05-09 10:48:31 -04:00
Heinz N. Gies c1975166a0 False atom fix (#156)
* Add failing test for falsy atom

* Fix false atom parsing
2019-05-09 10:45:42 -04:00
Daniel Lemire f0574d492c
Fix for issue 154 (#157)
* Changes necessary to reproduce

https://github.com/lemire/simdjson/issues/154

* Fixing issue 154.
2019-05-08 22:33:11 -04:00
technateNG 6f0d350f2c Fix to issue #148. (#151)
* Issue #148 fix.

* Test cases for issue #148.
2019-05-07 20:56:36 -04:00
saka1 719dff1312 Add predicates to ParsedJson::iterator (#153) 2019-05-07 14:11:33 -04:00
Daniel Lemire 681cd33698 Making the iterator a tad safer (tweaking the constructor so that it can throw). 2019-04-22 10:53:25 -04:00
Dong Xie 1153778f92 fix a bug in copy constructor of ParsedJson::iterator. (#146) 2019-04-22 10:37:02 -04:00
Geoff Langdale 0250352139 Merge branch 'master' of https://github.com/lemire/simdjson 2019-04-01 02:08:15 -04:00
Geoff Langdale 134ba8d1dd Ratty version of transposed ARM SIMD stuff. Needs cleanup. 2019-04-01 02:07:38 -04:00
Geoff Langdale 777b9c9a9e Unbreak x86. Durp. 2019-03-30 15:50:35 +11:00
Geoff Langdale 5ba29122fd First cut of ARM port. Needs hand-hacked Makefile. 2019-03-30 00:47:35 -04:00
Geoff Langdale b4c815a60c Concentrate and encapsulate SIMD use somewhat in preparation for ARM port. 2019-03-21 15:15:41 +11:00
Geoff Langdale 473ab12a0a Stage 2 doesn't need to know about intrinsics either (for itself) 2019-03-21 11:41:15 +11:00
Daniel Lemire df8f792183
Store the string lengths on the string tape (#101)
* Store string length in the string-tape item.
* Files are now limited to 4GB.
* Moving detection of unescaped chars to stage 1 to reduce the burden due to string parsing.

Fixes https://github.com/lemire/simdjson/issues/114

Fixes https://github.com/lemire/simdjson/issues/87
2019-03-13 19:32:57 -04:00
Tyler Kennedy 21eef55907 Changes to the behaviour of move_forward to make it suitable for iteration. (Closes #73) (#103) 2019-03-06 12:13:55 -05:00
Georgios Floros d873ee9983 Reuse aligned_malloc (#108) 2019-03-06 10:12:55 -05:00
Geoff Langdale 6628c365c9 Substantial refactor (and clang-format google stype) of stage1_find_marks.cpp 2019-03-06 11:09:50 +11:00
myd7349 2851ea490c Export CMake targets (#96) 2019-03-04 16:07:06 -05:00
Thomas Navennec 352dd5e7fa Change parse_json return type from bool to int (#82)
* Added simdjerr namespace

* Updated jsonparser files

* updated stage1 and stage2

* removed stage2 inline function

* Added forgotten return statements

* Updated tools and benchmarks

* Corrected parenthesis

* Removed extra =

* Accidentally undid reinterpret_cast

* Better comments, undid a header name fuckup

* Added an errorMsg method, updated readme

* Removed useless header from stage2

* Updated single-header file

* added simdjerr.cpp contents to simdjson.cpp

* Made single header version work

* Updated singleheader test, fixed simdjson.cpp

* Renamed simdjerr namespace and files to simdjson

* Updating the amalgamation.
2019-03-02 17:18:45 -05:00
greedengineer 0c8ee105b4 fix memory free (#86) 2019-02-27 07:50:20 -05:00
Carmot a22c20fab0 Removed innecesary check and objects release. (#79) 2019-02-26 19:08:13 -05:00
M. Zhou 3d48628e71 CMake: Add version and soversion to library target properties. (#76) 2019-02-26 16:39:59 -05:00
Kai Wolf 33341b60d8 Apply code review suggestions
- Undo explicit bool conversion
 - Don't check for NULL before deleting pointer
2019-02-26 09:36:28 +01:00
Kai Wolf e7683820d5
Merge branch 'master' into fix_minor_problems 2019-02-25 21:05:29 +01:00
Wojciech Muła 7830b1be87 Use nothrow (#65)
* Use C++11 features

* Use std::nothrow

By default new throws std::bad_alloc, so no check code would be executed.
2019-02-25 14:36:45 -05:00
Kai Wolf b521719b6f Fix old-style C-Casts 2019-02-23 17:31:38 +01:00
Kai Wolf ff22e75f95 Apply minor readability fixes 2019-02-23 17:28:20 +01:00
Geoff Langdale 3d30fd5440 Fixed a stage number message and we now fail out if no structural chars from stage 1 2019-02-23 10:51:45 +11:00
Daniel Lemire 389f8b514e Porting recently introduced fix. 2019-02-22 14:39:21 -05:00
Thomas Navennec 9606343b2c ParsedJson & ParsedJson::iterator definitions in .cpp files (#47)
* Minor change to benchmark cmake

* Moved ParsedJson and its Iterator to separate .cpp files

* Uncommented functions, that has nothing to do with this pr

* Removed really_inline comments

* Reinstated some inline functions to restore previous performance

* Re-merged iterator in ParsedJson

* Uncommented some WARN_UNUSED
2019-02-22 14:38:35 -05:00
Daniel Lemire 35eceaf1c4 Merge branch 'master' of github.com:lemire/simdjson 2019-01-24 14:29:02 -05:00
Daniel Lemire 2ca29edab3 Fixing bad spacing. 2019-01-23 20:36:23 -05:00
Daniel Lemire 85d8c9dbf0 Minor code cleaning 2019-01-18 15:21:54 -05:00
Daniel Lemire 084dfbc2b3 Fix for name change. 2019-01-16 11:29:52 -05:00
Daniel Lemire ca5f3349e6 Removing useless macro. 2019-01-08 10:34:21 -05:00
Daniel Lemire bad32be5f6 Merge branch 'stage12unified_attempt2' 2018-12-31 17:33:01 -05:00
Daniel Lemire 3ce1dd8087 Cleaning. 2018-12-31 17:13:32 -05:00
Daniel Lemire d7d568ee89 Trying something else. 2018-12-31 16:25:37 -05:00
Daniel Lemire 3a2f746602 Trying with just a unified stage 3. 2018-12-31 14:03:43 -05:00
Daniel Lemire 0317287071 Minor fix 2018-12-31 11:59:12 -05:00
Daniel Lemire 58d41923fd
Porting to visual studio
Now builds on Visual Studio
2018-12-30 21:00:19 -05:00
Daniel Lemire 386bebb33b adding support for cmake. 2018-12-28 13:13:10 -05:00
Daniel Lemire 3b24ba9043 Adding cmake 2018-12-28 13:05:42 -05:00
Daniel Lemire 46ef59c679 Cleaning. 2018-12-27 20:19:10 -05:00
Daniel Lemire bf4089b33b Removing custom types (more standard code). 2018-12-27 20:09:25 -05:00
Daniel Lemire 20133963bc Trying a detailed analysis. 2018-12-19 21:23:37 -05:00
Daniel Lemire 0a109508de Added documentation of the tape format. 2018-12-18 15:09:27 -05:00
Daniel Lemire 779ce184fb Getting ready to document the tape format. 2018-12-18 14:21:22 -05:00
Daniel Lemire e8d3d784ab More fixing. 2018-12-10 22:21:03 -05:00
Daniel Lemire 058eb917d1 Better doc. 2018-12-10 22:00:16 -05:00
Daniel Lemire e4703a383b Even safer. 2018-12-10 20:54:31 -05:00
Daniel Lemire 8a2281269c Cleaning memset amd adding more tests. 2018-12-10 19:20:52 -05:00
Daniel Lemire 7296d4d48b Fixing... 2018-12-10 17:39:19 -05:00
Daniel Lemire 05636f3a1d Cleaning. 2018-12-10 16:47:02 -05:00
Daniel Lemire 176d2ccda4 Tweaking. 2018-12-10 14:25:49 -05:00
Daniel Lemire 9df22452af First API implementation. 2018-12-07 22:19:57 -05:00
Daniel Lemire c2913d5d69 Adding dynamic memory allocation. 2018-12-06 21:44:26 -05:00
Daniel Lemire e2d2d2f8ff Adding more tests. 2018-12-06 17:22:22 -05:00
Daniel Lemire e3a4b41c2e Cleaning. 2018-11-30 22:02:32 -05:00
Daniel Lemire c11eefca32 More cleaning. 2018-11-30 21:31:05 -05:00
Daniel Lemire 0e4804137c It still works. 2018-11-30 20:36:56 -05:00
Daniel Lemire a8b99984f2 Intermediate step. 2018-11-30 20:27:16 -05:00
Daniel Lemire e5707331e9 Some refactoring. 2018-11-30 09:37:57 -05:00
Daniel Lemire 12b518578d Ok, the new code seems quite fast. 2018-11-29 22:15:02 -05:00
Daniel Lemire ce85dd0c3a Still need to streamline number parsing. 2018-11-29 17:56:17 -05:00
Daniel Lemire c1de7662c1 Simplifying function call. 2018-11-28 11:12:28 -05:00
Daniel Lemire c1805783fc Tweaking performance. 2018-11-27 21:13:31 -05:00
Daniel Lemire 8648c4108e MOre cleaning. 2018-11-27 20:42:35 -05:00
Daniel Lemire 58ac242770 Ok. Let us benchmark this thing. 2018-11-27 15:05:50 -05:00
Daniel Lemire a43b0772e1 Lots and lots of cleaning. 2018-11-27 14:37:59 -05:00
Daniel Lemire 5fae7b2100 Still working 2018-11-27 10:10:39 -05:00
Daniel Lemire 50defa510f Stupid work. 2018-11-26 16:55:24 -05:00
Daniel Lemire 86a75462c5 Adding the ability of doing a dump. 2018-11-23 22:20:57 -05:00
Daniel Lemire 17f5d0517d Opting for a more common intrinsic. 2018-11-09 21:41:15 -05:00
Daniel Lemire 76074a821f Various cleaning steps. 2018-11-09 21:31:14 -05:00
Daniel Lemire 0e5b939568 Merge branch 'master' of github.com:lemire/simdjson 2018-11-09 15:16:25 -05:00
Daniel Lemire c1a7e79862 Lifting the mem limit. (Dirty commit.) 2018-11-09 15:16:05 -05:00
Daniel Lemire df65de4ae2 Tuning presentation and fixing a problem with minifier benchmark. 2018-10-23 21:36:32 -04:00
Daniel Lemire 8315f4c888 Cleaning up the code. 2018-10-17 21:31:22 -04:00
Daniel Lemire 35381279c3 Maybe we can do away with the fast ASCII trick. 2018-10-17 21:05:38 -04:00
Daniel Lemire e517414080 We include character-encoding validation. 2018-10-17 19:22:09 -04:00
Daniel Lemire 9fc8d8444b We want to allow more than just arrays and objects, as per the JSON spec. 2018-10-17 13:57:42 -04:00
Daniel Lemire 2ad9891b66 I think NO_PDEP_PLEASE should be defined by default. It seems
to be generally better/faster. More instructions, but also
more instructions per cycle, so it ends up being a net win.
2018-10-03 21:42:27 -04:00
Geoff Langdale ceb55cc8db Pick new number parser as winner; move string parsing to own header 2018-09-28 14:27:48 +10:00
Daniel Lemire e4094afe08 Moving toward having number-parsing testing. 2018-09-27 17:38:15 -04:00
Daniel Lemire 1c8339297d With new number parser (faster!). Removing the dependency on the doubleconv library (which proves to be useless). 2018-09-26 23:35:33 -04:00
Daniel Lemire 6239b9c13e Overallocation 2018-09-26 14:20:28 -04:00
Geoff Langdale 9f91650e72 Remove old 4-stage path. 2018-09-26 15:22:55 +10:00
Geoff Langdale b9706d462c Minor cleanups. 2018-09-26 15:09:54 +10:00
Geoff Langdale 36fadde3c7 Minor twiddles. 2018-09-26 13:52:05 +10:00
Geoff Langdale 0d5797a827 Wrap the tape dump in debug code. 2018-09-26 13:28:16 +10:00
Geoff Langdale e9586b6b4d Very first char is considered to follow "whitespace" for pseudo-structural character detection purposes 2018-09-26 13:27:39 +10:00
Geoff Langdale 35503f1d8f Oops noisy. 2018-09-26 13:21:05 +10:00
Geoff Langdale fa6c8990ff Added a terrifying hack to append a idx-to-0-char to stage 2 output. 2018-09-26 13:20:08 +10:00
Geoff Langdale 682c224d1a Merge branch 'master' of https://github.com/lemire/simdjson 2018-09-26 11:29:23 +10:00
Geoff Langdale b0c05c03cc Fix linkage between call sites and headers, add dump code, cleanup 2018-09-26 11:28:22 +10:00
Daniel Lemire dee1bbe54e Integrating the new 3-stage approach. 2018-09-25 17:26:58 -04:00
Geoff Langdale 555926849d Bug cleanup (many vestiges of old 32-bit tape stil there) and more encapsulation of tapes. 2018-09-25 16:24:39 +10:00
Geoff Langdale 8b2d00a337 Bug fix for ,] issue and cleanup. 2018-09-25 15:35:17 +10:00
Geoff Langdale 64d07cd04c Fix bug where strings were not parsed on 2nd and subsequent key:value pairs. 2018-09-24 15:16:22 +10:00
Geoff Langdale 77bfe6c984 Fix some bad messages and the failure to parse key strings. 2018-09-24 10:54:29 +10:00
Geoff Langdale 2a46b40457 Adding new stage34, a more straightforward replacement for stage 3 and 4 using a DFA and explicit stack 2018-09-24 10:44:05 +10:00
Geoff Langdale 01f191e5eb Merge branch 'master' of https://github.com/lemire/simdjson 2018-09-24 10:43:10 +10:00
Geoff Langdale 053f04b15d Crude first cut of "stage34", a unified code-based DFA with explicit stack for stages 3 and 4. 2018-09-24 10:42:30 +10:00
Daniel Lemire 9d4f9e46f9 Some comments. 2018-09-16 16:40:59 -04:00
Daniel Lemire 2aa6b93a02 Using a naive strtoll 2018-08-28 22:37:11 -04:00
Daniel Lemire 0b2f9747f8 Check that numbers starting with 0 are followed by decimal, e, E or
they just end the number (0). Note that we allow -0. I guess.
2018-08-28 20:41:55 -04:00
Daniel Lemire e104c020ef Versions of the code that use Google DoubleConv. 2018-08-24 20:49:45 -04:00
Daniel Lemire 6807abff96 Made the code safer (at the expense of the memory usage). 2018-08-24 13:20:20 -04:00
Daniel Lemire 94ea7cefb0 Moving include files into a sensible subdirectory. 2018-08-20 17:51:38 -04:00
Daniel Lemire ef0d14c35c Minor fixes + new scripts. 2018-08-20 17:40:50 -04:00
Daniel Lemire fb65be64bb Major surgery. 2018-08-20 17:27:25 -04:00
Daniel Lemire 726eb5a030 Moved the files into subdirectories. 2018-08-20 14:45:51 -04:00