Commit Graph

221 Commits

Author SHA1 Message Date
Juho Lauri cf9dbe583d improved const correctness (#321) 2019-10-02 14:25:28 -04:00
John Keiser de8df0a05f Combined performance patch (5% overall, 15% stage 1) (#317)
* Allow -f

* Support parse -s (force sse)

* Simplify flatten_bits

- Add directly to base instead of storing variable
- Don't modify base_ptr after beginning of function
- Eliminate base variable and increment base_ptr instead

* De-unroll the flatten_bits loops

* Decrease dependencies in stage 1

- Do all finalize_structurals work before computing the quote mask; mask
  out the quote mask later
- Join find_whitespace_and_structurals and finalize_structurals into
  single find_structurals call, to reduce variable leakage
- Rework pseudo_pred algorithm to refer to "primitive" for clarity and some
  dependency reduction
- Rename quote_mask to in_string to describe what we're trying to
  achieve ("mask" could mean many things)
- Break up find_quote_mask_and_bits into find_quote_mask and
  invalid_string_bytes to reduce data leakage (i.e. don't expose quote bits
  or odd_ends at all to find_structural_bits)
- Genericize overflow methods "follows" and "follows_odd_sequence" for
  descriptiveness and possible lifting into a generic simd parsing library

* Mark branches as likely/unlikely

* Reorder and unroll+interleave stage 1 loop

* Nest the cnt > 16 branch inside cnt > 8
2019-10-01 12:01:08 -04:00
Daniel Lemire 53b6deaeae
Safer handling of error codes, fixes https://github.com/lemire/simdjson/issues/318 (#319) 2019-09-29 12:12:15 -04:00
Opemipo 462858efa3 Fix Typo (#311)
escapted -> escaped
2019-09-12 10:16:33 -04:00
John Keiser f7e893667d Use simd_input generic methods for utf8 checking (#301)
* Use generic each/reduce in simdutf8check

* Remove macros from generic simd_input uses

* Use array instead of members to store simd registers

* Default local checkperf to clone from .
2019-09-02 12:46:05 -04:00
saka1 c1f27fb848 Accept large unsigned integers (#295)
* handle uint64 value in JSON
* Add integer_tests
* Add get_unsigned_integer() on  ParsedJson::BasicIterator
* Write 'u' to tape when the value seems unsigned
* Add to handle 'u' element
* Brush up integer_tests.cpp
* Append tests/integer_tests in .gitignore
* Add comments to is_integer and is_unsigned_integer
2019-09-02 10:50:24 -04:00
John Keiser 7f249cd179 Use non-interleaved map() to make structurals clearer (#304) 2019-08-29 21:38:41 -04:00
John Keiser f4fa5b7340 Add MAP_CHUNKS2, make parameter name related to input 2019-08-26 09:46:49 -07:00
John Keiser 169568ca47 Use map() to interleave instructions for parallelism 2019-08-26 09:46:49 -07:00
John Keiser 9cc4ddfc88 Use map().to_bitmask() instead of build_bitmask() 2019-08-26 09:46:49 -07:00
John Keiser 441963c84c Add AMD64 build_bitmask 2019-08-26 09:46:49 -07:00
John Keiser da0f1cacea Remove static modifiers 2019-08-26 09:46:48 -07:00
John Keiser b01222518d Genericize bitmask building to make algorithms clearer 2019-08-26 09:46:48 -07:00
John Keiser 585f84a734 Move architecture-specific headers to src/ (#287)
* Use namespaces instead of templates for stage1 impls

* Move stage1 implementation into the src/ directory

* Move architecture-specific code to src/
2019-08-21 07:59:49 -04:00
Vitaly Baranov e9be643db5 Fix condition in ParsedJson::allocate_capacity(). (#283) 2019-08-16 08:38:59 -04:00
Vitaly Baranov 6a2728e730 No allocation in the iterator's constructor (#276)
* Get rid of dynamic allocation in ParsedJson::Iterator.

* Implement copy assignment operator for ParsedJson::Iterator.

* ParsedJson::Iterator is now a template class.
2019-08-15 19:42:15 -04:00
Daniel Lemire 3fb82502f7
This gets rid of the silly ALLOW_SAME_PAGE_BUFFER_OVERRUN (#268) 2019-08-09 17:36:32 -04:00
Vitaly Baranov 0b927f059c Make dynamic dispatch free of TSan warnings (#256) 2019-08-08 16:16:35 -04:00
John Keiser f3c3afd4cd Use direct call to templated flatten_bits instead of if (#262)
* Use direct call to templated flatten_bits instead of if

* Put really_inline back on find_structural_bits_64
2019-08-08 15:09:17 -04:00
John Keiser b1beacd1f3 Make headers show up in Header Files in VS2019 (#257) 2019-08-05 16:36:52 -04:00
John Keiser d9a0e2b8f4 Fix Intellisense errors opening .h files on VS2019 (#253) 2019-08-04 19:57:55 -04:00
ioioioio 2a24567370
Replace macros by include files (#236) (#248)
* stage1 compiles without macros

* cleaning

* amalgation is weird but works

* macros are removed from stringparsing

* amalgation fixed

* Huge macros are removed.

* clang-format
2019-08-04 15:58:35 -04:00
Daniel Lemire 99a153d9e8
Hiding the pointer away... (#252)
* Hiding the runtime dispatch pointer in a source file so it is not an exported symbol
* Disabling hard failure on style check.
* Fixes https://github.com/lemire/simdjson/issues/250
2019-08-04 15:41:00 -04:00
Daniel Lemire 038b18edf1
Adding style scripts. (#243)
* Adding style scripts.
2019-08-01 16:09:26 -04:00
Daniel Lemire d83aef4e86 This should fix a warning in Visual Studio. 2019-07-31 18:12:58 -04:00
John Keiser bf59ba76f5 Fix most warnings on VS2019 (#241) 2019-07-31 17:43:45 -04:00
ioioioio c2eea8abba Style uniformization (#238)
* massive clang-format -style=LLVM

* naming harmonization

* adding commentary about sysinfoapi.h
2019-07-30 17:18:10 -04:00
Daniel Lemire 771e9cd68a
Trying again... (#235) 2019-07-29 13:55:13 -04:00
Daniel Lemire c328afee57 This should fix master. 2019-07-29 13:44:25 -04:00
Daniel Lemire a53d95099c
Intrinsic-based flatten (#234)
* Providing a flatten function with intrinsics (for Visual Studio).
2019-07-29 13:28:02 -04:00
Daniel Lemire f76ee5e5ef Fixes issue 221 (#222)
https://github.com/lemire/simdjson/issues/221
2019-07-29 10:07:07 -04:00
Daniel Lemire eba02dc1b9 Runtime dispatch
* Attempt 1 - fn targeting

GCC won't work with templates with different targets, need to specialize all the way up the call stack.

* Compiles properly with cmake. Does not with the Makefile.

* Compilation works with Makefile

* instruction_set changes to architecture

* some aesthetic changes

* fix amalgation and tests + aesthetic changes

* This now compiles and passes tests under CLANG

* Minor correction.

* Trying to make it work on ARM

* Adding missing namespace

* Missing bracket

* Fixing minor compilation issues.

* Getting parse to use runtime dispatch

* Fixing amalgamation script.

* Making sure that NEON is supported.

* Fixing typo

* Merging https://github.com/lemire/simdjson/pull/229

* Manual merge of
https://github.com/lemire/simdjson/pull/229
by @jkeiser  (second part)

* Trying another way.

* Removing the paral.

* Fixing the make file

* Let us make the practice run long enough.

* Resolved the awful slowness.

* Cleaning the README.md

* With runtime dispatching, we should not need flags anymore.

* Changing isa detection file's name + fixing typos.
2019-07-28 22:46:33 -04:00
ioioioio bcabdfc1ae Json pointer (#220)
* json pointer support

* Addition of tests for the json pointer

* Adding a new tool for the JSON Pointer support, and some documentation.
2019-07-26 18:38:10 -04:00
Daniel Lemire be956654b2 Minor cleaning = annotating simdjson namespaces and making sure that we don't have headers all over. 2019-07-09 19:24:08 -04:00
Daniel Lemire fba27ef4b9 I missed a few. Building up VS support. 2019-07-04 17:45:45 -04:00
Daniel Lemire 19cdc09928 Improving support for VS 2019-07-04 17:36:26 -04:00
ioioioio 861a6a17e4 SSE implementation integrated 2019-07-03 17:15:21 -04:00
ioioioio 036f9d5a45 Merge branch 'master' of https://github.com/lemire/simdjson into Multiple_implementation_refactoring_stage2 2019-07-03 10:34:58 -04:00
ioioioio 3f24879157 Stage2 refactored to simplify multiple implementations 2019-07-02 17:12:00 -04:00
ioioioio 9230588ce8 conflicts are solved 2019-07-02 15:21:00 -04:00
Daniel Lemire aa78b70d69 Introducing a "native" instruction set so that you do not need to do #ifdef to select the right SIMD set all the time.
Fixing indentation.
Removing some obsolete WARN_UNUSED.
Fixing a weird warning with optind variable.
2019-07-01 14:18:30 -04:00
ioioioio de08df6a7e Correction of identation. 2019-06-28 15:33:30 -04:00
ioioioio 6723221a42 Refactoring stage1 to facilitate multiple implementations. 2019-06-28 15:14:42 -04:00
Daniel Lemire d7f7f1b200
Fixing issue. (#193) 2019-06-20 18:49:47 -04:00
Daniel Lemire b1e8990654
Moving iterator functions in the header file (#189)
We want the compiler to inline hot functions in the iterators. Let us leave them in the header file. Please.
2019-06-11 21:09:58 -04:00
Daniel Lemire b32c72f1fc Adding a new compile-time flag (SIMDJSON_NAIVE_STRUCTURAL) for research purposes. 2019-06-03 16:41:50 -04:00
Daniel Lemire e27a46973c Introducing a fallback for clmul. 2019-06-03 15:10:14 -04:00
Daniel Lemire 06461a465b Tweaking. 2019-06-03 14:25:22 -04:00
Daniel Lemire cf6f231be6 Allowing users to provide additional flags. 2019-06-03 14:17:42 -04:00
Daniel Lemire 9239f75123
Adding compile-time option to test the speed of the fast flatten (research-oriented). (#181) 2019-06-03 13:37:09 -04:00
Daniel Lemire 642132920f Fixing performance regression caused by helpful code contributions
that moved inlineable functions into the source file combined with
helpful compilers which aren't smart enough to do the inlinining in
any case.
2019-05-31 18:16:12 -04:00
Daniel Lemire 8526387acb
Improving error codes. (#176)
* This commit adds new error codes.
2019-05-24 17:28:56 -04:00
Daniel Lemire 17ac5c0525
This adds guards so that we can better detect the case where we have neither AVX2 nor ARM NEON. (#173) 2019-05-24 17:26:29 -04:00
Daniel Lemire 43dba8ac7f
A slightly better "flatten"? (#166)
* This seems beneficial.
2019-05-19 12:33:45 -04:00
Daniel Lemire dcd0cb8080
Fix for https://github.com/lemire/simdjson/issues/58 (#168) 2019-05-19 12:25:27 -04:00
Daniel Lemire 47beaff152 Adding white-listing for memory sanitizer. 2019-05-19 11:18:54 -04:00
Daniel Lemire f75280ac9c
Fix for issue 150 (#162)
* Checks for issue 150. We run through the test files with sanitizers on.

* Fix for issue 150: the remaining issues were an overrun on the depth capacity and an "off-by-1" overrun on tape capacity.

* Improving makefile.

* Safer git submodule command.

* Getting get 'git' on circleci
2019-05-09 20:51:33 -04:00
Daniel Lemire e370a65383
Fix for issues 32, 50, 131, 137
* Improving portability.

* Revisiting faulty logic regarding same-page overruns.

* Disabling same-page overruns under VS.

* Clarifying the documentation

* Fix for issue 131 + being more explicit regarding memory realloc.

* Fix for issue 137.

* removing "using namespace std" throughout. Fix for 50

* Introducing typed malloc/free.

* Introducing a custom class (padded_string) that solves several minor usability issues.

* Updating amalgamation for testing.
2019-05-09 17:59:51 -04:00
Daniel Lemire 20cda07eef
Minor grammatical thing ("an integer" vs "a integer") 2019-05-09 10:48:31 -04:00
Heinz N. Gies c1975166a0 False atom fix (#156)
* Add failing test for falsy atom

* Fix false atom parsing
2019-05-09 10:45:42 -04:00
Daniel Lemire f0574d492c
Fix for issue 154 (#157)
* Changes necessary to reproduce

https://github.com/lemire/simdjson/issues/154

* Fixing issue 154.
2019-05-08 22:33:11 -04:00
technateNG 6f0d350f2c Fix to issue #148. (#151)
* Issue #148 fix.

* Test cases for issue #148.
2019-05-07 20:56:36 -04:00
saka1 719dff1312 Add predicates to ParsedJson::iterator (#153) 2019-05-07 14:11:33 -04:00
Daniel Lemire 681cd33698 Making the iterator a tad safer (tweaking the constructor so that it can throw). 2019-04-22 10:53:25 -04:00
Dong Xie 1153778f92 fix a bug in copy constructor of ParsedJson::iterator. (#146) 2019-04-22 10:37:02 -04:00
Geoff Langdale 0250352139 Merge branch 'master' of https://github.com/lemire/simdjson 2019-04-01 02:08:15 -04:00
Geoff Langdale 134ba8d1dd Ratty version of transposed ARM SIMD stuff. Needs cleanup. 2019-04-01 02:07:38 -04:00
Geoff Langdale 777b9c9a9e Unbreak x86. Durp. 2019-03-30 15:50:35 +11:00
Geoff Langdale 5ba29122fd First cut of ARM port. Needs hand-hacked Makefile. 2019-03-30 00:47:35 -04:00
Geoff Langdale b4c815a60c Concentrate and encapsulate SIMD use somewhat in preparation for ARM port. 2019-03-21 15:15:41 +11:00
Geoff Langdale 473ab12a0a Stage 2 doesn't need to know about intrinsics either (for itself) 2019-03-21 11:41:15 +11:00
Daniel Lemire df8f792183
Store the string lengths on the string tape (#101)
* Store string length in the string-tape item.
* Files are now limited to 4GB.
* Moving detection of unescaped chars to stage 1 to reduce the burden due to string parsing.

Fixes https://github.com/lemire/simdjson/issues/114

Fixes https://github.com/lemire/simdjson/issues/87
2019-03-13 19:32:57 -04:00
Tyler Kennedy 21eef55907 Changes to the behaviour of move_forward to make it suitable for iteration. (Closes #73) (#103) 2019-03-06 12:13:55 -05:00
Georgios Floros d873ee9983 Reuse aligned_malloc (#108) 2019-03-06 10:12:55 -05:00
Geoff Langdale 6628c365c9 Substantial refactor (and clang-format google stype) of stage1_find_marks.cpp 2019-03-06 11:09:50 +11:00
myd7349 2851ea490c Export CMake targets (#96) 2019-03-04 16:07:06 -05:00
Thomas Navennec 352dd5e7fa Change parse_json return type from bool to int (#82)
* Added simdjerr namespace

* Updated jsonparser files

* updated stage1 and stage2

* removed stage2 inline function

* Added forgotten return statements

* Updated tools and benchmarks

* Corrected parenthesis

* Removed extra =

* Accidentally undid reinterpret_cast

* Better comments, undid a header name fuckup

* Added an errorMsg method, updated readme

* Removed useless header from stage2

* Updated single-header file

* added simdjerr.cpp contents to simdjson.cpp

* Made single header version work

* Updated singleheader test, fixed simdjson.cpp

* Renamed simdjerr namespace and files to simdjson

* Updating the amalgamation.
2019-03-02 17:18:45 -05:00
greedengineer 0c8ee105b4 fix memory free (#86) 2019-02-27 07:50:20 -05:00
Carmot a22c20fab0 Removed innecesary check and objects release. (#79) 2019-02-26 19:08:13 -05:00
M. Zhou 3d48628e71 CMake: Add version and soversion to library target properties. (#76) 2019-02-26 16:39:59 -05:00
Kai Wolf 33341b60d8 Apply code review suggestions
- Undo explicit bool conversion
 - Don't check for NULL before deleting pointer
2019-02-26 09:36:28 +01:00
Kai Wolf e7683820d5
Merge branch 'master' into fix_minor_problems 2019-02-25 21:05:29 +01:00
Wojciech Muła 7830b1be87 Use nothrow (#65)
* Use C++11 features

* Use std::nothrow

By default new throws std::bad_alloc, so no check code would be executed.
2019-02-25 14:36:45 -05:00
Kai Wolf b521719b6f Fix old-style C-Casts 2019-02-23 17:31:38 +01:00
Kai Wolf ff22e75f95 Apply minor readability fixes 2019-02-23 17:28:20 +01:00
Geoff Langdale 3d30fd5440 Fixed a stage number message and we now fail out if no structural chars from stage 1 2019-02-23 10:51:45 +11:00
Daniel Lemire 389f8b514e Porting recently introduced fix. 2019-02-22 14:39:21 -05:00
Thomas Navennec 9606343b2c ParsedJson & ParsedJson::iterator definitions in .cpp files (#47)
* Minor change to benchmark cmake

* Moved ParsedJson and its Iterator to separate .cpp files

* Uncommented functions, that has nothing to do with this pr

* Removed really_inline comments

* Reinstated some inline functions to restore previous performance

* Re-merged iterator in ParsedJson

* Uncommented some WARN_UNUSED
2019-02-22 14:38:35 -05:00
Daniel Lemire 35eceaf1c4 Merge branch 'master' of github.com:lemire/simdjson 2019-01-24 14:29:02 -05:00
Daniel Lemire 2ca29edab3 Fixing bad spacing. 2019-01-23 20:36:23 -05:00
Daniel Lemire 85d8c9dbf0 Minor code cleaning 2019-01-18 15:21:54 -05:00
Daniel Lemire 084dfbc2b3 Fix for name change. 2019-01-16 11:29:52 -05:00
Daniel Lemire ca5f3349e6 Removing useless macro. 2019-01-08 10:34:21 -05:00
Daniel Lemire bad32be5f6 Merge branch 'stage12unified_attempt2' 2018-12-31 17:33:01 -05:00
Daniel Lemire 3ce1dd8087 Cleaning. 2018-12-31 17:13:32 -05:00
Daniel Lemire d7d568ee89 Trying something else. 2018-12-31 16:25:37 -05:00
Daniel Lemire 3a2f746602 Trying with just a unified stage 3. 2018-12-31 14:03:43 -05:00
Daniel Lemire 0317287071 Minor fix 2018-12-31 11:59:12 -05:00
Daniel Lemire 58d41923fd
Porting to visual studio
Now builds on Visual Studio
2018-12-30 21:00:19 -05:00
Daniel Lemire 386bebb33b adding support for cmake. 2018-12-28 13:13:10 -05:00