Commit Graph

483 Commits

Author SHA1 Message Date
Daniel Lemire 27861f6358 SIMDJSON_PADDING is now an absolute constant. This is temporary since
padding should go away once  https://github.com/lemire/simdjson/issues/174
is resolved.
2020-01-15 15:49:50 -05:00
Daniel Lemire 1498b78342 Minor simplifications. 2020-01-10 14:07:57 -05:00
dbj 85e84fc1fa improved string padded (#440)
* dirent portable latest version

* improved

std::string argument passed by const reference
ctor added with std::string_view  argument
`allocate_padded_buffer()`  moved here with **optional** check on `length < 1`

* allocate_padded_buffer moved to padded_string.h
2020-01-10 10:15:48 -05:00
UKABUER 773883c486 Fix #420 (#421) 2020-01-09 09:56:43 -05:00
Daniel Lemire 951c4bedf8
Simpler jsonstream (#436)
* One simplification.

* Removing untested functions.
2020-01-07 19:10:02 -05:00
Daniel Lemire 0a874a5063 Some tuning 2020-01-06 11:41:07 -05:00
dbj 2caa6e3370 C++ language version detection (#418)
* added visual_studio folder where visual_studio cmake generated, local artefacts are

* C++ version detection
2020-01-06 11:38:09 -05:00
Daniel Lemire 7bde23590a
Debugging jsonstream (#432)
Fixes #424 (and provide tests for it), as well as #401
2020-01-03 22:22:47 -05:00
John Keiser 165e23773f Refactor stage 2 into structural_parser class 2020-01-02 13:12:22 -07:00
Paul Dreik 399d08c86c use unique_ptr in class parsedjson (#417)
* refactor parsedjson to use unique_ptr instead of owning raw pointer
* fix a potential undefined behavior
* output only first cpu in /proc/cpuinfo
2019-12-31 14:31:45 -05:00
dbj 9c3828fefe STRINGIFY implemented (#402)
* STRINGIFY implemented

* SIMDJSON_THREADS_ENABLED def/undef
2019-12-20 07:57:00 -05:00
John Keiser e2f349e7bd Measure impact of utf-8 blocks and structurals per block directly 2019-12-17 11:41:13 -08:00
Daniel Lemire 102262c7ab
Fixing issue386 (#396)
* Creating arch-specific bitmanipulation.h files.
* Improving system and compiler portability.
* We want to allow trailing_zeroes on zero inputs.
2019-12-16 19:09:18 -05:00
mswilson d33208c7db Correct detection of NEON support (#392)
... as the test as it is currently implemented will always evaluate to true.

Fixes #389
2019-12-10 13:12:17 -05:00
Daniel Lemire 7c560fa137 Cleaning documentation. 2019-11-26 14:13:17 -05:00
Jeremie Piotte f163155929 JsonStream documentation (#381)
* adding Multiline JSON competition chart to doc
* Completing the comments for JsonStream
* Adding a page for JsonStream's documentation.
2019-11-25 18:11:55 -05:00
Jeremie Piotte 29fc51522a
Introducing concurrency mode in JsonStream. (#373)
* JsonStream threaded prototype

* JsonStream Threaded version working. Still supporting non-threaded version.

* Fix where invalid files would enter infinite loop.

* SingleHeader update

* I will remove -pthread in cmake for now.

* Attempt at resolving the -pthread issue
2019-11-21 11:22:06 -05:00
Daniel Lemire 58d249ca16
Introducing move assignments. (#363) 2019-11-09 10:34:32 -05:00
Jeremie Piotte bdc2b07339
Streams of JSON documents + Large files (>4GB) (#350) (#364)
* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* type

* minor fixes and cleaning.

* minor fixes and cleaning.

* removing warnings

* removing some copies

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* merged main into branch

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* Fix for https://github.com/lemire/simdjson/issues/345

* Follow up test and fix for https://github.com/lemire/simdjson/issues/345 (#347)

* Final (?) fix for https://github.com/lemire/simdjson/issues/345

* Verbose basictest

* Being more forgiving of powers of ten.

* Let us zero the tail end.

* add basic fuzzers (#348)

* add basic fuzzing using libFuzzer

* let cmake respect cflags, otherwise the fuzzer flags go unnoticed

also, integrates badly with oss-fuzz

* add new fuzzer for minification, simplify the old one

* add fuzzer for the dump example

* clang format

* adding Paul Dreik

* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* type

* minor fixes and cleaning.

* Fixing issue 351 (#352)

* Fixing issues 351 and 353

* minor fixes and cleaning.

* removing warnings

* removing some copies

* Fix ARM compile errors on g++ 7.4 (#354)

* Fix ARM compilation errors

* Update singleheader

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* fix integer overflow in subnormal_power10 (#355)

detected by oss-fuzz

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714

* Adding new test file, following https://github.com/lemire/simdjson/pull/355

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* merged main into branch

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* merging main

* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* minor fixes and cleaning.

* minor fixes and cleaning.

* removing warnings

* removing some copies

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* rough prototype working.  Needs more test and fine tuning.

* minor fixes and cleaning.

* adding jsonstream to amalgamation

* merged main into branch

* Addind a JsonStream Demo to Amalgamation

* merging main

* merging main

* make file fix
2019-11-08 17:39:45 -05:00
Paul Dreik 8ae818e17c add ossfuzz support (#362)
* initial oss-fuzz friendly build

parts taken from libfmt, which I wrote and have the copyright to

* fix build error

* add script for building a corpus zip

see https://google.github.io/oss-fuzz/getting-started/new-project-guide/#seed-corpus

* fix zip command

* drop setting the C++ standard

* disable the minify fuzzer, does not pass oss-fuzz check-build test

* fix integer overflow in subnormal_power10

detected by oss-fuzz

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714

* invoke the build like oss fuzz does

* document what the scripts are for and how to use them

* add a page about fuzzing
2019-11-08 10:32:43 -05:00
Daniel Lemire 3439ce19c9
Adding a flag which allows us to disable AVX detection. This exposes a bug. (#356) 2019-11-06 10:39:26 -05:00
Daniel Lemire a065805b0f Fix for https://github.com/lemire/simdjson/issues/345 2019-10-24 15:34:30 -04:00
John Keiser 64872bddf4 Eliminate stage1_find_marks_flatten.h 2019-10-14 12:33:46 -07:00
John Keiser 9bbd6bd874 Move headers to implementation area
- jsoncharutils.h, numberparsing.h, simdprune_tables.h
2019-10-14 11:51:41 -07:00
Daniel Lemire 81f9aac13f Fixing minor perf. regression. 2019-10-07 16:31:44 -04:00
Juho Lauri b2eff3c90c case insensitive move_to_key (#324)
* case insensitive move_to_key
* portable strcmpi
2019-10-07 16:08:17 -04:00
Juho Lauri cf9dbe583d improved const correctness (#321) 2019-10-02 14:25:28 -04:00
John Keiser de8df0a05f Combined performance patch (5% overall, 15% stage 1) (#317)
* Allow -f

* Support parse -s (force sse)

* Simplify flatten_bits

- Add directly to base instead of storing variable
- Don't modify base_ptr after beginning of function
- Eliminate base variable and increment base_ptr instead

* De-unroll the flatten_bits loops

* Decrease dependencies in stage 1

- Do all finalize_structurals work before computing the quote mask; mask
  out the quote mask later
- Join find_whitespace_and_structurals and finalize_structurals into
  single find_structurals call, to reduce variable leakage
- Rework pseudo_pred algorithm to refer to "primitive" for clarity and some
  dependency reduction
- Rename quote_mask to in_string to describe what we're trying to
  achieve ("mask" could mean many things)
- Break up find_quote_mask_and_bits into find_quote_mask and
  invalid_string_bytes to reduce data leakage (i.e. don't expose quote bits
  or odd_ends at all to find_structural_bits)
- Genericize overflow methods "follows" and "follows_odd_sequence" for
  descriptiveness and possible lifting into a generic simd parsing library

* Mark branches as likely/unlikely

* Reorder and unroll+interleave stage 1 loop

* Nest the cnt > 16 branch inside cnt > 8
2019-10-01 12:01:08 -04:00
Daniel Lemire 5765c81f66 Fixing number parsing of large ints 2019-09-02 12:40:39 -04:00
Daniel Lemire 92334a8e28 Better tests. 2019-09-02 12:32:44 -04:00
Daniel Lemire c4218c8e40
Accept large unsigned integers (#295) (#306)
* handle uint64 value in JSON
* Add integer_tests
* Add get_unsigned_integer() on  ParsedJson::BasicIterator
* Write 'u' to tape when the value seems unsigned
* Add to handle 'u' element
* Brush up integer_tests.cpp
* Append tests/integer_tests in .gitignore
* Add comments to is_integer and is_unsigned_integer
2019-09-02 11:56:26 -04:00
saka1 c1f27fb848 Accept large unsigned integers (#295)
* handle uint64 value in JSON
* Add integer_tests
* Add get_unsigned_integer() on  ParsedJson::BasicIterator
* Write 'u' to tape when the value seems unsigned
* Add to handle 'u' element
* Brush up integer_tests.cpp
* Append tests/integer_tests in .gitignore
* Add comments to is_integer and is_unsigned_integer
2019-09-02 10:50:24 -04:00
Daniel Lemire f667d4965d
This is a bug fix: our prev function was buggy. (#291) 2019-08-23 18:59:43 -04:00
John Keiser 585f84a734 Move architecture-specific headers to src/ (#287)
* Use namespaces instead of templates for stage1 impls

* Move stage1 implementation into the src/ directory

* Move architecture-specific code to src/
2019-08-21 07:59:49 -04:00
Daniel Lemire a1bff85263 Documenting the limits of move_to_key with respect to Unicode Equivalence. 2019-08-20 17:10:30 -04:00
John Keiser 94673bcdf2 Use methods for utf8 checker 2019-08-16 14:15:37 -07:00
John Keiser aa15917c9d Use methods instead of functions for simd_input 2019-08-16 14:07:30 -07:00
Vitaly Baranov 6a2728e730 No allocation in the iterator's constructor (#276)
* Get rid of dynamic allocation in ParsedJson::Iterator.

* Implement copy assignment operator for ParsedJson::Iterator.

* ParsedJson::Iterator is now a template class.
2019-08-15 19:42:15 -04:00
John Keiser 0042d9b406 Move UTF8 checking functions into their own file 2019-08-14 10:34:11 -07:00
John Keiser 237b8865f5 Correct header #define 2019-08-13 17:44:26 -07:00
John Keiser 8f01cece3a Move simd_input and associated functions to their own header 2019-08-13 17:44:06 -07:00
Daniel Lemire 2ca574d9e6
Removing windows.h (#273) 2019-08-12 19:40:21 -04:00
Daniel Lemire 3fb82502f7
This gets rid of the silly ALLOW_SAME_PAGE_BUFFER_OVERRUN (#268) 2019-08-09 17:36:32 -04:00
Vitaly Baranov 9dfab9d9a4 Disable UBSan error in trailing_zeroes(). (#266)
https://github.com/lemire/simdjson/issues/265
2019-08-09 14:37:22 -04:00
John Keiser f3c3afd4cd Use direct call to templated flatten_bits instead of if (#262)
* Use direct call to templated flatten_bits instead of if

* Put really_inline back on find_structural_bits_64
2019-08-08 15:09:17 -04:00
John Keiser b1beacd1f3 Make headers show up in Header Files in VS2019 (#257) 2019-08-05 16:36:52 -04:00
John Keiser d9a0e2b8f4 Fix Intellisense errors opening .h files on VS2019 (#253) 2019-08-04 19:57:55 -04:00
ioioioio 2a24567370
Replace macros by include files (#236) (#248)
* stage1 compiles without macros

* cleaning

* amalgation is weird but works

* macros are removed from stringparsing

* amalgation fixed

* Huge macros are removed.

* clang-format
2019-08-04 15:58:35 -04:00
Daniel Lemire bd9628df93 Producing a new release 2019-08-04 15:43:47 -04:00
Daniel Lemire 99a153d9e8
Hiding the pointer away... (#252)
* Hiding the runtime dispatch pointer in a source file so it is not an exported symbol
* Disabling hard failure on style check.
* Fixes https://github.com/lemire/simdjson/issues/250
2019-08-04 15:41:00 -04:00
Daniel Lemire 2a240e3fe2 Fixing style violation. 2019-08-01 16:38:51 -04:00
Daniel Lemire ee66fb1c60 Version 0.2.0. 2019-08-01 16:23:30 -04:00
Daniel Lemire 038b18edf1
Adding style scripts. (#243)
* Adding style scripts.
2019-08-01 16:09:26 -04:00
ioioioio 968117c940 preventing clang-format to move sysinfoapi.h (#244) 2019-08-01 15:06:50 -04:00
Daniel Lemire 6788b12d65 It is not beneficial to try to get clever with trailing zeroes. (Lead to major performance
regression under haswell+ for stage 1).
2019-08-01 14:44:04 -04:00
Daniel Lemire 66ffc1b2d6 Adding a remark. 2019-08-01 11:33:51 -04:00
John Keiser bf59ba76f5 Fix most warnings on VS2019 (#241) 2019-07-31 17:43:45 -04:00
Daniel Lemire 76da659977 Fixing amalgamate under ARM 2019-07-30 22:10:48 +00:00
ioioioio c2eea8abba Style uniformization (#238)
* massive clang-format -style=LLVM

* naming harmonization

* adding commentary about sysinfoapi.h
2019-07-30 17:18:10 -04:00
ioioioio 5f20d3eb34 Merging No duplicate tail (PR#223) (#232)
* Use __forceinline on Windows for really_inline

https://docs.microsoft.com/en-us/cpp/cpp/inline-functions-cpp?view=vs-2019#inline-__inline-and-__forceinline

* Don't duplicate find_structural_bits for final chunk

* writing coherent macro definitions
2019-07-29 14:11:42 -04:00
Daniel Lemire 3c0f5a3fe4 Improving the documentation. 2019-07-29 14:10:49 -04:00
Daniel Lemire 771e9cd68a
Trying again... (#235) 2019-07-29 13:55:13 -04:00
Daniel Lemire c328afee57 This should fix master. 2019-07-29 13:44:25 -04:00
Daniel Lemire 3dae86223d Changing intrinsic name. 2019-07-29 13:39:54 -04:00
Daniel Lemire 85e31a5479
This fixes the "big exp" bug, although we need to assess the performance and maybe do some tuning. (#233) 2019-07-29 13:28:16 -04:00
Daniel Lemire a53d95099c
Intrinsic-based flatten (#234)
* Providing a flatten function with intrinsics (for Visual Studio).
2019-07-29 13:28:02 -04:00
Daniel Lemire eba02dc1b9 Runtime dispatch
* Attempt 1 - fn targeting

GCC won't work with templates with different targets, need to specialize all the way up the call stack.

* Compiles properly with cmake. Does not with the Makefile.

* Compilation works with Makefile

* instruction_set changes to architecture

* some aesthetic changes

* fix amalgation and tests + aesthetic changes

* This now compiles and passes tests under CLANG

* Minor correction.

* Trying to make it work on ARM

* Adding missing namespace

* Missing bracket

* Fixing minor compilation issues.

* Getting parse to use runtime dispatch

* Fixing amalgamation script.

* Making sure that NEON is supported.

* Fixing typo

* Merging https://github.com/lemire/simdjson/pull/229

* Manual merge of
https://github.com/lemire/simdjson/pull/229
by @jkeiser  (second part)

* Trying another way.

* Removing the paral.

* Fixing the make file

* Let us make the practice run long enough.

* Resolved the awful slowness.

* Cleaning the README.md

* With runtime dispatching, we should not need flags anymore.

* Changing isa detection file's name + fixing typos.
2019-07-28 22:46:33 -04:00
ioioioio bcabdfc1ae Json pointer (#220)
* json pointer support

* Addition of tests for the json pointer

* Adding a new tool for the JSON Pointer support, and some documentation.
2019-07-26 18:38:10 -04:00
Daniel Lemire a3beac8d13
This simplifies back the number parsing code... The extra work introduced recently is seemingly unnecessary. (#218) 2019-07-18 11:50:26 -04:00
Daniel Lemire e926b4b3c9
More accurate number parsing (#217)
* This drastically improves the accuracy (down to to a ULP of 1)

* More comments and documentation.
2019-07-15 22:17:49 -04:00
Daniel Lemire 6c168f046d
Optimizing stage1 (#216)
* Optimizing stage 1-- avx edition

* Optimizing sse.

* Saving 0.5% in instruction count (NEON).
2019-07-11 20:59:21 -04:00
Daniel Lemire 4b7e87ec7f
Removing garbage. (#213) 2019-07-09 21:51:16 -04:00
Daniel Lemire 98b387aac3
Fixing a messed up interleaved #ifdef/namespace. (#211) 2019-07-09 19:48:20 -04:00
Daniel Lemire be956654b2 Minor cleaning = annotating simdjson namespaces and making sure that we don't have headers all over. 2019-07-09 19:24:08 -04:00
Daniel Lemire 977f57fd37 We need to guard the simdutf8check files. 2019-07-09 16:53:28 -04:00
ioioioio 7369339c88 Neon utf8validation (#207)
* utf8 validation on neon works
2019-07-09 15:14:34 -04:00
Daniel Lemire 3f79385160
Removing some fprintf. (#209) 2019-07-09 13:04:44 -04:00
ioioioio b0d9c074e1 check_utf8_helper has a more meaningful name 2019-07-05 11:09:28 -04:00
Daniel Lemire fba27ef4b9 I missed a few. Building up VS support. 2019-07-04 17:45:45 -04:00
Daniel Lemire 19cdc09928 Improving support for VS 2019-07-04 17:36:26 -04:00
Daniel Lemire 2b2d93b05f Various minor tweaks. 2019-07-04 17:19:05 -04:00
ioioioio f7ea2629e4 Fixing warnings and Microsoft intinsics. 2019-07-04 10:13:40 -04:00
ioioioio 861a6a17e4 SSE implementation integrated 2019-07-03 17:15:21 -04:00
ioioioio 0df6d83f08 deleting useless comments and namespace indications 2019-07-03 10:47:45 -04:00
ioioioio 036f9d5a45 Merge branch 'master' of https://github.com/lemire/simdjson into Multiple_implementation_refactoring_stage2 2019-07-03 10:34:58 -04:00
ioioioio 3f24879157 Stage2 refactored to simplify multiple implementations 2019-07-02 17:12:00 -04:00
ioioioio 9230588ce8 conflicts are solved 2019-07-02 15:21:00 -04:00
Daniel Lemire aa78b70d69 Introducing a "native" instruction set so that you do not need to do #ifdef to select the right SIMD set all the time.
Fixing indentation.
Removing some obsolete WARN_UNUSED.
Fixing a weird warning with optind variable.
2019-07-01 14:18:30 -04:00
Daniel Lemire 1b81e7c928 Correcting ugly indentation. 2019-07-01 11:55:39 -04:00
ioioioio de08df6a7e Correction of identation. 2019-06-28 15:33:30 -04:00
ioioioio 6723221a42 Refactoring stage1 to facilitate multiple implementations. 2019-06-28 15:14:42 -04:00
Daniel Lemire d7f7f1b200
Fixing issue. (#193) 2019-06-20 18:49:47 -04:00
Daniel Lemire 3db8c5a0eb
Fixing the issue (#191) 2019-06-12 16:32:46 -04:00
Daniel Lemire b0e6bfa84c
Simpler iteration code (#190)
* Adding convenience method to simplify code.

* Simplifying the iteration code.
2019-06-12 16:29:24 -04:00
Daniel Lemire b1e8990654
Moving iterator functions in the header file (#189)
We want the compiler to inline hot functions in the iterators. Let us leave them in the header file. Please.
2019-06-11 21:09:58 -04:00
Daniel Lemire 14016743be
fixing typo in comment. 2019-06-05 21:29:46 -04:00
Daniel Lemire 59194dcf4d
Issue182: fixed (#183)
* Verifying issue 182.

* Fixing the corresponding bug.
2019-06-05 18:51:29 -04:00
Daniel Lemire 642132920f Fixing performance regression caused by helpful code contributions
that moved inlineable functions into the source file combined with
helpful compilers which aren't smart enough to do the inlinining in
any case.
2019-05-31 18:16:12 -04:00
Daniel Lemire f00be30318 Being clearer as to what TAPE_ERROR means. 2019-05-28 19:32:56 -04:00
Daniel Lemire 8526387acb
Improving error codes. (#176)
* This commit adds new error codes.
2019-05-24 17:28:56 -04:00
Daniel Lemire bf82288ab1
Preventing implicit conversions for C strings to C++ strings (with evil results). (#172) 2019-05-21 13:32:23 -04:00
alexey-milovidov 576914ed54 Remove MMX code (#170)
#169
2019-05-20 14:41:58 -04:00
Daniel Lemire dcd0cb8080
Fix for https://github.com/lemire/simdjson/issues/58 (#168) 2019-05-19 12:25:27 -04:00
Daniel Lemire 47beaff152 Adding white-listing for memory sanitizer. 2019-05-19 11:18:54 -04:00
Dong Xie b98454d213 Add explicit conversion for leading and tailing zeros. (#161) 2019-05-09 20:56:13 -04:00
Daniel Lemire 954b89e762 New version (0.1.2). 2019-05-09 20:55:26 -04:00
Daniel Lemire f75280ac9c
Fix for issue 150 (#162)
* Checks for issue 150. We run through the test files with sanitizers on.

* Fix for issue 150: the remaining issues were an overrun on the depth capacity and an "off-by-1" overrun on tape capacity.

* Improving makefile.

* Safer git submodule command.

* Getting get 'git' on circleci
2019-05-09 20:51:33 -04:00
Daniel Lemire e370a65383
Fix for issues 32, 50, 131, 137
* Improving portability.

* Revisiting faulty logic regarding same-page overruns.

* Disabling same-page overruns under VS.

* Clarifying the documentation

* Fix for issue 131 + being more explicit regarding memory realloc.

* Fix for issue 137.

* removing "using namespace std" throughout. Fix for 50

* Introducing typed malloc/free.

* Introducing a custom class (padded_string) that solves several minor usability issues.

* Updating amalgamation for testing.
2019-05-09 17:59:51 -04:00
Heinz N. Gies c5a3f9ccd4 Add failing test for a json with content zero (#134)
* Add failing test for a json with content zero

* Mark 0 byte as false in structural_or_whitespace_or_exponent_or_decimal_negated
2019-05-09 12:24:22 -04:00
Daniel Lemire f0574d492c
Fix for issue 154 (#157)
* Changes necessary to reproduce

https://github.com/lemire/simdjson/issues/154

* Fixing issue 154.
2019-05-08 22:33:11 -04:00
Daniel Lemire 39fcc62e85
Fixed typo 2019-05-08 13:42:30 -04:00
saka1 719dff1312 Add predicates to ParsedJson::iterator (#153) 2019-05-07 14:11:33 -04:00
Daniel Lemire 0d81fd287e
With this commit we can do all tests with full sanitizers on, and get no warning (#132)
* Making sure we can run with the sanitizers on.
* Minor code simplification in the number parsing.
* Following @EmilGedda 's recommendations regarding the makefile.
* Reference to blog post.
* Adding link to https://johnnylee-sde.github.io/Fast-numeric-string-to-int/
* Better hex parsing.
2019-04-24 17:31:47 -04:00
Daniel Lemire 681cd33698 Making the iterator a tad safer (tweaking the constructor so that it can throw). 2019-04-22 10:53:25 -04:00
Geoff Langdale 777b9c9a9e Unbreak x86. Durp. 2019-03-30 15:50:35 +11:00
Geoff Langdale 5ba29122fd First cut of ARM port. Needs hand-hacked Makefile. 2019-03-30 00:47:35 -04:00
saka1 ddc2867f94 Adjust format and comments on avxcheckOverlong (#129) 2019-03-25 10:06:27 -04:00
Geoff Langdale 2c23b375b2 Temporarily added a non-x86 definition of SIMDJSON_PADDING 2019-03-21 11:37:40 +11:00
Geoff Langdale 9b6d32346b Fixup portability.h to be more portable. 2019-03-21 11:25:51 +11:00
Daniel Lemire 374fe1af1e
Updating comments regarding usage 2019-03-15 12:10:10 -04:00
Daniel Lemire bf9b1b1457 New version (mostly setting the singleheader version in sync). 2019-03-13 21:02:39 -04:00
Daniel Lemire d5a185b13e new release. 2019-03-13 20:02:44 -04:00
Daniel Lemire df8f792183
Store the string lengths on the string tape (#101)
* Store string length in the string-tape item.
* Files are now limited to 4GB.
* Moving detection of unescaped chars to stage 1 to reduce the burden due to string parsing.

Fixes https://github.com/lemire/simdjson/issues/114

Fixes https://github.com/lemire/simdjson/issues/87
2019-03-13 19:32:57 -04:00
Daniel Lemire 609e96b5d1 Fix for https://github.com/lemire/simdjson/issues/119 2019-03-13 11:01:31 -04:00
myd7349 d2fa086198 Fix C4146 build error on UWP with MSVC (#113)
* Fix C4146 build error on UWP with MSVC

* Regenerate single header version

* Fix typo in parsedjson.h

* Regenerate single header version
2019-03-09 08:46:06 -05:00
Thomas Navennec 352dd5e7fa Change parse_json return type from bool to int (#82)
* Added simdjerr namespace

* Updated jsonparser files

* updated stage1 and stage2

* removed stage2 inline function

* Added forgotten return statements

* Updated tools and benchmarks

* Corrected parenthesis

* Removed extra =

* Accidentally undid reinterpret_cast

* Better comments, undid a header name fuckup

* Added an errorMsg method, updated readme

* Removed useless header from stage2

* Updated single-header file

* added simdjerr.cpp contents to simdjson.cpp

* Made single header version work

* Updated singleheader test, fixed simdjson.cpp

* Renamed simdjerr namespace and files to simdjson

* Updating the amalgamation.
2019-03-02 17:18:45 -05:00
Daniel Lemire a24e701b4e First release (0.0.1) 2019-02-26 10:14:49 -05:00
geofflangdale bdc2bc693f
Merge pull request #61 from NewProggie/fix_minor_problems
Fix minor problems
2019-02-26 20:50:03 +11:00
Kai Wolf 33341b60d8 Apply code review suggestions
- Undo explicit bool conversion
 - Don't check for NULL before deleting pointer
2019-02-26 09:36:28 +01:00
Geoff Langdale 5289bf3eeb Fixing Utf8 validation question #72 2019-02-26 13:17:29 +11:00
Kai Wolf e7683820d5
Merge branch 'master' into fix_minor_problems 2019-02-25 21:05:29 +01:00
Kai Wolf 95e6fc2844 Fix CI errors 2019-02-25 20:55:07 +01:00
Wojciech Muła 7830b1be87 Use nothrow (#65)
* Use C++11 features

* Use std::nothrow

By default new throws std::bad_alloc, so no check code would be executed.
2019-02-25 14:36:45 -05:00
Egor Bogatov 83ab72079f Add link to C# version (#66)
* fix noiline for MSVC

* Add SimdJsonSharp link to README.md
2019-02-25 14:17:43 -05:00
Kai Wolf b521719b6f Fix old-style C-Casts 2019-02-23 17:31:38 +01:00
Kai Wolf ff22e75f95 Apply minor readability fixes 2019-02-23 17:28:20 +01:00
Daniel Lemire 3640ab9dd3 Fixing the makefile build. 2019-02-22 15:34:35 -05:00
Thomas Navennec 9606343b2c ParsedJson & ParsedJson::iterator definitions in .cpp files (#47)
* Minor change to benchmark cmake

* Moved ParsedJson and its Iterator to separate .cpp files

* Uncommented functions, that has nothing to do with this pr

* Removed really_inline comments

* Reinstated some inline functions to restore previous performance

* Re-merged iterator in ParsedJson

* Uncommented some WARN_UNUSED
2019-02-22 14:38:35 -05:00
Daniel Lemire 4d6ed2b2c1 Tape capacity increase by 32 bytes to allow for expected overflow. 2019-02-22 13:08:46 -05:00
Daniel Lemire 90c881a3de Invoking -mbmi 2019-01-16 13:26:24 -05:00
Daniel Lemire b5a2c41049 We need the Intel intrinsic. 2019-01-16 13:13:03 -05:00
Daniel Lemire 388f89f185 Working on improving portability 2019-01-16 12:33:06 -05:00
Daniel Lemire a00df9b992 Number parsing fix. 2019-01-04 17:36:52 -05:00
Daniel Lemire d94e8ee973 Fixing dead code. 2019-01-02 12:16:29 -05:00
Daniel Lemire 3ce1dd8087 Cleaning. 2018-12-31 17:13:32 -05:00
Daniel Lemire 58d41923fd
Porting to visual studio
Now builds on Visual Studio
2018-12-30 21:00:19 -05:00
Daniel Lemire 46ef59c679 Cleaning. 2018-12-27 20:19:10 -05:00
Daniel Lemire bf4089b33b Removing custom types (more standard code). 2018-12-27 20:09:25 -05:00
Daniel Lemire 20133963bc Trying a detailed analysis. 2018-12-19 21:23:37 -05:00
Daniel Lemire 0a109508de Added documentation of the tape format. 2018-12-18 15:09:27 -05:00
Daniel Lemire 779ce184fb Getting ready to document the tape format. 2018-12-18 14:21:22 -05:00
Daniel Lemire 0769c39e27 Ok. Looks complete. 2018-12-14 21:32:42 -05:00
Daniel Lemire 05a2547829 Adding benchmark. 2018-12-12 22:42:19 -05:00
Daniel Lemire 751dce98f5 Getting there slowly. 2018-12-11 22:39:39 -05:00
Daniel Lemire e8d3d784ab More fixing. 2018-12-10 22:21:03 -05:00
Daniel Lemire 058eb917d1 Better doc. 2018-12-10 22:00:16 -05:00
Daniel Lemire e4703a383b Even safer. 2018-12-10 20:54:31 -05:00
Daniel Lemire 7296d4d48b Fixing... 2018-12-10 17:39:19 -05:00
Daniel Lemire 05636f3a1d Cleaning. 2018-12-10 16:47:02 -05:00
Daniel Lemire 7fda77d51a Mostly fixed performance regression. 2018-12-10 15:35:42 -05:00
Daniel Lemire 8615760331 Should now pass. 2018-12-10 15:16:31 -05:00
Daniel Lemire 176d2ccda4 Tweaking. 2018-12-10 14:25:49 -05:00
Daniel Lemire 52c4b65f1e Progress validating the API. 2018-12-09 20:47:02 -05:00
Daniel Lemire a56e92a571 API works now. 2018-12-09 13:08:41 -05:00
Daniel Lemire 747bb16919 Iteration API implemented but untested. 2018-12-07 23:35:53 -05:00
Daniel Lemire 9df22452af First API implementation. 2018-12-07 22:19:57 -05:00
Daniel Lemire 628e4e3522 Fix https://github.com/lemire/simdjson/issues/26 2018-12-06 22:51:55 -05:00
Daniel Lemire beb030fc16 Tweaking 2018-12-06 22:23:57 -05:00
Daniel Lemire c2913d5d69 Adding dynamic memory allocation. 2018-12-06 21:44:26 -05:00
Daniel Lemire 8589a0588b More clever parse function. 2018-12-06 17:40:32 -05:00
Daniel Lemire e2d2d2f8ff Adding more tests. 2018-12-06 17:22:22 -05:00
Daniel Lemire 196c41e3bc Fixed typo 2018-12-06 11:37:26 -05:00
Daniel Lemire 0f9a7a6b2f Improving a bit the number parsing using MMX. 2018-12-05 23:08:33 -05:00
Daniel Lemire c8706c66ec Solving some build issues 2018-12-05 21:33:32 -05:00
Daniel Lemire 4a4bf8d98d Fixed issue where the numbers don't appear properly after parsing. 2018-11-30 22:40:10 -05:00
Daniel Lemire e3a4b41c2e Cleaning. 2018-11-30 22:02:32 -05:00
Daniel Lemire c11eefca32 More cleaning. 2018-11-30 21:31:05 -05:00
Daniel Lemire a8b99984f2 Intermediate step. 2018-11-30 20:27:16 -05:00
Daniel Lemire e5707331e9 Some refactoring. 2018-11-30 09:37:57 -05:00
Daniel Lemire 12b518578d Ok, the new code seems quite fast. 2018-11-29 22:15:02 -05:00
Daniel Lemire ce85dd0c3a Still need to streamline number parsing. 2018-11-29 17:56:17 -05:00
Daniel Lemire c1de7662c1 Simplifying function call. 2018-11-28 11:12:28 -05:00
Daniel Lemire b858a404f7 Adding missing include 2018-11-28 10:29:57 -05:00
Daniel Lemire 8648c4108e MOre cleaning. 2018-11-27 20:42:35 -05:00
Daniel Lemire ba0f6fea51 Cleaning. 2018-11-27 17:38:53 -05:00
Daniel Lemire 58ac242770 Ok. Let us benchmark this thing. 2018-11-27 15:05:50 -05:00
Daniel Lemire a43b0772e1 Lots and lots of cleaning. 2018-11-27 14:37:59 -05:00
Daniel Lemire 5fae7b2100 Still working 2018-11-27 10:10:39 -05:00
Daniel Lemire 50defa510f Stupid work. 2018-11-26 16:55:24 -05:00
Daniel Lemire 08ae836aa1 Removing dead file. 2018-11-20 20:53:42 -05:00
Daniel Lemire 1fcd2688f8 Better documentation. 2018-11-20 12:59:06 -05:00
Daniel Lemire bbeb64a70b Cleaning documentation. 2018-11-20 12:54:06 -05:00
Daniel Lemire 78e75a8bae Even faster. 2018-11-20 11:56:10 -05:00
Daniel Lemire 7dd590c43c Saving faster version. 2018-11-20 11:02:39 -05:00
Daniel Lemire 47ae00895a Forgot to save... 2018-11-09 21:42:44 -05:00
Daniel Lemire 17f5d0517d Opting for a more common intrinsic. 2018-11-09 21:41:15 -05:00
Daniel Lemire 76074a821f Various cleaning steps. 2018-11-09 21:31:14 -05:00
Daniel Lemire 0e5b939568 Merge branch 'master' of github.com:lemire/simdjson 2018-11-09 15:16:25 -05:00
Daniel Lemire c1a7e79862 Lifting the mem limit. (Dirty commit.) 2018-11-09 15:16:05 -05:00
Daniel Lemire df65de4ae2 Tuning presentation and fixing a problem with minifier benchmark. 2018-10-23 21:36:32 -04:00
Daniel Lemire 18633e02d2 Added more thorough testing. 2018-10-23 20:19:33 -04:00
Daniel Lemire 9738af68c8 Fixing up the code point parsing. I think that what is there is now correct.
I believe that there was a case of early optimization.
2018-10-19 22:07:22 -04:00
Daniel Lemire 8315f4c888 Cleaning up the code. 2018-10-17 21:31:22 -04:00
Daniel Lemire 35381279c3 Maybe we can do away with the fast ASCII trick. 2018-10-17 21:05:38 -04:00
Daniel Lemire e517414080 We include character-encoding validation. 2018-10-17 19:22:09 -04:00
Daniel Lemire 355e5d2ed3 Checking for unescaped chars. 2018-10-17 15:08:49 -04:00
Daniel Lemire 7eb7cd265a We can now parse crazy things like pi to 100 digits. 2018-10-08 15:24:16 -04:00
Daniel Lemire 70c122074f Tests. 2018-10-08 14:41:36 -04:00
Daniel Lemire 37adea9387 Adding a comment. 2018-09-30 14:44:30 -04:00
Daniel Lemire 314356d561 We have faster number parsing...? 2018-09-28 18:26:27 -04:00
Daniel Lemire 4ee515fa4b The new number parsing code is faster. 2018-09-28 14:45:34 -04:00
Daniel Lemire 57b840327f Faster number parsing? 2018-09-28 14:38:40 -04:00
Geoff Langdale 1e5d8ece56 Update API a bit 2018-09-28 14:59:30 +10:00
Geoff Langdale 89fd074ec9 Draft API. No implementation yet. 2018-09-28 14:55:57 +10:00
Geoff Langdale ceb55cc8db Pick new number parser as winner; move string parsing to own header 2018-09-28 14:27:48 +10:00
Daniel Lemire ecbe1158ed Added testing for number parsing. 2018-09-27 20:26:27 -04:00
Daniel Lemire e4094afe08 Moving toward having number-parsing testing. 2018-09-27 17:38:15 -04:00
Daniel Lemire 7606a43aa9 Merge branch 'master' of github.com:lemire/simdjson 2018-09-26 23:36:19 -04:00
Daniel Lemire 1c8339297d With new number parser (faster!). Removing the dependency on the doubleconv library (which proves to be useless). 2018-09-26 23:35:33 -04:00
Geoff Langdale ccb3670c7c DEBUG mode fixes. 2018-09-27 13:10:33 +10:00
Geoff Langdale 9f91650e72 Remove old 4-stage path. 2018-09-26 15:22:55 +10:00
Geoff Langdale c4c51627d3 Fix compile - jsonparser needs to include unified header 2018-09-26 11:33:35 +10:00
Geoff Langdale 682c224d1a Merge branch 'master' of https://github.com/lemire/simdjson 2018-09-26 11:29:23 +10:00
Geoff Langdale b0c05c03cc Fix linkage between call sites and headers, add dump code, cleanup 2018-09-26 11:28:22 +10:00
Daniel Lemire dee1bbe54e Integrating the new 3-stage approach. 2018-09-25 17:26:58 -04:00
Geoff Langdale 555926849d Bug cleanup (many vestiges of old 32-bit tape stil there) and more encapsulation of tapes. 2018-09-25 16:24:39 +10:00
Geoff Langdale 053f04b15d Crude first cut of "stage34", a unified code-based DFA with explicit stack for stages 3 and 4. 2018-09-24 10:42:30 +10:00
Daniel Lemire 2aa6b93a02 Using a naive strtoll 2018-08-28 22:37:11 -04:00
Daniel Lemire 6807abff96 Made the code safer (at the expense of the memory usage). 2018-08-24 13:20:20 -04:00
Daniel Lemire 94ea7cefb0 Moving include files into a sensible subdirectory. 2018-08-20 17:51:38 -04:00
Daniel Lemire ef0d14c35c Minor fixes + new scripts. 2018-08-20 17:40:50 -04:00
Daniel Lemire fb65be64bb Major surgery. 2018-08-20 17:27:25 -04:00
Daniel Lemire 726eb5a030 Moved the files into subdirectories. 2018-08-20 14:45:51 -04:00