Commit Graph

876 Commits

Author SHA1 Message Date
piotte13 f345490cae Updating .gitignore for most popular IDEs 2019-11-26 10:59:18 -05:00
Jeremie Piotte db141e82c9
Specifying that RFC7464 is not supported 2019-11-26 10:33:33 -05:00
Jeremie Piotte f163155929 JsonStream documentation (#381)
* adding Multiline JSON competition chart to doc
* Completing the comments for JsonStream
* Adding a page for JsonStream's documentation.
2019-11-25 18:11:55 -05:00
John Keiser 9b6377fd80 Precalculate the ASCII path 2019-11-25 11:49:44 -08:00
John Keiser 7356b4532f Perform UTF-8 detection via flag lookup algorithm
- adds the alternative zwegner, range and lookup utf8 algorithms as well, for
ability to do "shootouts"
2019-11-25 11:49:44 -08:00
John Keiser 7d7bec856d Remove lookup_lower_4_bits
It's only a coincidence that it works in current uses: it doesn't do
what the name says. Particularly, if the high bit is 1 it will yield
0 even if the lower 4 bits would yield something else.
2019-11-25 11:49:44 -08:00
Paul Dreik c5504ef50b
run the oss fuzz initial seed corpus in CI (#378)
This makes sure the seed corpus keeps being healthy.
2019-11-23 22:49:41 +01:00
Daniel Lemire 3658ff650d
Delete Notes.md 2019-11-23 14:15:25 -05:00
Paul Dreik 6d14afd80e
Make threads optional in the cmake build (#376)
Only the simdjson library should optionally depend on threads,
the executables that link to simdjson will get the dependency
indirectly.

* add option for controlling threads (default is on)
* add CI testing with threading on/off for msvc, gcc and clang
* fix an unrelated copy paste comment error in the cirlce ci build conf
2019-11-22 21:51:46 +01:00
Jeremie Piotte 6e5178efc4
Update CONTRIBUTORS 2019-11-21 16:49:07 -05:00
Jeremie Piotte 29fc51522a
Introducing concurrency mode in JsonStream. (#373)
* JsonStream threaded prototype

* JsonStream Threaded version working. Still supporting non-threaded version.

* Fix where invalid files would enter infinite loop.

* SingleHeader update

* I will remove -pthread in cmake for now.

* Attempt at resolving the -pthread issue
2019-11-21 11:22:06 -05:00
Daniel Lemire 6cd8fb7982
Adding a getline benchmark (#344) 2019-11-20 20:33:16 -05:00
John Keiser ce824f8653 Decrease stage 1 step size to 64 bytes on Westmere/ARM
- Templatize scan_step() with STAGE1_STEP_SIZE
- Fix simd8::store()
- add NUM_CHUNKS to simd8
2019-11-18 21:58:07 -08:00
John Keiser 708f4a094d Move inline functions out of class definition for templating 2019-11-18 21:58:07 -08:00
Paul Dreik 2704b73399
Add fuzzer badge and improve fuzzer documentation (#367)
* Update Fuzzing.md

* add oss-fuzz badge
2019-11-13 16:57:20 +01:00
Paul Dreik 783ccd6c21
Add CI Fuzz job
This runs fuzzing for a short while, then executes the corpus through valgrind.
The extended corpus is uploaded to persistent storage on bintray.
2019-11-12 16:46:23 +01:00
Paul Dreik 3fd1c3b64a run short fuzzing and valgrind in github action 2019-11-11 22:17:32 +01:00
Daniel Lemire 58d249ca16
Introducing move assignments. (#363) 2019-11-09 10:34:32 -05:00
Jeremie Piotte bdc2b07339
Streams of JSON documents + Large files (>4GB) (#350) (#364)
* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* type

* minor fixes and cleaning.

* minor fixes and cleaning.

* removing warnings

* removing some copies

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* merged main into branch

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* Fix for https://github.com/lemire/simdjson/issues/345

* Follow up test and fix for https://github.com/lemire/simdjson/issues/345 (#347)

* Final (?) fix for https://github.com/lemire/simdjson/issues/345

* Verbose basictest

* Being more forgiving of powers of ten.

* Let us zero the tail end.

* add basic fuzzers (#348)

* add basic fuzzing using libFuzzer

* let cmake respect cflags, otherwise the fuzzer flags go unnoticed

also, integrates badly with oss-fuzz

* add new fuzzer for minification, simplify the old one

* add fuzzer for the dump example

* clang format

* adding Paul Dreik

* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* type

* minor fixes and cleaning.

* Fixing issue 351 (#352)

* Fixing issues 351 and 353

* minor fixes and cleaning.

* removing warnings

* removing some copies

* Fix ARM compile errors on g++ 7.4 (#354)

* Fix ARM compilation errors

* Update singleheader

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* fix integer overflow in subnormal_power10 (#355)

detected by oss-fuzz

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714

* Adding new test file, following https://github.com/lemire/simdjson/pull/355

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* merged main into branch

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* merging main

* rough prototype working.  Needs more test and fine tuning.

* prototype working on large files.

* prototype working on large files.

* Adding benchmarks

* jsonstream API adjustment

* minor fixes and cleaning.

* minor fixes and cleaning.

* removing warnings

* removing some copies

* runtime dispatch error fix

* makefile linking src/jsonstream.cpp

* fixing arm stage 1 headers

* fixing stage 2 headers

* fixing stage 1 arm header

* making jsonstream portable

* cleaning imports

* including <algorithms> for windows compiler

* cleaning benchmark imports

* adding jsonstream to amalgamation

* bug fix where JsonStream would bug on rare cases.

* Addind a JsonStream Demo to Amalgamation

* rough prototype working.  Needs more test and fine tuning.

* minor fixes and cleaning.

* adding jsonstream to amalgamation

* merged main into branch

* Addind a JsonStream Demo to Amalgamation

* merging main

* merging main

* make file fix
2019-11-08 17:39:45 -05:00
Daniel Lemire 6888ca709d
Update README.md 2019-11-08 16:39:09 -05:00
Paul Dreik 8ae818e17c add ossfuzz support (#362)
* initial oss-fuzz friendly build

parts taken from libfmt, which I wrote and have the copyright to

* fix build error

* add script for building a corpus zip

see https://google.github.io/oss-fuzz/getting-started/new-project-guide/#seed-corpus

* fix zip command

* drop setting the C++ standard

* disable the minify fuzzer, does not pass oss-fuzz check-build test

* fix integer overflow in subnormal_power10

detected by oss-fuzz

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714

* invoke the build like oss fuzz does

* document what the scripts are for and how to use them

* add a page about fuzzing
2019-11-08 10:32:43 -05:00
Daniel Lemire c4f1baad31
Making get_corpus safer (#360) 2019-11-06 12:22:42 -05:00
Daniel Lemire 3439ce19c9
Adding a flag which allows us to disable AVX detection. This exposes a bug. (#356) 2019-11-06 10:39:26 -05:00
John Keiser b7c18df540
Merge pull request #346 from lemire/jkeiser/simd_u8
Genericize SIMD arch code with `simd8<T>`
2019-11-05 19:49:14 -08:00
John Keiser 74799134b1 Add cpuinfo to checkperf 2019-11-05 13:44:04 -08:00
John Keiser 3828e1e538 Fix performance issues:
1. Don't recast "int" result of movemask to uint32_t
2. Call max_epu8 with the mask first and the bytes second.
2019-11-05 13:44:04 -08:00
John Keiser d89046d515 Use simd8 helpers for find_bs_bits_and_quote_bits 2019-11-05 13:44:04 -08:00
John Keiser 4bc128f07e Move compute_quote_mask to generic bitmask library 2019-11-05 13:44:04 -08:00
John Keiser e383b7a6ab Use generic simd operators for find_whitespace_and_operators 2019-11-05 13:37:56 -08:00
John Keiser c89d6bf68b Genericize utf-8 check 2019-11-05 13:37:32 -08:00
Daniel Lemire 52640518d3 Adding new test file, following https://github.com/lemire/simdjson/pull/355 2019-11-05 08:58:41 -05:00
Paul Dreik cf493254b7 fix integer overflow in subnormal_power10 (#355)
detected by oss-fuzz

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18714
2019-11-04 16:54:03 -05:00
John Keiser c97eb41dc6 Fix ARM compile errors on g++ 7.4 (#354)
* Fix ARM compilation errors

* Update singleheader
2019-11-04 10:36:34 -05:00
Daniel Lemire b1224a77db
Fixing issue 351 (#352)
* Fixing issues 351 and 353
2019-11-01 16:05:28 -04:00
Daniel Lemire 17b777f751
adding Paul Dreik 2019-10-28 14:47:45 -04:00
Paul Dreik 9442c9e1f4 add basic fuzzers (#348)
* add basic fuzzing using libFuzzer

* let cmake respect cflags, otherwise the fuzzer flags go unnoticed

also, integrates badly with oss-fuzz

* add new fuzzer for minification, simplify the old one

* add fuzzer for the dump example

* clang format
2019-10-28 14:46:57 -04:00
Daniel Lemire 15740500af Let us zero the tail end. 2019-10-24 18:49:30 -04:00
Daniel Lemire 3484dda45e Being more forgiving of powers of ten. 2019-10-24 18:27:24 -04:00
Daniel Lemire 1ece6c0e2f Verbose basictest 2019-10-24 16:40:40 -04:00
Daniel Lemire 59cad23aeb Merge branch 'master' of github.com:lemire/simdjson 2019-10-24 16:34:10 -04:00
Daniel Lemire da1c35d04b Final (?) fix for https://github.com/lemire/simdjson/issues/345 2019-10-24 16:33:37 -04:00
Daniel Lemire c469aed047
Follow up test and fix for https://github.com/lemire/simdjson/issues/345 (#347) 2019-10-24 16:06:29 -04:00
Daniel Lemire a065805b0f Fix for https://github.com/lemire/simdjson/issues/345 2019-10-24 15:34:30 -04:00
Jérémie Galarneau f41a18b57d Fix typo in REAME.md: technical -> technique (#338) 2019-10-19 11:07:02 -04:00
Daniel Lemire 1257432df3 Adding list. 2019-10-18 17:32:54 -04:00
Daniel Lemire d0c0e31220 Increasing the ULP bound. 2019-10-18 17:30:29 -04:00
Daniel Lemire 9b7832c39a Adding another test (powers of two). 2019-10-16 17:47:52 -04:00
Daniel Lemire e3e29b720d Adding a check that powers of ten can be parsed (sanity check). 2019-10-16 16:27:50 -04:00
John Keiser 64872bddf4 Eliminate stage1_find_marks_flatten.h 2019-10-14 12:33:46 -07:00
John Keiser 81f2249575 Move stage1 into a class to pass fewer parameters 2019-10-14 12:33:46 -07:00