Commit Graph

660 Commits

Author SHA1 Message Date
Daniel Lemire 81609393f1
Fixing issue 1449. (#1451) 2021-02-21 16:33:05 -05:00
John Keiser 0634958329 Don't emit out of order iteration error for empty array 2021-02-21 11:43:36 -08:00
John Keiser b352b903e7 Fix bug where iterators didn't always report errors 2021-02-21 11:43:36 -08:00
John Keiser 3076de0405 Use SIMDJSON_DEVELOPMENT_CHECKS instead of SIMDJSON_PRODUCTION
Don't enable in retail
2021-02-20 11:46:01 -08:00
Daniel Lemire 610b3ad302
Adds Visual Studio 2017 to CI (for real) and adapt our build/tests (#1444) 2021-02-15 19:49:12 -05:00
Daniel Lemire 4c63a929bc
This makes it possible to a have document instance (DOM) that is separate from the parser if you would like. (#1430)
* This makes it possible to a have document instance that is separate from the parser if you would like.
2021-02-10 14:44:53 -05:00
John Keiser df7201ba42 Fix Windows assume error 2021-02-06 11:06:53 -08:00
John Keiser 14315ec5cd Default SIMDJSON_PRODUCTION to OFF for bare header usage 2021-02-06 11:06:37 -08:00
John Keiser 0f10fc9ad9 Fix Windows _assume warning 2021-02-05 18:53:39 -08:00
John Keiser ce678fd986 Fix GCC 7 strict-overflow warning 2021-02-05 18:53:31 -08:00
John Keiser 9d693da852 Only set container depth when a container iteration starts 2021-02-05 17:20:24 -08:00
John Keiser 22742b6bd6 Make max_depth() a simple check 2021-02-05 17:11:03 -08:00
John Keiser 3801ea7777 Disable all OUT_OF_ORDER_ITERATION checks when SIMDJSON_API_USAGE_CHECKS
is off
2021-02-05 16:39:44 -08:00
John Keiser c7935ceed1 Put parser capacity / max_depth back into parser 2021-02-05 16:39:36 -08:00
John Keiser ea119a5679 Start parsing at depth 1 instead of using descend_to for it 2021-02-05 16:39:34 -08:00
John Keiser 0d1c99a6ad Allow object lookup safety to be disabled
Use cmake -DSIMDJSON_API_USAGE_CHECKS=OFF ..
2021-02-05 10:18:01 -08:00
John Keiser e4626d233c Descend into fields at the value position, not the key 2021-02-05 10:18:01 -08:00
John Keiser 9934f65987 Store start index of each depth for safety 2021-02-05 10:17:28 -08:00
John Keiser b2de2dfd1b
Merge pull request #1416 from simdjson/jkeiser/safe-iterators-2
Add safety checks for out of order array/object iteration+indexing
2021-02-05 09:47:03 -08:00
John Keiser 3f2639a655
Merge pull request #1414 from simdjson/jkeiser/array-assert
Fix #1409 (assert when trying to get one value as multiple types)
2021-02-05 09:45:49 -08:00
Daniel Lemire 96536239c2
Deleting the function. (#1428) 2021-02-02 09:48:01 -05:00
Daniel Lemire 0e18453e34
Potential optimizations applied to jkeiser/array-assert (#1421)
* Some tuning.
* Using table lookups...
2021-02-01 12:39:10 -05:00
Daniel Lemire 777202e1f1
Why would you use a reference when looping? (#1422)
* Why would you use a reference?

* I missed a few cases.
2021-02-01 12:30:36 -05:00
Daniel Lemire 6f61ed1477
It appears that Qt uses macros for common terms like slots, signals and so forth. (#1425) 2021-02-01 11:31:29 -05:00
Daniel Lemire d6f33e4830
This adds a little test to see if we can compiler with very strict flags (conventional casts) (#1417)
* This adds a little test to see if we can compiler with very strict flags.

* Trimming a leftover old-style cast.

* More cleaning.

* A few more pedantic casts.
2021-01-27 18:37:30 -05:00
Tibbel 5613d30e97
partly replacement old-style-cast to c++ *_cast (#1403)
Co-authored-by: Tibbel <tibbel@ma-gi.de>
2021-01-27 13:33:48 -05:00
John Keiser 1bfbb6448a Check out-of-order error in object index 2021-01-26 20:49:14 -08:00
John Keiser c5b44f44f9 Add partial out-of-order check for field lookup 2021-01-26 20:00:39 -08:00
John Keiser 22b3ea93a8 Emit an error if user tries to iterate arrays out of order 2021-01-26 20:00:19 -08:00
John Keiser 18ecc0032d Reenable test that is now working 2021-01-26 15:15:09 -08:00
John Keiser 1a1532c8cc Return INCORRECT_TYPE when numbers fail to parse
Also add tests for trying to get multiple types in a row
2021-01-26 14:59:13 -08:00
John Keiser e6d2b7759a Fix assertion when getting array after failing to get a scalar
Also remove distinction between & and && for array start, acting like
other types
2021-01-26 14:09:54 -08:00
Daniel Lemire c96ff018fe Version 0.8. 2021-01-22 12:04:08 -05:00
Daniel Lemire 2a714f4e37
Hide the std::pair inheritance in our result instances (#1396)
* Fixing issue 1243

* The tie must go.

* Having std::pair be a protected inheritance breaks on demand.

* Putting it back.

* You really want to use emplace.

* Fixing one botched test.

* Prettier test.

* Using safer code.

* Fixing unsafe code.

* Simplifying the fuzzer.

* Trying another way.

* Ok. It should work without exceptions.

* Removing trailing spaces.
2021-01-18 12:00:02 -05:00
John Keiser 55faf4c5bc
Recommend simdjson::ondemand over simdjson::builtin::ondemand (#1380)
Co-authored-by: Daniel Lemire <lemire@gmail.com>
2021-01-14 17:33:49 -05:00
John Keiser 3279c2f15b Remove unnecessary "try_get_XXX" methods
Also don't distinguish between & and && (instead, advance the first time
you encounter a scalar no matter what)
2021-01-11 15:19:26 -08:00
John Keiser 38da15b501 Rename checkpoint() -> position(), add token_position type 2021-01-11 15:19:24 -08:00
John Keiser 0f515785c6 Support reading scalars out of order 2021-01-11 15:17:46 -08:00
John Keiser 6fed8d2a26 Use active implementation for stage 1 on demand 2021-01-11 14:57:52 -08:00
John Keiser 0039c5b981 Disallow parser.iterate("1"_padded), as it won't work 2021-01-01 19:18:00 -08:00
John Keiser a1cf588d5f Fix GCC 7 warning when inlining does its job 2020-12-21 09:19:07 -08:00
John Keiser e7e09e444c Use find_field in benchmark 2020-12-20 11:39:56 -08:00
John Keiser 195acc3e45 Add find_field / find_field_unordered to object 2020-12-20 11:39:50 -08:00
John Keiser b8426584fc Make field lookup order-insensitive. 2020-12-19 13:57:29 -08:00
Daniel Lemire 85001c55fb
Fixing UTF-8 validation under PPC64 (#1346)
* Entering a new UTF-8 test

* Maybe *I* had a bug in the tests.

* Replacing nulls with 1s.

* Let us try to be more verbose.

* Return 0.

* Fixing issue.

* Adding puzzler scenario.

* Fixing PPC64

Co-authored-by: Daniel Lemire <dlemire@rcs-power9-talos>
2020-12-19 10:42:27 -05:00
John Keiser f785f76d98
Merge pull request #1342 from simdjson/jkeiser/iter-safety
Safety: assert in debug mode when on demand values are used out of order
2020-12-16 15:58:29 -08:00
John Keiser d670906ff3 Don't keep a separate at_start boolean in object 2020-12-15 11:29:31 -08:00
Daniel Lemire c578f63f34
Fixing issue 1337 (#1338)
* Fixing issue 1337

* This test should always run.

* Removing bad attribute access.
2020-12-14 16:20:36 -05:00
Daniel Lemire 9a404bdcdc
Reenabling the optimized kernels (main branch). (#1344)
* Reenabling the optimized kernels (main branch).

* Defining SIMDJSON_CAN_ALWAYS_RUN_PPC64 and SIMDJSON_CAN_ALWAYS_RUN_ARM64

* Adding the bad UTF8 string from the fuzzer.

* Taking into account John's comments.

* Bumping the lib version.

* Update CMakeLists.txt
2020-12-14 14:28:45 -05:00
John Keiser f9c6dedca1 Add test for out-of-order parse asserts 2020-12-13 13:39:47 -08:00
John Keiser 73c7b7db3d Safety: check index of scalar values before use 2020-12-11 11:57:09 -08:00
John Keiser 6b02b06581
Merge pull request #1330 from simdjson/jkeiser/depth-tracking
Permit partial iteration in On Demand
2020-12-09 12:40:58 -08:00
John Keiser 806cb39103 Clean up some more benchmark/test code 2020-12-07 13:44:58 -08:00
John Keiser 3baba73cf5 Don't call functions in assume (VS2019 Clang doesn't like it) 2020-12-07 11:17:24 -08:00
John Keiser db6bf15361 Allow all array/object/etc. to be copied 2020-12-06 16:29:53 -08:00
John Keiser 3aa3175378 Add tests for recovering from partial child iteration 2020-12-06 16:19:06 -08:00
John Keiser 4c63956624 Enable object["x"]["y"] 2020-12-06 15:23:52 -08:00
John Keiser c89647af9e Make all values use same underlying iterator 2020-12-06 15:23:52 -08:00
John Keiser 1303a88769 Unconditionally track depth 2020-12-06 15:23:52 -08:00
Daniel Lemire cbacec0760 Releasing 0.7.0. 2020-12-04 18:56:15 -05:00
Daniel Lemire aaa12c8fb6
more thread testing (#1324)
* Adding a few tests.

* More tests.

* Trailing.

* More links/documentations.
2020-12-02 11:33:36 -05:00
Daniel Lemire 9b5150daef
Fixing issue 1247 (#1325) 2020-12-01 19:09:55 -05:00
Daniel Lemire bc4087ac96
Hunting for data races in document_stream (#1323)
* Adding matching tests.

* Actually adding thread testing actions.
2020-12-01 12:40:15 -05:00
Daniel Lemire dc69bc28ae
Trying to verify recent document stream issues. (#1318)
* Trying to verify recent document stream issues.

* Adding another one.

* More thorough tests.

* Removing trailing spaces.

* Working toward exposing some issues.

* Tweaking.
2020-11-27 17:04:10 -05:00
Daniel Lemire 53577f11e1 Explicit worker disposal. 2020-11-27 15:39:43 -05:00
Daniel Lemire 1abb64f6f6
This fixes issue 1302 (#1303)
* This verifies issue 1302

* I believe that this fixes the issue.

* Let us do it differently.

* Better comments.
2020-11-27 14:19:36 -05:00
Mikhail Matrosov b442b29ccf
Mark comparison operators as const (#1320)
Co-authored-by: Mikhail Matrosov <mikhail.matrosov@aimtech.com>
2020-11-27 14:18:42 -05:00
Daniel Lemire 5609851747
This will guard the batch_size so it cannot be so low as to accidentally burn through your CPU. (#1319)
* This will guard the batch_size so it cannot be so low as to accidentally burn through your CPU.
2020-11-27 09:37:54 -05:00
Paul Dreik 68a8004518
don't memcpy after failed alloc (#1315)
Should fix https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27675
2020-11-21 07:52:24 +01:00
Paul Dreik 0b39e3a6cf
add fuzzer for padded_string (#1312)
This also fixes an overflow problem.
2020-11-19 16:51:56 +01:00
Daniel Lemire 268f26a84f Tweak of the documentation. 2020-11-07 16:33:12 -05:00
Paul Dreik d9456b3030
fix numeric_limits usage problem on windows caused by the max macro (#1293)
See https://stackoverflow.com/questions/2789481/problem-calling-stdmax/2789509#2789509
2020-11-06 06:55:27 +01:00
Daniel Lemire 218c274090
Updating main branch for legacy libc++ support (#1288)
* Updating main branch for legacy libc++ support

* Adopting

* Removing unnecessary math header.

* Updating the single-header files so we can pass the new tests.

* Portable infinite-value detection is hard.

* Working toward disabling boost json selectively.

* Selectively disabling Boost JSON

* More work toward selectively disabling boost json.
2020-11-04 12:24:42 -05:00
Tyler Kennedy 921c79f26f
Add method to get the total number of slots used by an array in the t… (#1253)
* Add method to get the total number of slots used by an array in the tape.

* Whoops
2020-11-04 10:35:54 -05:00
Paul Dreik af4db55e66
remove trailing whitespace (#1284) 2020-11-03 21:48:09 +01:00
Convery e6258377d9
Additional define to silence the 32-bit portability warning. (#1279)
Under compilers like MSVC the `#pragma message` about portability will be issued for each translation module; regardless of whether or not `SIMDJSON_PORTABILITY_H` has already been defined. As such a new define `SIMDJSON_NO_PORTABILITY_WARNING`, can be defined prior to the inclusion of `<simdjson.h>` to silence it.
2020-11-02 08:37:13 -05:00
Paul Dreik 54ffbbe7db
remove asserts from compute_float_64 (#1276) 2020-11-01 18:34:42 +01:00
Paul Dreik 0b82f07115
fix segfault in numberparsing #1273 (#1274)
This was a read overflow.
2020-11-01 18:27:21 +01:00
Paul Dreik 265db2e533
fix non ascii sources (#1275)
Master does not build because of non-ascii sources, merging without waiting for CI.
2020-11-01 11:14:01 +01:00
Paul Dreik f93fb21c95
optionally disable deprecated apis (#1271)
Introduce cmake option SIMDJSON_DISABLE_DEPRECATED_API (default Off)
which turns off deprecated simdjson api functions by setting the macro
 SIMDJSON_DISABLE_DEPRECATED_API.

For non-cmake users, users will have to set SIMDJSON_DISABLE_DEPRECATED_API
by some other means to disable the api.

Closes #1264
2020-11-01 06:38:52 +01:00
Daniel Lemire b1444b4dfb
Mostly tiny changes, with one optimization to fallback for number parsing. (#1265)
* Mostly tiny changes, with one optimization to fallback for number parsing.

* Missed an update.
2020-10-29 11:18:11 -04:00
Danila Kutenin f46a0f64f2
PPC64 support (#1254)
* Initial PPC64 support

* Add travis CI

* Fix outdated cmake version for travis

* Fix indendtation

* Try another workaround for outdated cmake in travis

* Try beta cmake

* Add dash before beta

* Use builtin snaps

* Use cmake as rocksdb

* Test cmake on bionic

* Remove unnecessary things from travis

* Remove unnecessary things from travis

* Another try of compiler install

* Add all major compilers

* Add all major compilers

* Add all major compilers

* Tweak travis a bit

* Typo

* More robust travis

* Typos typos typos

* Add fewer compilers, add non specific build for clang and gcc, should be the final config

* CMAKE_FLAGS is in incorrect place

* Remove default implementation

* Limit build thread number

* Fall back prefix_xor to a usual implementation, no performance boost is noticed

* Test for power9 as it is the main architecture for OpenPOWER right now

* Add to documentation to build with power9 as the implementation is compatible but compiler optimizations is not

* Replace ARM with PPC in the comment
2020-10-27 18:43:39 -04:00
Daniel Lemire a75c07065f
Fix for issue 1246. We document the relationship between parser instances and elements (#1250)
* Fix for issue 1246.

* Adopting John's wording.
2020-10-26 08:40:45 -04:00
Daniel Lemire 6a86ef5a7d Issue release 2020-10-23 09:32:25 -04:00
Daniel Lemire 0d6919dd99
Reenable the on-demand tests and allows us to convert a raw string into a C++ string. (#1232)
* Reenable the on-demand tests and allows us to convert a raw string into a C++ string.

* Fixing a 1-byte buffer overrun.

* More documentation.

* Adding more tests.

* Enabling the new tests

* Committing a nicer example.

* Not yet happy but this should fix our failures.

* Duh.

* Ok. Making it easier to get string_view instances from field instances.

* It is a struct.

* Trying to satisfy VS.

* Adopting John's name.
2020-10-19 20:22:24 -04:00
Paul Dreik f1b4a54991
add fuzz element (#1204)
* add definitions for is_number and tie (by lemire)
* add fuzzer for element
* update fuzz documentation
* fix UB in creating an empty padded string
* don't bother null terminating padded_string, it is done by the std::memset already
*  refactor fuzz data splitting into a separate class
2020-10-17 05:48:50 +02:00
Paul Dreik 58e7106df1
remove unused function parse_unsigned 2020-10-16 22:17:11 +02:00
Paul Dreik 7bf391c54a
fix potential use of uninitialized value warning, avoid casting away const
This fixes a "potentially use of uninitialized value" warning, as well as a cstyle cast to non-const.
2020-10-16 22:14:42 +02:00
Daniel Lemire 07a6e098c8
This would allow users to find out what builtin is. (#1227)
* This would allow users to find out what builtin is.

* Trying another approach.

* Added instructions.

* Cleaning up the printout.

* Let us be less invasive.

* Adding a comment.
2020-10-15 21:58:42 -04:00
Daniel Lemire e4897d6b54
We have hardcoded 32 (#1236) 2020-10-15 21:57:10 -04:00
Daniel Lemire bb2bc98a22
Fix issue https://github.com/simdjson/simdjson/issues/1127 (#1224) 2020-10-13 09:18:54 -04:00
Daniel Lemire 43da4f7ccc Corrected number 2020-10-12 17:59:13 -04:00
Daniel Lemire 37e6d1e9c7
new number parsing (#1222)
* Remove our dependency on strtod_l by bundling our own slow path.

* Ok. Let us drop strtod entirely.

* Trimming down the powers to -342.

* Removing useless line.

* Many more comments.

* Adding some DLL exports.

* Let the gods help those who rely on windows+gcc.

* Marking the subnormals as unlikely. This is pretty much "performance neutral", but it might help just a bit with twitter.json.
2020-10-10 12:47:49 -04:00
John Keiser 676a3d068c Add non-top-level array iteration test 2020-10-06 11:29:46 -07:00
John Keiser 5533f8d87b Add object iteration error tests 2020-10-06 11:29:46 -07:00
John Keiser 364ad5529d Add basic error tests 2020-10-06 11:29:46 -07:00
John Keiser 9088792b0e Disable value["x"] (unsafe, convert to object first) 2020-10-06 11:29:45 -07:00
John Keiser 00f9bb8a07 Add null tests 2020-10-06 11:29:45 -07:00
John Keiser a90e1637cb Add boolean tests 2020-10-06 11:29:45 -07:00
John Keiser ce09d82fc7 Test strings 2020-10-06 11:29:45 -07:00