Commit Graph

439 Commits

Author SHA1 Message Date
mir4cle c0d18452fc
Add an option to get current location from value (#1738)
Co-authored-by: Igor Logvanev <igor.logvanev@aimtech.team>
2021-10-24 16:55:49 -04:00
Daniel Lemire 6d308a08c5
Fixing issue 1736 (#1737)
* Fixing issue 1736

* Updating google benchmark.

* Minor trimming.

* Using the variable (to silence a warning).

* Adding assignment operator.
2021-10-20 12:15:35 -04:00
Daniel Lemire b7c4d1eeef
Adding test for issue 1729. (#1730)
* Adding test for issue 1729.

* Adding comment.

* Trying to move to 11.7.

* Tweaking.

* More tweaking.

* Adding additional test.

* Missing "<<".

* Minor update.

* Removing legacy systems.
2021-10-13 09:30:37 -04:00
Daniel Lemire 91908ade4d
Additional documentation following issue 1723 (#1724)
* Some extra documentation regarding issue 1723.

* Adding comments.

* Minor fix.

* [no ci] more documentation
2021-10-09 11:41:20 -04:00
Daniel Lemire d996ffc494
Minor typo. (#1721)
* Minor typo.

* Minor fixes.

* Patching...
2021-09-25 11:34:44 -04:00
Daniel Lemire cae5e5342f
Additional ndjson tests. (#1717)
* Additional ndjson tests.

* Switching the data source.

* Fixing.
2021-09-18 16:29:10 -04:00
Daniel Lemire af4ff7cc33
Adding fast "get_number_type()" function, bypassing "get_number()" (#1713)
* Adding fast "get_number_type" function, bypassing "get_number"

* Minor tweak.

* Adding missing get_number_type().
2021-09-07 14:34:40 -04:00
Nicolas Boyer c9179ad81d
Add count_fields method for objects (#1712)
* Implement count_elements for object

* Add count_elements() for simdjson_result

* Add count_elements for documents(arrays,objects).

* Add tests for objects.

* Add tests for documents array. Typos.

* Renaming to count_fields() for objects.

* Update doc.

* Apply patch
2021-09-02 16:18:48 -04:00
Daniel Lemire cebe3fb299
Tweaking current_location(). (#1707)
* Tweaking current_location().

* Well.
2021-08-28 20:19:30 -04:00
Nicolas Boyer ed7343f7f2
Provide current location in JSON input (#1695)
* Setup.

* Add current_location().

* Make return simdjson_result and fix cast issues.

* Whitespace.

* Add broken JSON tests. Add null parser check.

* Remove unused variables.

* Alive fix.

* Fix merge issues.

* Simplification for out of bounds.

* More tests.

* Move pointer back for unrecoverable errors.

* Add new error OUT_OF_BOUNDS

* Remove unnecessary include and fix OUT_OF_BOUNDS.

* Add more tests. Fix unrecoverable errors.

* Fix tests.

* Modify one test.

* Update doc.

* Typos.

* Add read_me tests.

* Update doc.

* Add current_location for simdjson_result and document_reference

* Typos.

Co-authored-by: Daniel Lemire <lemire@gmail.com>
2021-08-27 13:41:59 -04:00
Daniel Lemire 35158257c6
Implementing get_number for the document instances. (#1706) 2021-08-27 10:26:01 -04:00
Daniel Lemire b935ce2e06
Allowing casts instead of get_double, get_uint64 and get_int64 (#1705) 2021-08-27 10:25:17 -04:00
Daniel Lemire 4afe7565b4
ondemand dynamically-typed numbers (#1704)
* Building up a number type.

* Implemented is_integer and is_negative.

* Implemented get_number in value_iterator.

* Final prototype.

* [no ci] typo
2021-08-26 12:16:44 -04:00
Daniel Lemire 6bed34ad61
This exposes 'reset' for object and array instances. (#1696)
* This exposes 'rewind' for object and array instances.

* Putting really_inline back to count_elements()

* Update array.h

* Adding empty array rewind.

* Adds "is_empty" method to arrays.

* More fragmentation.

* Tweaking implementation.

* Fixing issue with get_value() on document instances.

* Changing the name of the new rewind functions to reset.
2021-08-21 10:23:59 -04:00
Daniel Lemire 0ad52a7e22
Renaming scalar to is_scalar. (#1698) 2021-08-21 10:23:22 -04:00
Daniel Lemire aa52cf6868
Alive fix. (#1700) 2021-08-21 10:22:59 -04:00
strager d036fdf919
Reduce #include bloat (<iostream>) (#1697)
Including <iostream> has two problems:

* Compile times are worse because of over-inclusion
* Binary sizes are worse when statically linking libstdc++ because
  iostreams cannot be dead-code-stripped

simdjson only needs std::ostream. Include the header declaring only what
we need (<ostream>), omitting stuff we don't need (std::cout and its
initialization, for example).

This commit should not change behavior, but it might break users who
assume that including <simdjson/simdjson.h> will make std::cout
available (such as many of simdjson's own files).
2021-08-13 11:24:36 -04:00
Daniel Lemire de4deb8c4e
Makes it possible to cast a document to a value. (#1690)
* Makes it possible to cast a document to a value.
2021-08-11 20:02:30 -04:00
Daniel Lemire ba46616cbc
Small test for document_reference usage. (#1694) 2021-08-10 21:08:59 -04:00
Daniel Lemire 19902abaf8
Guarding first/second access. (#1688)
* Guarding first/second access.

* Correcting our own usage.

* Adding more documentation.
2021-08-06 20:25:05 -04:00
Daniel Lemire 06643fc9f5
Additional tests and document tuning (#1684)
* Additional example.

* Adds more tests.

* Actually using the variable.
2021-08-02 16:35:02 -04:00
Daniel Lemire 0fa68d8930
Fixing noexcept on operator << with simdjson_result. (#1678)
* Additional tests.

* Finishing touch.

* Extending to IO.
2021-07-31 17:54:27 -04:00
Daniel Lemire cc98358453
Adding error handing examples to the documentation (#1679)
* Adding error handing examples.

* Guarding the exception-throwing test.
2021-07-31 14:31:48 -04:00
Daniel Lemire d83e69d977
Fix an issue with truncated-byte function. (#1674) 2021-07-30 13:12:42 -04:00
Daniel Lemire c6ef2105ab Minor tweak. 2021-07-27 10:56:05 -04:00
Daniel Lemire eb93b98d6a
verify and fix issue 1668 (#1673)
* Adding test.

* Verifies and fix issue 1668. This commit updates the previous behavior of the
On Demand stream support by return a value type (document_reference) instead
of a reference to a document. This allows us to bridge with the usually simdjson
error system, with its simdjson_result types.

* Minor reformat.

* Adds a test with initial tests passing.

* Adding an example.
2021-07-27 08:51:07 -04:00
Nicolas Boyer 7d887fdc1e
Parse numbers inside strings (#1667)
* Update basic.md to document JSON pointer for On Demand.

* Add automatic rewind for at_pointer

* Remove DOM examples in basics.md and update documentation reflecting addition of at_pointer automatic rewinding.

* Review

* Add test

* Naive implementation for doubles in string.

* Add double from string in atom doc.

* Simplification (removed all *_from_string())

* Add int and uint parsing in string.

* Make duplicates instead.

* Make tests exceptionless.

* Add missing declarations.

* Add more tests (errors, JSON pointer).

* Add crypto json tests.

* Update doc.

* Update doc after review.

Co-authored-by: Daniel Lemire <lemire@gmail.com>
2021-07-27 08:50:44 -04:00
Daniel Lemire b79261eebc
This cleans a bit the current code, especially with respect to EOF guards. (#1669)
* Upgrading the GitHub Actions.

* Upgrading appveyor

* Upgrading circle ci.

* Cleaning.
2021-07-25 10:36:22 -04:00
Daniel Lemire 47a62db559
Isolated jkeiser fix for issue 1632: make it so that INCORRECT_TYPE is a recoverable condition in On Demand (#1663) 2021-07-23 11:32:26 -04:00
Nicolas Boyer 5c590b8434
Bringing ndjson(document_stream) to On Demand (#1643)
* Update basic.md to document JSON pointer for On Demand.

* Add automatic rewind for at_pointer

* Remove DOM examples in basics.md and update documentation reflecting addition of at_pointer automatic rewinding.

* Review

* Add test

* Add document_stream constructors and iterate_many

* Attempt to implement streaming.

* Kind of fixed next() for getting next document

* Temporary save.

* Putting in working order.

* Add working doc_index and add function next_document()

* Attempt to implement streaming.

* Re-anchoring json_iterator after a call to stage 1

* I am convinced it should be a 'while'.

* Add source() with test.

* Add truncated_bytes().

* Fix casting issues.

* Fix old style cast.

* Fix privacy issue.

* Fix privacy issues.

* Again

* .

* Add more tests. Add error() for iterator class.

* Fix source() to not included whitespaces between documents.

* Fixing CI.

* Fix source() for multiple batches. Add new tests.

* Fix batch_start when document has leading spaces. Add new tests for that.

* Add new tests.

* Temporary save.

* Working hacky multithread version.

* Small fix in header files.

* Correct version (not working).

* Adding a move assignment to ondemand::parser.

* Fix attempt by changing std::swap.

* Moving DEFAULT_BATCH_SIZE and MINIMAL_BATCH_SIZE.

* Update doc and readme tests.

* Update basics.md

* Update readme_examples tests.

* Fix exceptions in test.

* Partial setup for amazon_cellphones.

* Benchmark with vectors.

* Benchmark with maps

* With vectors again.

* Fix for weighted average.

* DOM benchmark.

* Fix typos. Add On Demand benchmark.

* Add large amazon_cellphones benchmark for DOM

* Add benchmark for On demand.

* Fix broken read_me test.

* Add parser.threaded to enable/disable thread usage.

Co-authored-by: Daniel Lemire <lemire@gmail.com>
2021-07-20 14:17:23 -04:00
Daniel Lemire 2dac3705d2
renames 'to_string' to 'to_json_string' and makes it ridiculously fast (#1642)
* Changing the name of the function to 'to_json_string' from 'to_string' to avoid confusion.

* Moving to a fast string_view model

* Making it exception-safe.

* Tweaking.

* Workaround for exceptions.

* more robust to_json_string (#1651)

* WIP.

* Fuzzing timeout  (bug fix) (#1650)

* prove pull request #1648 introduces an infinite loop

* Interesting bug!

* Tweak.

Co-authored-by: Paul Dreik <github@pauldreik.se>

* It should now work.

* Moving car examples to exception mode

* Simplifying somewhat.

* I forgot to abandon. Let us do that.

* Adding more tests.

* WIP.

* It should now work.

* Moving car examples to exception mode

* Simplifying somewhat.

* I forgot to abandon. Let us do that.

* Adding more tests.

Co-authored-by: Paul Dreik <github@pauldreik.se>

Co-authored-by: Paul Dreik <github@pauldreik.se>
2021-07-19 10:24:36 -04:00
Daniel Lemire 33f73b577c
Another attempt at producing problems with threads (more tests) (#1655)
* Another attempt at producing problems with threads.

* Fixing code

* Trying to please visual studio
2021-07-13 17:30:29 -04:00
Daniel Lemire ea3d4e7ce5
Fuzzing timeout (bug fix) (#1650)
* prove pull request #1648 introduces an infinite loop

* Interesting bug!

* Tweak.

Co-authored-by: Paul Dreik <github@pauldreik.se>
2021-07-06 14:36:38 -04:00
Daniel Lemire bea1483cde
Fixing minor issue with document stream (DOM). (#1648)
* Fixing minor issue with document stream (DOM).

* Porting over the fix.
2021-07-05 17:40:04 -04:00
Nicolas Boyer eb849662c0
Update basic.md to document JSON pointer for On Demand. (#1618)
* Update basic.md to document JSON pointer for On Demand.

* Add automatic rewind for at_pointer

* Remove DOM examples in basics.md and update documentation reflecting addition of at_pointer automatic rewinding.

* Review

* Add test

Co-authored-by: Daniel Lemire <lemire@gmail.com>
2021-06-26 11:38:17 -04:00
Daniel Lemire f146294a85
Partial documentation regarding relative JSON pointers. (#1630)
* Attempt at bringing some sanity to partial/relative JSON pointers.

* Removing some white spaces.
2021-06-26 11:36:38 -04:00
Daniel Lemire 5b99a75ae1
count_elements did not like empty arrays. (#1631)
* count_elements did not like empty arrays.

* Minor cleaning.

* I don't understand.

* More cleaning.
2021-06-24 11:08:13 -04:00
John Keiser 1ba73b9e6b
Merge pull request #1629 from simdjson/jkeiser/vscode-config
Add .vscode workspace settings
2021-06-23 19:29:21 -06:00
Daniel Lemire cfe3adb599
Added tests over invalid documents. (#1626)
* Added tests over invalid documents.

* Tweaking.
2021-06-23 18:02:00 -04:00
John Keiser ca8e21583c Add basic workspace configuration for vscode 2021-06-23 12:28:00 -06:00
Daniel Lemire 1c01fc35eb
This better documents invalidation. (#1625)
* This better documents invalidation.

* Tweak.
2021-06-22 11:33:25 -04:00
Nicolas Boyer ce38fe7bea
Add automatic rewind for at_pointer (#1624) 2021-06-21 15:17:24 -04:00
Nicolas Boyer a4803d50c5
Add JSON Pointer for On Demand (#1615)
* Add working JSON pointer for array of atoms.

* Add working JSON pointer for object with key-atom pairs.

* Add first version of JSON pointer.

* Update tests (2 tests).

* Make tests exceptionless.

* Fix builing issues.

* Add more tests. Add json_pointer validation in array-inl.h and object-inl.h and empty json_pointer in document-inl.h.

* Fix errors in tests.

* Review.

* Add missing comment.
2021-06-11 14:20:05 -04:00
Nicolas Boyer 3ba221eb8e
Add max_capacity setting for On Demand (#1610)
* First try at implementing max_capacity for simdjson_ondemand.

* Add max_capacity check.

* Update doc.

* Add one more example in doc for fixed capacity.

* Make allocate() public.

* Remove whitespace

* Found culprit whitespace.

* Duplicating variable.
2021-06-08 14:42:42 -04:00
Daniel Lemire 13ab123daf
Testing issue 1607. (#1608) 2021-06-07 10:50:48 -04:00
Daniel Lemire 16e8db1f17
Adding 'count_elements' method. (#1577)
* Adding 'count_elements' method.

* Actually reporting errors.

* removing white space.

* Removing white space again.

* Adding an extra example.

* Prettier.

* Making the functionality more error-proof.

* Avoiding exceptions.

* Various fixes including extending count_elements to value types.

* Various fixes.

* Minor fixes.

* Correcting comment.

* Trimming white spaces.
2021-06-06 17:56:00 -04:00
Daniel Lemire eb0ae041e3
Verification and bug fix of issue 1511 (#1602)
* Verification and bug fix.

* Removing comment.

* Removing spaces.

* Guarding exceptions.

* Tweaking the test
2021-06-06 17:55:33 -04:00
John Keiser 893e613faa
Don't #include "simdjson.cpp" in tests (#1605) 2021-06-06 14:44:04 -04:00
Daniel Lemire 714f0ba222
This deletes most of our data files making the repository much smaller (#1582)
* This deletes most of our data files making the repository much smaller.

* Removing dead code.

* Various minor fixes.
2021-06-04 09:24:03 -04:00
Daniel Lemire 19c3b1315a
Rewind functionality. (#1539)
* Rewind functionality.


* Keeping just the document rewind.
2021-06-04 09:22:33 -04:00