Commit Graph

582 Commits

Author SHA1 Message Date
John Keiser e180dc44bc Move container logging into json_iterator 2020-08-18 21:25:03 -07:00
John Keiser 268b8845a9 Document tape_builder 2020-08-18 21:25:03 -07:00
John Keiser 74c47995a3 Document json_iterator 2020-08-18 21:25:03 -07:00
John Keiser 24f5936cbf Give is_array responsibility to json_iterator 2020-08-18 21:25:03 -07:00
John Keiser bdfa8aca28 Separate interface from implementation to make interface clearer 2020-08-18 21:25:03 -07:00
John Keiser 15eb1ad922 Preface visitor methods with visit() 2020-08-18 21:25:03 -07:00
John Keiser 6ec98ee8b1 Add error codes to all things 2020-08-18 21:25:03 -07:00
John Keiser c5862d6de9 Remove empty_object/empty_array 2020-08-18 21:25:03 -07:00
John Keiser 57eb55446f Cache string value locally 2020-08-18 21:25:03 -07:00
John Keiser abd1399a7f Don't check depth at the end (unnecessary check) 2020-08-18 21:25:03 -07:00
John Keiser 57eba21ee5 Fall through goto labels where possible 2020-08-18 21:25:03 -07:00
John Keiser 1b56211a70 Give start_*/end_* error codes 2020-08-18 21:25:03 -07:00
John Keiser d8974d53b2 Keep value around between states 2020-08-18 21:25:03 -07:00
John Keiser 5ecd17f49e Log unconsumed input as an error 2020-08-18 21:25:03 -07:00
John Keiser 6bb99aec3c Merge structural_parser+iterator into json_iterator 2020-08-18 21:25:03 -07:00
John Keiser a67e83e24e Remove parse_* from visitor method names 2020-08-18 21:25:03 -07:00
John Keiser 5a3c3134ec Move value into common place to be shared across states 2020-08-18 21:25:03 -07:00
John Keiser ce8b9ee8c4 Move finish() out to walk_document 2020-08-18 21:25:03 -07:00
John Keiser 970dfc9f67 builder -> visitor, parser -> iter 2020-08-18 21:25:03 -07:00
John Keiser 04d39c0961 Make tape_builder primary entry point to stage 2 2020-08-18 21:25:01 -07:00
John Keiser d6339aa015 Set is_array in builder 2020-08-18 17:41:48 -07:00
John Keiser 11076bf337 containing_scope -> open_container 2020-08-18 17:41:16 -07:00
Daniel Lemire 8a8eea53a2
Prefixing macros (issue 1035) (#1124)
* Renaming partially done.

* More prefixing.

* I thought that this was fixed.

* Missed one.

* Missed a few.

* Missed another one.

* Minor fixes.
2020-08-18 18:25:36 -04:00
John Keiser ab6b7a8044 Make last_structural() helper 2020-08-18 10:12:42 -07:00
John Keiser 07c2fe726e Fail if hash is unclosed at start 2020-08-18 10:10:01 -07:00
Daniel Lemire 17f6d5208f
Documenting and fixing the case where a string is immediately followed by a scalar (#1106)
* Documenting and fixing.

* More cleaning.

* Being a bit cleaner.
2020-08-14 16:19:57 -04:00
John Keiser bee4d7a12b
Merge pull request #1108 from simdjson/jkeiser/ncgdoc
Remove information about nonexistent computed gotos :)
2020-08-12 10:39:20 -07:00
John Keiser 1b69612246 Remove information about nonexistent computed gotos :) 2020-08-10 16:29:24 -07:00
Daniel Lemire 9e93509a56
Fix number parsing (too lenient). (#1107)
* Fix number parsing (too lenient).

* Minor tweak.

* These are Booleans.

* Tweaking test config
2020-08-10 18:10:11 -04:00
John Keiser 5dd625916b Decrement depth just before checking 2020-08-03 23:09:21 -07:00
John Keiser e3d7718cf3 Simplify value switch statements 2020-08-03 23:09:21 -07:00
John Keiser 9eccd7b1fb Inline start_object/start_array 2020-08-03 23:09:21 -07:00
John Keiser 5b05d126b4 Consolidate start_object calls 2020-08-03 23:09:20 -07:00
John Keiser 03aaf189c1 Use parse_primitive (negative perf!) 2020-08-03 23:09:20 -07:00
John Keiser 6ef9395419 "parser.parser" -> "parser.dom_parser" 2020-08-03 23:09:20 -07:00
John Keiser 3a56e13b78 Make parse() a method 2020-08-03 23:09:19 -07:00
John Keiser ec28acba3d De-templatize stage2::structural_parser 2020-08-03 23:09:15 -07:00
John Keiser ee6647ce40 Make parse part of structural_parser 2020-08-03 17:50:51 -07:00
John Keiser 03d54f8f6e Use SAX model for stage 2 2020-08-03 17:50:51 -07:00
John Keiser 553e6d7549 Don't check max depth on startup 2020-08-03 17:49:14 -07:00
John Keiser e6896ee71e Keep current JSON after checking primitive type 2020-08-03 13:30:13 -07:00
John Keiser e6762f9b48 Advance immediately upon evaluating a character 2020-08-03 13:26:56 -07:00
John Keiser 099bb1afef Pass buffer to primitive parse functions 2020-08-03 12:56:35 -07:00
John Keiser 9c33093c91 Name goto labels consistently 2020-08-03 11:47:38 -07:00
John Keiser 634d8038b9 Increment depth before starting a scope 2020-08-03 11:35:46 -07:00
John Keiser ad46154f2f Hardcode document start/end creation 2020-08-03 10:23:32 -07:00
John Keiser fa81068ea8 Simplify structural_parser.start() 2020-08-03 09:49:15 -07:00
John Keiser 70c2a1c9f9 Short-circuit empty objects/arrays 2020-08-03 09:36:18 -07:00
John Keiser 66a68ce264 Return errors immediately instead of using goto 2020-08-02 12:04:12 -07:00
John Keiser 6bca1225e6 Add unlikely in strategic places 2020-08-01 18:19:36 -07:00
John Keiser 379a4e6a01 namespace { -> unnamed namespace 2020-08-01 14:46:23 -07:00
John Keiser 460cfcaf3e Make parse_structurals inline 2020-08-01 14:43:50 -07:00
John Keiser 8e69103822 Remove computed GOTO 2020-08-01 14:43:50 -07:00
John Keiser 2f67dab2b6 Remove extraneous machine addresses 2020-08-01 14:43:50 -07:00
John Keiser bb65ebd8be Remove computed gotos from parse_value 2020-08-01 14:43:50 -07:00
John Keiser c46ea0390c Move { and [ to the start of the switch 2020-08-01 14:43:50 -07:00
John Keiser bc8a6dd2e3 Remove dead code 2020-08-01 14:43:10 -07:00
John Keiser b1478c37f6 Fix arm64 build 2020-08-01 14:43:10 -07:00
John Keiser 4e944a9f3c Eliminate unused functions in fallback 2020-08-01 14:43:10 -07:00
John Keiser c7fa9b5fe8 Make entire implementation namespaces anonymous 2020-08-01 14:43:10 -07:00
John Keiser 65148b123b Put anonymous namespace in front of everything 2020-08-01 14:43:10 -07:00
John Keiser 3acfc0b630
Merge pull request #1045 from simdjson/jkeiser/generic-2
Define namespaces inside generic files
2020-07-24 12:42:39 -07:00
Daniel Lemire 2ce5f69def
fix recently introduced overflow (#1060)
* Various fixes.

* Clearer comment.
2020-07-24 13:59:24 -04:00
John Keiser 7d347be902 Untangle amalgamated headers 2020-07-24 02:56:41 -07:00
John Keiser a456d78fe0 really_inline more things 2020-07-24 02:56:41 -07:00
John Keiser bf67c967d6 Inline jsoncharutils per-implementation 2020-07-24 02:56:41 -07:00
John Keiser 44b7a7145c Include bitmanip/simd everywhere 2020-07-24 02:56:39 -07:00
John Keiser 3867ee71ed Include files where they are used 2020-07-24 02:56:37 -07:00
John Keiser 464f4813e3 Define namespaces inside generic files 2020-07-24 02:56:36 -07:00
John Keiser af8b52e7e8 Target region for entire compilation of an implementation 2020-07-24 02:48:25 -07:00
Daniel Lemire 4beb2ed507
Make simd8 64 uncopyable and other Visual Studio optimizations (#1031)
* Working on making simd8x64 immutable


* Even less invasive
2020-07-21 18:11:21 -04:00
Daniel Lemire e9c91a1ce2
lookup4 (new UTF-8 validation) (#993)
* lookup4

* Self-document lookup4 and clean up extra bits

* Maintenance, to match against upcoming PR.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
Co-authored-by: John Keiser <john@johnkeiser.com>
2020-07-20 18:20:07 -04:00
Daniel Lemire 534632dc52
Minor tweak on number parsing (#1041)
* Tweak.
2020-07-17 12:14:10 -04:00
Daniel Lemire 8bf5f3d869
Trying to document more carefully the use of memcpy. (#1038)
* Trying to document more carefully the use of memcpy.

* Patching spelling.
2020-07-17 09:58:34 -04:00
Vitaly Baranov 1e4aa116e5
Choose active implementation only once. (#1044) 2020-07-16 18:17:56 -04:00
John Keiser 90cc1411da
Merge pull request #1018 from simdjson/jkeiser/simplify-integer-parse
Remove some branches from number parsing
2020-07-16 12:21:43 -07:00
Vitaly Baranov 6bd64c6873
Fix clang warning -Wused-but-marked-unused. (#1042)
* Fix clang warning -Wused-but-marked-unused.

* Fix build.
2020-07-15 13:28:51 -04:00
Vitaly Baranov a2f0933d01
Fix undefined behavior: load of misaligned address in atomparsing.h (#1037) 2020-07-13 08:46:52 -04:00
John Keiser 6797a6ab56 Use const uint8_t * in number parsing 2020-07-10 09:17:23 -07:00
John Keiser 86b5928f5e Use parse_digit for decimal and exp parsing as well 2020-07-10 09:16:43 -07:00
John Keiser 6dbd15aa71 Move SIMDJSON_SKIPNUMBERPARSING method out 2020-07-09 15:55:10 -07:00
John Keiser 22e5b081c4 Remove is_integer 2020-07-09 15:55:10 -07:00
John Keiser d848f33c48 Simplify integer parsing 2020-07-09 15:55:10 -07:00
John Keiser c64367536d Eliminate "found_minus" parse_number() parameter 2020-07-09 15:55:09 -07:00
John Keiser fc0102b079 Use common parse_digit() funtion in int parsing 2020-07-09 15:33:22 -07:00
Daniel Lemire d0ce2f0b5a
Fixing clang under visual studio (#1028)
* Lots of fixes

* Removing some lambdas

* Removing some functional programming.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-07-06 18:58:19 -04:00
John Keiser 82fb45aa2a
Merge pull request #990 from simdjson/jkeiser/fast-large-integer
Don't reparse large integers
2020-07-01 12:49:43 -07:00
Daniel Lemire 74870a8189
Fixing issue 1013. (#1016)
* Fixing issue 1013.

* Bumping to 0.4.6

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-07-01 14:14:51 -04:00
John Keiser 7a9f6b48f4 Replace TODOs with comments about why we DIDNTDO 2020-07-01 10:31:10 -07:00
John Keiser d3c089130d Check overflow without reparsing integers 2020-07-01 09:51:48 -07:00
John Keiser e0f3060527 Add negative/positive integer writing 2020-07-01 09:51:48 -07:00
John Keiser 4c1256acc4 Reduce nesting somewhat with different if() order 2020-07-01 09:51:48 -07:00
John Keiser 85f6f5bd29 Use macros to remove #ifdefs on every write 2020-07-01 09:51:48 -07:00
John Keiser 4d9eac663a Use a macro to get rid of #ifdefs on each invalid number check 2020-07-01 09:51:48 -07:00
Daniel Lemire 0ef4d90ad0
Fix for issue 1014. (#1015)
* Fix for issue 1014.

* Explanation.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-30 19:36:26 -04:00
Daniel Lemire 444ec4ad27 Stupid me 2020-06-26 19:29:28 -04:00
Daniel Lemire bb5ce007e6 Something better. 2020-06-26 19:03:28 -04:00
Daniel Lemire deaa74d378 Re-enabling tests generally. 2020-06-26 18:57:34 -04:00
Daniel Lemire b6997a56df Patching things up and adding tests. 2020-06-26 12:15:16 -04:00
Brendan Knapp 41f33ecbb9 Permit 32-bit GCC compilation 2020-06-25 17:07:17 -07:00
John Keiser b4b968ff44 Fix #953 2020-06-23 09:53:24 -07:00
Daniel Lemire 2bb101bd19 Code reformatting. 2020-06-22 16:50:57 -04:00
Daniel Lemire a6cbf1f922 Going generic... 2020-06-22 16:25:11 -04:00
Daniel Lemire b836164a38 Fix. 2020-06-22 02:12:49 +00:00
Daniel Lemire 058507badf Putting back the loop 2020-06-21 21:21:49 -04:00
Daniel Lemire ad40e90790 Patching. 2020-06-21 20:14:00 -04:00
Daniel Lemire 066269153e Explaining decision. 2020-06-21 18:02:34 -04:00
Daniel Lemire 5dbcdf1484 Ok 2020-06-21 17:52:30 -04:00
Daniel Lemire f03a6ab5a4 Tweaking. 2020-06-21 17:39:24 -04:00
Daniel Lemire 5dc07ed295 It builds. 2020-06-21 17:20:33 -04:00
Daniel Lemire 064d4255d5 Ok. 2020-06-21 17:09:06 -04:00
Daniel Lemire 04139eb82e Ok. 2020-06-21 17:05:55 -04:00
John Keiser 76c9f4f5a6
Merge pull request #941 from simdjson/jkeiser/forgot
Remove unnecessary functions
2020-06-17 09:09:28 -07:00
Daniel Lemire 942ef3b7f2
Merge pull request #939 from simdjson/dlemire/lookup3
Introducing lookup3 (UTF-8 validation).
2020-06-17 11:19:09 -04:00
John Keiser f8f36c085c Remove unnecessary functions 2020-06-17 07:11:53 -07:00
John Keiser 7339f67dd7
Merge pull request #462 from simdjson/jkeiser/if-backslash
Wrap backslash processing in a branch
2020-06-17 07:07:58 -07:00
Daniel Lemire 71a889ed73 Introducing lookup3 (UTF-8 validation). 2020-06-16 19:08:25 -04:00
John Keiser 610c79fbf3 Don't use backslash branch on ARM 2020-06-13 07:51:28 -07:00
John Keiser fd44c2a2ff
Merge pull request #927 from simdjson/dlemire/exposingthestringminifier
Exposing the string minifier.
2020-06-13 07:47:20 -07:00
John Keiser a86a82b39c Rename minify class to minifier so the minify() method is cleared up 2020-06-12 17:05:25 -07:00
Daniel Lemire bd2d0f769f
One unlikely too many (#930) 2020-06-12 17:58:10 -04:00
John Keiser 664b03bb13 Short circuit find escapes if there is a backslash 2020-06-12 10:10:35 -07:00
John Keiser bbd61eb13f Let tape writing be put in a register 2020-06-12 09:18:20 -07:00
John Keiser e15e1e253d peek_char -> peek_next_char 2020-06-12 09:10:16 -07:00
Daniel Lemire a6e4933d93 Exposing the string minifier. 2020-06-11 13:07:18 -04:00
John Keiser ea08e7d192 Remove unused extra copy of find_next_document_index 2020-06-09 17:52:13 -07:00
John Keiser d178e089a6 Stop caching current structural, keep current index around instead of
next
2020-06-08 15:21:54 -07:00
John Keiser 5f00b37e21 Stop caching the buffer index 2020-06-08 15:21:54 -07:00
John Keiser 8a8792d47f Remove most uses of current_char() 2020-06-08 15:21:54 -07:00
John Keiser 59d9bc9e48 Store the pointer to the next structural instead of base
structural_indexes and an index
2020-06-08 15:21:54 -07:00
John Keiser 8793dd3ceb Don't store len locally 2020-06-08 15:21:54 -07:00
John Keiser 48062380fa Move parser to structural_iterator 2020-06-08 15:21:54 -07:00
John Keiser 3636aa5522 Extend structural_parser from structural_iterator 2020-06-08 15:21:54 -07:00
John Keiser a1aea4588f Move document stream state to implementation 2020-06-08 15:21:54 -07:00
John Keiser 1d4fffb799 Fix fallback implementation 2020-06-08 15:21:52 -07:00
John Keiser 6f90f5dc5f Remove templating from finish() method 2020-06-08 15:20:56 -07:00
John Keiser 9dd6972d26 Remove impossible checks, add EMPTY check to normal parser 2020-06-08 15:20:56 -07:00
John Keiser d731a7d52c Privatize structural_parser 2020-06-08 15:20:56 -07:00
John Keiser 059468b74e Eliminate streaming_structural_parser subclass with templates 2020-06-08 15:20:56 -07:00
John Keiser 5e69fb782a Call a function to parse structurals 2020-06-08 15:20:56 -07:00
John Keiser a5beffda78 Remove streaming_structural_parser.h 2020-06-08 15:20:56 -07:00
John Keiser 7de7ce5fdc Move document stream state to implementation 2020-06-08 15:20:56 -07:00
John Keiser 0dbda65e44 Fix fallback implementation 2020-06-08 14:52:23 -07:00
John Keiser d43a4e9df9 Remove SUCCESS_AND_HAS_MORE (internal only value) 2020-06-07 16:20:55 -07:00
John Keiser ef63a84a3e Move document stream state to implementation 2020-06-07 16:20:44 -07:00
John Keiser 8c16ba372e Acknowledge that we always have a remainder 2020-06-06 16:46:38 -07:00
John Keiser 9be4a17687 Separate definition from declaration, arrange top down 2020-06-06 16:46:38 -07:00
John Keiser ed0c815735 Move unclosed array check to stage 2 2020-06-05 12:39:13 -07:00
Daniel Lemire 7a69da16e4
Fixing issue 906 (#912)
* Fixing issue 906

* Safe patching.

* Now with explanations.

* Bumping up memory allocation.

* Putting the patch back.

* fallback fixes.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-05 15:37:09 -04:00
Daniel Lemire 52f44de257
This introduces a tiny simplification in number parsing. (#910)
* This introduces a tiny simplification in number parsing.

* Removing unnecessary function.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-04 17:13:02 -04:00