John Keiser
03aaf189c1
Use parse_primitive (negative perf!)
2020-08-03 23:09:20 -07:00
John Keiser
6ef9395419
"parser.parser" -> "parser.dom_parser"
2020-08-03 23:09:20 -07:00
John Keiser
3a56e13b78
Make parse() a method
2020-08-03 23:09:19 -07:00
John Keiser
ec28acba3d
De-templatize stage2::structural_parser
2020-08-03 23:09:15 -07:00
John Keiser
ee6647ce40
Make parse part of structural_parser
2020-08-03 17:50:51 -07:00
John Keiser
03d54f8f6e
Use SAX model for stage 2
2020-08-03 17:50:51 -07:00
John Keiser
553e6d7549
Don't check max depth on startup
2020-08-03 17:49:14 -07:00
John Keiser
e6896ee71e
Keep current JSON after checking primitive type
2020-08-03 13:30:13 -07:00
John Keiser
e6762f9b48
Advance immediately upon evaluating a character
2020-08-03 13:26:56 -07:00
John Keiser
099bb1afef
Pass buffer to primitive parse functions
2020-08-03 12:56:35 -07:00
John Keiser
9c33093c91
Name goto labels consistently
2020-08-03 11:47:38 -07:00
John Keiser
634d8038b9
Increment depth before starting a scope
2020-08-03 11:35:46 -07:00
John Keiser
ad46154f2f
Hardcode document start/end creation
2020-08-03 10:23:32 -07:00
John Keiser
fa81068ea8
Simplify structural_parser.start()
2020-08-03 09:49:15 -07:00
John Keiser
70c2a1c9f9
Short-circuit empty objects/arrays
2020-08-03 09:36:18 -07:00
John Keiser
66a68ce264
Return errors immediately instead of using goto
2020-08-02 12:04:12 -07:00
John Keiser
6bca1225e6
Add unlikely in strategic places
2020-08-01 18:19:36 -07:00
John Keiser
379a4e6a01
namespace { -> unnamed namespace
2020-08-01 14:46:23 -07:00
John Keiser
460cfcaf3e
Make parse_structurals inline
2020-08-01 14:43:50 -07:00
John Keiser
8e69103822
Remove computed GOTO
2020-08-01 14:43:50 -07:00
John Keiser
2f67dab2b6
Remove extraneous machine addresses
2020-08-01 14:43:50 -07:00
John Keiser
bb65ebd8be
Remove computed gotos from parse_value
2020-08-01 14:43:50 -07:00
John Keiser
c46ea0390c
Move { and [ to the start of the switch
2020-08-01 14:43:50 -07:00
John Keiser
bc8a6dd2e3
Remove dead code
2020-08-01 14:43:10 -07:00
John Keiser
b1478c37f6
Fix arm64 build
2020-08-01 14:43:10 -07:00
John Keiser
4e944a9f3c
Eliminate unused functions in fallback
2020-08-01 14:43:10 -07:00
John Keiser
c7fa9b5fe8
Make entire implementation namespaces anonymous
2020-08-01 14:43:10 -07:00
John Keiser
65148b123b
Put anonymous namespace in front of everything
2020-08-01 14:43:10 -07:00
John Keiser
3acfc0b630
Merge pull request #1045 from simdjson/jkeiser/generic-2
...
Define namespaces inside generic files
2020-07-24 12:42:39 -07:00
Daniel Lemire
2ce5f69def
fix recently introduced overflow ( #1060 )
...
* Various fixes.
* Clearer comment.
2020-07-24 13:59:24 -04:00
John Keiser
7d347be902
Untangle amalgamated headers
2020-07-24 02:56:41 -07:00
John Keiser
a456d78fe0
really_inline more things
2020-07-24 02:56:41 -07:00
John Keiser
bf67c967d6
Inline jsoncharutils per-implementation
2020-07-24 02:56:41 -07:00
John Keiser
44b7a7145c
Include bitmanip/simd everywhere
2020-07-24 02:56:39 -07:00
John Keiser
3867ee71ed
Include files where they are used
2020-07-24 02:56:37 -07:00
John Keiser
464f4813e3
Define namespaces inside generic files
2020-07-24 02:56:36 -07:00
John Keiser
af8b52e7e8
Target region for entire compilation of an implementation
2020-07-24 02:48:25 -07:00
Daniel Lemire
4beb2ed507
Make simd8 64 uncopyable and other Visual Studio optimizations ( #1031 )
...
* Working on making simd8x64 immutable
* Even less invasive
2020-07-21 18:11:21 -04:00
Daniel Lemire
e9c91a1ce2
lookup4 (new UTF-8 validation) ( #993 )
...
* lookup4
* Self-document lookup4 and clean up extra bits
* Maintenance, to match against upcoming PR.
Co-authored-by: Daniel Lemire <lemire@gmai.com>
Co-authored-by: John Keiser <john@johnkeiser.com>
2020-07-20 18:20:07 -04:00
Daniel Lemire
534632dc52
Minor tweak on number parsing ( #1041 )
...
* Tweak.
2020-07-17 12:14:10 -04:00
Daniel Lemire
8bf5f3d869
Trying to document more carefully the use of memcpy. ( #1038 )
...
* Trying to document more carefully the use of memcpy.
* Patching spelling.
2020-07-17 09:58:34 -04:00
Vitaly Baranov
1e4aa116e5
Choose active implementation only once. ( #1044 )
2020-07-16 18:17:56 -04:00
John Keiser
90cc1411da
Merge pull request #1018 from simdjson/jkeiser/simplify-integer-parse
...
Remove some branches from number parsing
2020-07-16 12:21:43 -07:00
Vitaly Baranov
6bd64c6873
Fix clang warning -Wused-but-marked-unused. ( #1042 )
...
* Fix clang warning -Wused-but-marked-unused.
* Fix build.
2020-07-15 13:28:51 -04:00
Vitaly Baranov
a2f0933d01
Fix undefined behavior: load of misaligned address in atomparsing.h ( #1037 )
2020-07-13 08:46:52 -04:00
John Keiser
6797a6ab56
Use const uint8_t * in number parsing
2020-07-10 09:17:23 -07:00
John Keiser
86b5928f5e
Use parse_digit for decimal and exp parsing as well
2020-07-10 09:16:43 -07:00
John Keiser
6dbd15aa71
Move SIMDJSON_SKIPNUMBERPARSING method out
2020-07-09 15:55:10 -07:00
John Keiser
22e5b081c4
Remove is_integer
2020-07-09 15:55:10 -07:00
John Keiser
d848f33c48
Simplify integer parsing
2020-07-09 15:55:10 -07:00
John Keiser
c64367536d
Eliminate "found_minus" parse_number() parameter
2020-07-09 15:55:09 -07:00
John Keiser
fc0102b079
Use common parse_digit() funtion in int parsing
2020-07-09 15:33:22 -07:00
Daniel Lemire
d0ce2f0b5a
Fixing clang under visual studio ( #1028 )
...
* Lots of fixes
* Removing some lambdas
* Removing some functional programming.
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-07-06 18:58:19 -04:00
John Keiser
82fb45aa2a
Merge pull request #990 from simdjson/jkeiser/fast-large-integer
...
Don't reparse large integers
2020-07-01 12:49:43 -07:00
Daniel Lemire
74870a8189
Fixing issue 1013. ( #1016 )
...
* Fixing issue 1013.
* Bumping to 0.4.6
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-07-01 14:14:51 -04:00
John Keiser
7a9f6b48f4
Replace TODOs with comments about why we DIDNTDO
2020-07-01 10:31:10 -07:00
John Keiser
d3c089130d
Check overflow without reparsing integers
2020-07-01 09:51:48 -07:00
John Keiser
e0f3060527
Add negative/positive integer writing
2020-07-01 09:51:48 -07:00
John Keiser
4c1256acc4
Reduce nesting somewhat with different if() order
2020-07-01 09:51:48 -07:00
John Keiser
85f6f5bd29
Use macros to remove #ifdefs on every write
2020-07-01 09:51:48 -07:00
John Keiser
4d9eac663a
Use a macro to get rid of #ifdefs on each invalid number check
2020-07-01 09:51:48 -07:00
Daniel Lemire
0ef4d90ad0
Fix for issue 1014. ( #1015 )
...
* Fix for issue 1014.
* Explanation.
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-30 19:36:26 -04:00
Daniel Lemire
444ec4ad27
Stupid me
2020-06-26 19:29:28 -04:00
Daniel Lemire
bb5ce007e6
Something better.
2020-06-26 19:03:28 -04:00
Daniel Lemire
deaa74d378
Re-enabling tests generally.
2020-06-26 18:57:34 -04:00
Daniel Lemire
b6997a56df
Patching things up and adding tests.
2020-06-26 12:15:16 -04:00
Brendan Knapp
41f33ecbb9
Permit 32-bit GCC compilation
2020-06-25 17:07:17 -07:00
John Keiser
b4b968ff44
Fix #953
2020-06-23 09:53:24 -07:00
Daniel Lemire
2bb101bd19
Code reformatting.
2020-06-22 16:50:57 -04:00
Daniel Lemire
a6cbf1f922
Going generic...
2020-06-22 16:25:11 -04:00
Daniel Lemire
b836164a38
Fix.
2020-06-22 02:12:49 +00:00
Daniel Lemire
058507badf
Putting back the loop
2020-06-21 21:21:49 -04:00
Daniel Lemire
ad40e90790
Patching.
2020-06-21 20:14:00 -04:00
Daniel Lemire
066269153e
Explaining decision.
2020-06-21 18:02:34 -04:00
Daniel Lemire
5dbcdf1484
Ok
2020-06-21 17:52:30 -04:00
Daniel Lemire
f03a6ab5a4
Tweaking.
2020-06-21 17:39:24 -04:00
Daniel Lemire
5dc07ed295
It builds.
2020-06-21 17:20:33 -04:00
Daniel Lemire
064d4255d5
Ok.
2020-06-21 17:09:06 -04:00
Daniel Lemire
04139eb82e
Ok.
2020-06-21 17:05:55 -04:00
John Keiser
76c9f4f5a6
Merge pull request #941 from simdjson/jkeiser/forgot
...
Remove unnecessary functions
2020-06-17 09:09:28 -07:00
Daniel Lemire
942ef3b7f2
Merge pull request #939 from simdjson/dlemire/lookup3
...
Introducing lookup3 (UTF-8 validation).
2020-06-17 11:19:09 -04:00
John Keiser
f8f36c085c
Remove unnecessary functions
2020-06-17 07:11:53 -07:00
John Keiser
7339f67dd7
Merge pull request #462 from simdjson/jkeiser/if-backslash
...
Wrap backslash processing in a branch
2020-06-17 07:07:58 -07:00
Daniel Lemire
71a889ed73
Introducing lookup3 (UTF-8 validation).
2020-06-16 19:08:25 -04:00
John Keiser
610c79fbf3
Don't use backslash branch on ARM
2020-06-13 07:51:28 -07:00
John Keiser
fd44c2a2ff
Merge pull request #927 from simdjson/dlemire/exposingthestringminifier
...
Exposing the string minifier.
2020-06-13 07:47:20 -07:00
John Keiser
a86a82b39c
Rename minify class to minifier so the minify() method is cleared up
2020-06-12 17:05:25 -07:00
Daniel Lemire
bd2d0f769f
One unlikely too many ( #930 )
2020-06-12 17:58:10 -04:00
John Keiser
664b03bb13
Short circuit find escapes if there is a backslash
2020-06-12 10:10:35 -07:00
John Keiser
bbd61eb13f
Let tape writing be put in a register
2020-06-12 09:18:20 -07:00
John Keiser
e15e1e253d
peek_char -> peek_next_char
2020-06-12 09:10:16 -07:00
Daniel Lemire
a6e4933d93
Exposing the string minifier.
2020-06-11 13:07:18 -04:00
John Keiser
ea08e7d192
Remove unused extra copy of find_next_document_index
2020-06-09 17:52:13 -07:00
John Keiser
d178e089a6
Stop caching current structural, keep current index around instead of
...
next
2020-06-08 15:21:54 -07:00
John Keiser
5f00b37e21
Stop caching the buffer index
2020-06-08 15:21:54 -07:00
John Keiser
8a8792d47f
Remove most uses of current_char()
2020-06-08 15:21:54 -07:00
John Keiser
59d9bc9e48
Store the pointer to the next structural instead of base
...
structural_indexes and an index
2020-06-08 15:21:54 -07:00
John Keiser
8793dd3ceb
Don't store len locally
2020-06-08 15:21:54 -07:00
John Keiser
48062380fa
Move parser to structural_iterator
2020-06-08 15:21:54 -07:00
John Keiser
3636aa5522
Extend structural_parser from structural_iterator
2020-06-08 15:21:54 -07:00
John Keiser
a1aea4588f
Move document stream state to implementation
2020-06-08 15:21:54 -07:00
John Keiser
1d4fffb799
Fix fallback implementation
2020-06-08 15:21:52 -07:00
John Keiser
6f90f5dc5f
Remove templating from finish() method
2020-06-08 15:20:56 -07:00
John Keiser
9dd6972d26
Remove impossible checks, add EMPTY check to normal parser
2020-06-08 15:20:56 -07:00
John Keiser
d731a7d52c
Privatize structural_parser
2020-06-08 15:20:56 -07:00
John Keiser
059468b74e
Eliminate streaming_structural_parser subclass with templates
2020-06-08 15:20:56 -07:00
John Keiser
5e69fb782a
Call a function to parse structurals
2020-06-08 15:20:56 -07:00
John Keiser
a5beffda78
Remove streaming_structural_parser.h
2020-06-08 15:20:56 -07:00
John Keiser
7de7ce5fdc
Move document stream state to implementation
2020-06-08 15:20:56 -07:00
John Keiser
0dbda65e44
Fix fallback implementation
2020-06-08 14:52:23 -07:00
John Keiser
d43a4e9df9
Remove SUCCESS_AND_HAS_MORE (internal only value)
2020-06-07 16:20:55 -07:00
John Keiser
ef63a84a3e
Move document stream state to implementation
2020-06-07 16:20:44 -07:00
John Keiser
8c16ba372e
Acknowledge that we always have a remainder
2020-06-06 16:46:38 -07:00
John Keiser
9be4a17687
Separate definition from declaration, arrange top down
2020-06-06 16:46:38 -07:00
John Keiser
ed0c815735
Move unclosed array check to stage 2
2020-06-05 12:39:13 -07:00
Daniel Lemire
7a69da16e4
Fixing issue 906 ( #912 )
...
* Fixing issue 906
* Safe patching.
* Now with explanations.
* Bumping up memory allocation.
* Putting the patch back.
* fallback fixes.
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-05 15:37:09 -04:00
Daniel Lemire
52f44de257
This introduces a tiny simplification in number parsing. ( #910 )
...
* This introduces a tiny simplification in number parsing.
* Removing unnecessary function.
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-04 17:13:02 -04:00
John Keiser
b75fa26dc1
Move containing_scope and ret_address to .cpp
2020-06-01 12:15:55 -07:00
John Keiser
3d22a2d845
One weird trick: set a bogus error value in the parser impl
...
This makes us faster under both gcc and clang somehow.
2020-06-01 12:15:55 -07:00
John Keiser
1aab4752e2
Store all parser state in the implementation
2020-06-01 12:15:54 -07:00
John Keiser
86f8a4a9d2
Don't set parser.valid or parser.error
...
This regresses performance and is ONLY here because the next
two commits are here; this lets us see the impact of removing
parser.error separately from the impact of the next commit.
2020-06-01 12:14:09 -07:00
John Keiser
db2cb061cb
Remove on_error function
...
Solely here to make the next patch smaller and more isolatable
2020-06-01 12:14:09 -07:00
John Keiser
6a71b24495
Reuse stored buf and len from parser
2020-06-01 12:14:09 -07:00
John Keiser
84712a8bbc
Store buf and len in parser implementation
2020-06-01 12:14:09 -07:00
John Keiser
b86fb95306
Rename doc_parser -> parser
2020-06-01 12:14:09 -07:00
John Keiser
a3a9bde83e
Move DOM parsing into concrete interface implementation
2020-06-01 12:14:09 -07:00
Daniel Lemire
12150baa5e
Using just ASCII. ( #899 )
...
* Using just ASCII.
* Let us prune checkperf.
* Moving the description of lookup2 to the HACKING.md file.
2020-05-21 21:59:06 -04:00
John Keiser
4551e60f8b
Don't write start object/array until the end
2020-05-21 14:28:47 -07:00
John Keiser
5651fbedc4
Add logging to stage 2
2020-05-21 09:47:19 -07:00
Daniel Lemire
40d57da83c
fixes issue 891 ( #893 )
2020-05-20 11:54:53 -04:00
John Keiser
e6c9dfbd91
Make include files more fine-grained
2020-05-19 14:42:04 -07:00
John Keiser
64abc3e86c
Include top-level .h files outside #if statements
2020-05-19 13:33:14 -07:00
John Keiser
7ad4020829
Make main compilation chunks into .cpp files
2020-05-19 13:32:35 -07:00
John Keiser
72ab0d11ff
Move stage 1 and 2 files to their own directories
2020-05-19 13:30:34 -07:00
John Keiser
4ea866f050
Move stage2 classes into their own files
2020-05-19 13:30:34 -07:00
John Keiser
a476531524
Share ref_address everywhere it's used
2020-05-19 13:30:34 -07:00
John Keiser
dbb3316511
Move current_string_buf_loc to stage 2
2020-05-11 06:11:32 -07:00
John Keiser
cd6f204c77
Move write_tape() to stage 2 code
2020-05-11 06:09:48 -07:00
John Keiser
269131ed21
Move on_number_* to stage 2 code
2020-05-11 06:04:54 -07:00
John Keiser
65d784e88e
Move on_start/end_string to stage 2 code
2020-05-11 05:49:40 -07:00
John Keiser
35afb6cae0
Move on_error, on_success to stage 2 code
2020-05-11 05:46:18 -07:00
John Keiser
27bce09be8
Consolidate start_scope/end_scope
2020-05-11 05:40:02 -07:00
John Keiser
4f25b6ac0c
Move on_end_* to stage 2 code
2020-05-11 05:34:49 -07:00
John Keiser
3d5ed1a7e3
Move on_start_* to stage 2 code
2020-05-11 05:30:35 -07:00
John Keiser
a03115a4a6
Move end_scope to stage 2 code
2020-05-11 05:24:12 -07:00
John Keiser
7219d28a31
Call end_scope directly from stage 2 code
2020-05-11 05:20:04 -07:00
John Keiser
0875bce68f
Don't pass depth to on_end_*
2020-05-11 05:15:39 -07:00
John Keiser
54fe302907
Don't pass depth to end_scope
2020-05-11 05:06:41 -07:00
John Keiser
edaa8f811f
Move on_start_* depth management to stage 2 code
2020-05-11 05:03:25 -07:00
John Keiser
2c8fd109de
Move increment_count to stage 2
2020-05-11 04:58:50 -07:00