Daniel Lemire
88da62ba09
Better documentation in the code.
2020-06-26 13:02:12 -04:00
Daniel Lemire
b6997a56df
Patching things up and adding tests.
2020-06-26 12:15:16 -04:00
Daniel Lemire
2956bce047
Minor fixes to avoid 32-bit warnings.
2020-06-25 21:12:26 -04:00
Brendan Knapp
41f33ecbb9
Permit 32-bit GCC compilation
2020-06-25 17:07:17 -07:00
Daniel Lemire
86241e2871
Merge pull request #987 from simdjson/issue985
...
Removing optional since it is not C++11, and it is not used
2020-06-25 11:04:36 -04:00
Daniel Lemire
1b63a9a9b5
Removing optional since it is not C++11
2020-06-25 10:25:57 -04:00
Daniel Lemire
32348c2b0b
Elaborating.
2020-06-25 10:14:29 -04:00
Daniel Lemire
5e690c5d04
Fixing the string_view issue.
2020-06-25 10:02:10 -04:00
Daniel Lemire
e01f1434fb
Bumping up the version number
2020-06-23 20:55:52 -04:00
John Keiser
187084ce46
Merge pull request #970 from simdjson/jkeiser/singleheader-tests
...
Make singleheader tests be test-only
2020-06-23 17:07:03 -07:00
Daniel Lemire
544fa57641
Damn merge conflicts.
2020-06-23 19:15:47 -04:00
John Keiser
d9929edbc1
Run -Weffc++ in CI
2020-06-23 13:44:25 -07:00
Daniel Lemire
b84a3a0230
Merge branch 'master' into issue961
2020-06-23 14:33:06 -04:00
Daniel Lemire
49d70232f8
Merge pull request #969 from simdjson/dlemire/minor_pre0.4_cleaning
...
Very minor cleaning.
2020-06-23 14:30:47 -04:00
John Keiser
257089884f
Merge pull request #958 from simdjson/jkeiser/is
...
Make simdjson_result<element>.is() return bool
2020-06-23 09:51:37 -07:00
John Keiser
c650ea9765
Merge pull request #960 from simdjson/jkeiser/idiomatic-get
...
Convert simdjson to use .get()
2020-06-23 09:49:41 -07:00
John Keiser
e369d45b9c
Fix non-compileable examples
2020-06-23 09:48:17 -07:00
John Keiser
2d84b6f6d9
Make simdjson_result<element>.is() return bool
2020-06-23 09:09:24 -07:00
John Keiser
eef1171944
Merge pull request #954 from simdjson/jkeiser/parse-many-result
...
Return error from parse_many
2020-06-23 09:06:20 -07:00
Daniel Lemire
f1a03bfb04
Very minor cleaning.
2020-06-23 11:05:58 -04:00
Daniel Lemire
696b0e29e4
Fixing issue 961
2020-06-23 10:47:32 -04:00
Daniel Lemire
33e003616d
Fixing the name of the variable
2020-06-22 16:29:38 -04:00
Daniel Lemire
bf03d77ab9
Passing by value the string_view
2020-06-22 16:28:35 -04:00
Daniel Lemire
d6f056f266
Fixing documentation issues.
2020-06-22 16:17:11 -04:00
Daniel Lemire
a76c67c19f
Fixing...
2020-06-22 15:57:54 -04:00
John Keiser
1ff55c2729
Replace auto [x,error] with .get() everywhere
2020-06-21 16:26:59 -07:00
Daniel Lemire
5dbcdf1484
Ok
2020-06-21 17:52:30 -04:00
Daniel Lemire
f03a6ab5a4
Tweaking.
2020-06-21 17:39:24 -04:00
John Keiser
6fa5abcd7e
Replace x.get<T>() with x.get(v) or T(x)
2020-06-21 14:36:38 -07:00
Daniel Lemire
5dc07ed295
It builds.
2020-06-21 17:20:33 -04:00
John Keiser
1b1a122b1f
Fix copy constructor issue on older gcc
2020-06-21 12:06:14 -07:00
John Keiser
ae1bd891e7
Remove deprecated uses of parse_many
2020-06-21 11:19:06 -07:00
John Keiser
9899e5021d
Allow use of document_stream with tie()
2020-06-20 21:15:05 -07:00
John Keiser
94440e0170
Return simdjson_result from load_many/parse_many
2020-06-20 20:51:53 -07:00
John Keiser
a7fc7d4ffb
Switch from get(v,e) to e = get(v)
2020-06-20 17:57:09 -07:00
John Keiser
56e2b38048
Add bool result from tie()/get(), get<T>(T&,error_code&)
2020-06-20 17:55:46 -07:00
John Keiser
1d8c2d6c22
Make get_xxx the primary functions
2020-06-20 13:29:12 -07:00
John Keiser
0b8c357eff
Add get_X and is_X methods
2020-06-19 13:27:33 -07:00
John Keiser
05bc664c11
Don't extend from tape_ref in public classes
2020-06-19 13:25:52 -07:00
Daniel Lemire
c13c2650a2
Merge pull request #940 from simdjson/issue938
...
Verifying (and fixing) issue 938
2020-06-18 18:25:31 -04:00
Daniel Lemire
2f6091419f
Merge pull request #944 from simdjson/issue680
...
Document the complexity of array.at
2020-06-18 18:24:08 -04:00
Daniel Lemire
2022dd7d74
Merge pull request #945 from simdjson/issue678
...
Fixing issue 678
2020-06-18 18:23:56 -04:00
Daniel Lemire
ef688a74fe
Minor tweak to the documentation.
2020-06-18 18:18:12 -04:00
Daniel Lemire
04a19f9813
Fixes https://github.com/simdjson/simdjson/issues/937
2020-06-17 18:06:13 -04:00
Daniel Lemire
2cbc591c9d
Fixing issue 678
2020-06-17 16:17:17 -04:00
Daniel Lemire
3586fc4910
Fix for issue 680
2020-06-17 18:49:22 +00:00
Daniel Lemire
0b9df6d8c4
It turns out that we need fairly complicated logic.
2020-06-17 15:17:10 +00:00
Daniel Lemire
803b0c4bdb
Light touch.
2020-06-17 11:00:13 -04:00
Daniel Lemire
0d4e501239
Fixing the bug.
2020-06-17 10:06:16 -04:00
John Keiser
fd44c2a2ff
Merge pull request #927 from simdjson/dlemire/exposingthestringminifier
...
Exposing the string minifier.
2020-06-13 07:47:20 -07:00
John Keiser
a86a82b39c
Rename minify class to minifier so the minify() method is cleared up
2020-06-12 17:05:25 -07:00
Daniel Lemire
4dfbf98e4e
Using a worker instead of a thread per batch ( #920 )
...
In the parse_many function, we have one thread doing the stage 1, while the main thread does stage 2. So if stage 1 and stage 2 take half the time, the parse_many could run at twice the speed. It is unlikely to do so. Still, we see benefits of about 40% due to threading.
To achieve this interleaving, we load the data in batches (blocks) of some size. In the current code (master), we create a new thread for each batch. Thread creation is expensive so our approach only works over sizeable batches. This PR improves things and makes parse_many faster when using small batches.
This fixes our parse_stream benchmark which is just busted.
This replaces the one-thread per batch routine by a worker object that reuses the same thread. In benchmarks, this allows us to get the same maximal speed, but with smaller processing blocks. It does not help much with larger blocks because the cost of the thread create gets amortized efficiently.
This PR makes parse_many beneficial over small datasets. It also makes us less dependent on the thread creation time.
Unfortunately, it is going to be difficult to say anything definitive in general. The cost of creating a thread varies widely depending on the OS. On some systems, it might be cheap, in others very expensive. It should be expected that the new code will depend less drastically on the performances of the underlying system, since we create juste one thread.
Co-authored-by: John Keiser <john@johnkeiser.com>
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-12 16:51:18 -04:00
John Keiser
bbd61eb13f
Let tape writing be put in a register
2020-06-12 09:18:20 -07:00
Daniel Lemire
a6e4933d93
Exposing the string minifier.
2020-06-11 13:07:18 -04:00
John Keiser
fe01da077e
Make threaded version work again
2020-06-07 16:21:00 -07:00
John Keiser
d43a4e9df9
Remove SUCCESS_AND_HAS_MORE (internal only value)
2020-06-07 16:20:55 -07:00
John Keiser
ef63a84a3e
Move document stream state to implementation
2020-06-07 16:20:44 -07:00
Daniel Lemire
7a69da16e4
Fixing issue 906 ( #912 )
...
* Fixing issue 906
* Safe patching.
* Now with explanations.
* Bumping up memory allocation.
* Putting the patch back.
* fallback fixes.
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-05 15:37:09 -04:00
John Keiser
b75fa26dc1
Move containing_scope and ret_address to .cpp
2020-06-01 12:15:55 -07:00
John Keiser
3d22a2d845
One weird trick: set a bogus error value in the parser impl
...
This makes us faster under both gcc and clang somehow.
2020-06-01 12:15:55 -07:00
John Keiser
1aab4752e2
Store all parser state in the implementation
2020-06-01 12:15:54 -07:00
John Keiser
6a71b24495
Reuse stored buf and len from parser
2020-06-01 12:14:09 -07:00
John Keiser
a3a9bde83e
Move DOM parsing into concrete interface implementation
2020-06-01 12:14:09 -07:00
Daniel Lemire
40d57da83c
fixes issue 891 ( #893 )
2020-05-20 11:54:53 -04:00
John Keiser
e6c9dfbd91
Make include files more fine-grained
2020-05-19 14:42:04 -07:00
John Keiser
7ad4020829
Make main compilation chunks into .cpp files
2020-05-19 13:32:35 -07:00
John Keiser
a476531524
Share ref_address everywhere it's used
2020-05-19 13:30:34 -07:00
Daniel Lemire
e03c5e9f23
We should guard the include ( #881 )
2020-05-13 20:02:46 -04:00
John Keiser
dbb3316511
Move current_string_buf_loc to stage 2
2020-05-11 06:11:32 -07:00
John Keiser
cd6f204c77
Move write_tape() to stage 2 code
2020-05-11 06:09:48 -07:00
John Keiser
269131ed21
Move on_number_* to stage 2 code
2020-05-11 06:04:54 -07:00
John Keiser
65d784e88e
Move on_start/end_string to stage 2 code
2020-05-11 05:49:40 -07:00
John Keiser
35afb6cae0
Move on_error, on_success to stage 2 code
2020-05-11 05:46:18 -07:00
John Keiser
4f25b6ac0c
Move on_end_* to stage 2 code
2020-05-11 05:34:49 -07:00
John Keiser
3d5ed1a7e3
Move on_start_* to stage 2 code
2020-05-11 05:30:35 -07:00
John Keiser
a03115a4a6
Move end_scope to stage 2 code
2020-05-11 05:24:12 -07:00
John Keiser
7219d28a31
Call end_scope directly from stage 2 code
2020-05-11 05:20:04 -07:00
John Keiser
0875bce68f
Don't pass depth to on_end_*
2020-05-11 05:15:39 -07:00
John Keiser
54fe302907
Don't pass depth to end_scope
2020-05-11 05:06:41 -07:00
John Keiser
edaa8f811f
Move on_start_* depth management to stage 2 code
2020-05-11 05:03:25 -07:00
John Keiser
2c8fd109de
Move increment_count to stage 2
2020-05-11 04:58:50 -07:00
John Keiser
16d88cc095
Don't pass depth to increment_count
2020-05-11 04:15:02 -07:00
Daniel Lemire
2a6e6b3dbd
Cleaning string_view ( #872 )
...
* Cleaning string_view
* Corrected typo
* Alignment.
2020-05-10 16:05:52 -04:00
John Keiser
afb369950c
Disable Intellisense-only warnings in simdjson.h/cpp
2020-05-04 11:47:04 -07:00
John Keiser
1d06624d38
Unset /D_CRT_SECURE_NO_WARNINGS
...
- Also localize DISABLE_DEPRECATED_WARNING so that we catch other
deprecations
2020-05-04 11:35:05 -07:00
Pavel P
d40069a018
Disable deprecation warnings for VS builds
...
fopen/getenv are standard c++ that are not deprecated.
2020-05-04 11:34:00 -07:00
Furkan Usta
e04cbd71d0
Only install singleheader/simdjson.h as part of the public API
2020-05-02 01:44:11 +03:00
Daniel Lemire
fc1ddcd2f8
Faster case-insensitive comparisons. ( #837 )
...
* Faster case-insensitive comparisons.
2020-04-30 15:52:28 -04:00
Furkan Usta
73d7d704c1
CMake: Remove export_private_library
...
Since we are exporting all the targets as part of the main simdjson target we do not need private
exports anymore
2020-04-30 02:06:19 +03:00
Furkan Usta
eee07e6cfd
Use the same export name for all targets
2020-04-29 23:47:27 +03:00
Nong Li
0f9dbf84b7
Fix incorrect check for case insensitive key lookup ( #824 )
2020-04-29 13:55:28 -04:00
Daniel Lemire
2a1f8fa8f1
Provides support for clang under Windows. ( #817 )
2020-04-27 22:09:27 -04:00
John Keiser
49da7e74cd
usage.md -> basics.md ( #823 )
2020-04-27 16:03:19 -04:00
PavelP
0514588175
Improves clang-cl build with Visual Studio ( #809 )
2020-04-27 08:59:32 -04:00
Daniel Lemire
b99a7344c9
missing spaces.
2020-04-25 22:26:18 -04:00
Daniel Lemire
f3ac0be0e6
Merge branch 'master' of github.com:simdjson/simdjson
2020-04-23 18:39:56 -04:00
Daniel Lemire
18c9468af5
Fixed typo
2020-04-23 18:39:32 -04:00
ostri
d4239aaa8f
default initialisaiton ( #779 )
...
* padded_string.* default initialisation
parsedjson_iterator - copy constructor; depth_index not necessary
2020-04-23 18:32:11 -04:00
Daniel Lemire
4d0c7d706d
Warn 32-bit users about their doom. ( #783 )
2020-04-23 16:01:19 -04:00
Daniel Lemire
382392e03b
This should enable -Weffc++ ( #777 )
...
* Enabling -Weffc++
2020-04-23 13:03:04 -04:00