Commit Graph

145 Commits

Author SHA1 Message Date
Daniel Lemire 4c9f11b78a Missing character. 2020-06-25 10:15:13 -04:00
Daniel Lemire 5e690c5d04 Fixing the string_view issue. 2020-06-25 10:02:10 -04:00
Daniel Lemire 8f2a5649fe
Merge pull request #983 from TkTech/patch-1
Fix documentation links in basics.md
2020-06-24 20:44:46 -04:00
Daniel Lemire c3b25e12a5
Update implementation-selection.md 2020-06-24 20:42:04 -04:00
Daniel Lemire 6d3e33d440
Update parse_many.md 2020-06-24 20:41:38 -04:00
Daniel Lemire c11f7ce54f
Update performance.md 2020-06-24 20:41:06 -04:00
Tyler Kennedy 84806cc174
Fix documentation links in basics.md
Links to other files need to be either relative to themselves (doc/performance.md -> performance.md) or absolute (doc/performance.md -> /doc/performance.md). This change fixes the documentation when read on GitHub.
2020-06-24 20:20:14 -04:00
Daniel Lemire 3e35729eb6
Merge pull request #968 from simdjson/issue961
Fixing issue 961
2020-06-23 19:48:43 -04:00
Daniel Lemire 7e94309046
Update basics.md 2020-06-23 19:08:14 -04:00
Daniel Lemire c8a70a0a73 Tweaking the documentation. 2020-06-23 14:39:16 -04:00
Daniel Lemire b84a3a0230
Merge branch 'master' into issue961 2020-06-23 14:33:06 -04:00
Daniel Lemire 8cc9f496ee
Merge branch 'master' into dlemire/improving_documentation 2020-06-23 13:07:29 -04:00
Daniel Lemire 1547f2ec80 Pleasing John 2020-06-23 13:05:19 -04:00
John Keiser c650ea9765
Merge pull request #960 from simdjson/jkeiser/idiomatic-get
Convert simdjson to use .get()
2020-06-23 09:49:41 -07:00
John Keiser eef1171944
Merge pull request #954 from simdjson/jkeiser/parse-many-result
Return error from parse_many
2020-06-23 09:06:20 -07:00
John Keiser 12ccdcf858 Include document_stream line in parse_many docs 2020-06-23 08:49:47 -07:00
Daniel Lemire 696b0e29e4 Fixing issue 961 2020-06-23 10:47:32 -04:00
Daniel Lemire 5eb748ae17 This improves slightly the documentation, adding instructions for CMake users. 2020-06-23 09:33:15 -04:00
Daniel Lemire 89c2582376 Extending the documentation. 2020-06-22 16:32:00 -04:00
Daniel Lemire a76c67c19f Fixing... 2020-06-22 15:57:54 -04:00
John Keiser 1ff55c2729 Replace auto [x,error] with .get() everywhere 2020-06-21 16:26:59 -07:00
Daniel Lemire 38bb08778a With an example. 2020-06-21 17:57:22 -04:00
John Keiser 6fa5abcd7e Replace x.get<T>() with x.get(v) or T(x) 2020-06-21 14:36:38 -07:00
John Keiser a7fc7d4ffb Switch from get(v,e) to e = get(v) 2020-06-20 17:57:09 -07:00
John Keiser f336103f63 Convert tools/docs/benchmarks to bool get() idiom 2020-06-20 17:55:46 -07:00
John Keiser 56e2b38048 Add bool result from tie()/get(), get<T>(T&,error_code&) 2020-06-20 17:55:46 -07:00
Daniel Lemire 5ccdbef7d5
Merge pull request #936 from simdjson/dlemire/new_examples
New examples.
2020-06-18 18:29:06 -04:00
John Keiser f632e7c043 Put C++11 capable version back, change name to readme style 2020-06-18 12:50:49 -07:00
Daniel Lemire 3f00e79bcb
Merge branch 'master' into dlemire/better_doxygen_home_page 2020-06-17 16:02:49 -04:00
Daniel Lemire 14ceacac73 Tweaking. 2020-06-17 13:27:17 -04:00
Daniel Lemire 4474f8ef18 Cleaning a bit the examples. 2020-06-17 16:24:55 +00:00
Daniel Lemire b5ea504ad2 Tweaks doxygen so that we have a better main page. 2020-06-17 11:07:21 -04:00
Daniel Lemire 27a75a9085 Tweaking. 2020-06-15 17:54:34 -04:00
Daniel Lemire 954d6c326d New examples. 2020-06-15 17:45:15 -04:00
Daniel Lemire 16f41ea059 Added a word. 2020-06-14 18:48:42 -04:00
Daniel Lemire 0a7270fc29 More tweaks. 2020-06-14 18:47:22 -04:00
Daniel Lemire 23fbd9d004 Some tweaks. 2020-06-14 18:28:09 -04:00
John Keiser fd44c2a2ff
Merge pull request #927 from simdjson/dlemire/exposingthestringminifier
Exposing the string minifier.
2020-06-13 07:47:20 -07:00
John Keiser a86a82b39c Rename minify class to minifier so the minify() method is cleared up 2020-06-12 17:05:25 -07:00
Daniel Lemire 4dfbf98e4e
Using a worker instead of a thread per batch (#920)
In the parse_many function, we have one thread doing the stage 1, while the main thread does stage 2. So if stage 1 and stage 2 take half the time, the parse_many could run at twice the speed. It is unlikely to do so. Still, we see benefits of about 40% due to threading.

To achieve this interleaving, we load the data in batches (blocks) of some size. In the current code (master), we create a new thread for each batch. Thread creation is expensive so our approach only works over sizeable batches. This PR improves things and makes parse_many faster when using small batches.

  This fixes our parse_stream benchmark which is just busted.
  This replaces the one-thread per batch routine by a worker object that reuses the same thread. In benchmarks, this allows us to get the same maximal speed, but with smaller processing blocks. It does not help much with larger blocks because the cost of the thread create gets amortized efficiently.
This PR makes parse_many beneficial over small datasets. It also makes us less dependent on the thread creation time.

Unfortunately, it is going to be difficult to say anything definitive in general. The cost of creating a thread varies widely depending on the OS. On some systems, it might be cheap, in others very expensive. It should be expected that the new code will depend less drastically on the performances of the underlying system, since we create juste one thread.

Co-authored-by: John Keiser <john@johnkeiser.com>
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-12 16:51:18 -04:00
Daniel Lemire be707dbb6f Added a remark 2020-06-12 16:07:34 -04:00
Daniel Lemire 45e2178ada Duh. 2020-06-11 17:20:28 +00:00
Daniel Lemire a6e4933d93 Exposing the string minifier. 2020-06-11 13:07:18 -04:00
John Keiser e6c9dfbd91 Make include files more fine-grained 2020-05-19 14:42:04 -07:00
Daniel Lemire fa4ce6a8bc
There is confusion between gigabytes and gigibytes. Let us standardize throughout. (#838)
* There is confusion between gigabytes and gigibytes.

* Trying to be consistent.
2020-05-01 12:16:18 -04:00
Daniel Lemire 2a1f8fa8f1
Provides support for clang under Windows. (#817) 2020-04-27 22:09:27 -04:00
Daniel Lemire 04e47bde84
Update basics.md 2020-04-27 16:16:40 -04:00
Daniel Lemire 76314280cb
Update basics.md 2020-04-25 11:24:41 -04:00
Daniel Lemire d6716218bd
details. 2020-04-24 20:12:02 -04:00
Daniel Lemire ac0e6c5e6e
Update basics.md 2020-04-23 22:01:20 -04:00
Daniel Lemire f397b6fedf
Another example. (#790)
* Another example.

* Adding a reference to error chaining.
2020-04-23 21:48:41 -04:00
Daniel Lemire 4f72d5cfac
This adds another example (#785) 2020-04-23 18:29:28 -04:00
Daniel Lemire f0ac55ec0c
testing on freebsd (#768)
* Adding cirrus tests
* Adding cirrus badge.
2020-04-22 21:22:09 -04:00
Daniel Lemire d94cd65dfd
We used to have a requirements section which went away. I think it is required. (#749) 2020-04-20 19:03:26 -04:00
Daniel Lemire 38289fe381
Tweaking a sentence (#747) 2020-04-20 11:46:02 -04:00
John Keiser 289cc3e7a0 Treat warnings as errors during compilation 2020-04-15 19:59:38 -07:00
Daniel Lemire 6d7c77ddc1
Let us try to check with the exceptions disabled. (#707)
* Tweaking code so that we can run all tests with exceptions off.
* Removing SIMDJSON_DISABLE_EXCEPTIONS
2020-04-15 16:45:36 -04:00
Daniel Lemire b523c43927
Can we provide a size() function to arrays and objects? (eager approach) [TO BE MERGED] (#690)
* This is an implementation of "size()" for arrays and objects.
* Adding benchmark
* Adding a size() remark in the documentation.
* Extending size() to result types.
2020-04-15 10:15:48 -04:00
Daniel Lemire 3c6ef83046
Trying to correct the documentation so that it actually describes how the code behaves. (Attempt two) (#712)
* Trying to correct the documentation so that it actually describes how the code behaves.

* tweaking the wording.

* Improving.

* Removing confusing sentence.

* Fixing formatting.

* Now with working example, tested.

* Added a smaller piece of code
2020-04-14 22:31:21 -04:00
Daniel Lemire d7370cc916
Let us document the relationship between a parser instance and the parsed document. (#699) 2020-04-14 08:30:06 -04:00
John Keiser 1e30b6e334 Compile under C++ 11 2020-04-08 14:00:13 -07:00
Daniel Lemire 74d9b41b7d
Minor fixes to our documentation regarding thread safety. (#683)
* Minor fixes to our documentation regarding thread safety.

* A bit more pessimistic.
2020-04-08 16:41:08 -04:00
John Keiser 6eec2d6b4f Simplify cars example 2020-04-05 09:15:20 -07:00
John Keiser 13aee51011 Add element.type() for type switching 2020-04-02 14:07:19 -07:00
Marko Radišić 4060f64232
Update basics.md
Fix links to singleheader .h and .cpp
2020-04-01 17:13:53 +02:00
John Keiser d93af1161d Remove set_capacity, replace with allocate
Makes allocation point more predictable
2020-03-30 13:49:54 -07:00
John Keiser dc918d764e
Merge pull request #646 from simdjson/jkeiser/quickstart-example
Compile all .md examples in CI
2020-03-30 13:44:43 -07:00
John Keiser b5a1017afa Update JsonStream.md -> parse_many to new API 2020-03-30 13:44:03 -07:00
John Keiser 7badc230a4 Add RELEASES.md 2020-03-30 13:44:03 -07:00
Daniel Lemire 6369cf4dd9
Better documentation for the -H flag. (#651) 2020-03-30 15:44:04 -04:00
John Keiser 2115596ed3 Compile performance.md examples in tests 2020-03-29 16:28:34 -07:00
John Keiser 7ed65e42d7 Add actual examples from basics.md to readme_examples 2020-03-29 16:28:29 -07:00
John Keiser ea8a5020e2 Remove array indexer, make object indexer key lookup 2020-03-28 15:56:43 -07:00
John Keiser 622d9c9480 Replace as_X and is_X with get<T> and is<T> 2020-03-28 15:29:53 -07:00
John Keiser 03746b966b Move document/element/etc. under dom 2020-03-28 13:42:21 -07:00
John Keiser 9f265711a8
Merge pull request #629 from simdjson/jkeiser/parse-element
Return document::element from parser.parse()
2020-03-28 08:45:14 -07:00
Daniel Lemire 32afcd2e48
Better documentation for issue 70 (#638) 2020-03-27 19:44:01 -04:00
John Keiser 5ad405006c Return document::element from parse, load, parse_many, load_many 2020-03-27 12:24:41 -07:00
Daniel Lemire 1b6a31b277
Updating the performance numbers. (#634)
* Updating the performance numbers.

* Updating with growing file sizes.
2020-03-27 14:11:02 -04:00
John Keiser 26b15251e2 Split docs into multiple files 2020-03-25 18:25:14 -07:00
Bruce Mitchener c3c43769ae Fix typos. 2020-03-22 09:14:14 -07:00
John Keiser 40c6213d7e Add parser.load() and load_many() to load files 2020-03-11 17:19:41 -07:00
John Keiser 9a7c8fb5be Use parse_many in examples/tests/docs 2020-03-05 12:04:45 -08:00
John Keiser b3ea8c406e Add simdjson.cpp for unified use (#515) 2020-03-04 10:12:27 -08:00
Daniel Lemire 28710f8ad5
fix for Issue 467 (#469)
* Fix for issue467

* Updating single-header

* Let us make it so that JsonStream is constructed from a padded_string which will avoid dangerous overruns.

* Fixing parse_stream

* Updating documentation.
2020-01-29 19:00:18 -05:00
dbj c6f2f60b03 clarifications -- documentation update (#448) 2020-01-20 10:39:47 -05:00
Nexus Web Development f2b48ede4c Update JsonStream.md : simple typo (#413)
Exemple to example.
2019-12-23 11:35:09 -05:00
Daniel Lemire e63f258470
missed one 2019-11-26 14:51:40 -05:00
Daniel Lemire ede9f9117f
Minor cleaning 2019-11-26 14:51:01 -05:00
Jeremie Piotte db141e82c9
Specifying that RFC7464 is not supported 2019-11-26 10:33:33 -05:00
Jeremie Piotte f163155929 JsonStream documentation (#381)
* adding Multiline JSON competition chart to doc
* Completing the comments for JsonStream
* Adding a page for JsonStream's documentation.
2019-11-25 18:11:55 -05:00
Daniel Lemire 4495619a2e Moved file to proper directory. 2019-04-18 13:30:27 -04:00
Daniel Lemire 33f45582af adding twitter comment 2019-04-18 13:28:06 -04:00
Daniel Lemire 46ef59c679 Cleaning. 2018-12-27 20:19:10 -05:00
Daniel Lemire 61e9b82af2 Adding figures. 2018-12-19 01:04:13 -05:00