simdjson

Commit Graph

Author	SHA1	Message	Date
Daniel Lemire	bd2a31a0fe	Minor edits regarding the On Demand documentation. (#1384 ) * Minor edits regarding the On Demand documentation. * Adding more instructions for CMake * Tweaking. * Adding changes requested by John. * Bringing back detailed explanations of -march=native.	2021-01-11 18:48:02 -05:00
John Keiser	17f4f82827	Ondemand usage docs (and associated tests) Also disallowed parsing a temporary padded_string, since the JSON must live through the whole parse.	2021-01-01 19:17:58 -08:00
John Keiser	d91491bf13	Update documentation for out-of-order fields	2020-12-23 09:14:45 -08:00
John Keiser	2eaeac53e4	Revamp design documentation to match new design	2020-12-07 13:09:44 -08:00
Daniel Lemire	9304d88920	Prototype test for issue 1299: using parse_many, find the location of the end of the last document (#1301 ) * Prototype test for issue 1299. * This improves the documentation. * Removing trailing white spaces. * Removing trailing spaces * Trailing.	2020-12-01 15:59:20 -05:00
Daniel Lemire	3fa40b8dc2	Adding an example corresponding to issue 1316 (documentation enhancement) (#1317 ) * Adding an example. * Updated other doc file. * Trying to take into account @jkeiser's comments. * Some people prefer empty final lines.	2020-11-27 17:40:29 -05:00
Paul Dreik	def624a50c	update version tag (#1297 )	2020-11-08 10:17:06 -05:00
Paul Dreik	af4db55e66	remove trailing whitespace (#1284 )	2020-11-03 21:48:09 +01:00
Danila Kutenin	f46a0f64f2	PPC64 support (#1254 ) * Initial PPC64 support * Add travis CI * Fix outdated cmake version for travis * Fix indendtation * Try another workaround for outdated cmake in travis * Try beta cmake * Add dash before beta * Use builtin snaps * Use cmake as rocksdb * Test cmake on bionic * Remove unnecessary things from travis * Remove unnecessary things from travis * Another try of compiler install * Add all major compilers * Add all major compilers * Add all major compilers * Tweak travis a bit * Typo * More robust travis * Typos typos typos * Add fewer compilers, add non specific build for clang and gcc, should be the final config * CMAKE_FLAGS is in incorrect place * Remove default implementation * Limit build thread number * Fall back prefix_xor to a usual implementation, no performance boost is noticed * Test for power9 as it is the main architecture for OpenPOWER right now * Add to documentation to build with power9 as the implementation is compatible but compiler optimizations is not * Replace ARM with PPC in the comment	2020-10-27 18:43:39 -04:00
Jonathan Wakely	1fd0447dbb	Remove repeated words (#1252 )	2020-10-26 20:41:01 -04:00
Daniel Lemire	a75c07065f	Fix for issue 1246. We document the relationship between parser instances and elements (#1250 ) * Fix for issue 1246. * Adopting John's wording.	2020-10-26 08:40:45 -04:00
Daniel Lemire	14039d05a9	Adding a new benchmark for ondemand: distinct user id (#1239 ) * Adding a distinct user id benchmark * reenabling everything * Removing an unnecessary "value()". * Better tests of the examples and some fixes. * Guarding exception code.	2020-10-23 08:47:01 -04:00
Daniel Lemire	0942dc0764	This fixes a typo and makes the types more explicit (#1241 )	2020-10-20 17:41:37 -04:00
Daniel Lemire	0d6919dd99	Reenable the on-demand tests and allows us to convert a raw string into a C++ string. (#1232 ) * Reenable the on-demand tests and allows us to convert a raw string into a C++ string. * Fixing a 1-byte buffer overrun. * More documentation. * Adding more tests. * Enabling the new tests * Committing a nicer example. * Not yet happy but this should fix our failures. * Duh. * Ok. Making it easier to get string_view instances from field instances. * It is a struct. * Trying to satisfy VS. * Adopting John's name.	2020-10-19 20:22:24 -04:00
Daniel Lemire	0a907ec694	Tweaking further the documentation. (#1237 ) * Tweaking further the documentation. * More details. * Another sentence. * Saving. * Tweaking more	2020-10-19 16:51:04 -04:00
Daniel Lemire	07a6e098c8	This would allow users to find out what builtin is. (#1227 ) * This would allow users to find out what builtin is. * Trying another approach. * Added instructions. * Cleaning up the printout. * Let us be less invasive. * Adding a comment.	2020-10-15 21:58:42 -04:00
Daniel Lemire	23026d966b	Tweaking.	2020-10-15 21:55:23 -04:00
Daniel Lemire	3cd98df30d	This adds new tests regarding ordering. (#1233 ) * This adds new tests regarding ordering. * Updating the documentation with more examples. * Adding compilation tests. * Pruning code for exceptions. * Guarding exceptionless.	2020-10-15 16:41:14 -04:00
Daniel Lemire	001be23258	Being more specific regarding the padding. (#1228 ) * Being more specific regarding the padding. * Even more precise.	2020-10-14 13:35:51 -04:00
Daniel Lemire	c85b6682e0	This is a cleaner on-demand documentation (for discussion). (#1226 ) * This is a cleaner on-demand documentation (for discussion). * Added stable APIs.	2020-10-14 13:35:28 -04:00
Daniel Lemire	ce94411dff	Tweaking the documentation to better answer https://github.com/simdjson/simdjson/issues/1218	2020-10-09 10:02:56 -04:00
John Keiser	a9480a768b	Merge pull request #947 from simdjson/jkeiser/stream-parse On-Demand Parsing	2020-10-06 16:04:27 -07:00
Daniel Lemire	1f41cc2030	Making it clearer that parse_many is meant for small documents. (#1205 ) * Making it clearer that parse_many is meant for small documents. * Update parse_many.md	2020-10-06 17:19:34 -04:00
John Keiser	938678f87f	Complete draft design doc	2020-10-06 11:29:45 -07:00
John Keiser	9dcf5fca5b	Add ondemand rationale to beginning of document	2020-10-06 11:29:45 -07:00
John Keiser	4e3b4809ea	[WIP] Nascent design doc for on demand	2020-10-04 12:47:29 -07:00
Daniel Lemire	9865bb6904	Make it possible to check that an implementation is supported at runtime (#1197 ) * Make it possible to check that an implementation is supported at runtime. * add CI fuzzing on arm 64 bit This adds fuzzing on drone.io arm64 For some reason, leak detection had to be disabled. If it is enabled, the fuzzer falsely reports a crash at the end of fuzzing. Closes: #1188 * Guarding the implementation accesses. * Better doc. * Updating cxxopts. * Make it possible to check that an implementation is supported at runtime. * Guarding the implementation accesses. * Better doc. * Updating cxxopts. * We need to accomodate cxxopts Co-authored-by: Paul Dreik <github@pauldreik.se>	2020-10-02 11:04:51 -04:00
Daniel Lemire	f410213003	Improve documentation on padding - Improves and clarifies the documentation on padding. - Use std:: prefix for memcpy, strlen etc. Related to issues #1175 and #1178	2020-09-23 09:07:14 +02:00
Daniel Lemire	19cb5d57db	Some minor documentation fixes. (#1177 )	2020-09-17 13:17:35 -04:00
Daniel Lemire	3e5497e2f9	Fixes issue 1170 and makes the usage of minify easier. (#1171 ) * Fixes issue 1170 and makes the usage of minify easier. * This should get the fallback implementation to detect unclosed strings.	2020-09-12 16:20:20 -04:00
Daniel Lemire	c40aeaec3a	Fix for issue 1147 (#1153 ) * This must be a typo * Improving documentation of the string conversion. * Minor update.	2020-09-03 13:18:15 -04:00
Daniel Lemire	5b10c38e43	Make parse_many safer. (#1137 )	2020-08-20 22:22:46 -04:00
Daniel Lemire	3316df9195	Adding test for issue 1133 and improving documentation (#1134 ) * Adding test. * Saving. * With exceptions. * Added extensive tests. * Better documentation. * Tweaking CI * Cleaning. * Do not assume make. * Let us make the build verbose * Reorg * I do not understand how circle ci works. * Breaking it up. * Better syntax.	2020-08-20 14:03:14 -04:00
Daniel Lemire	5d355f1a8b	release candidate (#1132 )	2020-08-19 18:12:23 -04:00
Daniel Lemire	a954d50ad4	This improves our documentation. (#1128 ) * This improves our documentation. * Removing tags for doxygen. * You need a recent cmake remark.	2020-08-19 14:02:08 -04:00
Daniel Lemire	1ec710c985	Updating the documentation for hackers.	2020-08-19 10:59:24 -04:00
Daniel Lemire	09bd7e8ef8	Verification and fix for issue 1063 (JSON Pointers) (#1064 ) * Specification is not followed. * Fixes. * Do not pass string_view by reference. * Better documentation. * The example is written for exceptions. * Better documentation. * Updating with deprecation. * Updating example. * Updating example.	2020-08-18 17:23:18 -04:00
Daniel Lemire	4a6eebc0e4	This corrects a small typo in the documentation. (#1121 ) * This corrects a small typo in the documentation. * Modifying the test as well.	2020-08-18 08:36:15 -04:00
John Keiser	1b69612246	Remove information about nonexistent computed gotos :)	2020-08-10 16:29:24 -07:00
Daniel Lemire	ef45cd3342	Let us be explicit about standard compliance (#1099 ) * Let us be explicit about standard compliance * More explicit.	2020-08-06 18:24:36 -04:00
Daniel Lemire	2f92a34bb7	Turns out that passing dom::element by reference can be a performance killer. (#1086 ) * Turns out that passing dom::element by reference can be a performance killer. * Tweaking.	2020-08-01 10:31:47 -04:00
Daniel Lemire	268df9f67a	Update basics.md	2020-07-31 15:43:34 -04:00
Daniel Lemire	0ff6833e96	Update basics.md	2020-07-21 17:29:10 -04:00
Daniel Lemire	ba58d868e5	Update performance.md	2020-07-14 15:00:31 -04:00
Ben McMorran	c50799ba3b	Fix TOC links in basics documentation The "++" in "C++" gets stripped from the generated anchors, so the links in the table of contents didn't work.	2020-07-13 17:02:35 -04:00
Daniel Lemire	77e1e3cc18	Update performance.md	2020-07-12 18:35:15 -04:00
Daniel Lemire	7bdd41350a	Update performance.md	2020-07-12 18:31:45 -04:00
Daniel Lemire	62a39639c2	Update performance.md	2020-07-09 11:47:33 -04:00
Daniel Lemire	158aaff384	Update performance.md	2020-07-09 11:46:35 -04:00
Daniel Lemire	fd836145fe	Update performance.md	2020-07-09 11:45:47 -04:00
Daniel Lemire	697bafdd0a	Update performance.md	2020-07-08 08:32:41 -04:00
Daniel Lemire	9675dcac44	Update performance.md	2020-07-06 19:03:18 -04:00
Daniel Lemire	f7d99f97a3	Update performance.md	2020-07-04 11:52:40 -04:00
Daniel Lemire	8b7df0c12e	Update performance.md	2020-07-03 23:14:01 -04:00
Daniel Lemire	bd780817f7	Update performance.md	2020-07-02 15:33:36 -04:00
Daniel Lemire	b6f1f4ef64	Update basics.md	2020-06-29 21:41:50 -04:00
Daniel Lemire	1fd30db726	This example in our documentation would not compile (#1005 ) Co-authored-by: Daniel Lemire <lemire@gmai.com>	2020-06-29 16:25:11 -04:00
Daniel Lemire	4582a13360	Final steps.	2020-06-26 20:31:24 -04:00
Daniel Lemire	4c9f11b78a	Missing character.	2020-06-25 10:15:13 -04:00
Daniel Lemire	5e690c5d04	Fixing the string_view issue.	2020-06-25 10:02:10 -04:00
Daniel Lemire	8f2a5649fe	Merge pull request #983 from TkTech/patch-1 Fix documentation links in basics.md	2020-06-24 20:44:46 -04:00
Daniel Lemire	c3b25e12a5	Update implementation-selection.md	2020-06-24 20:42:04 -04:00
Daniel Lemire	6d3e33d440	Update parse_many.md	2020-06-24 20:41:38 -04:00
Daniel Lemire	c11f7ce54f	Update performance.md	2020-06-24 20:41:06 -04:00
Tyler Kennedy	84806cc174	Fix documentation links in basics.md Links to other files need to be either relative to themselves (doc/performance.md -> performance.md) or absolute (doc/performance.md -> /doc/performance.md). This change fixes the documentation when read on GitHub.	2020-06-24 20:20:14 -04:00
Daniel Lemire	3e35729eb6	Merge pull request #968 from simdjson/issue961 Fixing issue 961	2020-06-23 19:48:43 -04:00
Daniel Lemire	7e94309046	Update basics.md	2020-06-23 19:08:14 -04:00
Daniel Lemire	c8a70a0a73	Tweaking the documentation.	2020-06-23 14:39:16 -04:00
Daniel Lemire	b84a3a0230	Merge branch 'master' into issue961	2020-06-23 14:33:06 -04:00
Daniel Lemire	8cc9f496ee	Merge branch 'master' into dlemire/improving_documentation	2020-06-23 13:07:29 -04:00
Daniel Lemire	1547f2ec80	Pleasing John	2020-06-23 13:05:19 -04:00
John Keiser	c650ea9765	Merge pull request #960 from simdjson/jkeiser/idiomatic-get Convert simdjson to use .get()	2020-06-23 09:49:41 -07:00
John Keiser	eef1171944	Merge pull request #954 from simdjson/jkeiser/parse-many-result Return error from parse_many	2020-06-23 09:06:20 -07:00
John Keiser	12ccdcf858	Include document_stream line in parse_many docs	2020-06-23 08:49:47 -07:00
Daniel Lemire	696b0e29e4	Fixing issue 961	2020-06-23 10:47:32 -04:00
Daniel Lemire	5eb748ae17	This improves slightly the documentation, adding instructions for CMake users.	2020-06-23 09:33:15 -04:00
Daniel Lemire	89c2582376	Extending the documentation.	2020-06-22 16:32:00 -04:00
Daniel Lemire	a76c67c19f	Fixing...	2020-06-22 15:57:54 -04:00
John Keiser	1ff55c2729	Replace auto [x,error] with .get() everywhere	2020-06-21 16:26:59 -07:00
Daniel Lemire	38bb08778a	With an example.	2020-06-21 17:57:22 -04:00
John Keiser	6fa5abcd7e	Replace x.get<T>() with x.get(v) or T(x)	2020-06-21 14:36:38 -07:00
John Keiser	a7fc7d4ffb	Switch from get(v,e) to e = get(v)	2020-06-20 17:57:09 -07:00
John Keiser	f336103f63	Convert tools/docs/benchmarks to bool get() idiom	2020-06-20 17:55:46 -07:00
John Keiser	56e2b38048	Add bool result from tie()/get(), get<T>(T&,error_code&)	2020-06-20 17:55:46 -07:00
Daniel Lemire	5ccdbef7d5	Merge pull request #936 from simdjson/dlemire/new_examples New examples.	2020-06-18 18:29:06 -04:00
John Keiser	f632e7c043	Put C++11 capable version back, change name to readme style	2020-06-18 12:50:49 -07:00
Daniel Lemire	3f00e79bcb	Merge branch 'master' into dlemire/better_doxygen_home_page	2020-06-17 16:02:49 -04:00
Daniel Lemire	14ceacac73	Tweaking.	2020-06-17 13:27:17 -04:00
Daniel Lemire	4474f8ef18	Cleaning a bit the examples.	2020-06-17 16:24:55 +00:00
Daniel Lemire	b5ea504ad2	Tweaks doxygen so that we have a better main page.	2020-06-17 11:07:21 -04:00
Daniel Lemire	27a75a9085	Tweaking.	2020-06-15 17:54:34 -04:00
Daniel Lemire	954d6c326d	New examples.	2020-06-15 17:45:15 -04:00
Daniel Lemire	16f41ea059	Added a word.	2020-06-14 18:48:42 -04:00
Daniel Lemire	0a7270fc29	More tweaks.	2020-06-14 18:47:22 -04:00
Daniel Lemire	23fbd9d004	Some tweaks.	2020-06-14 18:28:09 -04:00
John Keiser	fd44c2a2ff	Merge pull request #927 from simdjson/dlemire/exposingthestringminifier Exposing the string minifier.	2020-06-13 07:47:20 -07:00
John Keiser	a86a82b39c	Rename minify class to minifier so the minify() method is cleared up	2020-06-12 17:05:25 -07:00
Daniel Lemire	4dfbf98e4e	Using a worker instead of a thread per batch (#920 ) In the parse_many function, we have one thread doing the stage 1, while the main thread does stage 2. So if stage 1 and stage 2 take half the time, the parse_many could run at twice the speed. It is unlikely to do so. Still, we see benefits of about 40% due to threading. To achieve this interleaving, we load the data in batches (blocks) of some size. In the current code (master), we create a new thread for each batch. Thread creation is expensive so our approach only works over sizeable batches. This PR improves things and makes parse_many faster when using small batches. This fixes our parse_stream benchmark which is just busted. This replaces the one-thread per batch routine by a worker object that reuses the same thread. In benchmarks, this allows us to get the same maximal speed, but with smaller processing blocks. It does not help much with larger blocks because the cost of the thread create gets amortized efficiently. This PR makes parse_many beneficial over small datasets. It also makes us less dependent on the thread creation time. Unfortunately, it is going to be difficult to say anything definitive in general. The cost of creating a thread varies widely depending on the OS. On some systems, it might be cheap, in others very expensive. It should be expected that the new code will depend less drastically on the performances of the underlying system, since we create juste one thread. Co-authored-by: John Keiser <john@johnkeiser.com> Co-authored-by: Daniel Lemire <lemire@gmai.com>	2020-06-12 16:51:18 -04:00
Daniel Lemire	be707dbb6f	Added a remark	2020-06-12 16:07:34 -04:00
Daniel Lemire	45e2178ada	Duh.	2020-06-11 17:20:28 +00:00

1 2 3 4 5

203 Commits