simdjson

Commit Graph

Author	SHA1	Message	Date
Paul Dreik	f93fb21c95	optionally disable deprecated apis (#1271 ) Introduce cmake option SIMDJSON_DISABLE_DEPRECATED_API (default Off) which turns off deprecated simdjson api functions by setting the macro SIMDJSON_DISABLE_DEPRECATED_API. For non-cmake users, users will have to set SIMDJSON_DISABLE_DEPRECATED_API by some other means to disable the api. Closes #1264	2020-11-01 06:38:52 +01:00
Daniel Lemire	a8bf10ea5a	Minor patch.	2020-10-30 14:51:50 -04:00
Daniel Lemire	a75c07065f	Fix for issue 1246. We document the relationship between parser instances and elements (#1250 ) * Fix for issue 1246. * Adopting John's wording.	2020-10-26 08:40:45 -04:00
Daniel Lemire	bb2bc98a22	Fix issue https://github.com/simdjson/simdjson/issues/1127 (#1224 )	2020-10-13 09:18:54 -04:00
Daniel Lemire	37e6d1e9c7	new number parsing (#1222 ) * Remove our dependency on strtod_l by bundling our own slow path. * Ok. Let us drop strtod entirely. * Trimming down the powers to -342. * Removing useless line. * Many more comments. * Adding some DLL exports. * Let the gods help those who rely on windows+gcc. * Marking the subnormals as unlikely. This is pretty much "performance neutral", but it might help just a bit with twitter.json.	2020-10-10 12:47:49 -04:00
John Keiser	4eb80ec75a	Add DOM API exception tests	2020-10-06 11:29:45 -07:00
Daniel Lemire	9865bb6904	Make it possible to check that an implementation is supported at runtime (#1197 ) * Make it possible to check that an implementation is supported at runtime. * add CI fuzzing on arm 64 bit This adds fuzzing on drone.io arm64 For some reason, leak detection had to be disabled. If it is enabled, the fuzzer falsely reports a crash at the end of fuzzing. Closes: #1188 * Guarding the implementation accesses. * Better doc. * Updating cxxopts. * Make it possible to check that an implementation is supported at runtime. * Guarding the implementation accesses. * Better doc. * Updating cxxopts. * We need to accomodate cxxopts Co-authored-by: Paul Dreik <github@pauldreik.se>	2020-10-02 11:04:51 -04:00
Daniel Lemire	8b5a89c136	Parsing floats with 19 significant digits should be fine. (#1191 ) * Parsing floats with 19 significant digits should be fine. * Adding more tests with very long mantissa.	2020-09-29 19:42:43 -04:00
Daniel Lemire	da093c1982	Fixing "undefined behavior" issue in new fast_itoa functions (#1186 ) * Fixing "undefined behavior" issue. * Simplifying our custom atoi * Fixing minor bug	2020-09-29 19:17:03 -04:00
Daniel Lemire	048fb6278a	This adds two tests to verify a new fuzzer issue. (So far I could not verify.) (#1194 )	2020-09-29 11:45:41 -04:00
Daniel Lemire	0e584fa4a5	Attempt to fix issue 1187. (#1192 )	2020-09-27 12:04:47 -04:00
Daniel Lemire	60c139a844	Faster and more correct serialization (#1168 ) * Adding new files. * Better. * Fixing minifier and adding tests. * Adding benchmarks. * Including the array header. * Replacing old stream-based code by the new code. * Doubling up the itoa. * Hidden away to_chars in internal namespace. * Removing the repetitions. * Documented the atoi functions. * Tuning the escape sequences. * Moving the operators off the main namespace. * Added more tests. * Tweaking the implementation so that it works with and without exp. * The string_builder template and mini_formatter class are not part of our public API and are subject to change at any time! * Adding a benchmark and some optimization. * Cleaning. * Strictly speaking, this header is needed.	2020-09-23 10:00:39 -04:00
Daniel Lemire	f410213003	Improve documentation on padding - Improves and clarifies the documentation on padding. - Use std:: prefix for memcpy, strlen etc. Related to issues #1175 and #1178	2020-09-23 09:07:14 +02:00
Daniel Lemire	bfbac12f76	We were forgetting to check the end bytes at the end of the UTF8 validation. (#1173 ) * We were forgetting to check the end bytes at the end of the UTF8 validation. * Silencing the sanitizer * Better explanation.	2020-09-15 11:33:09 -04:00
Daniel Lemire	3e5497e2f9	Fixes issue 1170 and makes the usage of minify easier. (#1171 ) * Fixes issue 1170 and makes the usage of minify easier. * This should get the fallback implementation to detect unclosed strings.	2020-09-12 16:20:20 -04:00
Daniel Lemire	8a8eea53a2	Prefixing macros (issue 1035) (#1124 ) * Renaming partially done. * More prefixing. * I thought that this was fixed. * Missed one. * Missed a few. * Missed another one. * Minor fixes.	2020-08-18 18:25:36 -04:00
Daniel Lemire	039d82ff1b	Returning basictests to its original function: basic tests (only) (#1010 ) * The initial motivation behind basictests was for a quick set of sanity tests to check whether your code made sense. It was not meant for thorough testing to find corner cases. However, over time, it grew to include such expensive tests. This PR takes them out. It also allows us to bring back basictests to MinGW tests, since it is now cheap. This is not an exercise in software engineering and making things prettier. This is a pragmatic change to improve our test coverage and quality of life. * Adds many more cheap tests. Co-authored-by: Daniel Lemire <lemire@gmai.com>	2020-07-13 09:39:35 -04:00
Daniel Lemire	ccc94c9b05	Mingw tests (32-bit and 64-bit) (#1004 )	2020-06-29 21:10:54 -04:00
John Keiser	257089884f	Merge pull request #958 from simdjson/jkeiser/is Make simdjson_result<element>.is() return bool	2020-06-23 09:51:37 -07:00
John Keiser	c650ea9765	Merge pull request #960 from simdjson/jkeiser/idiomatic-get Convert simdjson to use .get()	2020-06-23 09:49:41 -07:00
John Keiser	2d84b6f6d9	Make simdjson_result<element>.is() return bool	2020-06-23 09:09:24 -07:00
John Keiser	eef1171944	Merge pull request #954 from simdjson/jkeiser/parse-many-result Return error from parse_many	2020-06-23 09:06:20 -07:00
John Keiser	1ff55c2729	Replace auto [x,error] with .get() everywhere	2020-06-21 16:26:59 -07:00
Daniel Lemire	5dbcdf1484	Ok	2020-06-21 17:52:30 -04:00
John Keiser	6fa5abcd7e	Replace x.get<T>() with x.get(v) or T(x)	2020-06-21 14:36:38 -07:00
John Keiser	ae1bd891e7	Remove deprecated uses of parse_many	2020-06-21 11:19:06 -07:00
John Keiser	9899e5021d	Allow use of document_stream with tie()	2020-06-20 21:15:05 -07:00
John Keiser	a7fc7d4ffb	Switch from get(v,e) to e = get(v)	2020-06-20 17:57:09 -07:00
John Keiser	56e2b38048	Add bool result from tie()/get(), get<T>(T&,error_code&)	2020-06-20 17:55:46 -07:00
John Keiser	0b8c357eff	Add get_X and is_X methods	2020-06-19 13:27:33 -07:00
John Keiser	efc168f473	Make test changes only	2020-06-19 13:27:33 -07:00
John Keiser	d8428f98d9	Add cast_tester.h	2020-06-19 13:27:33 -07:00
John Keiser	60f17d26a3	Move test macros to a header	2020-06-19 13:27:00 -07:00
Daniel Lemire	c13c2650a2	Merge pull request #940 from simdjson/issue938 Verifying (and fixing) issue 938	2020-06-18 18:25:31 -04:00
Daniel Lemire	04a19f9813	Fixes https://github.com/simdjson/simdjson/issues/937	2020-06-17 18:06:13 -04:00
Daniel Lemire	6537d0dc76	Avoiding the unused errors.	2020-06-17 14:19:58 +00:00
Daniel Lemire	8d609607e2	Verifying the bug.	2020-06-16 20:04:09 -04:00
John Keiser	fd44c2a2ff	Merge pull request #927 from simdjson/dlemire/exposingthestringminifier Exposing the string minifier.	2020-06-13 07:47:20 -07:00
John Keiser	a86a82b39c	Rename minify class to minifier so the minify() method is cleared up	2020-06-12 17:05:25 -07:00
Daniel Lemire	4dfbf98e4e	Using a worker instead of a thread per batch (#920 ) In the parse_many function, we have one thread doing the stage 1, while the main thread does stage 2. So if stage 1 and stage 2 take half the time, the parse_many could run at twice the speed. It is unlikely to do so. Still, we see benefits of about 40% due to threading. To achieve this interleaving, we load the data in batches (blocks) of some size. In the current code (master), we create a new thread for each batch. Thread creation is expensive so our approach only works over sizeable batches. This PR improves things and makes parse_many faster when using small batches. This fixes our parse_stream benchmark which is just busted. This replaces the one-thread per batch routine by a worker object that reuses the same thread. In benchmarks, this allows us to get the same maximal speed, but with smaller processing blocks. It does not help much with larger blocks because the cost of the thread create gets amortized efficiently. This PR makes parse_many beneficial over small datasets. It also makes us less dependent on the thread creation time. Unfortunately, it is going to be difficult to say anything definitive in general. The cost of creating a thread varies widely depending on the OS. On some systems, it might be cheap, in others very expensive. It should be expected that the new code will depend less drastically on the performances of the underlying system, since we create juste one thread. Co-authored-by: John Keiser <john@johnkeiser.com> Co-authored-by: Daniel Lemire <lemire@gmai.com>	2020-06-12 16:51:18 -04:00
Daniel Lemire	45e2178ada	Duh.	2020-06-11 17:20:28 +00:00
Daniel Lemire	a6e4933d93	Exposing the string minifier.	2020-06-11 13:07:18 -04:00
John Keiser	fe01da077e	Make threaded version work again	2020-06-07 16:21:00 -07:00
John Keiser	c4a0fe1606	Add tests for parse_many() errors	2020-06-07 16:20:46 -07:00
Daniel Lemire	7a69da16e4	Fixing issue 906 (#912 ) * Fixing issue 906 * Safe patching. * Now with explanations. * Bumping up memory allocation. * Putting the patch back. * fallback fixes. Co-authored-by: Daniel Lemire <lemire@gmai.com>	2020-06-05 15:37:09 -04:00
Daniel Lemire	12150baa5e	Using just ASCII. (#899 ) * Using just ASCII. * Let us prune checkperf. * Moving the description of lookup2 to the HACKING.md file.	2020-05-21 21:59:06 -04:00
John Keiser	5312fd30e5	Fix CRT_SECURE warnings in clang	2020-05-04 11:36:00 -07:00
John Keiser	0e6ea76e88	Make checkperf work on Windows (#799 ) * Make command line arguments work for Windows * Run checkperf on Windows	2020-04-27 14:20:05 -04:00
John Keiser	d4a37f6ef5	Enable conversion warnings on Linux and Windows	2020-04-22 14:21:30 -07:00
John Keiser	ff09b6c824	Run fewer redundant steps and configs in CI	2020-04-17 12:23:05 -07:00

1 2 3

118 Commits