Commit Graph

14 Commits

Author SHA1 Message Date
Daniel Lemire 0d4e501239 Fixing the bug. 2020-06-17 10:06:16 -04:00
Daniel Lemire 4dfbf98e4e
Using a worker instead of a thread per batch (#920)
In the parse_many function, we have one thread doing the stage 1, while the main thread does stage 2. So if stage 1 and stage 2 take half the time, the parse_many could run at twice the speed. It is unlikely to do so. Still, we see benefits of about 40% due to threading.

To achieve this interleaving, we load the data in batches (blocks) of some size. In the current code (master), we create a new thread for each batch. Thread creation is expensive so our approach only works over sizeable batches. This PR improves things and makes parse_many faster when using small batches.

  This fixes our parse_stream benchmark which is just busted.
  This replaces the one-thread per batch routine by a worker object that reuses the same thread. In benchmarks, this allows us to get the same maximal speed, but with smaller processing blocks. It does not help much with larger blocks because the cost of the thread create gets amortized efficiently.
This PR makes parse_many beneficial over small datasets. It also makes us less dependent on the thread creation time.

Unfortunately, it is going to be difficult to say anything definitive in general. The cost of creating a thread varies widely depending on the OS. On some systems, it might be cheap, in others very expensive. It should be expected that the new code will depend less drastically on the performances of the underlying system, since we create juste one thread.

Co-authored-by: John Keiser <john@johnkeiser.com>
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-12 16:51:18 -04:00
Furkan 89332e1696
Temporary fix to #914 (#917) 2020-06-05 21:01:41 -04:00
Daniel Lemire 2fe2dd170b
The "competition tests" are being made portable (#907)
* More portable competition

* This will enable SIMDJSON_COMPETITION everywhere by default.

* Minor fixes
2020-05-31 20:34:06 -04:00
John Keiser 23dd0bdaa1 Remove /nologo MSVC flag 2020-05-04 11:35:57 -07:00
John Keiser 1d06624d38 Unset /D_CRT_SECURE_NO_WARNINGS
- Also localize DISABLE_DEPRECATED_WARNING so that we catch other
  deprecations
2020-05-04 11:35:05 -07:00
Furkan Usta 064eb0b24f CMake: Make simdjson-internal-flags subsume simdjson-flags 2020-05-03 02:48:29 +03:00
Furkan Usta 293c104cc4 CMake: Separate public and private compilation flags
simdjson-internal-flags for macros and warnings
simdjson-flags for pthread, sanitizer, and libcpp
2020-05-02 04:08:47 +03:00
John Keiser e7f774f964
Merge pull request #836 from furkanusta/fix830
Make library consumable after CMake installation (Fixes #830)
2020-04-30 11:34:08 -07:00
Furkan Usta 73d7d704c1 CMake: Remove export_private_library
Since we are exporting all the targets as part of the main simdjson target we do not need private
exports anymore
2020-04-30 02:06:19 +03:00
John Keiser c3dec1a5ea Default SIMDJSON_GOOGLE_BENCHMARKS to ON. 2020-04-29 15:21:43 -07:00
Furkan Usta 44b06d70e8 CMake: Link Threads the old-style
If linked against Threads::Threads target while building static libraries, cmake cannot find the
threads library while trying to use the installed target afterwards
2020-04-29 23:47:36 +03:00
Furkan Usta eee07e6cfd Use the same export name for all targets 2020-04-29 23:47:27 +03:00
John Keiser 1d069e5077 Split simdjson-flags and cmakecache into include files 2020-04-23 17:06:35 -07:00