Minor fixes to our documentation regarding thread safety. (#683)

* Minor fixes to our documentation regarding thread safety.

* A bit more pessimistic.
This commit is contained in:
Daniel Lemire 2020-04-08 16:41:08 -04:00 committed by GitHub
parent ff0b0c54b7
commit 74d9b41b7d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 15 additions and 10 deletions

View File

@ -298,11 +298,15 @@ See [parse_many.md](parse_many.md) for detailed information and design.
Thread Safety
-------------
The simdjson library is mostly single-threaded. Thread safety is the responsibility of the caller:
it is unsafe to reuse a dom::parser object between different threads.
We built simdjson with thread safety in mind.
simdjson's CPU detection, which runs the first time parsing is attempted and switches to the fastest
The simdjson library is single-threaded except for [`parse_many`](https://github.com/simdjson/simdjson/blob/master/doc/parse_many.md) which may use secondary threads under its control when the library is compiled with thread support.
We recommend using one `dom::parser` object per thread in which case the library is thread-safe.
It is unsafe to reuse a `dom::parser` object between different threads.
The parsed results (`dom::document`, `dom::element`, `array`, `object`) depend on the `dom::parser`, etc. therefore it is also potentially unsafe to use the result of the parsing between different threads.
The CPU detection, which runs the first time parsing is attempted and switches to the fastest
parser for your CPU, is transparent and thread-safe.
The json stream parser is threaded, using a second thread under its own control. Like the single
document parser

View File

@ -12,8 +12,6 @@ Contents
- [How it works](#how-it-works)
- [Support](#support)
- [API](#api)
- [Concurrency mode](#concurrency-mode)
- [Example](#example)
- [Use cases](#use-cases)
Motivations
@ -88,14 +86,17 @@ sweet spot for now.
### Threads
But how can we make use of threads? We found a pretty cool algorithm that allows us to quickly
But how can we make use of threads if they are available? We found a pretty cool algorithm that allows us to quickly
identify the position of the last JSON document in a given batch. Knowing exactly where the end of
the batch is, we no longer need for stage 2 to finish in order to load a new batch. We already know
where to start the next batch. Therefore, we can run stage 1 on the next batch concurrently while
the main thread is going through stage 2. Now, running stage 1 in a different thread can, in best
cases, remove almost entirely it's cost and replaces it by the overhead of a thread, which is orders
the main thread is going through stage 2. Running stage 1 in a different thread can, in best
cases, remove almost entirely its cost and replaces it by the overhead of a thread, which is orders
of magnitude cheaper. Ain't that awesome!
Thread support is only active if thread supported is detected in which case the macro
SIMDJSON_THREADS_ENABLED is set. Otherwise the library runs in single-thread mode.
Support
-------