Updating the performance numbers. (#634)
* Updating the performance numbers. * Updating with growing file sizes.
This commit is contained in:
parent
2e420169c3
commit
1b6a31b277
50
README.md
50
README.md
|
@ -76,25 +76,47 @@ The simdjson library uses three-quarters less instructions than state-of-the-art
|
||||||
fifty percent less than sajson. To our knowledge, simdjson is the first fully-validating JSON parser
|
fifty percent less than sajson. To our knowledge, simdjson is the first fully-validating JSON parser
|
||||||
to run at gigabytes per second on commodity processors.
|
to run at gigabytes per second on commodity processors.
|
||||||
|
|
||||||
|
The following figure represents parsing speed in GB/s for parsing various files
|
||||||
|
on an Intel Skylake processor (3.4 GHz) using the GNU GCC 9 compiler (with the -O3 flag).
|
||||||
|
We compare against the best and fastest C++ libraries.
|
||||||
|
The simdjson library offers full unicode (UTF-8) validation and exact
|
||||||
|
number parsing. The RapidJSON library is tested in two modes: fast and
|
||||||
|
exact number parsing. The sajson library offers fast (but not exact)
|
||||||
|
number parsing and partial unicode validation. In this data set, the file
|
||||||
|
sizes range from 65KB (github_events) all the way to 3.3GB (gsoc-2018).
|
||||||
|
Many files are mostly made of numbers: canada, mesh.pretty, mesh, random
|
||||||
|
and numbers: in such instances, we see lower JSON parsing speeds due to the
|
||||||
|
high cost of number parsing.
|
||||||
|
|
||||||
<img src="doc/gbps.png" width="90%">
|
<img src="doc/gbps.png" width="90%">
|
||||||
|
|
||||||
On a Skylake processor, the parsing speeds (in GB/s) of various processors on the twitter.json file
|
On a Skylake processor, the parsing speeds (in GB/s) of various processors on the twitter.json file are as follows, using again GNU GCC 9.1 (with the -O3 flag). The popular JSON for Modern C++ library is particularly slow: it obviously trades parsing speed for other desirable features.
|
||||||
are as follows.
|
|
||||||
|
|
||||||
| parser | GB/s |
|
| parser | GB/s |
|
||||||
| ------------------------------------- | ---- |
|
| ------------------------------------- | ---- |
|
||||||
| simdjson | 2.2 |
|
| simdjson | 2.5 |
|
||||||
| RapidJSON encoding-validation | 0.51 |
|
| RapidJSON UTF8-validation | 0.29 |
|
||||||
| RapidJSON encoding-validation, insitu | 0.71 |
|
| RapidJSON UTF8-valid., exact numbers | 0.28 |
|
||||||
| sajson (insitu, dynamic) | 0.70 |
|
| RapidJSON insitu, UTF8-validation | 0.41 |
|
||||||
| sajson (insitu, static) | 0.97 |
|
| RapidJSON insitu, UTF8-valid., exact | 0.39 |
|
||||||
| dropbox | 0.14 |
|
| sajson (insitu, dynamic) | 0.62 |
|
||||||
| fastjson | 0.26 |
|
| sajson (insitu, static) | 0.88 |
|
||||||
| gason | 0.85 |
|
| dropbox | 0.13 |
|
||||||
| ultrajson | 0.42 |
|
| fastjson | 0.27 |
|
||||||
| jsmn | 0.28 |
|
| gason | 0.59 |
|
||||||
| cJSON | 0.34 |
|
| ultrajson | 0.34 |
|
||||||
| JSON for Modern C++ (nlohmann/json) | 0.10 |
|
| jsmn | 0.25 |
|
||||||
|
| cJSON | 0.31 |
|
||||||
|
| JSON for Modern C++ (nlohmann/json) | 0.11 |
|
||||||
|
|
||||||
|
|
||||||
|
The simdjson library offer high speed whether it processes tiny files (e.g., 300 bytes)
|
||||||
|
or larger files (e.g., 3MB). The following plot presents parsing
|
||||||
|
speed for [synthetic files over various sizes generated with a script](https://github.com/simdjson/simdjson_experiments_vldb2019/blob/master/experiments/growing/gen.py) on a 3.4 GHz Skylake processor (GNU GCC 9, -O3).
|
||||||
|
<img src="doc/growing.png" width="90%">
|
||||||
|
|
||||||
|
|
||||||
|
[All our experiments are reproducible](https://github.com/simdjson/simdjson_experiments_vldb2019).
|
||||||
|
|
||||||
Real-world usage
|
Real-world usage
|
||||||
----------------
|
----------------
|
||||||
|
|
BIN
doc/gbps.png
BIN
doc/gbps.png
Binary file not shown.
Before Width: | Height: | Size: 49 KiB After Width: | Height: | Size: 67 KiB |
Binary file not shown.
After Width: | Height: | Size: 44 KiB |
Loading…
Reference in New Issue