diff --git a/README.md b/README.md index b5dde0c4..6bd51e13 100644 --- a/README.md +++ b/README.md @@ -76,25 +76,47 @@ The simdjson library uses three-quarters less instructions than state-of-the-art fifty percent less than sajson. To our knowledge, simdjson is the first fully-validating JSON parser to run at gigabytes per second on commodity processors. +The following figure represents parsing speed in GB/s for parsing various files +on an Intel Skylake processor (3.4 GHz) using the GNU GCC 9 compiler (with the -O3 flag). +We compare against the best and fastest C++ libraries. +The simdjson library offers full unicode (UTF-8) validation and exact +number parsing. The RapidJSON library is tested in two modes: fast and +exact number parsing. The sajson library offers fast (but not exact) +number parsing and partial unicode validation. In this data set, the file +sizes range from 65KB (github_events) all the way to 3.3GB (gsoc-2018). +Many files are mostly made of numbers: canada, mesh.pretty, mesh, random +and numbers: in such instances, we see lower JSON parsing speeds due to the +high cost of number parsing. + -On a Skylake processor, the parsing speeds (in GB/s) of various processors on the twitter.json file -are as follows. +On a Skylake processor, the parsing speeds (in GB/s) of various processors on the twitter.json file are as follows, using again GNU GCC 9.1 (with the -O3 flag). The popular JSON for Modern C++ library is particularly slow: it obviously trades parsing speed for other desirable features. | parser | GB/s | | ------------------------------------- | ---- | -| simdjson | 2.2 | -| RapidJSON encoding-validation | 0.51 | -| RapidJSON encoding-validation, insitu | 0.71 | -| sajson (insitu, dynamic) | 0.70 | -| sajson (insitu, static) | 0.97 | -| dropbox | 0.14 | -| fastjson | 0.26 | -| gason | 0.85 | -| ultrajson | 0.42 | -| jsmn | 0.28 | -| cJSON | 0.34 | -| JSON for Modern C++ (nlohmann/json) | 0.10 | +| simdjson | 2.5 | +| RapidJSON UTF8-validation | 0.29 | +| RapidJSON UTF8-valid., exact numbers | 0.28 | +| RapidJSON insitu, UTF8-validation | 0.41 | +| RapidJSON insitu, UTF8-valid., exact | 0.39 | +| sajson (insitu, dynamic) | 0.62 | +| sajson (insitu, static) | 0.88 | +| dropbox | 0.13 | +| fastjson | 0.27 | +| gason | 0.59 | +| ultrajson | 0.34 | +| jsmn | 0.25 | +| cJSON | 0.31 | +| JSON for Modern C++ (nlohmann/json) | 0.11 | + + +The simdjson library offer high speed whether it processes tiny files (e.g., 300 bytes) +or larger files (e.g., 3MB). The following plot presents parsing +speed for [synthetic files over various sizes generated with a script](https://github.com/simdjson/simdjson_experiments_vldb2019/blob/master/experiments/growing/gen.py) on a 3.4 GHz Skylake processor (GNU GCC 9, -O3). + + + +[All our experiments are reproducible](https://github.com/simdjson/simdjson_experiments_vldb2019). Real-world usage ---------------- diff --git a/doc/gbps.png b/doc/gbps.png index 32949109..f4efe05d 100644 Binary files a/doc/gbps.png and b/doc/gbps.png differ diff --git a/doc/growing.png b/doc/growing.png new file mode 100644 index 00000000..4a8bfa54 Binary files /dev/null and b/doc/growing.png differ