Better documentation for issue 70 (#638)
This commit is contained in:
parent
47859f3560
commit
32afcd2e48
|
@ -152,3 +152,16 @@ If you wish to forcefully disable computed gotos, you can do so by compiling the
|
|||
`-DSIMDJSON_NO_COMPUTED_GOTO=1`. It is not recommended to disable computed gotos if your compiler
|
||||
supports it. In fact, you should almost never need to be concerned with computed gotos.
|
||||
|
||||
Number parsing
|
||||
--------------
|
||||
|
||||
Some JSON files contain many floating-point values. It is the case with many GeoJSON files. Accurately
|
||||
parsing decimal strings into binary floating-point values with proper rounding is challenging. To
|
||||
our knowledge, it is not possible, in general, to parse streams of numbers at gigabytes per second
|
||||
using a single core. While using the simdjson library, it is possible that you might be limited to a
|
||||
few hundred megabytes per second if your JSON documents are densely packed with floating-point values.
|
||||
|
||||
|
||||
- When possible, you should favor integer values written without a decimal point, as it simpler and faster to parse decimal integer values.
|
||||
- When serializing numbers, you should not use more digits than necessary: 17 digits is all that is needed to exactly represent double-precision floating-point numbers. Using many more digits than necessary will make your files larger and slower to parse.
|
||||
- When benchmarking parsing speeds, always report whether your JSON documents are made mostly of floating-point numbers when it is the case, since number parsing can then dominate the parsing time.
|
||||
|
|
|
@ -1,4 +1,28 @@
|
|||
Files from https://github.com/plokhotnyuk/jsoniter-scala/tree/master/jsoniter-scala-benchmark/src/main/resources/com/github/plokhotnyuk/jsoniter_scala/benchmark
|
||||
|
||||
See issue "Lower performance on small files":
|
||||
See issue:
|
||||
https://github.com/lemire/simdjson/issues/70
|
||||
|
||||
The files che-*.geo.json are number-parsing stress tests.
|
||||
|
||||
```
|
||||
$ for i in *.json ; do echo $i; ./parsingcompetition $i ; done
|
||||
che-1.geo.json
|
||||
simdjson : 4.841 cycles per input byte (best) 4.880 cycles per input byte (avg) 0.689 GB/s (error margin: 0.005 GB/s)
|
||||
RapidJSON (accurate number parsing) : 18.326 cycles per input byte (best) 19.185 cycles per input byte (avg) 0.185 GB/s (error margin: 0.008 GB/s)
|
||||
RapidJSON (insitu, accurate number parsing) : 18.158 cycles per input byte (best) 18.957 cycles per input byte (avg) 0.187 GB/s (error margin: 0.008 GB/s)
|
||||
nlohmann-json : 90.423 cycles per input byte (best) 91.077 cycles per input byte (avg) 0.038 GB/s (error margin: 0.000 GB/s)
|
||||
|
||||
che-2.geo.json
|
||||
simdjson : 4.849 cycles per input byte (best) 4.882 cycles per input byte (avg) 0.687 GB/s (error margin: 0.005 GB/s)
|
||||
RapidJSON (accurate number parsing) : 18.248 cycles per input byte (best) 19.197 cycles per input byte (avg) 0.186 GB/s (error margin: 0.009 GB/s)
|
||||
RapidJSON (insitu, accurate number parsing) : 18.178 cycles per input byte (best) 18.951 cycles per input byte (avg) 0.186 GB/s (error margin: 0.008 GB/s)
|
||||
nlohmann-json : 91.483 cycles per input byte (best) 91.842 cycles per input byte (avg) 0.037 GB/s (error margin: 0.000 GB/s)
|
||||
|
||||
che-3.geo.json
|
||||
simdjson : 4.862 cycles per input byte (best) 4.892 cycles per input byte (avg) 0.686 GB/s (error margin: 0.004 GB/s)
|
||||
RapidJSON (accurate number parsing) : 18.316 cycles per input byte (best) 19.202 cycles per input byte (avg) 0.185 GB/s (error margin: 0.008 GB/s)
|
||||
RapidJSON (insitu, accurate number parsing) : 18.143 cycles per input byte (best) 18.957 cycles per input byte (avg) 0.187 GB/s (error margin: 0.008 GB/s)
|
||||
nlohmann-json : 91.462 cycles per input byte (best) 91.758 cycles per input byte (avg) 0.037 GB/s (error margin: 0.000 GB/s)
|
||||
```
|
||||
|
||||
|
|
Loading…
Reference in New Issue