Commit Graph

374 Commits

Author SHA1 Message Date
John Keiser f75e856d2b Compare records to ensure benchmarks work 2020-10-04 12:47:30 -07:00
John Keiser 44d689bc6e Make instructions / cycle counters more useful 2020-10-04 12:47:30 -07:00
John Keiser 9e433c2f19 Move benchmarks into their own directories 2020-10-04 12:47:30 -07:00
John Keiser 4d89076bdc Check for EOF when skipping containers
Revert that?

Or not
2020-10-04 12:47:30 -07:00
John Keiser 283ac3191f Rename parse->iterate, add iterate_raw 2020-10-04 12:47:29 -07:00
John Keiser 4dd0c80dad Move current_string_buf_loc to json_iterator 2020-10-04 12:47:29 -07:00
John Keiser 3b53c6ca47 Use json_iterator as shared state instead of document 2020-10-04 12:47:29 -07:00
John Keiser a90b8fb449 Remove depth tracking from ondemand api 2020-10-04 12:47:29 -07:00
John Keiser 98be2c91df Fix SAX benchmarks to actually push to vector 2020-10-04 12:47:29 -07:00
John Keiser 2657e5e226 Fix points SAX to actually record points 2020-10-04 12:47:29 -07:00
John Keiser cfcb0d4fb7 Use json_iterator in array/object 2020-10-04 12:47:29 -07:00
John Keiser 4065529bdf Don't try to compile Haswell benchmarks on ARM 2020-10-04 12:47:29 -07:00
John Keiser 6be2db8c42 Fix SAX benchmark to actually add tweets 2020-10-04 12:47:29 -07:00
John Keiser 5cf68416d8 Don't bother comparing field names in parserandom 2020-10-04 12:47:29 -07:00
John Keiser ebcb3c6b3b On-demand parse implementation 2020-10-04 12:47:29 -07:00
Paul Dreik 04267e0f6b
add boost.json to benchmark (#1202)
Add boost.json to the benchmark.
It was accepted into boost 20201003, see https://lists.boost.org/Archives/boost/2020/10/250129.php.

The upstream repo is (expected to eventually be migrated to boost): https://github.com/CPPAlliance/json
2020-10-04 10:00:09 +02:00
Daniel Lemire a540e6afc5
Testing on minimalist alpine (linux) images (#1200)
* Tweaking header includes to make it safer.

* Adding the actual tests.

* Fixing my syntax.
2020-10-02 13:32:09 -04:00
Daniel Lemire 9865bb6904
Make it possible to check that an implementation is supported at runtime (#1197)
* Make it possible to check that an implementation is supported at runtime.

* add CI fuzzing on arm 64 bit

This adds fuzzing on drone.io arm64

For some reason, leak detection had to be disabled. If it is enabled, the fuzzer falsely reports a crash at the end of fuzzing.

Closes: #1188

* Guarding the implementation accesses.

* Better doc.

* Updating cxxopts.

* Make it possible to check that an implementation is supported at runtime.

* Guarding the implementation accesses.

* Better doc.

* Updating cxxopts.

* We need to accomodate cxxopts

Co-authored-by: Paul Dreik <github@pauldreik.se>
2020-10-02 11:04:51 -04:00
Daniel Lemire 60c139a844
Faster and more correct serialization (#1168)
* Adding new files.

* Better.

* Fixing minifier and adding tests.

* Adding benchmarks.

* Including the array header.

* Replacing old stream-based code by the new code.

* Doubling up the itoa.

* Hidden away to_chars in internal namespace.

* Removing the repetitions.

* Documented the atoi functions.

* Tuning the escape sequences.

* Moving the operators off the main namespace.

* Added more tests.

* Tweaking the implementation so that it works with and without exp.

* The string_builder template and mini_formatter class
 are not part of  our public API and are subject to change
 at any time!

* Adding a benchmark and some optimization.

* Cleaning.

* Strictly speaking, this header is needed.
2020-09-23 10:00:39 -04:00
Daniel Lemire f410213003
Improve documentation on padding
- Improves and clarifies the documentation on padding.
 - Use std:: prefix for memcpy, strlen etc.

Related to issues #1175 and #1178
2020-09-23 09:07:14 +02:00
Daniel Lemire 4c11652808
This must be a typo (#1140) 2020-08-28 20:35:13 -04:00
John Keiser b2779c35df Fix issue with unsupported unreachable on Windows 2020-08-18 21:35:12 -07:00
John Keiser 18564f1ae2 Don't benchmark unless haswell is available 2020-08-18 21:25:03 -07:00
John Keiser 638f1deb62 Add DOM tweet reader for comparison 2020-08-18 21:25:03 -07:00
John Keiser 7e74d30f45 [WIP] tweet reader SAX benchmark 2020-08-18 21:25:03 -07:00
Daniel Lemire 8a8eea53a2
Prefixing macros (issue 1035) (#1124)
* Renaming partially done.

* More prefixing.

* I thought that this was fixed.

* Missed one.

* Missed a few.

* Missed another one.

* Minor fixes.
2020-08-18 18:25:36 -04:00
Daniel Lemire 501fed6c4f
This would disable bash scripts under FreeBSD. (#1118)
* This would disable bash scripts under FreeBSD.

* Let us also disable GIT.

* Let us try to just disable GIT

* Nope. We must have both bash and git disabled.
2020-08-17 11:50:57 -04:00
Daniel Lemire 2f92a34bb7
Turns out that passing dom::element by reference can be a performance killer. (#1086)
* Turns out that passing dom::element by reference can be a performance killer.

* Tweaking.
2020-08-01 10:31:47 -04:00
Daniel Lemire 84dc398d32 Adding a couple of tests. 2020-07-31 15:29:10 -04:00
Daniel Lemire f80668e87f
This removes the crazy alignment requirements. (#1073)
* This removes the crazy alignment requirements.
2020-07-27 16:19:01 -04:00
Daniel Lemire af18d5ed81
This adds a validation benchmark (#1040) 2020-07-20 18:56:39 -04:00
Daniel Lemire d0ce2f0b5a
Fixing clang under visual studio (#1028)
* Lots of fixes

* Removing some lambdas

* Removing some functional programming.

Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-07-06 18:58:19 -04:00
Daniel Lemire 29e744fdbb Adding warning message. 2020-06-24 19:23:02 -04:00
Daniel Lemire 515b87bcbe Disabling perfcheck for ninja 2020-06-24 18:45:47 -04:00
Daniel Lemire 5b4acf14ea Removing space. 2020-06-24 16:51:28 -04:00
Daniel Lemire 5fc6cb15b8 This should make things even more robust. If .git is not found, just disable all git work. 2020-06-24 16:12:19 -04:00
Daniel Lemire f6e9a8eee4 Making the cmake more verbose so we can figure out what is happening. 2020-06-24 15:44:22 -04:00
Daniel Lemire cb8a9ef2c0 This removes git as a dependency 2020-06-24 15:13:47 -04:00
John Keiser 1ff55c2729 Replace auto [x,error] with .get() everywhere 2020-06-21 16:26:59 -07:00
John Keiser 6fa5abcd7e Replace x.get<T>() with x.get(v) or T(x) 2020-06-21 14:36:38 -07:00
John Keiser 9899e5021d Allow use of document_stream with tie() 2020-06-20 21:15:05 -07:00
John Keiser a7fc7d4ffb Switch from get(v,e) to e = get(v) 2020-06-20 17:57:09 -07:00
John Keiser f336103f63 Convert tools/docs/benchmarks to bool get() idiom 2020-06-20 17:55:46 -07:00
John Keiser 56e2b38048 Add bool result from tie()/get(), get<T>(T&,error_code&) 2020-06-20 17:55:46 -07:00
John Keiser 7339f67dd7
Merge pull request #462 from simdjson/jkeiser/if-backslash
Wrap backslash processing in a branch
2020-06-17 07:07:58 -07:00
Daniel Lemire 7ea05d038e
New API traversal tests. (#931)
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-15 13:15:52 -04:00
Daniel Lemire 33930ff046 Adding link. 2020-06-15 13:07:53 -04:00
John Keiser fd44c2a2ff
Merge pull request #927 from simdjson/dlemire/exposingthestringminifier
Exposing the string minifier.
2020-06-13 07:47:20 -07:00
John Keiser a86a82b39c Rename minify class to minifier so the minify() method is cleared up 2020-06-12 17:05:25 -07:00
Daniel Lemire d1a54249e7 New API traversal tests. 2020-06-12 17:42:57 -04:00
Daniel Lemire 4dfbf98e4e
Using a worker instead of a thread per batch (#920)
In the parse_many function, we have one thread doing the stage 1, while the main thread does stage 2. So if stage 1 and stage 2 take half the time, the parse_many could run at twice the speed. It is unlikely to do so. Still, we see benefits of about 40% due to threading.

To achieve this interleaving, we load the data in batches (blocks) of some size. In the current code (master), we create a new thread for each batch. Thread creation is expensive so our approach only works over sizeable batches. This PR improves things and makes parse_many faster when using small batches.

  This fixes our parse_stream benchmark which is just busted.
  This replaces the one-thread per batch routine by a worker object that reuses the same thread. In benchmarks, this allows us to get the same maximal speed, but with smaller processing blocks. It does not help much with larger blocks because the cost of the thread create gets amortized efficiently.
This PR makes parse_many beneficial over small datasets. It also makes us less dependent on the thread creation time.

Unfortunately, it is going to be difficult to say anything definitive in general. The cost of creating a thread varies widely depending on the OS. On some systems, it might be cheap, in others very expensive. It should be expected that the new code will depend less drastically on the performances of the underlying system, since we create juste one thread.

Co-authored-by: John Keiser <john@johnkeiser.com>
Co-authored-by: Daniel Lemire <lemire@gmai.com>
2020-06-12 16:51:18 -04:00
Daniel Lemire 1b6258ec8c Added std::minify 2020-06-12 16:37:41 -04:00
John Keiser 7c6723d912 Print progress bar even if there is only one file 2020-06-12 10:01:19 -07:00
Daniel Lemire a6e4933d93 Exposing the string minifier. 2020-06-11 13:07:18 -04:00
John Keiser ae6dddfff4
Merge pull request #903 from simdjson/jkeiser/dom-parser-implementation
Move parser state to implementation-specific class
2020-06-04 13:09:57 -07:00
John Keiser 1aab4752e2 Store all parser state in the implementation 2020-06-01 12:15:54 -07:00
John Keiser 6a71b24495 Reuse stored buf and len from parser 2020-06-01 12:14:09 -07:00
John Keiser a3a9bde83e Move DOM parsing into concrete interface implementation 2020-06-01 12:14:09 -07:00
Daniel Lemire 2fe2dd170b
The "competition tests" are being made portable (#907)
* More portable competition

* This will enable SIMDJSON_COMPETITION everywhere by default.

* Minor fixes
2020-05-31 20:34:06 -04:00
John Keiser e6c9dfbd91 Make include files more fine-grained 2020-05-19 14:42:04 -07:00
Daniel Lemire 8927a0561f
Obvious fix. (#885) 2020-05-14 20:39:44 -04:00
John Keiser 8c600ca553 Make benchfeatures work again 2020-05-05 09:39:29 -07:00
John Keiser 5312fd30e5 Fix CRT_SECURE warnings in clang 2020-05-04 11:36:00 -07:00
Furkan Usta af968c5b44 Merge branch 'master' of github.com:simdjson/simdjson into cmake-flags 2020-05-03 02:12:23 +03:00
Furkan Usta 293c104cc4 CMake: Separate public and private compilation flags
simdjson-internal-flags for macros and warnings
simdjson-flags for pthread, sanitizer, and libcpp
2020-05-02 04:08:47 +03:00
Furkan Usta 5a3035bb72 Propagate some CMake variables to checkperf
Although it passes user-defined options, if the project is build in Debug mode or with Clang (since
CXX defaults to gcc on Linux) results can flactuate
2020-05-02 02:40:13 +03:00
Daniel Lemire fa4ce6a8bc
There is confusion between gigabytes and gigibytes. Let us standardize throughout. (#838)
* There is confusion between gigabytes and gigibytes.

* Trying to be consistent.
2020-05-01 12:16:18 -04:00
John Keiser c3dec1a5ea Default SIMDJSON_GOOGLE_BENCHMARKS to ON. 2020-04-29 15:21:43 -07:00
Daniel Lemire f0d5337818
Adding independent benchmarks using Google Benchmark (#826)
* Adding independent benchmarks using Google Benchmark
2020-04-29 13:53:54 -04:00
Daniel Lemire 4cd9de5c37
This will change the default of the parse benchmark so that it work over hot buffers (#827)
* This will change the default of the parse benchmark so that it work over hot buffers
by default, thus omitting memory allocation as part of the benchmark.

* Everyone should be using '-H' from now on.
2020-04-29 13:43:27 -04:00
John Keiser 92d7af0881 Don't include benchmark overhead in documents/s 2020-04-28 13:15:01 -07:00
John Keiser 0e6ea76e88
Make checkperf work on Windows (#799)
* Make command line arguments work for Windows

* Run checkperf on Windows
2020-04-27 14:20:05 -04:00
Daniel Lemire 0d1c574cb1
A few more changes... (#775)
* More nitpicking.
2020-04-23 11:36:52 -04:00
Daniel Lemire e030f02776 Merge branch 'master' into jkeiser/wconversion 2020-04-22 22:03:34 -04:00
Daniel Lemire 80dbf9a32a
We should not enable checkperf on anything but Linux. (#762) 2020-04-22 21:54:03 -04:00
John Keiser d4a37f6ef5 Enable conversion warnings on Linux and Windows 2020-04-22 14:21:30 -07:00
John Keiser 3091e2dc0e Add fallback, westmere and unthreaded checkperf 2020-04-20 11:18:40 -07:00
John Keiser fc50a36cc5 Make checkperf build master with the same options 2020-04-20 11:14:46 -07:00
John Keiser 9bf9fba2ec Add checkperf to cmake 2020-04-20 11:14:46 -07:00
John Keiser 289cc3e7a0 Treat warnings as errors during compilation 2020-04-15 19:59:38 -07:00
John Keiser 7480b87e07
Merge pull request #693 from simdjson/jkeiser/cmake-quickstartcpp
Add C++11 tests to cmake
2020-04-15 19:53:14 -07:00
Daniel Lemire befa6423be
This massively improves the performance of tight loops relying on a type() call. (#721)
* This massively improves the performance of tight loops relying on a type() call.

* Adding a few more benchmarks
2020-04-15 20:45:40 -04:00
John Keiser 09cf18a646 Add C++11 tests to cmake
- Add simdjson-flags target so callers don't have flags forced on them
2020-04-15 17:26:25 -07:00
Daniel Lemire 326c175dcb
Massive performance boost for get<double>. (#719)
* Massive performance boost for get<double>.
2020-04-15 20:09:45 -04:00
Daniel Lemire 6d7c77ddc1
Let us try to check with the exceptions disabled. (#707)
* Tweaking code so that we can run all tests with exceptions off.
* Removing SIMDJSON_DISABLE_EXCEPTIONS
2020-04-15 16:45:36 -04:00
Daniel Lemire b523c43927
Can we provide a size() function to arrays and objects? (eager approach) [TO BE MERGED] (#690)
* This is an implementation of "size()" for arrays and objects.
* Adding benchmark
* Adding a size() remark in the documentation.
* Extending size() to result types.
2020-04-15 10:15:48 -04:00
Paul Dreik 75545ff70d
ref qualify parser methods to avoid use of dangling objects (#703)
To avoid using data belonging to a temporary, the parse functions are ref qualified to get a compile error if used on an rvalue. See https://github.com/simdjson/simdjson/issues/696

Compilation tests are also added, to make sure bad usage fails to compile.

Reviewed by jkeiser.
2020-04-15 09:57:52 +02:00
John Keiser 7317fe1440 Don't reinitialize submodules
Add ability to turn competitive benchmarks off (no need for submodules)
2020-04-09 08:52:29 -07:00
John Keiser 7b58fea911 Add benchmark competitions to cmake 2020-04-09 08:52:29 -07:00
John Keiser 6dabfa176a Add competition libraries 2020-04-09 08:52:29 -07:00
John Keiser 3dcc188d93 Add more tests to cmake 2020-04-08 14:52:56 -07:00
John Keiser 10b7556a37 Specify cmake tests, benchmarks and tools idiomatically 2020-04-08 14:52:56 -07:00
John Keiser 54b7291c34 Reference simdjson by name, don't specify include files individually 2020-04-08 14:52:55 -07:00
Daniel Lemire 21dce6cca9
Displaying the numbers of documents parsed per second (#652)
* Some users are interested, as a metric, in the number of documents parsed per second.
Obviously, this means reusing the same parser again and again.

* Adding a sentence

* This update the parsingcompetition benchmark so that it displays the number of documents parsed per second.
2020-03-30 17:51:03 -04:00
John Keiser d93af1161d Remove set_capacity, replace with allocate
Makes allocation point more predictable
2020-03-30 13:49:54 -07:00
John Keiser 434776db1a Deprecate more things 2020-03-30 13:48:43 -07:00
John Keiser 622d9c9480 Replace as_X and is_X with get<T> and is<T> 2020-03-28 15:29:53 -07:00
John Keiser 03746b966b Move document/element/etc. under dom 2020-03-28 13:42:21 -07:00
Daniel Lemire 450e19858b Minor fix to distinctuseridcompetition 2020-03-28 15:56:10 -04:00
John Keiser e836c28008 Deprecate parser error code methods
- Also make competitions compile without warnings
2020-03-28 10:13:20 -07:00
John Keiser 5ad405006c Return document::element from parse, load, parse_many, load_many 2020-03-27 12:24:41 -07:00
John Keiser 2e420169c3 Remove document::parse and document::load 2020-03-26 10:13:09 -07:00
Daniel Lemire ab0e22a316
Trying to migrate distinctuseridcompetition to new API. (#624)
* Trying to migrate distinctuseridcompetition to new API.

* Ok. Good performance + got rid of old API.
2020-03-26 12:06:28 -04:00
John Keiser a0bce440a6 Remove document_iterator, document::iterator, ParsedJsonIterator
Keep ParsedJson::Iterator only, without template, in same form as
it was in 0.2
2020-03-25 18:26:51 -07:00
Daniel Lemire 1cf4fe405d
Fixing issue 602 (#621) 2020-03-25 21:06:20 -04:00
Daniel Lemire 6b8f5d3354
Fixing issue 601 (#618)
* Fixing issue 601
2020-03-25 20:44:55 -04:00
John Keiser d5af359365
Fix compile error in master (#619) 2020-03-25 20:11:23 -04:00
Daniel Lemire d84e70b6e5
migrating minifier competition to new API (#597)
* Migrating minifiercompetition to new API.
2020-03-24 10:13:55 -04:00
Daniel Lemire 7ff034504d
Migrating parsingcompetition to new API. (#593)
* Migrating parsingcompetition to new API.

* Removing ParsedJson
2020-03-24 10:06:44 -04:00
Daniel Lemire 5d1e3efce8
faster minifier (#568)
* Fallback should use our scalar code.
* parse should have a nicer error message.
* Making it so that "minify" can use different architectures.
* Let us change the minifier competition so that it tests all implementations.
* Documenting the untaken optimization opportunity.

Co-authored-by: John Keiser <john@johnkeiser.com>
2020-03-20 16:14:47 -04:00
Daniel Lemire 6cefeb338b
std::tie does not work on some compilers (#567)
* std::tie workaround.

* Cleaner solution
2020-03-19 16:56:45 -04:00
John Keiser af203aaf86 Add fallback parser for pre-SSE4.2 machines 2020-03-17 14:59:47 -07:00
John Keiser 8e2c06cb0e Compile with -fno-exceptions 2020-03-17 13:54:37 -07:00
John Keiser 1a5d8f1957 Add tests for SIMDJSON_EXCEPTIONS=0, add `tie()` support 2020-03-17 13:54:37 -07:00
Daniel Lemire 317fc6ba0e
accurate number parsing (#558) 2020-03-15 22:30:21 -04:00
John Keiser 0c190b165c Benchmark minify 2020-03-13 18:59:15 -07:00
John Keiser e4e89fe27a
Fix parse benchmarker (#554)
* Fix parse benchmarker

* Make CI fail when parse doesn't work
2020-03-13 16:19:21 -04:00
Daniel Lemire fb15886a1c Simple fix for name erasure. 2020-03-13 14:41:19 -04:00
Daniel Lemire 12c85d3e23
If we are going to have a google benchmark flag, we better make sure … (#551)
* If we are going to have a google benchmark flag, we better make sure that we test it out minimal (it should build).

* Fix bench_dom_api

Co-authored-by: John Keiser <john@johnkeiser.com>
2020-03-12 17:48:30 -04:00
John Keiser a5afec1f94 Make #defines into simdjson::constants 2020-03-11 19:16:29 -07:00
John Keiser 40c6213d7e Add parser.load() and load_many() to load files 2020-03-11 17:19:41 -07:00
John Keiser d140bc23f5 Automatically allocate memory as needed in parse 2020-03-11 16:14:54 -07:00
John Keiser 00f0859e1f Add ability to run multiple files 2020-03-11 16:05:05 -07:00
John Keiser 66a2807210 Rename invalid_json to simdjson_error 2020-03-06 16:12:51 -08:00
John Keiser 3bdfe167de Support cout << error 2020-03-06 15:41:51 -08:00
John Keiser 31e8a12e88 Make error_message(error_code) return C string
- Also move all error message logic to include inline
2020-03-06 15:41:51 -08:00
John Keiser 9a7c8fb5be Use parse_many in examples/tests/docs 2020-03-05 12:04:45 -08:00
John Keiser b3ea8c406e Add simdjson.cpp for unified use (#515) 2020-03-04 10:12:27 -08:00
John Keiser 99667f7c55 Create top level simdjson.h (#515)
- Allows everyone to #include the same way, singleheader or not.
2020-03-04 10:12:27 -08:00
John Keiser 0b21203141 Document navigation API 2020-03-02 14:49:03 -08:00
John Keiser 910f272467
Add parser implementation interface and selection API (#501)
* Make architecture implementations virtual functions

- Easier to add new architectures (add implementation to implementation.cpp)
- Easier to add new algorithms / functions to architecture selection
(add to implementation.h, implement)
- Automatically select best implementation in static initialization
- Allow user to explicitly select implementation with a string (i.e.
parameter)
- Allow user to inspect current implementation name/description
- Allow user to list available implementations
- Eliminate architecture enum and architecture-based templating
- Add noexcept in non-inline functions

* Move implementation static methods to their own classes

* Detect best supported implementation on first use

* available_implementationsI() -> available_implementations
2020-02-21 16:34:27 -05:00
John Keiser da34f9a253 Add Google Benchmark for calling conventions
- disable it on ubuntu 18.04 tests, which fail for [really can't figure
out why]
2020-02-18 08:37:07 -08:00
John Keiser 1f76737510 Make valstat-ish parse APIs 2020-02-18 08:37:07 -08:00
John Keiser bc8bc7d1a8
Lowercase Architecture and ErrorValues (#487)
ErrorValues -> error_code, Architecture -> architecture
2020-02-14 15:21:28 -08:00
John Keiser 8e7d1a5f09
Separate document state from ParsedJson
This creates a "document" class with only user-facing document state (no parser internals).

- document: user-facing document state
- document::iterator: iterator (equivalent of ParsedJsonIterator)
- document::parser: parser state plus a "docked" document we parse into (equivalent of ParsedJson)

Usage:

```c++
auto doc = simdjson::document::parse(buf, len); // less efficient but simplest
```

```c++
simdjson::document::parser parser; // reusable parser
parser.allocate_capacity(len);
simdjson::document* doc = parser.parse(buf, len); // pointer to doc inside parser
doc = parser.parse(buf2, len); // reuses all buffers and overwrites doc; more efficient
```
2020-02-07 10:02:36 -08:00
Daniel Lemire 4518f1fba1 Some minor nitpicking. 2020-02-07 10:41:45 -05:00
Daniel Lemire 5c59b3a775 Fixing memory leaks. (Minor issue.) 2020-02-07 10:29:15 -05:00
Daniel Lemire 28710f8ad5
fix for Issue 467 (#469)
* Fix for issue467

* Updating single-header

* Let us make it so that JsonStream is constructed from a padded_string which will avoid dangerous overruns.

* Fixing parse_stream

* Updating documentation.
2020-01-29 19:00:18 -05:00
John Keiser 6978a0b8d4 Benchmark escapes (#464)
* Add escapes as a feature we benchmark

* Don't print effectiveness metric unless verbose is on
2020-01-27 09:58:14 -05:00
Daniel Lemire aea79912ec
Adding a "get_corpus" benchmark. (#456)
* Adding a "get_corpus" benchmark.

* Improving portability.
2020-01-20 17:27:25 -05:00
Daniel Lemire 80b4dd2e8a
Removing all stdout, stderr from main library. (#455)
* Removing all stdout,stderr from main library.
2020-01-20 16:03:15 -05:00
Daniel Lemire f87e64f988
Add option to make buffers hot and remove recent benchmarking changes (#443)
* This revert the code back to how it was prior to the silly "run two stages" routine and instead
adds an option to benchmark the code over hot buffers. It turns out that it can be expensive,
when the files are large, to allocate the pages.
2020-01-15 19:48:00 -05:00
Daniel Lemire f97b655f02
Instead of emulating the whole parsing as stage 1 + stage 2, let us benchmark the real thing. (#441)
* Instead of emulating the whole parsing as stage 1 + stage 2, let us
benchmark the real thing.

* Adding explicit constructor.

* Adding warning to the benchmark user.

* Making re-running optional.
2020-01-11 10:14:22 -05:00
Daniel Lemire 7bde23590a
Debugging jsonstream (#432)
Fixes #424 (and provide tests for it), as well as #401
2020-01-03 22:22:47 -05:00
John Keiser 3b9e6bff3c Print stage 2 information in feature benchmarker 2020-01-02 17:23:21 -07:00
Paul Dreik 27293cc1c1
don't add integers to string literals (#410)
* string literal + integer means unintended and incorrect pointer arithmetic

fixes a clang warning. it could not be triggered, because it can only be
triggered if the string given to getopt is not covered among the
cases in the switch.

* handle review comment
2019-12-24 20:19:22 +01:00
Daniel Lemire b2ebdb0d07
I think we can align the numbers better (so it is prettier). (#399)
* I think we can align the numbers better (so it is prettier).

* Remove space before %, align third line better

Co-authored-by: John Keiser <john@johnkeiser.com>
2019-12-20 19:58:49 -05:00
John Keiser 60916318f7 Show miss rate, make it more accurate 2019-12-18 14:38:25 -08:00
John Keiser e2f349e7bd Measure impact of utf-8 blocks and structurals per block directly 2019-12-17 11:41:13 -08:00
Daniel Lemire f32b97733b Updating the json minifier benchmark to match that of the new API. 2019-12-02 10:46:03 -05:00