Commit Graph

181 Commits

Author SHA1 Message Date
Daniel Lemire 21dce6cca9
Displaying the numbers of documents parsed per second (#652)
* Some users are interested, as a metric, in the number of documents parsed per second.
Obviously, this means reusing the same parser again and again.

* Adding a sentence

* This update the parsingcompetition benchmark so that it displays the number of documents parsed per second.
2020-03-30 17:51:03 -04:00
John Keiser d93af1161d Remove set_capacity, replace with allocate
Makes allocation point more predictable
2020-03-30 13:49:54 -07:00
John Keiser 434776db1a Deprecate more things 2020-03-30 13:48:43 -07:00
John Keiser 622d9c9480 Replace as_X and is_X with get<T> and is<T> 2020-03-28 15:29:53 -07:00
John Keiser 03746b966b Move document/element/etc. under dom 2020-03-28 13:42:21 -07:00
Daniel Lemire 450e19858b Minor fix to distinctuseridcompetition 2020-03-28 15:56:10 -04:00
John Keiser e836c28008 Deprecate parser error code methods
- Also make competitions compile without warnings
2020-03-28 10:13:20 -07:00
John Keiser 5ad405006c Return document::element from parse, load, parse_many, load_many 2020-03-27 12:24:41 -07:00
John Keiser 2e420169c3 Remove document::parse and document::load 2020-03-26 10:13:09 -07:00
Daniel Lemire ab0e22a316
Trying to migrate distinctuseridcompetition to new API. (#624)
* Trying to migrate distinctuseridcompetition to new API.

* Ok. Good performance + got rid of old API.
2020-03-26 12:06:28 -04:00
John Keiser a0bce440a6 Remove document_iterator, document::iterator, ParsedJsonIterator
Keep ParsedJson::Iterator only, without template, in same form as
it was in 0.2
2020-03-25 18:26:51 -07:00
Daniel Lemire 1cf4fe405d
Fixing issue 602 (#621) 2020-03-25 21:06:20 -04:00
Daniel Lemire 6b8f5d3354
Fixing issue 601 (#618)
* Fixing issue 601
2020-03-25 20:44:55 -04:00
John Keiser d5af359365
Fix compile error in master (#619) 2020-03-25 20:11:23 -04:00
Daniel Lemire d84e70b6e5
migrating minifier competition to new API (#597)
* Migrating minifiercompetition to new API.
2020-03-24 10:13:55 -04:00
Daniel Lemire 7ff034504d
Migrating parsingcompetition to new API. (#593)
* Migrating parsingcompetition to new API.

* Removing ParsedJson
2020-03-24 10:06:44 -04:00
Daniel Lemire 5d1e3efce8
faster minifier (#568)
* Fallback should use our scalar code.
* parse should have a nicer error message.
* Making it so that "minify" can use different architectures.
* Let us change the minifier competition so that it tests all implementations.
* Documenting the untaken optimization opportunity.

Co-authored-by: John Keiser <john@johnkeiser.com>
2020-03-20 16:14:47 -04:00
Daniel Lemire 6cefeb338b
std::tie does not work on some compilers (#567)
* std::tie workaround.

* Cleaner solution
2020-03-19 16:56:45 -04:00
John Keiser af203aaf86 Add fallback parser for pre-SSE4.2 machines 2020-03-17 14:59:47 -07:00
John Keiser 8e2c06cb0e Compile with -fno-exceptions 2020-03-17 13:54:37 -07:00
John Keiser 1a5d8f1957 Add tests for SIMDJSON_EXCEPTIONS=0, add `tie()` support 2020-03-17 13:54:37 -07:00
Daniel Lemire 317fc6ba0e
accurate number parsing (#558) 2020-03-15 22:30:21 -04:00
John Keiser 0c190b165c Benchmark minify 2020-03-13 18:59:15 -07:00
John Keiser e4e89fe27a
Fix parse benchmarker (#554)
* Fix parse benchmarker

* Make CI fail when parse doesn't work
2020-03-13 16:19:21 -04:00
Daniel Lemire fb15886a1c Simple fix for name erasure. 2020-03-13 14:41:19 -04:00
Daniel Lemire 12c85d3e23
If we are going to have a google benchmark flag, we better make sure … (#551)
* If we are going to have a google benchmark flag, we better make sure that we test it out minimal (it should build).

* Fix bench_dom_api

Co-authored-by: John Keiser <john@johnkeiser.com>
2020-03-12 17:48:30 -04:00
John Keiser a5afec1f94 Make #defines into simdjson::constants 2020-03-11 19:16:29 -07:00
John Keiser 40c6213d7e Add parser.load() and load_many() to load files 2020-03-11 17:19:41 -07:00
John Keiser d140bc23f5 Automatically allocate memory as needed in parse 2020-03-11 16:14:54 -07:00
John Keiser 00f0859e1f Add ability to run multiple files 2020-03-11 16:05:05 -07:00
John Keiser 66a2807210 Rename invalid_json to simdjson_error 2020-03-06 16:12:51 -08:00
John Keiser 3bdfe167de Support cout << error 2020-03-06 15:41:51 -08:00
John Keiser 31e8a12e88 Make error_message(error_code) return C string
- Also move all error message logic to include inline
2020-03-06 15:41:51 -08:00
John Keiser 9a7c8fb5be Use parse_many in examples/tests/docs 2020-03-05 12:04:45 -08:00
John Keiser b3ea8c406e Add simdjson.cpp for unified use (#515) 2020-03-04 10:12:27 -08:00
John Keiser 99667f7c55 Create top level simdjson.h (#515)
- Allows everyone to #include the same way, singleheader or not.
2020-03-04 10:12:27 -08:00
John Keiser 0b21203141 Document navigation API 2020-03-02 14:49:03 -08:00
John Keiser 910f272467
Add parser implementation interface and selection API (#501)
* Make architecture implementations virtual functions

- Easier to add new architectures (add implementation to implementation.cpp)
- Easier to add new algorithms / functions to architecture selection
(add to implementation.h, implement)
- Automatically select best implementation in static initialization
- Allow user to explicitly select implementation with a string (i.e.
parameter)
- Allow user to inspect current implementation name/description
- Allow user to list available implementations
- Eliminate architecture enum and architecture-based templating
- Add noexcept in non-inline functions

* Move implementation static methods to their own classes

* Detect best supported implementation on first use

* available_implementationsI() -> available_implementations
2020-02-21 16:34:27 -05:00
John Keiser da34f9a253 Add Google Benchmark for calling conventions
- disable it on ubuntu 18.04 tests, which fail for [really can't figure
out why]
2020-02-18 08:37:07 -08:00
John Keiser 1f76737510 Make valstat-ish parse APIs 2020-02-18 08:37:07 -08:00
John Keiser bc8bc7d1a8
Lowercase Architecture and ErrorValues (#487)
ErrorValues -> error_code, Architecture -> architecture
2020-02-14 15:21:28 -08:00
John Keiser 8e7d1a5f09
Separate document state from ParsedJson
This creates a "document" class with only user-facing document state (no parser internals).

- document: user-facing document state
- document::iterator: iterator (equivalent of ParsedJsonIterator)
- document::parser: parser state plus a "docked" document we parse into (equivalent of ParsedJson)

Usage:

```c++
auto doc = simdjson::document::parse(buf, len); // less efficient but simplest
```

```c++
simdjson::document::parser parser; // reusable parser
parser.allocate_capacity(len);
simdjson::document* doc = parser.parse(buf, len); // pointer to doc inside parser
doc = parser.parse(buf2, len); // reuses all buffers and overwrites doc; more efficient
```
2020-02-07 10:02:36 -08:00
Daniel Lemire 4518f1fba1 Some minor nitpicking. 2020-02-07 10:41:45 -05:00
Daniel Lemire 5c59b3a775 Fixing memory leaks. (Minor issue.) 2020-02-07 10:29:15 -05:00
Daniel Lemire 28710f8ad5
fix for Issue 467 (#469)
* Fix for issue467

* Updating single-header

* Let us make it so that JsonStream is constructed from a padded_string which will avoid dangerous overruns.

* Fixing parse_stream

* Updating documentation.
2020-01-29 19:00:18 -05:00
John Keiser 6978a0b8d4 Benchmark escapes (#464)
* Add escapes as a feature we benchmark

* Don't print effectiveness metric unless verbose is on
2020-01-27 09:58:14 -05:00
Daniel Lemire aea79912ec
Adding a "get_corpus" benchmark. (#456)
* Adding a "get_corpus" benchmark.

* Improving portability.
2020-01-20 17:27:25 -05:00
Daniel Lemire 80b4dd2e8a
Removing all stdout, stderr from main library. (#455)
* Removing all stdout,stderr from main library.
2020-01-20 16:03:15 -05:00
Daniel Lemire f87e64f988
Add option to make buffers hot and remove recent benchmarking changes (#443)
* This revert the code back to how it was prior to the silly "run two stages" routine and instead
adds an option to benchmark the code over hot buffers. It turns out that it can be expensive,
when the files are large, to allocate the pages.
2020-01-15 19:48:00 -05:00
Daniel Lemire f97b655f02
Instead of emulating the whole parsing as stage 1 + stage 2, let us benchmark the real thing. (#441)
* Instead of emulating the whole parsing as stage 1 + stage 2, let us
benchmark the real thing.

* Adding explicit constructor.

* Adding warning to the benchmark user.

* Making re-running optional.
2020-01-11 10:14:22 -05:00