Commit Graph

374 Commits

Author SHA1 Message Date
strager d036fdf919
Reduce #include bloat (<iostream>) (#1697)
Including <iostream> has two problems:

* Compile times are worse because of over-inclusion
* Binary sizes are worse when statically linking libstdc++ because
  iostreams cannot be dead-code-stripped

simdjson only needs std::ostream. Include the header declaring only what
we need (<ostream>), omitting stuff we don't need (std::cout and its
initialization, for example).

This commit should not change behavior, but it might break users who
assume that including <simdjson/simdjson.h> will make std::cout
available (such as many of simdjson's own files).
2021-08-13 11:24:36 -04:00
Daniel Lemire eb93b98d6a
verify and fix issue 1668 (#1673)
* Adding test.

* Verifies and fix issue 1668. This commit updates the previous behavior of the
On Demand stream support by return a value type (document_reference) instead
of a reference to a document. This allows us to bridge with the usually simdjson
error system, with its simdjson_result types.

* Minor reformat.

* Adds a test with initial tests passing.

* Adding an example.
2021-07-27 08:51:07 -04:00
Nicolas Boyer 5c590b8434
Bringing ndjson(document_stream) to On Demand (#1643)
* Update basic.md to document JSON pointer for On Demand.

* Add automatic rewind for at_pointer

* Remove DOM examples in basics.md and update documentation reflecting addition of at_pointer automatic rewinding.

* Review

* Add test

* Add document_stream constructors and iterate_many

* Attempt to implement streaming.

* Kind of fixed next() for getting next document

* Temporary save.

* Putting in working order.

* Add working doc_index and add function next_document()

* Attempt to implement streaming.

* Re-anchoring json_iterator after a call to stage 1

* I am convinced it should be a 'while'.

* Add source() with test.

* Add truncated_bytes().

* Fix casting issues.

* Fix old style cast.

* Fix privacy issue.

* Fix privacy issues.

* Again

* .

* Add more tests. Add error() for iterator class.

* Fix source() to not included whitespaces between documents.

* Fixing CI.

* Fix source() for multiple batches. Add new tests.

* Fix batch_start when document has leading spaces. Add new tests for that.

* Add new tests.

* Temporary save.

* Working hacky multithread version.

* Small fix in header files.

* Correct version (not working).

* Adding a move assignment to ondemand::parser.

* Fix attempt by changing std::swap.

* Moving DEFAULT_BATCH_SIZE and MINIMAL_BATCH_SIZE.

* Update doc and readme tests.

* Update basics.md

* Update readme_examples tests.

* Fix exceptions in test.

* Partial setup for amazon_cellphones.

* Benchmark with vectors.

* Benchmark with maps

* With vectors again.

* Fix for weighted average.

* DOM benchmark.

* Fix typos. Add On Demand benchmark.

* Add large amazon_cellphones benchmark for DOM

* Add benchmark for On demand.

* Fix broken read_me test.

* Add parser.threaded to enable/disable thread usage.

Co-authored-by: Daniel Lemire <lemire@gmail.com>
2021-07-20 14:17:23 -04:00
Daniel Lemire 774999ee95 Adds some benchmarks for the minifier. 2021-07-16 11:54:55 -04:00
Daniel Lemire b085b56e32
This solves a minor issue with our legacy benchmark tools. (#1653)
* This solves a minor issue with our legacy benchmark tools.

* Slightly better code.

* Removing bad typo.
2021-07-13 09:18:58 -04:00
Nicolas Boyer 03f7396d50
Fix branches. (#1619) 2021-06-17 18:31:40 -04:00
Nicolas Boyer d90714e8df
Add RapidJSON and nlohmann_json SAX to partial_tweets benchmark (#1597)
* Add first working version of rapidjson_sax for partial tweets.

* Add cleaner and faster rapidjson_sax

* Add nlohmann_json_sax.

* Replace array of bool by bitsets.

* Replace strdup to copy string in rapidjson_sax.

* Change std::string_view assignment in rapidjson_sax.
2021-06-03 16:41:20 -04:00
Nicolas Boyer c7fd7353a8
Add RapidJSON and nlohmann_json SAX to top_tweet benchmark (#1599)
* Add rapidjson_sax.h and fix typo in rapidjson.h

* Add nlohmann_json_sax.h and add user key check for screen_name in rapidjson_sax

* Change std::string_view assignement for text and screen_name.
2021-06-03 16:41:00 -04:00
Nicolas Boyer 05f15d88b6
Add large_random/rapidjson_sax.h and large_random/nlohmann_json_sax.h. Clean up kostya/rapidjson_sax.h (add flags also) and kostya/nlohmann_json_sax.h (#1600) 2021-06-03 16:40:39 -04:00
Nicolas Boyer d7d81c7152
Add RapidJSON and nlohmann_json SAX to find_tweet benchmark (#1598)
* Add rapidjson_sax.h .

* Add nlohmann_json_sax.h . Fix typos distinct_user_id/nlohmann_json_sax.h, find_tweet/rapidjson.h and find_tweet/rapidjson_sax.h .

* Add extra check for id key when looking for find_id.
2021-06-03 12:43:54 -04:00
Nicolas Boyer 73b510225f
Add RapidJSON and nlohmann_json SAX to distinct_user_id benchmark (#1593)
* Add rapidjson_sax for distinct_user_id

* Add nlohmann_json_sax.h for distinct_user_id

* Add flags for RapidJSON.

* Fix revisions.

* Fix revisions again.

* Replace strcpy with memcpy. Increase performance fix.
2021-06-01 14:51:27 -04:00
Nicolas Boyer 369f66be35
Add RapidJSON and nlohmann_json SAX to kostya benchmark (#1592)
* Add RapidJSON and nlohmann_json SAX to kostya benchmark

* Remove trailing whitespaces

* Fix typo
2021-05-31 10:15:50 -04:00
Daniel Lemire af5c8175b4
By default, we should not do the DOM checkperf… (#1571)
* By default, we should not do the DOM checkperf. These targets assume that main branch remains
compatible, an assumption that will break over time.
2021-05-15 15:28:59 -04:00
Daniel Lemire 729c35c0f8 Removes docker file which is unused and untested, and updates the path to dom/parse. 2021-05-01 10:31:00 -04:00
Daniel Lemire c1dffac28c
This moves all DOM (benchmark + test) files to a subdir (#1549)
* This moves all DOM (benchmark + test) files to a subdir

* Missing file.

* CMake + DLL is not pretty.

* Capitalizing AND

* Fixing mismatch endif

* Flipping the order.

* onedemand => ondemand
2021-04-30 18:33:45 -04:00
Daniel Lemire 911b06186b
Delete Dockerfile 2021-04-26 09:08:34 -04:00
D. Stolle be9d5d4e31
adjust GitHub links to current repository URL (#1553)
Switch links (mostly in comments) from old repository URL
<https://github.com/lemire/simdjson/> to the current URL
<https://github.com/simdjson/simdjson/>.
2021-04-26 09:08:14 -04:00
friendlyanon 5ec85197f8
CMake refactor stage1 (#1512)
* Remove CMP0025 policy

This policy is already set to NEW by the minimum required version.

* Use HOMEPAGE_URL in the project call

* Use VERSION in the project call

* Detect if this is the top project

* Port simdjson-user-cmakecache to a CMake script

* Create a developer mode

The SIMDJSON_DEVELOPER_MODE option set to ON will enable targets that
are only useful for developers of simdjson.

* Consolidate root CML commands into logical sections

* Warn about intended use of developer mode

* Prettify the just_ascii test

* Remove redundant CMake variables

* Inline CML contents from include and src

* Raise minimum CMake requirement to 3.14

* Define proper install rules

* Restore thread support variable

* Add BUILD_SHARED_LIBS as a top level only option

* Force developer mode to be on in CI

* Include flags earlier in developer mode

* Set CMAKE_BUILD_TYPE conditionally

CMAKE_BUILD_TYPE is used only by single configuration generators and is
otherwise completely ignored.

* Remove useless static/shared options

simdjson now uses the CMake builtin BUILD_SHARED_LIBS to switch the
built artifact's type.

* Remove unused CMAKE_MODULE_PATH variable

* Refactor implementation switching into a module

* Factor exception option out into a module

* Reformat simdjson-flags.cmake

* Rename simdjson-flags to developer-options

* Accumulate properties into an include module

This is done this way to avoid using utility targets that must be
exported and installed, which could potentially be misused by users of
the library.

* Port impl definitions to props

* Port exception options to props

* Lift normal options to the top

* Port developer options to props

* Remove simdjson-flags from benchmark

* Document the developer mode in HACKING

* Fix include path in installed config file

* Fix formatting of prop commands

* Fix tests that include .cpp files

* Change GCC AVX fixes back to compile options

* Deprecate SIMDJSON_BUILD_STATIC

* Always link fuzz targets to simdjson

* Install CMake from simdjson's debian repo

* Add gnupg for apt-key

* Make sure ASan link flags come first

* Pass CI env variable to cmake invocation

* Install package for apt-add-repository

* Remove return() from flush macro

* Use directory level commands instead of props

* Restore the github repository variable

* Set developer mode unconditionally for checkperf

The CI env variable is only set in the CI and this target is always run
in developer mode.

* Attempt to fix ODR violation in parsing checks

These tests were compiling the simdjson.cpp file again and linking to
the simdjson library target causes ODR violations.

Instead of linking to the target, just inherit its props.

* Move variables before the source dir

* Mark props to be flushed after adding more

* Use props for every command for the library

* Use keyword form for linking libs

* Handle deprecation of SIMDJSON_JUST_LIBRARY

* Handle deprecations in a separate module

Co-authored-by: friendlyanon <friendlyanon@users.noreply.github.com>
2021-04-23 09:24:56 -04:00
Daniel Lemire 8eed8f5155
Document stream: truncate final unfinished document and give access to the number of truncated bytes. (#1534)
* Truncate final unclosed string.

* Adding more precise remarks.

* Better documentation and more robust code.

* ARM + PPC corrections.

* Patching ARM implementation with new stage1_mode parameter.

* Fixed most problems.

* Correcting white spaces and adding a remark.

* This adds the truncated_bytes() method to the stream instances.
2021-04-23 09:24:00 -04:00
John Keiser 94563328c4 Make ctest succeed after running make all_tests 2021-03-20 14:01:52 -07:00
Daniel Lemire 0a5bba7235
Provides a more correct simdjson::ondemand implementation message. (#1492) 2021-03-09 11:39:19 -05:00
Daniel Lemire 9577c54999
Provide the CMake install the necessarily information (and flags) to hand Windows DLL and add Windows installation tests (#1457)
* This gives the CMake install the necessarily information (and flags) to know
whether we have a Windows DLL and in such cases how to handle the linkage.
2021-02-26 16:17:05 -05:00
Daniel Lemire 81609393f1
Fixing issue 1449. (#1451) 2021-02-21 16:33:05 -05:00
Daniel Lemire 610b3ad302
Adds Visual Studio 2017 to CI (for real) and adapt our build/tests (#1444) 2021-02-15 19:49:12 -05:00
Daniel Lemire d6f33e4830
This adds a little test to see if we can compiler with very strict flags (conventional casts) (#1417)
* This adds a little test to see if we can compiler with very strict flags.

* Trimming a leftover old-style cast.

* More cleaning.

* A few more pedantic casts.
2021-01-27 18:37:30 -05:00
Daniel Lemire 2a714f4e37
Hide the std::pair inheritance in our result instances (#1396)
* Fixing issue 1243

* The tie must go.

* Having std::pair be a protected inheritance breaks on demand.

* Putting it back.

* You really want to use emplace.

* Fixing one botched test.

* Prettier test.

* Using safer code.

* Fixing unsafe code.

* Simplifying the fuzzer.

* Trying another way.

* Ok. It should work without exceptions.

* Removing trailing spaces.
2021-01-18 12:00:02 -05:00
Daniel Lemire 6e5d232ccc Fixing forgotten namespace. 2021-01-16 02:50:55 +00:00
Daniel Lemire 8e96e38099
We should trim out old benchmarks. (#1379) 2021-01-15 14:46:05 -05:00
John Keiser 55faf4c5bc
Recommend simdjson::ondemand over simdjson::builtin::ondemand (#1380)
Co-authored-by: Daniel Lemire <lemire@gmail.com>
2021-01-14 17:33:49 -05:00
John Keiser be61650102 Add top_tweet benchmark to test laziness 2021-01-11 15:19:26 -08:00
John Keiser 66db102c70 Use imprecise double comparison for sajson 2021-01-11 15:12:12 -08:00
John Keiser ab859f7952 Add nlohmann_json benchmarks 2021-01-11 15:12:12 -08:00
John Keiser 6367e55a5f Use new double differ in kostya/large_random benchmarks 2021-01-11 15:12:12 -08:00
Daniel Lemire b61f2799a8 This makes the float errors explicit. 2021-01-11 15:12:12 -08:00
John Keiser 1b4d3bcbb6 Add sajson benchmarks 2021-01-11 15:12:12 -08:00
John Keiser cd27bf0745 Add yyjson_insitu tests 2021-01-05 12:16:19 -08:00
John Keiser 62ded15cd8 Rename tweets/text/points -> result 2021-01-05 11:55:57 -08:00
John Keiser bc6907d280 Handle in situ document copies outside of the loop 2021-01-05 11:52:05 -08:00
John Keiser dcd2e13aec Measure time more accurately 2021-01-05 10:45:49 -08:00
John Keiser 2d760e75dc Remove public: from structs 2021-01-05 09:10:22 -08:00
John Keiser f071a15591 Add insitu versions of rapidjson benchmark 2021-01-04 20:30:54 -08:00
John Keiser 6a595231b0 Get rid of templates from rapidjson benchmarks 2021-01-04 20:20:24 -08:00
John Keiser 065ea00066 Fix kostya<yyjson> issue 2021-01-04 20:03:22 -08:00
John Keiser 680cd6df34 Add usage benchmarks for rapidjson 2021-01-04 20:03:21 -08:00
John Keiser 5583a3c89b Add error handling to yyjson 2021-01-04 13:05:37 -08:00
John Keiser 25d1c7e622 Fix yyjson double reading 2021-01-04 12:37:25 -08:00
John Keiser 5add8ac255 Rearrange benchmarks to be easier to create 2021-01-04 12:33:41 -08:00
John Keiser 3af54a9978 Add Yyjson benchmarks 2021-01-01 23:04:19 -08:00
John Keiser 1dc4e9a84c Create custom result printers to show actual differences 2021-01-01 22:03:44 -08:00
John Keiser 9af41dd988 Add PartialTweets<Yyjson> benchmark 2021-01-01 22:03:38 -08:00