Commit Graph

65 Commits

Author SHA1 Message Date
Daniel Lemire cae5e5342f
Additional ndjson tests. (#1717)
* Additional ndjson tests.

* Switching the data source.

* Fixing.
2021-09-18 16:29:10 -04:00
Paul Dreik d28e5534d9
ignore unused variable (#1714) 2021-09-12 17:37:58 -04:00
Paul Dreik d3f0e2afb3
[no ci] remove references to bintray (#1702)
* download fuzz corpus from www.pauldreik.se

* remove reference to bintray
2021-08-19 08:39:07 -04:00
strager d036fdf919
Reduce #include bloat (<iostream>) (#1697)
Including <iostream> has two problems:

* Compile times are worse because of over-inclusion
* Binary sizes are worse when statically linking libstdc++ because
  iostreams cannot be dead-code-stripped

simdjson only needs std::ostream. Include the header declaring only what
we need (<ostream>), omitting stuff we don't need (std::cout and its
initialization, for example).

This commit should not change behavior, but it might break users who
assume that including <simdjson/simdjson.h> will make std::cout
available (such as many of simdjson's own files).
2021-08-13 11:24:36 -04:00
Paul Dreik 90efd79055
Get fuzzing working again (#1646)
* upload corpus to https://www.pauldreik.se/ from the x64 github action job (keep the github action cache)
* drop the github action cache (which was not working anyway) for power fuzzer and download the fuzz corpus from https://www.pauldreik.se/ instead
* resurrect arm64 fuzzing on drone CI, downloading the fuzz corpus from https://www.pauldreik.se/
* update the fuzzing documentation
2021-07-05 09:42:57 +02:00
Daniel Lemire d539781cf3
This attempts to fix the fuzzers. (#1564)
* This attempts to fix the fuzzers.

* Retiring bintray.

* Disabling ARM fuzzing.
2021-05-07 22:59:26 -04:00
Dirk Stolle 2abcc35031
fix serveral typos (#1558)
* fix typos in markdown files

* fix typos in CMake files

* fix typos in headers and test code
2021-05-01 10:19:53 -04:00
friendlyanon 5ec85197f8
CMake refactor stage1 (#1512)
* Remove CMP0025 policy

This policy is already set to NEW by the minimum required version.

* Use HOMEPAGE_URL in the project call

* Use VERSION in the project call

* Detect if this is the top project

* Port simdjson-user-cmakecache to a CMake script

* Create a developer mode

The SIMDJSON_DEVELOPER_MODE option set to ON will enable targets that
are only useful for developers of simdjson.

* Consolidate root CML commands into logical sections

* Warn about intended use of developer mode

* Prettify the just_ascii test

* Remove redundant CMake variables

* Inline CML contents from include and src

* Raise minimum CMake requirement to 3.14

* Define proper install rules

* Restore thread support variable

* Add BUILD_SHARED_LIBS as a top level only option

* Force developer mode to be on in CI

* Include flags earlier in developer mode

* Set CMAKE_BUILD_TYPE conditionally

CMAKE_BUILD_TYPE is used only by single configuration generators and is
otherwise completely ignored.

* Remove useless static/shared options

simdjson now uses the CMake builtin BUILD_SHARED_LIBS to switch the
built artifact's type.

* Remove unused CMAKE_MODULE_PATH variable

* Refactor implementation switching into a module

* Factor exception option out into a module

* Reformat simdjson-flags.cmake

* Rename simdjson-flags to developer-options

* Accumulate properties into an include module

This is done this way to avoid using utility targets that must be
exported and installed, which could potentially be misused by users of
the library.

* Port impl definitions to props

* Port exception options to props

* Lift normal options to the top

* Port developer options to props

* Remove simdjson-flags from benchmark

* Document the developer mode in HACKING

* Fix include path in installed config file

* Fix formatting of prop commands

* Fix tests that include .cpp files

* Change GCC AVX fixes back to compile options

* Deprecate SIMDJSON_BUILD_STATIC

* Always link fuzz targets to simdjson

* Install CMake from simdjson's debian repo

* Add gnupg for apt-key

* Make sure ASan link flags come first

* Pass CI env variable to cmake invocation

* Install package for apt-add-repository

* Remove return() from flush macro

* Use directory level commands instead of props

* Restore the github repository variable

* Set developer mode unconditionally for checkperf

The CI env variable is only set in the CI and this target is always run
in developer mode.

* Attempt to fix ODR violation in parsing checks

These tests were compiling the simdjson.cpp file again and linking to
the simdjson library target causes ODR violations.

Instead of linking to the target, just inherit its props.

* Move variables before the source dir

* Mark props to be flushed after adding more

* Use props for every command for the library

* Use keyword form for linking libs

* Handle deprecation of SIMDJSON_JUST_LIBRARY

* Handle deprecations in a separate module

Co-authored-by: friendlyanon <friendlyanon@users.noreply.github.com>
2021-04-23 09:24:56 -04:00
Paul Dreik a79bbd63a3
use github action cache instead of bintray (#1536)
* use github action cache instead of bintray

* add note on where to get the corpus

Co-authored-by: Paul Dreik <paul@simdjson>
2021-04-12 16:58:30 -04:00
Daniel Lemire 9577c54999
Provide the CMake install the necessarily information (and flags) to hand Windows DLL and add Windows installation tests (#1457)
* This gives the CMake install the necessarily information (and flags) to know
whether we have a Windows DLL and in such cases how to handle the linkage.
2021-02-26 16:17:05 -05:00
Daniel Lemire 81609393f1
Fixing issue 1449. (#1451) 2021-02-21 16:33:05 -05:00
Daniel Lemire 610b3ad302
Adds Visual Studio 2017 to CI (for real) and adapt our build/tests (#1444) 2021-02-15 19:49:12 -05:00
Daniel Lemire 2a714f4e37
Hide the std::pair inheritance in our result instances (#1396)
* Fixing issue 1243

* The tie must go.

* Having std::pair be a protected inheritance breaks on demand.

* Putting it back.

* You really want to use emplace.

* Fixing one botched test.

* Prettier test.

* Using safer code.

* Fixing unsafe code.

* Simplifying the fuzzer.

* Trying another way.

* Ok. It should work without exceptions.

* Removing trailing spaces.
2021-01-18 12:00:02 -05:00
John Keiser 55faf4c5bc
Recommend simdjson::ondemand over simdjson::builtin::ondemand (#1380)
Co-authored-by: Daniel Lemire <lemire@gmail.com>
2021-01-14 17:33:49 -05:00
Daniel Lemire 85001c55fb
Fixing UTF-8 validation under PPC64 (#1346)
* Entering a new UTF-8 test

* Maybe *I* had a bug in the tests.

* Replacing nulls with 1s.

* Let us try to be more verbose.

* Return 0.

* Fixing issue.

* Adding puzzler scenario.

* Fixing PPC64

Co-authored-by: Daniel Lemire <dlemire@rcs-power9-talos>
2020-12-19 10:42:27 -05:00
Daniel Lemire c90ee57203
This might make the fuzzer error debuggable. (#1345) 2020-12-16 18:31:29 -05:00
Paul Dreik 5f7b2bac12
fuzz on power (#1326)
* first try

* use ubuntu 20.04, do the fuzzing

* new try at power fuzz

* hard code clang version

* setting env variables does not seem to work

* use fuzzer-no-link

* switch to Debian Buster for power fuzz

* use non-sanitizer build for power

* me not like yaml

* fix bad syntax
2020-12-07 18:12:36 -05:00
Paul Dreik 725ca010e7
add ndjson fuzzer (#1304)
* add ndjson fuzzer

* reproduce #1310 in the newly added unit test

Had to replace the input, because:
1)
the fuzzer uses the first part of the input to determine
the batch_size to use, so that has to be cut off

2)
the master now protects against low values of batch_size

I also made the test not return early, so the error is triggered.
2020-12-01 15:58:41 -05:00
Paul Dreik 0b39e3a6cf
add fuzzer for padded_string (#1312)
This also fixes an overflow problem.
2020-11-19 16:51:56 +01:00
Paul Dreik 5202d07a77
add script and instructions for minimizing and cleansing fuzz crashes (#1305) 2020-11-17 20:49:29 +01:00
Paul Dreik af4db55e66
remove trailing whitespace (#1284) 2020-11-03 21:48:09 +01:00
Paul Dreik f93fb21c95
optionally disable deprecated apis (#1271)
Introduce cmake option SIMDJSON_DISABLE_DEPRECATED_API (default Off)
which turns off deprecated simdjson api functions by setting the macro
 SIMDJSON_DISABLE_DEPRECATED_API.

For non-cmake users, users will have to set SIMDJSON_DISABLE_DEPRECATED_API
by some other means to disable the api.

Closes #1264
2020-11-01 06:38:52 +01:00
Paul Dreik b7fe764e6c
fuzz more of generic/numberparsing.h (#1272)
make the ondemand fuzzer use more of the api
2020-11-01 06:29:16 +01:00
Paul Dreik 500e4c3572
fuzz with the intended clang version (#1267)
This builds the CI fuzzers with the intended clang version. It also allows users to set the clang version locally,
in case they need to.

It also switches the CI fuzzers to use an optimized sanitizer build, to do something oss-fuzz doesn't and get more done in the short time the CI fuzzer runs.
2020-10-31 08:22:49 +01:00
Paul Dreik ac87437588
fuzz the on demand api (#1220) 2020-10-29 19:14:44 +01:00
Jeremy Ong 9856201f5c
Disable exceptions when compiling with MSVC (#1256)
Projects that link simdjson from MSVC with exceptions off will
include simdjson headers which transitively include STL headers.
The MSVC STL stipulates that _HAS_EXCEPTIONS=0 be defined or code
requiring exceptions will be enabled. This change adds a new job
to the appveyor build matrix to verify the build and tests with
exceptions disabled, and disables exceptions at the compiler level
when SIMDJSON_EXCEPTIONS is specified to OFF.
2020-10-26 14:11:25 -04:00
Paul Dreik f1b4a54991
add fuzz element (#1204)
* add definitions for is_number and tie (by lemire)
* add fuzzer for element
* update fuzz documentation
* fix UB in creating an empty padded string
* don't bother null terminating padded_string, it is done by the std::memset already
*  refactor fuzz data splitting into a separate class
2020-10-17 05:48:50 +02:00
Daniel Lemire bb2bc98a22
Fix issue https://github.com/simdjson/simdjson/issues/1127 (#1224) 2020-10-13 09:18:54 -04:00
Paul Dreik 8a68163905
simplify fuzzing only dynamically supported implementations (#1201)
This refactors the dynamic check of which implementations are supported at runtime.

It also reduces duplicated effort in the CI fuzzing job, the differential fuzzers don't need to run with different values of SIMDJSON_FORCE_IMPLEMENTATION.

There is also a convenience script to run the fuzzers locally, to quickly check that the fuzzers still build, run and no easy to find bugs are there. It should be handy not only when developing the fuzzers, but also when modifying simdjson.
2020-10-09 05:29:54 +02:00
Paul Dreik 1f98e64b71
fix merge conflict on master (#1217) 2020-10-07 10:05:31 +02:00
John Keiser 6d978c383a Kinder, gentler implementation selection
- Allow user to specify SIMDJSON_BUILTIN_IMPLEMENTATION
- Make cmake -DSIMDJSON_IMPLEMENTATION=haswell *only* specify haswell
- Move negative implementation selection to
-DSIMDJSON_EXCLUDE_IMPLEMENTATION
- Automatically select SIMDJSON_BUILTIN_IMPLEMENTATION if
SIMDJSON_IMPLEMENTATION is set
- Move implementation enablement mostly to implementation files
- Make implementation enablement and selection simpler and more robust
- Fix bug where programs linked against simdjson were not passed
SIMDJSON_XXX_IMPLEMENTATION or SIMDJSON_EXCEPTIONS
2020-10-06 11:29:45 -07:00
Daniel Lemire 9865bb6904
Make it possible to check that an implementation is supported at runtime (#1197)
* Make it possible to check that an implementation is supported at runtime.

* add CI fuzzing on arm 64 bit

This adds fuzzing on drone.io arm64

For some reason, leak detection had to be disabled. If it is enabled, the fuzzer falsely reports a crash at the end of fuzzing.

Closes: #1188

* Guarding the implementation accesses.

* Better doc.

* Updating cxxopts.

* Make it possible to check that an implementation is supported at runtime.

* Guarding the implementation accesses.

* Better doc.

* Updating cxxopts.

* We need to accomodate cxxopts

Co-authored-by: Paul Dreik <github@pauldreik.se>
2020-10-02 11:04:51 -04:00
Paul Dreik e06ddea784
add CI fuzzing on arm 64 bit
This adds fuzzing on drone.io arm64

For some reason, leak detection had to be disabled. If it is enabled, the fuzzer falsely reports a crash at the end of fuzzing.

Closes: #1188
2020-10-01 10:12:37 +02:00
Paul Dreik f1b0778f79
add utf8 fuzzer
This enables the utf8 fuzzer, now when #1187 is fixed
2020-09-27 21:11:13 +02:00
Paul Dreik f44386008c
add minifier fuzzers (#1172)
This adds a minifier fuzzer. There is also an utf-8 fuzzer, but it is disabled until  #1187 is fixed.

Run all fuzzers bug the utf-8 one in the github CI fuzz.
2020-09-26 14:25:00 +02:00
Paul Dreik 30b912fc81
fuzz at_pointer
This adds a fuzzer for at_pointer() which recently had a bug.

The #1142 bug had been found with this fuzzer

Also, it polishes the github action job:

    cross pollinate the fuzzer corpora (lets fuzzers reuse results from other fuzzers)
    use github action syntax instead of bash checks
    only run on push if on master
2020-09-16 21:17:43 +02:00
Paul Dreik 6ecbcc7c19
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).

This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.

Also, it will detect if the implementations behave differently:

    by making sure they all succeed, or all error
    turning the parsed data into text again, should produce equal results

While at it, I corrected some minor things:

    clean up building too many variants, run with forced implementation (closes #815 )
    always store crashes as artefacts, good in case the fuzzer finds something
    return value of the fuzzer function should always be 0
    reduce log spam
    introduce max size for the seed corpus and the CI fuzzer
2020-09-11 23:46:22 +02:00
Daniel Lemire 8a8eea53a2
Prefixing macros (issue 1035) (#1124)
* Renaming partially done.

* More prefixing.

* I thought that this was fixed.

* Missed one.

* Missed a few.

* Missed another one.

* Minor fixes.
2020-08-18 18:25:36 -04:00
John Keiser a7fc7d4ffb Switch from get(v,e) to e = get(v) 2020-06-20 17:57:09 -07:00
Daniel Lemire c009e4a57d
Fuzzing should not require loading lots of complicated dependencies (#879)
* I don't think we need google benchmark as part of the fuzzer

* I don't think we should load the "competition" for fuzzing.
2020-05-13 08:30:09 -04:00
Furkan Usta 064eb0b24f CMake: Make simdjson-internal-flags subsume simdjson-flags 2020-05-03 02:48:29 +03:00
Furkan Usta 940c8fcb5e Fix: simdjson-private-flags to simdjson-internal-flags 2020-05-02 05:23:53 +03:00
Furkan Usta 71e0148eb4 CMake: Link simdjson-fuzzer to simdjson-source as before
simdjson target adds extra definitions which MSVC doesn't like
2020-05-02 05:21:57 +03:00
Furkan Usta 293c104cc4 CMake: Separate public and private compilation flags
simdjson-internal-flags for macros and warnings
simdjson-flags for pthread, sanitizer, and libcpp
2020-05-02 04:08:47 +03:00
John Keiser 6ac47734c0
Only build fuzzers when fuzzing. (#822) 2020-04-27 16:02:19 -04:00
John Keiser db314bc381 Make fuzzing work without exceptions 2020-04-21 17:33:47 -07:00
John Keiser 09cf18a646 Add C++11 tests to cmake
- Add simdjson-flags target so callers don't have flags forced on them
2020-04-15 17:26:25 -07:00
Daniel Lemire 6d7c77ddc1
Let us try to check with the exceptions disabled. (#707)
* Tweaking code so that we can run all tests with exceptions off.
* Removing SIMDJSON_DISABLE_EXCEPTIONS
2020-04-15 16:45:36 -04:00
Paul Dreik 92c34f7f38
do not use deprecated apis in the fuzzers (#705)
* move from deprecated interface in fuzz dump raw tape

* update fuzz_dump to the non deprecated replacement

* replace use of deprecated api

* hopefully fix windows build
2020-04-14 07:45:30 +02:00
Paul Dreik 93328c8d6d
improve the fuzzer documentation, refresh links (#697) 2020-04-12 17:49:40 -04:00