2020-09-17 03:17:43 +08:00
|
|
|
name: Fuzz and run valgrind
|
2019-11-12 05:17:32 +08:00
|
|
|
|
|
|
|
on:
|
|
|
|
push:
|
2020-09-17 03:17:43 +08:00
|
|
|
branches:
|
|
|
|
- master
|
2019-11-12 05:17:32 +08:00
|
|
|
pull_request:
|
2020-11-11 02:52:58 +08:00
|
|
|
branches:
|
|
|
|
- master
|
2019-11-12 05:17:32 +08:00
|
|
|
schedule:
|
|
|
|
- cron: 23 */8 * * *
|
|
|
|
|
|
|
|
jobs:
|
2021-01-12 08:08:01 +08:00
|
|
|
build:
|
|
|
|
if: >-
|
|
|
|
! contains(toJSON(github.event.commits.*.message), '[skip ci]') &&
|
|
|
|
! contains(toJSON(github.event.commits.*.message), '[skip github]')
|
2019-11-12 05:17:32 +08:00
|
|
|
runs-on: ubuntu-latest
|
|
|
|
env:
|
2020-10-30 02:14:44 +08:00
|
|
|
# fuzzers that change behaviour with SIMDJSON_FORCE_IMPLEMENTATION
|
2020-11-19 23:51:56 +08:00
|
|
|
defaultimplfuzzers: atpointer dump dump_raw_tape element minify parser print_json
|
2020-10-30 02:14:44 +08:00
|
|
|
# fuzzers that loop over the implementations themselves, or don't need to switch.
|
2020-12-02 04:58:41 +08:00
|
|
|
implfuzzers: implementations minifyimpl ndjson ondemand padded utf8
|
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).
This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.
Also, it will detect if the implementations behave differently:
by making sure they all succeed, or all error
turning the parsed data into text again, should produce equal results
While at it, I corrected some minor things:
clean up building too many variants, run with forced implementation (closes #815 )
always store crashes as artefacts, good in case the fuzzer finds something
return value of the fuzzer function should always be 0
reduce log spam
introduce max size for the seed corpus and the CI fuzzer
2020-09-12 05:46:22 +08:00
|
|
|
implementations: haswell westmere fallback
|
|
|
|
UBSAN_OPTIONS: halt_on_error=1
|
|
|
|
MAXLEN: -max_len=4000
|
2020-10-31 15:22:49 +08:00
|
|
|
CLANGVERSION: 11
|
|
|
|
# which optimization level to use for the sanitizer build (see build_fuzzer.variants.sh)
|
|
|
|
OPTLEVEL: -O3
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2019-11-12 05:17:32 +08:00
|
|
|
steps:
|
|
|
|
- name: Install packages necessary for building
|
|
|
|
run: |
|
2019-12-22 02:23:51 +08:00
|
|
|
sudo apt update
|
2019-11-24 05:49:41 +08:00
|
|
|
sudo apt-get install --quiet ninja-build valgrind zip unzip
|
2019-11-12 05:17:32 +08:00
|
|
|
wget https://apt.llvm.org/llvm.sh
|
|
|
|
chmod +x llvm.sh
|
2020-10-31 15:22:49 +08:00
|
|
|
sudo ./llvm.sh $CLANGVERSION
|
2019-11-12 05:17:32 +08:00
|
|
|
|
|
|
|
- uses: actions/checkout@v1
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2020-11-05 02:35:33 +08:00
|
|
|
- uses: actions/cache@v2
|
|
|
|
with:
|
|
|
|
path: dependencies/.cache
|
|
|
|
key: ${{ hashFiles('dependencies/CMakeLists.txt') }}
|
|
|
|
|
2019-11-24 05:49:41 +08:00
|
|
|
- name: Create and prepare the initial seed corpus
|
|
|
|
run: |
|
|
|
|
fuzz/build_corpus.sh
|
|
|
|
mv corpus.zip seed_corpus.zip
|
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).
This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.
Also, it will detect if the implementations behave differently:
by making sure they all succeed, or all error
turning the parsed data into text again, should produce equal results
While at it, I corrected some minor things:
clean up building too many variants, run with forced implementation (closes #815 )
always store crashes as artefacts, good in case the fuzzer finds something
return value of the fuzzer function should always be 0
reduce log spam
introduce max size for the seed corpus and the CI fuzzer
2020-09-12 05:46:22 +08:00
|
|
|
mkdir seedcorpus
|
|
|
|
unzip -q -d seedcorpus seed_corpus.zip
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2019-11-12 05:17:32 +08:00
|
|
|
- name: Download the corpus from the last run
|
|
|
|
run: |
|
|
|
|
wget --quiet https://dl.bintray.com/pauldreik/simdjson-fuzz-corpus/corpus/corpus.tar
|
|
|
|
tar xf corpus.tar
|
|
|
|
rm corpus.tar
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2019-11-12 05:17:32 +08:00
|
|
|
- name: List clang versions
|
|
|
|
run: |
|
|
|
|
ls /usr/bin/clang*
|
|
|
|
which clang++
|
|
|
|
clang++ --version
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2019-11-12 05:17:32 +08:00
|
|
|
- name: Build all the variants
|
2020-10-31 15:22:49 +08:00
|
|
|
run: CLANGSUFFIX=-$CLANGVERSION fuzz/build_fuzzer_variants.sh
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2020-10-09 11:29:54 +08:00
|
|
|
- name: Explore fast (release build, default implementation)
|
2019-11-12 05:17:32 +08:00
|
|
|
run: |
|
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).
This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.
Also, it will detect if the implementations behave differently:
by making sure they all succeed, or all error
turning the parsed data into text again, should produce equal results
While at it, I corrected some minor things:
clean up building too many variants, run with forced implementation (closes #815 )
always store crashes as artefacts, good in case the fuzzer finds something
return value of the fuzzer function should always be 0
reduce log spam
introduce max size for the seed corpus and the CI fuzzer
2020-09-12 05:46:22 +08:00
|
|
|
set -eux
|
2020-10-09 11:29:54 +08:00
|
|
|
for fuzzer in $defaultimplfuzzers $implfuzzers; do
|
2019-11-12 05:17:32 +08:00
|
|
|
mkdir -p out/$fuzzer # in case this is a new fuzzer, or corpus.tar is broken
|
2020-09-17 03:17:43 +08:00
|
|
|
# get input from everyone else (corpus cross pollination)
|
|
|
|
others=$(find out -type d -not -name $fuzzer -not -name out -not -name cmin)
|
|
|
|
build-fast/fuzz/fuzz_$fuzzer out/$fuzzer $others seedcorpus -max_total_time=30 $MAXLEN
|
2019-11-12 05:17:32 +08:00
|
|
|
done
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2020-10-09 11:29:54 +08:00
|
|
|
- name: Fuzz default impl. fuzzers with sanitizer+asserts (good at detecting errors)
|
2019-11-12 05:17:32 +08:00
|
|
|
run: |
|
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).
This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.
Also, it will detect if the implementations behave differently:
by making sure they all succeed, or all error
turning the parsed data into text again, should produce equal results
While at it, I corrected some minor things:
clean up building too many variants, run with forced implementation (closes #815 )
always store crashes as artefacts, good in case the fuzzer finds something
return value of the fuzzer function should always be 0
reduce log spam
introduce max size for the seed corpus and the CI fuzzer
2020-09-12 05:46:22 +08:00
|
|
|
set -eux
|
2020-10-09 11:29:54 +08:00
|
|
|
for fuzzer in $defaultimplfuzzers; do
|
2020-09-17 03:17:43 +08:00
|
|
|
# get input from everyone else (corpus cross pollination)
|
|
|
|
others=$(find out -type d -not -name $fuzzer -not -name out -not -name cmin)
|
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).
This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.
Also, it will detect if the implementations behave differently:
by making sure they all succeed, or all error
turning the parsed data into text again, should produce equal results
While at it, I corrected some minor things:
clean up building too many variants, run with forced implementation (closes #815 )
always store crashes as artefacts, good in case the fuzzer finds something
return value of the fuzzer function should always be 0
reduce log spam
introduce max size for the seed corpus and the CI fuzzer
2020-09-12 05:46:22 +08:00
|
|
|
for implementation in $implementations; do
|
|
|
|
export SIMDJSON_FORCE_IMPLEMENTATION=$implementation
|
2020-10-31 15:22:49 +08:00
|
|
|
build-sanitizers$OPTLEVEL/fuzz/fuzz_$fuzzer out/$fuzzer $others seedcorpus -max_total_time=20 $MAXLEN
|
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).
This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.
Also, it will detect if the implementations behave differently:
by making sure they all succeed, or all error
turning the parsed data into text again, should produce equal results
While at it, I corrected some minor things:
clean up building too many variants, run with forced implementation (closes #815 )
always store crashes as artefacts, good in case the fuzzer finds something
return value of the fuzzer function should always be 0
reduce log spam
introduce max size for the seed corpus and the CI fuzzer
2020-09-12 05:46:22 +08:00
|
|
|
done
|
2019-11-12 05:17:32 +08:00
|
|
|
echo now have $(ls out/$fuzzer |wc -l) files in corpus
|
|
|
|
done
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2020-10-09 11:29:54 +08:00
|
|
|
- name: Fuzz differential impl. fuzzers with sanitizer+asserts (good at detecting errors)
|
|
|
|
run: |
|
|
|
|
set -eux
|
|
|
|
for fuzzer in $implfuzzers; do
|
|
|
|
# get input from everyone else (corpus cross pollination)
|
|
|
|
others=$(find out -type d -not -name $fuzzer -not -name out -not -name cmin)
|
2020-10-31 15:22:49 +08:00
|
|
|
build-sanitizers$OPTLEVEL/fuzz/fuzz_$fuzzer out/$fuzzer $others seedcorpus -max_total_time=20 $MAXLEN
|
2020-10-09 11:29:54 +08:00
|
|
|
echo now have $(ls out/$fuzzer |wc -l) files in corpus
|
|
|
|
done
|
|
|
|
|
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).
This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.
Also, it will detect if the implementations behave differently:
by making sure they all succeed, or all error
turning the parsed data into text again, should produce equal results
While at it, I corrected some minor things:
clean up building too many variants, run with forced implementation (closes #815 )
always store crashes as artefacts, good in case the fuzzer finds something
return value of the fuzzer function should always be 0
reduce log spam
introduce max size for the seed corpus and the CI fuzzer
2020-09-12 05:46:22 +08:00
|
|
|
- name: Minimize the corpus with the fast fuzzer on the default implementation
|
2019-11-12 05:17:32 +08:00
|
|
|
run: |
|
2020-09-17 03:17:43 +08:00
|
|
|
set -eux
|
2020-10-09 11:29:54 +08:00
|
|
|
for fuzzer in $defaultimplfuzzers $implfuzzers; do
|
2019-11-12 05:17:32 +08:00
|
|
|
mkdir -p out/cmin/$fuzzer
|
2020-09-17 03:17:43 +08:00
|
|
|
# get input from everyone else (corpus cross pollination)
|
|
|
|
others=$(find out -type d -not -name $fuzzer -not -name out -not -name cmin)
|
|
|
|
build-fast/fuzz/fuzz_$fuzzer -merge=1 $MAXLEN out/cmin/$fuzzer out/$fuzzer $others seedcorpus
|
2019-11-12 05:17:32 +08:00
|
|
|
rm -rf out/$fuzzer
|
|
|
|
mv out/cmin/$fuzzer out/$fuzzer
|
|
|
|
done
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2019-11-12 05:17:32 +08:00
|
|
|
- name: Package the corpus into an artifact
|
|
|
|
run: |
|
2020-10-09 18:44:17 +08:00
|
|
|
for fuzzer in $defaultimplfuzzers $implfuzzers; do
|
2019-11-12 05:17:32 +08:00
|
|
|
tar rf corpus.tar out/$fuzzer
|
|
|
|
done
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2019-11-12 05:17:32 +08:00
|
|
|
- name: Save the corpus as a github artifact
|
2020-09-26 20:25:00 +08:00
|
|
|
uses: actions/upload-artifact@v2
|
2019-11-12 05:17:32 +08:00
|
|
|
with:
|
|
|
|
name: corpus
|
|
|
|
path: corpus.tar
|
2020-09-17 03:17:43 +08:00
|
|
|
|
|
|
|
# This takes a subset of the minimized corpus and run it through valgrind. It is slow,
|
|
|
|
# therefore take a "random" subset. The random selection is accomplished by sorting on filenames,
|
|
|
|
# which are hashes of the content.
|
|
|
|
- name: Run some of the minimized corpus through valgrind (replay build, default implementation)
|
2019-11-12 05:17:32 +08:00
|
|
|
run: |
|
2020-10-09 11:29:54 +08:00
|
|
|
for fuzzer in $defaultimplfuzzers $implfuzzers; do
|
2020-09-17 03:17:43 +08:00
|
|
|
find out/$fuzzer -type f |sort|head -n200|xargs -n40 valgrind build-replay/fuzz/fuzz_$fuzzer 2>&1|tee valgrind-$fuzzer.txt
|
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).
This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.
Also, it will detect if the implementations behave differently:
by making sure they all succeed, or all error
turning the parsed data into text again, should produce equal results
While at it, I corrected some minor things:
clean up building too many variants, run with forced implementation (closes #815 )
always store crashes as artefacts, good in case the fuzzer finds something
return value of the fuzzer function should always be 0
reduce log spam
introduce max size for the seed corpus and the CI fuzzer
2020-09-12 05:46:22 +08:00
|
|
|
done
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2019-11-12 05:17:32 +08:00
|
|
|
- name: Compress the valgrind output
|
|
|
|
run: tar cf valgrind.tar valgrind-*.txt
|
2020-09-17 03:17:43 +08:00
|
|
|
|
2019-11-12 05:17:32 +08:00
|
|
|
- name: Save valgrind output as a github artifact
|
2020-09-17 03:17:43 +08:00
|
|
|
uses: actions/upload-artifact@v2
|
|
|
|
if: always()
|
2019-11-12 05:17:32 +08:00
|
|
|
with:
|
|
|
|
name: valgrindresults
|
|
|
|
path: valgrind.tar
|
2020-09-17 03:17:43 +08:00
|
|
|
if-no-files-found: ignore
|
|
|
|
|
2019-11-12 05:17:32 +08:00
|
|
|
- name: Upload the corpus and results to bintray if we are on master
|
2020-09-17 03:17:43 +08:00
|
|
|
if: ${{ github.event_name == 'schedule' }}
|
2019-11-12 05:17:32 +08:00
|
|
|
run: |
|
2020-09-17 03:17:43 +08:00
|
|
|
echo uploading each artifact twice, otherwise it will not be published
|
|
|
|
curl -T corpus.tar -upauldreik:${{ secrets.bintrayApiKey }} https://api.bintray.com/content/pauldreik/simdjson-fuzz-corpus/corpus/0/corpus/corpus.tar";publish=1;override=1"
|
|
|
|
curl -T corpus.tar -upauldreik:${{ secrets.bintrayApiKey }} https://api.bintray.com/content/pauldreik/simdjson-fuzz-corpus/corpus/0/corpus/corpus.tar";publish=1;override=1"
|
|
|
|
curl -T valgrind.tar -upauldreik:${{ secrets.bintrayApiKey }} https://api.bintray.com/content/pauldreik/simdjson-fuzz-corpus/corpus/0/corpus/valgrind.tar";publish=1;override=1"
|
|
|
|
curl -T valgrind.tar -upauldreik:${{ secrets.bintrayApiKey }} https://api.bintray.com/content/pauldreik/simdjson-fuzz-corpus/corpus/0/corpus/valgrind.tar";publish=1;override=1"
|
|
|
|
|
add multi implementation fuzzer (#1162)
This adds a fuzzer which parses the same input using all the available implementations (haswell, westmere, fallback on x64).
This should get the otherwise uncovered sourcefiles (mostly fallback) to show up in the fuzz coverage.
For instance, the fallback directory has only one line covered.
As of the 20200909 report, 1866 lines are covered out of 4478.
Also, it will detect if the implementations behave differently:
by making sure they all succeed, or all error
turning the parsed data into text again, should produce equal results
While at it, I corrected some minor things:
clean up building too many variants, run with forced implementation (closes #815 )
always store crashes as artefacts, good in case the fuzzer finds something
return value of the fuzzer function should always be 0
reduce log spam
introduce max size for the seed corpus and the CI fuzzer
2020-09-12 05:46:22 +08:00
|
|
|
- name: Archive any crashes as an artifact
|
|
|
|
uses: actions/upload-artifact@v2
|
|
|
|
if: always()
|
|
|
|
with:
|
|
|
|
name: crashes
|
|
|
|
path: |
|
|
|
|
crash-*
|
|
|
|
leak-*
|
|
|
|
timeout-*
|
|
|
|
if-no-files-found: ignore
|