Commit Graph

880 Commits

Author SHA1 Message Date
Daniel Lemire 9b7832c39a Adding another test (powers of two). 2019-10-16 17:47:52 -04:00
Daniel Lemire e3e29b720d Adding a check that powers of ten can be parsed (sanity check). 2019-10-16 16:27:50 -04:00
John Keiser 64872bddf4 Eliminate stage1_find_marks_flatten.h 2019-10-14 12:33:46 -07:00
John Keiser 81f2249575 Move stage1 into a class to pass fewer parameters 2019-10-14 12:33:46 -07:00
John Keiser 9bbd6bd874 Move headers to implementation area
- jsoncharutils.h, numberparsing.h, simdprune_tables.h
2019-10-14 11:51:41 -07:00
Daniel Lemire 13e477ebfe
Update CONTRIBUTORS 2019-10-10 14:29:35 -04:00
Daniel Lemire c284b54fc2
adding link to Go port 2019-10-09 16:28:01 -04:00
John Keiser 69caa477fb Use struct for UTF-8 checks, remove templating
- Removes templating from simd_input, utf8_checker, and parse_string
- Make drone gcc run a lot faster
- Make drone clang run a little faster (NOTE:
https://hub.docker.com/r/silkeh/clang helps even more, but I wasn't sure
whether we wanted to trust that)
- Make drone arm run in parallel to get results quicker
2019-10-08 17:58:45 -07:00
Daniel Lemire 81f9aac13f Fixing minor perf. regression. 2019-10-07 16:31:44 -04:00
Juho Lauri b2eff3c90c case insensitive move_to_key (#324)
* case insensitive move_to_key
* portable strcmpi
2019-10-07 16:08:17 -04:00
Daniel Lemire aa45fe7359
Adding check for subnormal parsing. (#328) 2019-10-07 11:01:34 -04:00
Daniel Lemire 253af0766c
adding Juho 2019-10-02 14:26:42 -04:00
Juho Lauri cf9dbe583d improved const correctness (#321) 2019-10-02 14:25:28 -04:00
John Keiser de8df0a05f Combined performance patch (5% overall, 15% stage 1) (#317)
* Allow -f

* Support parse -s (force sse)

* Simplify flatten_bits

- Add directly to base instead of storing variable
- Don't modify base_ptr after beginning of function
- Eliminate base variable and increment base_ptr instead

* De-unroll the flatten_bits loops

* Decrease dependencies in stage 1

- Do all finalize_structurals work before computing the quote mask; mask
  out the quote mask later
- Join find_whitespace_and_structurals and finalize_structurals into
  single find_structurals call, to reduce variable leakage
- Rework pseudo_pred algorithm to refer to "primitive" for clarity and some
  dependency reduction
- Rename quote_mask to in_string to describe what we're trying to
  achieve ("mask" could mean many things)
- Break up find_quote_mask_and_bits into find_quote_mask and
  invalid_string_bytes to reduce data leakage (i.e. don't expose quote bits
  or odd_ends at all to find_structural_bits)
- Genericize overflow methods "follows" and "follows_odd_sequence" for
  descriptiveness and possible lifting into a generic simd parsing library

* Mark branches as likely/unlikely

* Reorder and unroll+interleave stage 1 loop

* Nest the cnt > 16 branch inside cnt > 8
2019-10-01 12:01:08 -04:00
Daniel Lemire 53b6deaeae
Safer handling of error codes, fixes https://github.com/lemire/simdjson/issues/318 (#319) 2019-09-29 12:12:15 -04:00
Opemipo 462858efa3 Fix Typo (#311)
escapted -> escaped
2019-09-12 10:16:33 -04:00
John Keiser f7e893667d Use simd_input generic methods for utf8 checking (#301)
* Use generic each/reduce in simdutf8check

* Remove macros from generic simd_input uses

* Use array instead of members to store simd registers

* Default local checkperf to clone from .
2019-09-02 12:46:05 -04:00
Daniel Lemire 5765c81f66 Fixing number parsing of large ints 2019-09-02 12:40:39 -04:00
Daniel Lemire 92334a8e28 Better tests. 2019-09-02 12:32:44 -04:00
Daniel Lemire c4218c8e40
Accept large unsigned integers (#295) (#306)
* handle uint64 value in JSON
* Add integer_tests
* Add get_unsigned_integer() on  ParsedJson::BasicIterator
* Write 'u' to tape when the value seems unsigned
* Add to handle 'u' element
* Brush up integer_tests.cpp
* Append tests/integer_tests in .gitignore
* Add comments to is_integer and is_unsigned_integer
2019-09-02 11:56:26 -04:00
saka1 c1f27fb848 Accept large unsigned integers (#295)
* handle uint64 value in JSON
* Add integer_tests
* Add get_unsigned_integer() on  ParsedJson::BasicIterator
* Write 'u' to tape when the value seems unsigned
* Add to handle 'u' element
* Brush up integer_tests.cpp
* Append tests/integer_tests in .gitignore
* Add comments to is_integer and is_unsigned_integer
2019-09-02 10:50:24 -04:00
Valeriy Van 6d0fd5bb93 Stylistic fix in README.md (#305) 2019-09-01 12:35:50 -04:00
Daniel Lemire bd15d3ae24
Pointing appveyor badge to master 2019-08-30 17:26:18 -04:00
John Keiser 7f249cd179 Use non-interleaved map() to make structurals clearer (#304) 2019-08-29 21:38:41 -04:00
John Keiser aef3f4be99
Merge pull request #296 from lemire/wide_mask
Genericize bitmask building to make algorithms clearer
2019-08-28 08:53:21 -07:00
John Keiser bf8083888d Validate perf against master, not v0.2.1 2019-08-26 13:35:18 -07:00
John Keiser f4fa5b7340 Add MAP_CHUNKS2, make parameter name related to input 2019-08-26 09:46:49 -07:00
John Keiser 169568ca47 Use map() to interleave instructions for parallelism 2019-08-26 09:46:49 -07:00
John Keiser 9cc4ddfc88 Use map().to_bitmask() instead of build_bitmask() 2019-08-26 09:46:49 -07:00
John Keiser 441963c84c Add AMD64 build_bitmask 2019-08-26 09:46:49 -07:00
John Keiser cf4ae61ac6 Modify checkperf to print out perfdiff command
to make it easier to run it yourself without having to recompile the
world
2019-08-26 09:46:49 -07:00
John Keiser da0f1cacea Remove static modifiers 2019-08-26 09:46:48 -07:00
John Keiser 5e5592178d Update amalgamated cpp 2019-08-26 09:46:48 -07:00
John Keiser b01222518d Genericize bitmask building to make algorithms clearer 2019-08-26 09:46:48 -07:00
Daniel Lemire 2060cf8a70
Updating reference to paper 2019-08-26 10:24:31 -04:00
saka1 a4bd87119b Remove duplicate lines in .gitignore (#300) 2019-08-25 09:34:13 -04:00
Daniel Lemire 9f26355fe0
This should lower false positives. (#299) 2019-08-25 09:33:00 -04:00
Daniel Lemire f667d4965d
This is a bug fix: our prev function was buggy. (#291) 2019-08-23 18:59:43 -04:00
John Keiser 585f84a734 Move architecture-specific headers to src/ (#287)
* Use namespaces instead of templates for stage1 impls

* Move stage1 implementation into the src/ directory

* Move architecture-specific code to src/
2019-08-21 07:59:49 -04:00
Daniel Lemire a1bff85263 Documenting the limits of move_to_key with respect to Unicode Equivalence. 2019-08-20 17:10:30 -04:00
Daniel Lemire fb920bba62
ZippyJSON 2019-08-20 14:11:53 -04:00
saka1 58697f6f3b Remove unnecessary x permissions from JSON files (#290) 2019-08-18 20:48:48 -04:00
saka1 18c5b8d68a Update .gitignore for cmake files (#289) 2019-08-18 17:24:38 -04:00
John Keiser 08cf140811
Merge pull request #285 from lemire/methods
Use methods instead of functions for simd_input
2019-08-16 17:45:42 -07:00
John Keiser 94673bcdf2 Use methods for utf8 checker 2019-08-16 14:15:37 -07:00
John Keiser aa15917c9d Use methods instead of functions for simd_input 2019-08-16 14:07:30 -07:00
John Keiser 85fb37b6ea Lower the bar for performance check 2019-08-16 12:34:28 -07:00
John Keiser ae3ae9a474
Merge pull request #280 from lemire/circle-reuse
Parallelize Circle CI and speed up gcc runs
2019-08-16 09:55:39 -07:00
Vitaly Baranov e9be643db5 Fix condition in ParsedJson::allocate_capacity(). (#283) 2019-08-16 08:38:59 -04:00
John Keiser b49eefbee6 Give test jobs better names 2019-08-15 19:41:30 -07:00