Update iterate_many.md
This commit is contained in:
parent
d3f0e2afb3
commit
ca3f3cc49d
|
@ -1,8 +1,14 @@
|
|||
iterate_many
|
||||
==========
|
||||
|
||||
An interface providing features to work with files or streams containing multiple small JSON documents.
|
||||
As fast and convenient as possible.
|
||||
An interface providing features to work with files or streams containing multiple small JSON documents. Given an input such as
|
||||
```JSON
|
||||
{"text":"a"}
|
||||
{"text":"b"}
|
||||
{"text":"c"}
|
||||
...
|
||||
```
|
||||
... you want to read the entries (individual JSON documents) as quickly and as conveniently as possible. Importantly, the input might span several gigabytes, but you want to use a small (fixed) amount of memory. Ideally, you'd also like the parallelize the processing (using more than one core) to speed up the process.
|
||||
|
||||
Contents
|
||||
--------
|
||||
|
@ -226,4 +232,4 @@ This will print:
|
|||
39 bytes
|
||||
```
|
||||
|
||||
Importantly, you should only call `truncated_bytes()` after iterating through all of the documents since the stream cannot tell whether there are truncated documents at the very end when it may not have accessed that part of the data yet.
|
||||
Importantly, you should only call `truncated_bytes()` after iterating through all of the documents since the stream cannot tell whether there are truncated documents at the very end when it may not have accessed that part of the data yet.
|
||||
|
|
Loading…
Reference in New Issue