* This revert the code back to how it was prior to the silly "run two stages" routine and instead
adds an option to benchmark the code over hot buffers. It turns out that it can be expensive,
when the files are large, to allocate the pages.
* Instead of emulating the whole parsing as stage 1 + stage 2, let us
benchmark the real thing.
* Adding explicit constructor.
* Adding warning to the benchmark user.
* Making re-running optional.
* I think we can align the numbers better (so it is prettier).
* Remove space before %, align third line better
Co-authored-by: John Keiser <john@johnkeiser.com>