This includes our current full matrix (windows, linux and macos), for evaluting
purposes.
We should disconsider failures when evaluating PRs.
TODO:
- deploy
- coverage
- github release notes
Even with the above missing, I still believe it would be nice to merge
this and have GitHub actions working in parallel so we can evaluate performance
and usability from now on.