Updated 6 months ago

https://github.com/amazon-science/beyondcorrelation • Science 23%

Implementation of the paper: Beyond Correlation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge