Updated 6 months ago

tiny_qa_benchmark_pp • Rank 2.1 • Science 77%

Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultra-fast LLM smoke-testing & regression-catching.