https://github.com/creolehustle/everything_has_a_beginning
Project one-Knowledge and Training Base
https://github.com/creolehustle/ds_pysyft
Perform data science on data that remains in someone else's server
https://github.com/creolehustle/dstc8-schema-guided-dialogue
The Schema-Guided Dialogue Dataset
https://github.com/creolehustle/ds_pykale
Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!
https://github.com/creolehustle/dl_abstract-reasoning-matrices
Progressive matrices dataset, as described in: Measuring abstract reasoning in neural networks (Barrett*, Hill*, Santoro*, Morcos, Lillicrap), ICML2018
https://github.com/creolehustle/beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
https://github.com/crhf/swt-bench
[NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation
https://github.com/crhf/github-box-maker
Renders markdown into github-style html.
https://github.com/crhf/openai-rft-examples
Input-output examples highlighting the importance of fine-tuning LLMs for bug fixing tasks.
https://github.com/crhf/swe-bench-docker
A Docker based solution of the SWE-bench evaluation framework
https://github.com/crhf/cerberus
Program repair platform that provides interface to multiple state-of-the-art program repair tools
https://github.com/crhf/qasan
QASan is a custom QEMU 3.1.1 that detects memory errors in the guest using AddressSanitizer.
https://github.com/crim-ca/stac-populator
Workflow logic to populate STAC catalog with demo datasets.