Recent Releases of fasta-ai_csi-r1
fasta-ai_csi-r1 -
🤖 Fasta AI_CSI R1 – GPT-based Country Semantic Inference Module (v1.0.1)
This release introduces Fasta AI_CSI R1, an AI-powered metadata normalization module that infers the most likely country from ambiguous or inconsistent location strings in avian influenza virus FASTA records.
🧠 What It Does
When rule-based ISO 3166-1 matching fails, this module leverages OpenAI's GPT models (gpt-3.5 / gpt-4o) to semantically infer the country of origin, achieving over 99.6% resolution accuracy.
📦 Included Files
Fasta AI_CSI R1.ipynb– Executable Jupyter notebook for semantic inferencelocation_to_country_AI.json– Live dictionary of inferred country mappingsother_locations.csv– Unresolved entries for manual reviewcountry_stat.csv– Final country-level sample summaryREADME.md– Full documentation with OpenAI API setup guide
🔐 Requirements
```bash pip install openai biopython pandas tqdm Set your API key via environment variable:
export OPENAIAPIKEY="your-api-key-here"
📘 This module is designed to be used after rule-based processing, enabling scalable, high-resolution country assignment in global avian influenza surveillance.
- HTML
Published by Bambusaoldhamii 10 months ago