Recent Releases of fasta-ai_csi-r1

fasta-ai_csi-r1 -

🤖 Fasta AI_CSI R1 – GPT-based Country Semantic Inference Module (v1.0.1)

This release introduces Fasta AI_CSI R1, an AI-powered metadata normalization module that infers the most likely country from ambiguous or inconsistent location strings in avian influenza virus FASTA records.

🧠 What It Does

When rule-based ISO 3166-1 matching fails, this module leverages OpenAI's GPT models (gpt-3.5 / gpt-4o) to semantically infer the country of origin, achieving over 99.6% resolution accuracy.

📦 Included Files

  • Fasta AI_CSI R1.ipynb – Executable Jupyter notebook for semantic inference
  • location_to_country_AI.json – Live dictionary of inferred country mappings
  • other_locations.csv – Unresolved entries for manual review
  • country_stat.csv – Final country-level sample summary
  • README.md – Full documentation with OpenAI API setup guide

🔐 Requirements

```bash pip install openai biopython pandas tqdm Set your API key via environment variable:

export OPENAIAPIKEY="your-api-key-here"

📘 This module is designed to be used after rule-based processing, enabling scalable, high-resolution country assignment in global avian influenza surveillance.

- HTML
Published by Bambusaoldhamii 10 months ago