Recent Releases of llm-as-code-selectors-paper
llm-as-code-selectors-paper - Source code for the paper: "Large Language Models as Medical Codes Selectors: a benchmark using the International Classification of Primary Care"
Background: Medical coding is critical for structuring healthcare data. It can lead to a better understanding of population health, guide quality improvement interventions, and policy making. This study investigates the ability of large language models (LLMs) to select appropriate codes from the International Classification of Primary Care, 2nd edition (ICPC-2), based on the results of a specialized search engine.
Methods: A dataset of 437 clinical expressions in Brazilian Portuguese was used, each annotated with relevant ICPC-2 codes. A semantic search engine based on OpenAI’s text-embedding-3-large model retrieved candidate expressions from a corpus of 73,563 ICPC-2-labeled concepts. Thirty-three LLMs (both open-source and private) were prompted with each query and a ranked list of retrieved results, and asked to return the best-matching ICPC-2 code. Performance was evaluated using F1-score, with additional analysis of token usage, cost, response time, and formatting adherence.
Results: Of the 33 models evaluated, 28 achieved a maximum F1-score above 0.8, and 10 exceeded 0.85. The top-performing models were gpt-4.5-preview, o3, and gemini-2.5-pro. By optimizing the retriever, performance can improve by up to 4 percentage points. Most models were able to return valid codes in the expected format and restrict outputs to retrieved results, reducing hallucination risk. Notably, smaller models (<3B parameters) underperformed due to format inconsistencies and sensitivity to input length.
Conclusions: LLMs show strong potential for automating ICPC-2 code selection, with many models achieving high performance even without task-specific fine-tuning. This work establishes a benchmark for future studies and describes some of the challenges for achieving better results.
- Python
Published by almeidava93 8 months ago