https://github.com/captaincodercool/optimized-tfidf-search-with-query-handling

This Python program implements a TF-IDF-based document retrieval system with advanced query optimization. It preprocesses text files by tokenizing, removing stopwords, and stemming. It builds posting lists, normalizes vectors, and retrieves the most relevant documents using cosine similarity, optimized with top-10 element filtering for efficiency.

https://github.com/captaincodercool/optimized-tfidf-search-with-query-handling

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic links in README
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (2.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

This Python program implements a TF-IDF-based document retrieval system with advanced query optimization. It preprocesses text files by tokenizing, removing stopwords, and stemming. It builds posting lists, normalizes vectors, and retrieves the most relevant documents using cosine similarity, optimized with top-10 element filtering for efficiency.

Basic Info
  • Host: GitHub
  • Owner: CAPTAINCODERCOOL
  • License: apache-2.0
  • Language: Jupyter Notebook
  • Default Branch: master
  • Size: 201 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 10 months ago
Metadata Files
License

Owner

  • Login: CAPTAINCODERCOOL
  • Kind: user

GitHub Events

Total
  • Push event: 1
  • Create event: 1
Last Year
  • Push event: 1
  • Create event: 1