facts
Repository for the article in the online magazine Data Science Collective.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.7%) to scientific vocabulary
Keywords
Repository
Repository for the article in the online magazine Data Science Collective.
Basic Info
- Host: GitHub
- Owner: stefanpietrusky
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://medium.com/@stefanpietrusky/facts-v2-filtering-and-analysis-of-content-in-textual-sources-1a16cdac811b
- Size: 23.4 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md

FACTS V2.5 APP
Filtering and Analysis of Content in Textual Sources
This repository, developed by Stefan Pietrusky, is based on the article published at Data Science Collective [1]. In this article, I describe the functionality of an enhanced version (V2) of the FACTS application. The first version has already been tested and provided important results for improvement. The testing of the first version of FACTS provides concrete insights into the future of education in the age of AI [2].
The adapted (V1.5) version of the application was successfully tested during the 6th IGSP Congress. The results of this test are available at peDOCS and show that FACTS provides answers to the questions posed by the congress [3]. The version (V2) has been further improved and adapted so that the entire process can now be controlled via a common interface.
In the current version (V2.5), the search function for articles has been revised. Since the structure of ERIC has changed, a number of changes have been made here. The methods used by the other databases have also been adapted. The design has been modified and processes that have been started can now be terminated. In future, additional databases and new evaluation options are to be integrated. This is an open source project for educational and research purposes.
FACTS Structure
The structure of the current [V2.5] FACTS app is shown below.
FACTS working principle
Below is a short GIF showing the structure and function of the app.

FACTS availability
The code to run the app is already in the repository.The code is available in both German (GER) and English.
Installing and running the application
- Clone this repository on your local computer:
bash git clone https://github.com/stefanpietrusky/factsv2.git - Install the required dependencies:
bash pip install -r requirements.txt - Install Ollama and load the model Llama3.1 (8B). Alternatively, another model can be used but you need to adapt the code (parsing/regex).
- Install Python 3.10.11.
- Download a suitable web driver. For example, the GeckoDriver. Adjust the constant GECKODRIVERPATH accordingly.
- Create the specific versions of the LLM models with the following command
bash ollama create llama3.1p -f PATH\modelfile.txt ollama create llama3.1p2 -f PATH\modelfile.txt - Start the FACTS app:
bash python app.py## References [1] Pietrusky, S. (2025). How I automatically find numerous answers for any given question. FACTS V2: Filtering and Analysis of Content in Textual Sources. Data Science Collective. Data Science Collective
[2] Pietrusky, S. (2024). Automatic answering of scientific questions using the FACTS-V1 framework: New methods in research to increase efficiency through the use of generative AI. ARXIV CS.DL
[3] Pietrusky, S (2025). Changing school practice. Can artificial intelligence help provide answers to educational research questions? 6. IGSP Congress. peDOCS
Owner
- Login: stefanpietrusky
- Kind: user
- Repositories: 1
- Profile: https://github.com/stefanpietrusky
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you cite this repository, please use the following reference."
title: "FILTERING AND ANALYSIS OF CONTENT IN TEXTUAL SOURCES (FACTS) V2.5"
authors:
- family-names: "Pietrusky"
given-names: "Stefan"
orcid: "https://orcid.org/0009-0008-9739-5542"
version: "1.0.0"
date-released: "2025-04-04"
GitHub Events
Total
- Push event: 4
Last Year
- Push event: 4
Issues and Pull Requests
Last synced: 5 months ago
Dependencies
- Flask ==2.2.5
- PyMuPDF ==1.23.3
- beautifulsoup4 ==4.12.2
- gensim ==4.3.1
- matplotlib ==3.7.1
- networkx ==2.8.8
- nltk ==3.8.1
- numpy ==1.23.5
- pandas ==1.5.3
- plotly ==5.14.1
- pyLDAvis ==3.3.1
- requests ==2.28.2
- selenium ==4.10.0
- wordcloud ==1.8.2.3