article_graph
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: JorgeMIng
- License: apache-2.0
- Language: Jupyter Notebook
- Default Branch: main
- Size: 71.9 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 3
Metadata Files
README.md
Article_Graph
Article_Graph is a tool that extracts and enriches information from a set of academic papers and journals.
It makes use of advanced and powerful machine learning tools to extract as much information as possible. Also, it uses Grobid to extract all the relevant information about the papers.
The final output of this experiment is a RDF Graph that includes all the extracted and reconciled information about the papers and their relations.
A simple application is also available to visualize and interact with the KG.
Requirements
Python >= 3.11 is required for running the experiments.
Grobid is required for the first step of the pipeline, you can follow the installation instructions here.
PDF_ArticleAnalyzer is required to interact with the Grobid service, you can follow the installation instructions here.
Running the Application with Docker with the KG in a Remote Server
If you want to try the application with the pregenerated graph under
the rdf directory, here you will find all the instructions necessary for
running it.
- Clone the repository:
git clone https://github.com/JorgeMIng/Article_Graph
cd Article_Graph
- Build the Docker image:
bash
docker build -t graph_tool docker
- Run the image:
bash
docker run -p 8501:8501 graph_tool
By default, the KG generated in the examples/article_graph.ipynb is
loaded in a remote server http://yordi111nas.synology.me:3030/articles/query.
The graph is also available under the rdf directory.
Running the Application from Source with the KG in a Remote Server
If you want to try the application with the pregenerated graph under
the rdf directory, here you will find all the instructions necessary for
running it.
- Clone the repository:
git clone https://github.com/JorgeMIng/Article_Graph
cd Article_Graph
- Create a Python environment (conda is recommended):
bash
conda create -n article-graph-3.11 python=3.11
conda activate article-graph-3.11
- Install all the dependencies:
bash
pip install -r requirements_app.txt
- Execute the application:
bash
python Start.py
By default, the KG generated in the examples/article_graph.ipynb is
loaded in a remote server http://yordi111nas.synology.me:3030/articles/query
Running the Application with a custom KG Hosted Locally
If you want to try the application with another graph generated locally, here you will find all the instructions necessary for running it.
- Clone the repository:
git clone https://github.com/JorgeMIng/Article_Graph
cd Article_Graph
- Create a Python environment (conda is recommended):
bash
conda create -n article-graph-3.11 python=3.11
conda activate article-graph-3.11
- Install all the dependencies:
bash
pip install -r requirements_app.txt
- Host the KG in Jena Fuseki with Docker:
bash
docker run -p 3030:3030 stain/jena-fuseki
- Execute the application:
bash
python Start.py
- Go to the Settings section and configure the remote server.
Running the Experiments
If you want to reproduce the experiments by yourself, here you will find all the instructions necessary for running them.
- Clone the repository:
git clone https://github.com/JorgeMIng/Article_Graph
cd Article_Graph
- Create a Python environment (conda is recommended):
bash
conda create -n article-graph-3.11 python=3.11
conda activate article-graph-3.11
- Install all the dependencies:
bash
pip install -r requirements.txt
- Run the example notebook at
examples/article_graph.ipynb
Examples
- Full KG Generation:
examples/article_graph.ipynb - Similarity Analysis:
examples/examples_similarity.ipynb - Topic Modeling:
examples/topic_modeling.ipynb - NER Analysis and Project extracton:
ner/extract_element.ipynb
License
Please refer to the LICENSE file.
Authors
- Jorge Martn Izquierdo
- Gloria Cumia Espinosa de los Monteros
- Marco Ciccal Baztn
Owner
- Login: JorgeMIng
- Kind: user
- Repositories: 1
- Profile: https://github.com/JorgeMIng
CodeMeta (codemeta.json)
{
"@context": "https://doi.org/10.5063/schema/codemeta-2.0",
"@type": "SoftwareSourceCode",
"license": "https://spdx.org/licenses/Apache-2.0",
"codeRepository": "git+https://github.com/JorgeMIng/Article_Graph",
"dateCreated": "2024-04-11",
"datePublished": "22-05-2024",
"dateModified": "22-05-2024",
"name": "Article_Graph",
"version": "1.0.0",
"identifier": "https://zenodo.org/doi/10.5281/zenodo.11242795",
"description": "Article_Graph is a tool that extracts and enriches information from a set of academic papers and journals. It makes use of advanced and powerful machine learning tools to extract as much information as possible. Also, it uses Grobid to extract all the relevant information about the papers. The final output of this experiment is a RDF Graph that includes all the extracted and reconciled information about the papers and their relations. A simple application is also available to visualize and interact with the KG.",
"applicationCategory": "Research",
"developmentStatus": "active",
"keywords": [
"text",
"research",
"analysis",
"python",
"grobid",
"bert",
"lda",
"topic-modeling",
"ner"
],
"programmingLanguage": [
"Python 3"
],
"operatingSystem": [
"Linux",
"Windows",
"MacOS"
],
"softwareRequirements": [
"Python 3",
"Grobid"
],
"author": [
{
"@type": "Person",
"@id": "https://orcid.org/0009-0005-7696-8995",
"givenName": "Jorge",
"familyName": "Martin",
"email": "jorge.martin.izquierdo@alumnos.upm.es"
},
{
"@type": "Person",
"@id": "https://orcid.org/0009-0004-8215-9978",
"givenName": "Gloria",
"familyName": "Cumia",
"email": "gloria.cumia@alumnos.upm.es"
},
{
"@type": "Person",
"@id": "https://orcid.org/0009-0000-8821-0587",
"givenName": "Marco",
"familyName": "Ciccal",
"email": "marcociccalebaztan@gmail.com"
}
]
}
GitHub Events
Total
Last Year
Dependencies
- sphinx ==7.1.2
- sphinx-rtd-theme ==1.3.0rc1
- sphinx_mdinclude ==0.5.3
- SPARQLWrapper ==2.0.0
- folium ==0.16.0
- gensim ==4.3.2
- omegaconf ==2.3.0
- pandas ==1.5.2
- rdflib ==7.0.0
- reconciler ==0.2.2
- rocrate ==0.10.0
- scikit-learn ==1.3.2
- sentence-transformers ==2.2.2
- streamlit ==1.28.0
- streamlit-agraph ==0.0.45
- streamlit-extras ==0.3.1
- streamlit-lottie ==0.0.5
- streamlit_folium ==0.20.0
- transformers ==4.34.1
- SPARQLWrapper ==2.0.0
- folium ==0.16.0
- omegaconf ==2.3.0
- pandas ==1.5.2
- streamlit ==1.28.0
- streamlit-agraph ==0.0.45
- streamlit-extras ==0.3.1
- streamlit-lottie ==0.0.5
- streamlit_folium ==0.20.0