Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: JorgeMIng
  • License: apache-2.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 71.9 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 3
Created about 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation Codemeta

README.md

Article_Graph

Documentation Status License DOI

Article_Graph is a tool that extracts and enriches information from a set of academic papers and journals.

It makes use of advanced and powerful machine learning tools to extract as much information as possible. Also, it uses Grobid to extract all the relevant information about the papers.

The final output of this experiment is a RDF Graph that includes all the extracted and reconciled information about the papers and their relations.

A simple application is also available to visualize and interact with the KG.

Requirements

Python >= 3.11 is required for running the experiments.

Grobid is required for the first step of the pipeline, you can follow the installation instructions here.

PDF_ArticleAnalyzer is required to interact with the Grobid service, you can follow the installation instructions here.

Running the Application with Docker with the KG in a Remote Server

If you want to try the application with the pregenerated graph under the rdf directory, here you will find all the instructions necessary for running it.

  1. Clone the repository:

git clone https://github.com/JorgeMIng/Article_Graph cd Article_Graph

  1. Build the Docker image:

bash docker build -t graph_tool docker

  1. Run the image:

bash docker run -p 8501:8501 graph_tool

By default, the KG generated in the examples/article_graph.ipynb is loaded in a remote server http://yordi111nas.synology.me:3030/articles/query. The graph is also available under the rdf directory.

Running the Application from Source with the KG in a Remote Server

If you want to try the application with the pregenerated graph under the rdf directory, here you will find all the instructions necessary for running it.

  1. Clone the repository:

git clone https://github.com/JorgeMIng/Article_Graph cd Article_Graph

  1. Create a Python environment (conda is recommended):

bash conda create -n article-graph-3.11 python=3.11 conda activate article-graph-3.11

  1. Install all the dependencies:

bash pip install -r requirements_app.txt

  1. Execute the application:

bash python Start.py

By default, the KG generated in the examples/article_graph.ipynb is loaded in a remote server http://yordi111nas.synology.me:3030/articles/query

Running the Application with a custom KG Hosted Locally

If you want to try the application with another graph generated locally, here you will find all the instructions necessary for running it.

  1. Clone the repository:

git clone https://github.com/JorgeMIng/Article_Graph cd Article_Graph

  1. Create a Python environment (conda is recommended):

bash conda create -n article-graph-3.11 python=3.11 conda activate article-graph-3.11

  1. Install all the dependencies:

bash pip install -r requirements_app.txt

  1. Host the KG in Jena Fuseki with Docker:

bash docker run -p 3030:3030 stain/jena-fuseki

  1. Execute the application:

bash python Start.py

  1. Go to the Settings section and configure the remote server.

Running the Experiments

If you want to reproduce the experiments by yourself, here you will find all the instructions necessary for running them.

  1. Clone the repository:

git clone https://github.com/JorgeMIng/Article_Graph cd Article_Graph

  1. Create a Python environment (conda is recommended):

bash conda create -n article-graph-3.11 python=3.11 conda activate article-graph-3.11

  1. Install all the dependencies:

bash pip install -r requirements.txt

  1. Run the example notebook at examples/article_graph.ipynb

Examples

  • Full KG Generation: examples/article_graph.ipynb
  • Similarity Analysis: examples/examples_similarity.ipynb
  • Topic Modeling: examples/topic_modeling.ipynb
  • NER Analysis and Project extracton: ner/extract_element.ipynb

License

Please refer to the LICENSE file.

Authors

  • Jorge Martn Izquierdo
  • Gloria Cumia Espinosa de los Monteros
  • Marco Ciccal Baztn

Owner

  • Login: JorgeMIng
  • Kind: user

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "license": "https://spdx.org/licenses/Apache-2.0",
  "codeRepository": "git+https://github.com/JorgeMIng/Article_Graph",
  "dateCreated": "2024-04-11",
  "datePublished": "22-05-2024",
  "dateModified": "22-05-2024",
  "name": "Article_Graph",
  "version": "1.0.0",
  "identifier": "https://zenodo.org/doi/10.5281/zenodo.11242795",
  "description": "Article_Graph is a tool that extracts and enriches information from a set of academic papers and journals. It makes use of advanced and powerful machine learning tools to extract as much information as possible. Also, it uses Grobid to extract all the relevant information about the papers. The final output of this experiment is a RDF Graph that includes all the extracted and reconciled information about the papers and their relations. A simple application is also available to visualize and interact with the KG.",
  "applicationCategory": "Research",
  "developmentStatus": "active",
  "keywords": [
    "text",
    "research",
    "analysis",
    "python",
    "grobid",
    "bert",
    "lda",
    "topic-modeling",
    "ner"
  ],
  "programmingLanguage": [
    "Python 3"
  ],
  "operatingSystem": [
    "Linux",
    "Windows",
    "MacOS"
  ],
  "softwareRequirements": [
    "Python 3",
    "Grobid"
  ],
  "author": [
    {
      "@type": "Person",
      "@id": "https://orcid.org/0009-0005-7696-8995",
      "givenName": "Jorge",
      "familyName": "Martin",
      "email": "jorge.martin.izquierdo@alumnos.upm.es"
    },
    {
      "@type": "Person",
      "@id": "https://orcid.org/0009-0004-8215-9978",
      "givenName": "Gloria",
      "familyName": "Cumia",
      "email": "gloria.cumia@alumnos.upm.es"
    },
    {
      "@type": "Person",
      "@id": "https://orcid.org/0009-0000-8821-0587",
      "givenName": "Marco",
      "familyName": "Ciccal",
      "email": "marcociccalebaztan@gmail.com"
    }
  ]
}

GitHub Events

Total
Last Year

Dependencies

docs/requirements.txt pypi
  • sphinx ==7.1.2
  • sphinx-rtd-theme ==1.3.0rc1
  • sphinx_mdinclude ==0.5.3
requirements.txt pypi
  • SPARQLWrapper ==2.0.0
  • folium ==0.16.0
  • gensim ==4.3.2
  • omegaconf ==2.3.0
  • pandas ==1.5.2
  • rdflib ==7.0.0
  • reconciler ==0.2.2
  • rocrate ==0.10.0
  • scikit-learn ==1.3.2
  • sentence-transformers ==2.2.2
  • streamlit ==1.28.0
  • streamlit-agraph ==0.0.45
  • streamlit-extras ==0.3.1
  • streamlit-lottie ==0.0.5
  • streamlit_folium ==0.20.0
  • transformers ==4.34.1
requirements_app.txt pypi
  • SPARQLWrapper ==2.0.0
  • folium ==0.16.0
  • omegaconf ==2.3.0
  • pandas ==1.5.2
  • streamlit ==1.28.0
  • streamlit-agraph ==0.0.45
  • streamlit-extras ==0.3.1
  • streamlit-lottie ==0.0.5
  • streamlit_folium ==0.20.0