openscience

Repository for Artificial Intelligence And Open Science In Research Software Engineering deliverables

https://github.com/mrgg14/openscience

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.4%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Repository for Artificial Intelligence And Open Science In Research Software Engineering deliverables

Basic Info
  • Host: GitHub
  • Owner: MrGG14
  • License: apache-2.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 89.3 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 2
  • Open Issues: 0
  • Releases: 5
Created about 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation Codemeta

README.md

OpenScience

DOI Documentation Status

## Description

Repository for Artificial Intelligence And Open Science. The aim of this repository is to create a Gobrid client which will perform an analysis over 30 open-access articles and:

  • Extract metadata such as abstract, authors, publication date, referenced authors and organizactions.

  • Compare papers assigning a similitude metric between them.

  • Topic modelling using LDA.

  • Make local Knlowledge Graph in RDF.

  • Expanding our Knowledge Graph with external information.

## Requirements - Python >= 3.5

  • Gobrid library (uses requests as dependency beyond the Standard Python Library).

  • Docker

    Installation instructions

    First of all you need to install Gobrid from Docker as specified in Gobrid´s containers documentation. CRF-only image is recomended.

To install initialize the docker daemon and execute: docker pull lfoppiano/grobid:0.8.0

Then you should clone this repository locally.

Finally, you may need to install the necessary packages to run the code. In order to do this go to the github cloned repo through the CMD and go to the '/docs' folder (cd docs). Now just run:

pip install -r requirements.txt

You can also install Gobrid´s client for python following the instructions.

## Execution instructions

### LOCALLY Initilize Gobrid:

Once Gobrid is up and running you just need to place the papers you want to analyze in the 'papers' folder.

Finally just execute the 'main.py' file. The outputs generated will be in the 'output' folder.

After doing that we have to run the 'modelling.ipynb', which would collect and process all the data we need to create our KG, storing it in 'papers_metadata.pickle'file.

Once you have the dictionary created you have to run the 'dicttordf.py' file, this file will convert all the information collected in the previous part to rdf format, it will also enrich our KG with information from Wikidata and ORCID, using their APIs, storing the KG in the 'knowledgegraphlinked.rdf' file.

Lastly you need to use the 'querys.ipynb' program to run queries from any SPARQL query web.

## Running example We will run an example using 10 Deep Learning papers in PDF format located in the 'papers' folder.

We just need to execute the main.py file and we obtain:

After doing that we have to run the 'modelling.ipynb', which will extract all paper metadata and will enrich the data with simiitude metrics between papers and topic modelling.

Next, to generate our Knowledge graph and to merge our local graph with entities from wikidata and orcid we need to run 'dicttordf.py' to seriealize in RDF the data extracted previously.

Finally, we can make the querys we like with 'querys.ipynb'. There are some examples in the file, but you can make any query you like.

## Preferred citation Read CFF ## Where to get help Gobrid´s documentation here Docker´s documentation here ## Problems There seems to be a problem to install Gobrid´s python client from Docker. In the Dockerfile we try to install it following Gobrid´s steps (setup.py install) and directly with pip and neither of them seem to install it correctly. However locally works with both, we dont know where this error must come from.

Owner

  • Name: Nico Vega
  • Login: MrGG14
  • Kind: user

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Open Science assesment 1
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Nicolas
    family-names: Vega Muñoz
    email: nicovegamunoz@gmail.com
  - given-names: Jorge
    family-names: Saenz de Miera Marzo
    email: 
  - given-names: Daniel
    family-names: Fernandez Gomez
    email: danifg42@gmail.com
identifiers:
  - type: doi
    value: 10.5281/zenodo.11244385
  - type: url
    value: 'https://github.com/MrGG14/OpenScience'
repository-code: 'https://github.com/MrGG14/OpenScience'
abstract: Analysis over 10 open-access articles using Grobid
keywords:
  - Gobrid
  - keyword cloud
  - Open Science
license: Apache-2.0
version: '1'
date-released: '2024-02-25'

CodeMeta (Codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "license": "https://spdx.org/licenses/Apache-2.0",
  "codeRepository": "https://github.com/MrGG14/OpenScience",
  "dateCreated": "2024-02-25",
  "datePublished": "2024-02-25",
  "name": "Gobrid_Article_Analizer",
  "version": "1.0.0",
  "identifier": "10.5281/zenodo.10702189",
  "description": "This software performs an analysis over 10\nopen-access articles being able to draw a keyword cloud based on the words found in the abstract of your papers, create a visualization showing the number of figures per article and create a list of the links found in each paper.",
  "applicationCategory": "Education",
  "keywords": [
    "open science",
    "article",
    "paper",
    "open-access"
  ],
  "programmingLanguage": [
    "Python"
  ],
  "author": [
    {
      "@type": "Person",
      "givenName": "Nicolas",
      "familyName": "Vega"
    }
  ]
}

GitHub Events

Total
Last Year

Dependencies

Dockerfile docker
  • python 3.11-buster build
docker-compose.yml docker
  • lfoppiano/grobid 0.8.0
docs/requirements.txt pypi
  • et-xmlfile ==1.1.0
  • grobid_client ==0.8.5
  • matplotlib ==3.7.1
  • wordcloud ==1.9.3