openscience
Repository for Artificial Intelligence And Open Science In Research Software Engineering deliverables
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.4%) to scientific vocabulary
Repository
Repository for Artificial Intelligence And Open Science In Research Software Engineering deliverables
Basic Info
- Host: GitHub
- Owner: MrGG14
- License: apache-2.0
- Language: Jupyter Notebook
- Default Branch: main
- Size: 89.3 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 5
Metadata Files
README.md
OpenScience
## Description
Repository for Artificial Intelligence And Open Science. The aim of this repository is to create a Gobrid client which will perform an analysis over 30 open-access articles and:
Extract metadata such as abstract, authors, publication date, referenced authors and organizactions.
Compare papers assigning a similitude metric between them.
Topic modelling using LDA.
Make local Knlowledge Graph in RDF.
Expanding our Knowledge Graph with external information.
## Requirements - Python >= 3.5
Gobrid library (uses requests as dependency beyond the Standard Python Library).
Docker
Installation instructions
First of all you need to install Gobrid from Docker as specified in Gobrid´s containers documentation. CRF-only image is recomended.
To install initialize the docker daemon and execute:
docker pull lfoppiano/grobid:0.8.0
Then you should clone this repository locally.
Finally, you may need to install the necessary packages to run the code. In order to do this go to the github cloned repo through the CMD and go to the '/docs' folder (cd docs). Now just run:
pip install -r requirements.txt
You can also install Gobrid´s client for python following the instructions.
## Execution instructions
### LOCALLY Initilize Gobrid:
Once Gobrid is up and running you just need to place the papers you want to analyze in the 'papers' folder.
Finally just execute the 'main.py' file. The outputs generated will be in the 'output' folder.
After doing that we have to run the 'modelling.ipynb', which would collect and process all the data we need to create our KG, storing it in 'papers_metadata.pickle'file.
Once you have the dictionary created you have to run the 'dicttordf.py' file, this file will convert all the information collected in the previous part to rdf format, it will also enrich our KG with information from Wikidata and ORCID, using their APIs, storing the KG in the 'knowledgegraphlinked.rdf' file.
Lastly you need to use the 'querys.ipynb' program to run queries from any SPARQL query web.
## Running example We will run an example using 10 Deep Learning papers in PDF format located in the 'papers' folder.
We just need to execute the main.py file and we obtain:
After doing that we have to run the 'modelling.ipynb', which will extract all paper metadata and will enrich the data with simiitude metrics between papers and topic modelling.
Next, to generate our Knowledge graph and to merge our local graph with entities from wikidata and orcid we need to run 'dicttordf.py' to seriealize in RDF the data extracted previously.
Finally, we can make the querys we like with 'querys.ipynb'. There are some examples in the file, but you can make any query you like.
## Preferred citation Read CFF ## Where to get help Gobrid´s documentation here Docker´s documentation here ## Problems There seems to be a problem to install Gobrid´s python client from Docker. In the Dockerfile we try to install it following Gobrid´s steps (setup.py install) and directly with pip and neither of them seem to install it correctly. However locally works with both, we dont know where this error must come from.
Owner
- Name: Nico Vega
- Login: MrGG14
- Kind: user
- Repositories: 2
- Profile: https://github.com/MrGG14
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: Open Science assesment 1
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Nicolas
family-names: Vega Muñoz
email: nicovegamunoz@gmail.com
- given-names: Jorge
family-names: Saenz de Miera Marzo
email:
- given-names: Daniel
family-names: Fernandez Gomez
email: danifg42@gmail.com
identifiers:
- type: doi
value: 10.5281/zenodo.11244385
- type: url
value: 'https://github.com/MrGG14/OpenScience'
repository-code: 'https://github.com/MrGG14/OpenScience'
abstract: Analysis over 10 open-access articles using Grobid
keywords:
- Gobrid
- keyword cloud
- Open Science
license: Apache-2.0
version: '1'
date-released: '2024-02-25'
CodeMeta (Codemeta.json)
{
"@context": "https://doi.org/10.5063/schema/codemeta-2.0",
"@type": "SoftwareSourceCode",
"license": "https://spdx.org/licenses/Apache-2.0",
"codeRepository": "https://github.com/MrGG14/OpenScience",
"dateCreated": "2024-02-25",
"datePublished": "2024-02-25",
"name": "Gobrid_Article_Analizer",
"version": "1.0.0",
"identifier": "10.5281/zenodo.10702189",
"description": "This software performs an analysis over 10\nopen-access articles being able to draw a keyword cloud based on the words found in the abstract of your papers, create a visualization showing the number of figures per article and create a list of the links found in each paper.",
"applicationCategory": "Education",
"keywords": [
"open science",
"article",
"paper",
"open-access"
],
"programmingLanguage": [
"Python"
],
"author": [
{
"@type": "Person",
"givenName": "Nicolas",
"familyName": "Vega"
}
]
}
GitHub Events
Total
Last Year
Dependencies
- python 3.11-buster build
- lfoppiano/grobid 0.8.0
- et-xmlfile ==1.1.0
- grobid_client ==0.8.5
- matplotlib ==3.7.1
- wordcloud ==1.9.3