2023-2024-atreides-code

The repository for the team Atreides of the Open Science course a.a. 2023/2024

https://github.com/open-sci/2023-2024-atreides-code

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 34 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.8%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

The repository for the team Atreides of the Open Science course a.a. 2023/2024

Basic Info
  • Host: GitHub
  • Owner: open-sci
  • License: isc
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 19.3 MB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 5
Created about 2 years ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.md

2023-2024-atreides-code

The repository for the team Atreides of the Open Science course a.a. 2023/2024

DOI SWH

Usage

Installation

```sh

Clone the repository

git clone https://github.com/open-sci/2023-2024-atreides-code

Move to the repository folder

cd 2023-2024-atreides-code

Install required dependencies using uv

uv install options: https://docs.astral.sh/uv/getting-started/installation/

uv sync

Activate the virtual environment

source .venv/bin/activate ```

Run the software

Create the necessary datasets ('IRIS in Meta' and 'IRIS in Index' are required to answer the research questions) by running the following command:

sh python3 -m scripts.create_datasets -meta <path_to_meta_zip> -iris <path_to_iris_zip> [-index <path_to_index_zip>] <dataset_of_choice> [--year_cutoff <year>]

Arguments

  • -meta, --meta_path: Required. The path to the folder (or zip file) containing the OpenCitations Meta dump.
  • -iris, --iris_path: Required. The path to the folder (or zip file) containing the IRIS CSV files.
  • -index, --index_path: The path to the OpenCitations Index dump folder (or zip).
  • -iim, --iris_in_meta: Create the "Iris In Meta" dataset, which contains all the entities with external IDs in IRIS that are in Meta.
  • -iii, --iris_in_index: Create the "Iris In Index" dataset, which contains all the entities with external IDs in IRIS that are in the OpenCitations Index.
  • -inim, --iris_not_in_meta: Create the "Iris Not In Meta" dataset, which contains all the entities with external IDs in IRIS that are not in Meta.
  • -inoid, --iris_no_id: Create the "Iris No ID" dataset, which contains all the entities with no external IDs in IRIS.
  • -yc, --year-cutoff: (Optional) Specify a year cutoff for the mapping of IRIS data. Only entities published prior or during this year will be included in the new datasets.
  • --search_for_titles: (Experimental) Try to reconcile the IRIS entities without PIDs using their title in OC Meta. This can take around 3 hours to complete.

Alternatively, you can download the processed datasets from the links provided below and place them in the data/ directory of the repository folder.

Use the following command to get the answers to the research questions:

sh python3 -m scripts.answer_research_questions [-rq <research_question_number>]

  • -rq <research_question_number>: (Optional) Specify the research question number to answer a specific question.

For more detailed guidelines consult the protocol for the software:

protocols.io

Research questions:

1) What is the coverage of the publications available in IRIS, that strictly concern research conducted within the University of Bologna, in OpenCitations Meta? 2) What are the types of publications that are better covered in OpenCitations Meta? 3) What is the amount of citations (according to OpenCitations Index) the IRIS publications included in OpenCitations Meta are involved in (as citing entity and as cited entity)? 4) How many of these citations come from and go to publications not included in IRIS? 5) How many of these citations involve publications in IRIS as both citing and cited entities?

Download original datasets

Output datasets

Owner

  • Name: open-sci
  • Login: open-sci
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software in your research, please cite it using the metadata below."
title: "2023-2024-atreides-code"
version: "1.0.0"
doi: "10.5281/zenodo.11262416"
date-released: 2024-05-28
authors:
  - family-names: Zilli
    given-names: Leonardo
    orcid: https://orcid.org/0009-0007-4127-4875
    affiliation: University of Bologna
  - family-names: Andreose
    given-names: Erica
    orcid: https://orcid.org/0009-0003-7124-9639
    affiliation: University of Bologna
  - family-names: Di Marzo
    given-names: Salvatore
    orcid: https://orcid.org/0009-0006-0853-1772
    affiliation: University of Bologna

repository-code: https://github.com/open-sci/2023-2024-atreides-code
url: https://github.com/open-sci/2023-2024-atreides-code
license: ISC
keywords:
  - Open Science
  - OpenCitations
  - Bibliographic Metadata
  - Research Software

GitHub Events

Total
  • Release event: 1
  • Push event: 13
  • Create event: 1
Last Year
  • Release event: 1
  • Push event: 13
  • Create event: 1

Dependencies

requirements.txt pypi
  • Requests ==2.31.0
  • SPARQLWrapper ==2.0.0
  • pandas ==2.0.3
  • polars ==0.20.24
  • tqdm ==4.65.0