Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: infoqualitylab
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 46.9 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 2
  • Open Issues: 1
  • Releases: 0
Created about 4 years ago · Last pushed over 3 years ago
Metadata Files
Readme Citation

README.md

Author Network Analysis

ExRx

The ExRx folder contains the below folders - 1. raw data : Contains the raw dataset (csv files) along with data extracted through Scopus.
2. data : This folder contains the databank depoist new and old version of data 3. ExRx Data Quality : Contains the below mentioned python codes and output used to generate a final datafile - File exrxauthor.csv : After scopus retrieved author details, manual check and inclusion of author id's for papers that were not retrieved through scopus article name/doi - dataqualitytitle.ipynb - Match the titles of the articles in both the Articlelist.csv and scopus retreived exrxauthor.csv files. - dataqualityauthoridcheck.ipynb - Check oversplitting and undersplittiong issue, authors with same name have different id's. After the manual changes were made a final file exrxauthorfinal.csv created by Manasi on 4/2/2022. - authornamecomponentcheck.ipynb - Checked each cluster in the visualization to make sure that there are no suspiciously similar names in the same cluster. - Final manual changes were made on the exrxauthorfinal.csv file and the dataset is in the folder data/databankdepositv2 as of 4/9/2022.

  1. CoAuthor Network Visualization : The folder contains below two folders.
  • python_code -
    • Generating co-author network.ipynb - A python script generates co-author visalizations (an entire graph and a main connected component graph. There are other tables that help understand the nodes and co-author network in a broader way. i.e. Degree centrality, betweness centrality, closenesscentrality, eigenvectorcentrality, effective_size and top 10 authors with most number of co-authors.
    • edgenodecolorreviewstudy_exrx.ipynb - Python script reading the data from authors and article list datasets. Replacing the different names of authors with same id with the same name. The visualizations are split into two edge color and node color. Edge color - edge color displaying the review article percentage i.e. co-authors with 100% review or publisher Node color - Calculated the review article type percent for each author i.e. node in the graph. Displays if the author/node of the graph is a 100% reviewer/publisher.
    • edgenodecolorpublicationsexrx.ipynb - Python script reading the data from authors and article list datasets. Replacing the different names of authors with same id with the same name. The difference is the authors included in the graph depends on the number of publications for each author. A threshold is set to 3 and authors with 3 or more publications are included in the graph as nodes. The visualizations are split into two edge color and node color. Edge color - edge color displaying the review article percentage i.e. co-authors with 100% review or publisher Node color - Calculated the review article type percent for each author i.e. node in the graph. Displays if the author/node of the graph is a 100% reviewer/publisher.

color for the graphs edge and node can be changed in the edge_cmap(edge) or cmap (node) section as mentioned in the comment for the code.

  • Visualizations - Output for each generated graph/visualizations from the above mentioned python notebooks is present in this folder.

Salt

The ExRx folder contains the below folders - 1. data_prep : Contains the raw dataset (csv files) along with data extracted through Scopus.
2. data : This folder contains the databank depoist new and old version of datasets after correction. 3. data quality : Contains the below mentioned python codes and output used to generate a final datafile - File aftercorrection.csv : After scopus retrieved author details, manual check and inclusion of author id's for papers that were not retrieved through scopus article name/doi - dataqualitycheckzoiquestion.ipynb - Match the titles of the articles in both the Articlelist.csv and scopus retreived exrxauthor.csv files. - undersplittingchecking.csv - Check oversplitting and undersplitting issue, authors with same name have different id's. - dataqualitycheck.ipynb - Checked each cluster in the visualization to make sure that there are no suspiciously similar names in the same cluster. - Final manual changes were made on the aftercorrections.csv file and the dataset is in the folder data/databankdepositv2 as of 4/9/2022.

  1. CoAuthor Network Visualization : The folder contains below two folders.
  • python_code -

    • Generating co-author network-salt.ipynb - A python script generates co-author visalizations (an entire graph and a main connected component graph. There are other tables that help understand the nodes and co-author network in a broader way. i.e. Degree centrality, betweness centrality, closenesscentrality, eigenvectorcentrality, effective_size and top 10 authors with most number of co-authors.
    • edgenodecolorreviewstudy_salt.ipynb - Python script reading the data from authors and article list datasets. Replacing the different names of authors with same id with the same name. The visualizations are split into two edge color and node color. Edge color - edge color displaying the review article percentage i.e. co-authors with 100% review or publisher Node color - Calculated the review article type percent for each author i.e. node in the graph. Displays if the author/node of the graph is a 100% reviewer/publisher.
    • edgenodecolorpublicationssalt.ipynb - Python script reading the data from authors and article list datasets. Replacing the different names of authors with same id with the same name. The difference is the authors included in the graph depends on the number of publications for each author. A threshold is set to 3 and authors with 3 or more publications are included in the graph as nodes. The visualizations are split into two edge color and node color. Edge color - edge color displaying the review article percentage i.e. co-authors with 100% review or publisher Node color - Calculated the review article type percent for each author i.e. node in the graph. Displays if the author/node of the graph is a 100% reviewer/publisher.

    color for the graphs edge and node can be changed in the edge_cmap(edge) or cmap (node) section as mentioned in the comment for the code.

  • Visualizations - Output for each generated graph/visualizations from the above mentioned python notebooks is present in this folder.

  1. community_detection ?? (Yuanxi)

Owner

  • Name: InfoQualityLab
  • Login: infoqualitylab
  • Kind: organization

Information Quality Lab

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Manasi"
  given-names: "Joshi"
  orcid: "https://orcid.org/0000-0003-2176-4870"
- family-names: "Yuanxi"
  given-names: "Fu"
  orcid: "https://orcid.org/0000-0003-2726-0999"
title: "Author Network Analysis"
version: 1.0.0
doi:
date-released: 2022-05-16
url: "https://github.com/infoqualitylab/author_network_analysis"

GitHub Events

Total
  • Fork event: 1
Last Year
  • Fork event: 1