https://github.com/bluebrain/citation-graph

Blue Brain Citation & Knowledge Graph implementation in Neo4j

https://github.com/bluebrain/citation-graph

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.4%) to scientific vocabulary

Keywords

citation-network knowledge-graph neo4j
Last synced: 6 months ago · JSON representation

Repository

Blue Brain Citation & Knowledge Graph implementation in Neo4j

Basic Info
  • Host: GitHub
  • Owner: BlueBrain
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 9.76 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 2
  • Open Issues: 0
  • Releases: 1
Topics
citation-network knowledge-graph neo4j
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog License Authors

README.md

Blue Brain Citation Graph

3D Force Graph

Table of Contents

Introduction

Generating or Loading the Database

Gallery

Funding and Acknowledgement

Introduction

The Blue Brain Citation Graph leverages advanced Neo4j technologies to enhance the exploration and analysis of citation data. Key features include:

  • Perspectives: These are specialized views that focus on different aspects of the knowledge graph. In this repository, you will find several perspectives under src/citations/perspectives/:

    • BBP or Not Perspective: This perspective helps in distinguishing between articles published by the Blue Brain Project and those by other collaborators.
    • Timeview Perspective: This perspective provides a temporal view of the citation data, allowing users to analyze trends and changes over time.
    • Topics Perspective: This perspective focuses on thematic clustering, enabling users to explore articles based on their subject matter.
  • Search Phrases: These are predefined queries that facilitate quick access to specific data points within the graph. They are designed to help users efficiently navigate the vast amount of information stored in the database.

  • Scene Actions: These are interactive elements that allow users to manipulate and explore the graph dynamically. Scene actions can include filtering nodes, highlighting specific paths, or triggering animations to better visualize relationships and patterns within the data.

By utilizing these Neo4j technologies, the Blue Brain Citation Graph provides a robust framework for researchers and analysts to gain deeper insights into the citation landscape of the Blue Brain Project and its collaborators.

Generating or Loading the Database

When working with the Blue Brain Citation Graph, you have two primary options: generating the database from scratch or loading an existing database.

Generating the Database

To generate the database, you will need to follow a series of steps that involve gathering articles, authors, and citation data, embedding articles, clustering, and performing dimension reduction. This process requires access to external APIs, specifically the SERP API and the OpenAI API. Please be aware that using these APIs may incur costs, as they are not free services. The generation process is comprehensive and allows for a customized and up-to-date database tailored to your specific needs.

For detailed instructions on creating the database, please refer to the step by step tutorial. But first install the necessary packages in a fresh virtual environment with:

bash pip install .

or

bash pip install -e .

We explain a comprehensive guide on gathering articles and authors, fetching citations, embedding articles, clustering, dimension reduction, and integrating data into Neo4j. It also includes additional steps for generating and integrating keywords into the database.

Loading the Database

In your neo4j desktop, you can create a new project, then import the neo4j.dump file by clicking add -> File and once its loaded, you can click on ... next to the imported file and select create new DBMS from dump.

After the loading process is complete, you can start your Neo4j database.

To use perspectives into Neo4j Bloom, you can import them from here for enhaced user experience.

Gallery

Below are some visualizations generated from the citation graph data:

Author Works on Keyword Visualization of author works on specific keywords.

Author Collaboration Network A network visualization of author collaborations extracted from the citation data.

Keyword Co-occurrence Co-occurrence of keywords extracted from articles, highlighting thematic groupings.

Top 3 Keywords Per Year (Node Weighted) Top 3 keywords per year with node weighting.

Top 3 Keywords Per Year (Weighted) Top 3 keywords per year with weighting.

UMAP Cluster Louvain UMAP clustering using the Louvain method.

These images provide a glimpse into the complex relationships and structures within the citation graph, offering insights into the research landscape.

Funding and Acknowledgement

The development of this software was supported by funding to the Blue Brain Project, a research center of the École polytechnique fédérale de Lausanne (EPFL), from the Swiss government’s ETH Board of the Swiss Federal Institutes of Technology.

Copyright (c) 2024 Blue Brain Project/EPFL

Owner

  • Name: The Blue Brain Project
  • Login: BlueBrain
  • Kind: organization
  • Email: bbp.opensource@epfl.ch
  • Location: Geneva, Switzerland

Open Source Software produced and used by the Blue Brain Project

GitHub Events

Total
  • Public event: 1
  • Push event: 12
  • Pull request event: 2
  • Fork event: 1
  • Create event: 1
Last Year
  • Public event: 1
  • Push event: 12
  • Pull request event: 2
  • Fork event: 1
  • Create event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 5
  • Average time to close issues: N/A
  • Average time to close pull requests: about 14 hours
  • Total issue authors: 0
  • Total pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 5
  • Average time to close issues: N/A
  • Average time to close pull requests: about 14 hours
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • KeremKurban (3)
  • cszsol (3)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/ci.yaml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
pyproject.toml pypi
  • aiohttp *
  • asyncio *
  • httpx *
  • neo4j *
  • openai *
  • pandas *
  • pydantic *
  • python-dotenv *
  • pyyaml *
  • scikit-learn *
  • serpapi *
  • tqdm *