inspire-hep

Fetch citations and self-excluded citations for each article

https://github.com/xju2/inspire-hep

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.5%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Fetch citations and self-excluded citations for each article

Basic Info
  • Host: GitHub
  • Owner: xju2
  • Language: TeX
  • Default Branch: main
  • Size: 314 KB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 5 years ago · Last pushed 11 months ago
Metadata Files
Readme Citation

README.md

inspire-hep

Inspire-HEP API is a RESTful API for HEP literature. This repo is a collection of scripts to interact with the API.

Fetch citations and self-excluded citations for each article listed in mypub.py, create bibtex file mypub.bib, and create a csv files with columns of texkeys, arxiv_eprints, preprint_date, citation_count, citation_count_without_self_citations, doi, title.

Instructions:

In Faraday, conda-start tuning.

  • write down the HEP Inspire-ID in mypub.py
  • run python update.py -w 4 --mode u It will produce two files mypub.bib and mypub.csv.

Get bib

bash python get_bib.py test_bib.txt

Get total citation count

bash python get_citation_count.py publications/mypub.csv

Example of json file from Inspire-API

example.json

Owner

  • Name: Xiangyang Ju
  • Login: xju2
  • Kind: user
  • Location: Berkeley, CA
  • Company: Lawrence Berkeley National Laboratory

Ph.D in Physics from the University of Wisconsin-Madison.

Citation (citations.py)

#!/usr/bin/env python

import urllib.request
import json
import bib as bibHelper
import pprint

pp = pprint.PrettyPrinter(indent=2)

def citations(id_type, id_value, debug=False):
    inspire_api = f'https://inspirehep.net/api/{id_type}/{id_value}'
    if debug:
        print(f"launching API: {inspire_api}")
    data = json.loads(urllib.request.urlopen(inspire_api).read())
    if debug:
        pp.pprint(data)
    meta = data['metadata']
    bibtex_url = data['links']['bibtex']
    bibtex = urllib.request.urlopen(bibtex_url).read()
    if type(bibtex) is bytes:
        bibtex = bibtex.decode("utf-8")
        bibtex = bibHelper.correct_lhc_authors(bibtex)

    # authors
    authors = [author["full_name"] for author in meta['authors']]
    authors = ", ".join(authors)

    # maybe the paper is not submitted to a journal yet
    try:
        doi = meta['dois'][0]['value']
    except KeyError:
        doi = "N/A"

    # maybe the paper is not availble on arxiv
    try:
        arxiv_eprint = meta['arxiv_eprints'][0]['value']
        arxiv_category = meta['arxiv_eprints'][0]['categories'][0]
        preprint_date = meta['preprint_date']
    except KeyError:
        print("No arxiv info found for INSPIRE ID: ", id_value)
        arxiv_eprint = arxiv_category = preprint_date = "N/A"

    # replace preprint date with the publication date
    try:
        preprint_date = meta['imprint'][0]['date']
    except KeyError:
        pass

    return {
        "texkeys": meta['texkeys'][0],
        "citation_count": meta['citation_count'],
        "citation_count_without_self_citations": meta['citation_count_without_self_citations'],
        "doi": doi,
        "arxiv_eprints": arxiv_eprint,
        "arxiv_category": arxiv_category,
        'preprint_date': preprint_date,
        "title": meta['titles'][0]['title'],
        "bibtex": bibtex,
        "inspire_id": data['id'],
        "authors": authors,
    }

if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser(description="Fetch citations for HEP articles")
    add_arg = parser.add_argument
    add_arg("--id-type", help="identification type",
            choices=['literature', 'doi', 'arxiv'], default='literature')
    add_arg("--id-value", help='identfication value', default=1851403)
    add_arg('-d', '--debug', help='debug mode', action='store_true')

    args = parser.parse_args()
    res = citations(args.id_type, args.id_value, args.debug)
    print("\n")
    pp.pprint(res)

GitHub Events

Total
  • Push event: 2
Last Year
  • Push event: 2

Dependencies

requirements.txt pypi
  • bibtexparser *