tracking_citations
A tool that tracks the information of citations for a group/project, based on the info from Google Scholar
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.1%) to scientific vocabulary
Last synced: 6 months ago
·
JSON representation
·
Repository
A tool that tracks the information of citations for a group/project, based on the info from Google Scholar
Basic Info
- Host: GitHub
- Owner: yangha7
- Language: Jupyter Notebook
- Default Branch: main
- Size: 40 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Created 10 months ago
· Last pushed 6 months ago
Metadata Files
Readme
Citation
README.md
Tracking_Citations
A tool that tracks the information of citations for a group/project, based on the info from Google Scholar
- The DIALS related publication list can be found and updated here.
- Save the list of Google Scholar links in urls.txt
- Citationbyyear.ipynb scrapes citation numbers by year from Google Scholar. It obtains citation number per year for each individual paper, and saves them in a csv file in a database format. It is recommended to run the script locally through Jupyter notebook rather than on JupyterHub.
- The csv file can be later processed in Microsoft PowerBi to visualize the results. An example of the results looks like the following. You can click individual papers or years to see their contributions.
📊 View DIALS PDB Deposition Dashboard
[Click here to open the interactive dashboard](https://app.powerbi.com/view?r=eyJrIjoiZWQxYzQ3OGUtZGIwYS00NDZmLTk1YjctNDU1YmViNTI5ZDNjIiwidCI6IjM5NjU3M2NiLWYzNzgtNGI2OC05YmM4LTE1NzU1YzBjNTFmMyIsImMiOjZ9)Owner
- Name: Yang Ha
- Login: yangha7
- Kind: user
- Repositories: 1
- Profile: https://github.com/yangha7
Citation (Citation_By_Year_V3.ipynb)
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"from bs4 import BeautifulSoup\n",
"import time\n",
"import csv\n",
"from datetime import datetime\n",
"import os\n",
"import re\n",
"\n",
"def parse_citations(url):\n",
" headers = {\n",
" 'User-Agent': 'Mozilla/5.0'\n",
" }\n",
"\n",
" try:\n",
" response = requests.get(url, headers=headers)\n",
" response.raise_for_status()\n",
" soup = BeautifulSoup(response.text, 'html.parser')\n",
"\n",
" # Extract title\n",
" title_element = soup.find('div', id='gsc_oci_title')\n",
" title = title_element.text.strip() if title_element else \"Title not found\"\n",
"\n",
" # Extract citation data from bars\n",
" citation_links = soup.find_all('a', class_='gsc_oci_g_a')\n",
" year_count_map = {}\n",
" for link in citation_links:\n",
" href = link.get('href', '')\n",
" match = re.search(r'as_ylo=(\\d+)&as_yhi=(\\d+)', href)\n",
" if match:\n",
" year = match.group(1)\n",
" count_span = link.find('span', class_='gsc_oci_g_al')\n",
" count = int(count_span.text.strip()) if count_span else 0\n",
" year_count_map[year] = count\n",
"\n",
" if not year_count_map:\n",
" print(\"⚠️ No citation data found.\")\n",
" return []\n",
"\n",
" # Fill in zeroes for missing years\n",
" min_year = min(map(int, year_count_map))\n",
" max_year = max(map(int, year_count_map))\n",
" citation_data = {str(y): year_count_map.get(str(y), 0) for y in range(min_year, max_year + 1)}\n",
"\n",
" # Prepare long-format rows\n",
" rows = []\n",
" for year, count in sorted(citation_data.items()):\n",
" rows.append({'Title': title, 'Year': year, 'Citations': count})\n",
"\n",
" return rows\n",
"\n",
" except Exception as e:\n",
" print(f\"❌ Error parsing {url}: {e}\")\n",
" return []\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def parse_all_from_file(input_file='urls.txt', output_dir='.'):\n",
" date_str = datetime.now().strftime('%Y-%m-%d')\n",
" output_file = os.path.join(output_dir, f\"citations_{date_str}.csv\")\n",
"\n",
" all_rows = []\n",
"\n",
" with open(input_file, 'r') as f:\n",
" urls = [line.strip() for line in f if line.strip()]\n",
"\n",
" for i, url in enumerate(urls, 1):\n",
" print(f\"\\n➡️ Parsing {i}/{len(urls)}: {url}\")\n",
" rows = parse_citations(url)\n",
" all_rows.extend(rows)\n",
" time.sleep(10)\n",
"\n",
" # Write combined CSV\n",
" if all_rows:\n",
" with open(output_file, 'w', newline='', encoding='utf-8') as f:\n",
" writer = csv.DictWriter(f, fieldnames=['Title', 'Year', 'Citations'])\n",
" writer.writeheader()\n",
" writer.writerows(all_rows)\n",
" print(f\"\\n✅ Saved all results to: {output_file}\")\n",
" else:\n",
" print(\"⚠️ No data to write.\")\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"➡️ Parsing 1/38: google scholar link\n",
"❌ Error parsing google scholar link: Invalid URL 'google scholar link': No scheme supplied. Perhaps you meant https://google scholar link?\n",
"\n",
"➡️ Parsing 2/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&citation_for_view=ZfF1SeUAAAAJ:7H_MAutzIkAC\n",
"\n",
"➡️ Parsing 3/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&cstart=20&pagesize=80&citation_for_view=ZfF1SeUAAAAJ:wKETBy42zhYC\n",
"\n",
"➡️ Parsing 4/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&citation_for_view=ZfF1SeUAAAAJ:vDZJ-YLwNdEC\n",
"\n",
"➡️ Parsing 5/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&citation_for_view=ZfF1SeUAAAAJ:1taIhTC69MYC\n",
"\n",
"➡️ Parsing 6/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=20&pagesize=80&citation_for_view=f-rQ8e4AAAAJ:f2IySw72cVMC\n",
"\n",
"➡️ Parsing 7/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&cstart=20&pagesize=80&citation_for_view=ZfF1SeUAAAAJ:WHdLCjDvYFkC\n",
"\n",
"➡️ Parsing 8/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=L4uNnk0AAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=L4uNnk0AAAAJ:n3vGvpFsckwC\n",
"\n",
"➡️ Parsing 9/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=L4uNnk0AAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=L4uNnk0AAAAJ:JTqpx9DYBaYC\n",
"\n",
"➡️ Parsing 10/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=Uu0bROMAAAAJ&sortby=pubdate&citation_for_view=Uu0bROMAAAAJ:SpbeaW3--B0C\n",
"\n",
"➡️ Parsing 11/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=L4uNnk0AAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=L4uNnk0AAAAJ:ndLnGcHYRF0C\n",
"\n",
"➡️ Parsing 12/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:3s1wT3WcHBgC\n",
"\n",
"➡️ Parsing 13/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:IWHjjKOFINEC\n",
"\n",
"➡️ Parsing 14/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:dfsIfKJdRG4C\n",
"\n",
"➡️ Parsing 15/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:ldfaerwXgEUC\n",
"\n",
"➡️ Parsing 16/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=4eVETvsAAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=4eVETvsAAAAJ:NDuN12AVoxsC\n",
"\n",
"➡️ Parsing 17/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=L4uNnk0AAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=L4uNnk0AAAAJ:QsKbpXNoaWkC\n",
"\n",
"➡️ Parsing 18/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:bnK-pcrLprsC\n",
"⚠️ No citation data found.\n",
"\n",
"➡️ Parsing 19/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:BrmTIyaxlBUC\n",
"\n",
"➡️ Parsing 20/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:t6usbXjVLHcC\n",
"⚠️ No citation data found.\n",
"\n",
"➡️ Parsing 21/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:dfsIfKJdRG4C\n",
"\n",
"➡️ Parsing 22/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=ZfF1SeUAAAAJ:yqoGN6RLRZoC\n",
"\n",
"➡️ Parsing 23/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=100&pagesize=100&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:k_IJM867U9cC\n",
"\n",
"➡️ Parsing 24/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=100&pagesize=100&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:mVmsd5A6BfQC\n",
"\n",
"➡️ Parsing 25/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=100&pagesize=100&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:9yKSN-GCB0IC\n",
"\n",
"➡️ Parsing 26/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=100&pagesize=100&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:W7OEmFMy1HYC\n",
"\n",
"➡️ Parsing 27/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=ZfF1SeUAAAAJ:OTTXONDVkokC\n",
"\n",
"➡️ Parsing 28/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=100&pagesize=100&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:RHpTSmoSYBkC\n",
"\n",
"➡️ Parsing 29/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=100&pagesize=100&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:ns9cj8rnVeAC\n",
"\n",
"➡️ Parsing 30/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=ZfF1SeUAAAAJ:bKqednn6t2AC\n",
"\n",
"➡️ Parsing 31/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=zFhp8b4AAAAJ&citation_for_view=zFhp8b4AAAAJ:UeHWp8X0CEIC\n",
"\n",
"➡️ Parsing 32/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=ZfF1SeUAAAAJ:Bg7qf7VwUHIC\n",
"\n",
"➡️ Parsing 33/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=ZfF1SeUAAAAJ:GFxP56DSvIMC\n",
"\n",
"➡️ Parsing 34/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=3THzG8oAAAAJ&sortby=pubdate&citation_for_view=3THzG8oAAAAJ:qUcmZB5y_30C\n",
"⚠️ No citation data found.\n",
"\n",
"➡️ Parsing 35/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ZfF1SeUAAAAJ&sortby=pubdate&citation_for_view=ZfF1SeUAAAAJ:Br1UauaknNIC\n",
"\n",
"➡️ Parsing 36/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=YcB4CksAAAAJ&citation_for_view=YcB4CksAAAAJ:d1gkVwhDpl0C\n",
"\n",
"➡️ Parsing 37/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=zFhp8b4AAAAJ&cstart=20&pagesize=80&citation_for_view=zFhp8b4AAAAJ:WF5omc3nYNoC\n",
"⚠️ No citation data found.\n",
"\n",
"➡️ Parsing 38/38: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=f-rQ8e4AAAAJ&cstart=20&pagesize=80&sortby=pubdate&citation_for_view=f-rQ8e4AAAAJ:u5HHmVD_uO8C\n",
"\n",
"✅ Saved all results to: ./citations_2025-05-15.csv\n"
]
}
],
"source": [
"# Example usage:\n",
"if __name__ == \"__main__\":\n",
" parse_all_from_file()\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
GitHub Events
Total
- Push event: 5
- Create event: 2
Last Year
- Push event: 5
- Create event: 2