lja_python_for_librarians
Instructional material for Library Juice Academy Python For Librarians
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (1.2%) to scientific vocabulary
Last synced: 6 months ago
·
JSON representation
·
Repository
Instructional material for Library Juice Academy Python For Librarians
Basic Info
- Host: GitHub
- Owner: elibtronic
- Language: Jupyter Notebook
- Default Branch: main
- Size: 8.81 MB
Statistics
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
- Releases: 0
Created over 5 years ago
· Last pushed 7 months ago
Metadata Files
Readme
Citation
README.md
Python for Librarians
Instructional material for Library Juice Academy Python For Librarians
Week 1
Week 2
Week 3
Week 4
Owner
- Name: Tim Ribaric
- Login: elibtronic
- Kind: user
- Location: Canada
- Website: http://elibtronic.ca
- Twitter: elibtronic
- Repositories: 22
- Profile: https://github.com/elibtronic
Citation (Citation Data Prep for Week 4 Homework.ipynb)
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Citation Counts and impact factors\n",
"\n",
"Dataset from [https://datadryad.org/stash/dataset/doi:10.5061/dryad.2h4j5](https://datadryad.org/stash/dataset/doi:10.5061/dryad.2h4j5) article is [https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001675](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001675)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 1\n",
"\n",
"- Download Text file from [https://datadryad.org/stash/downloads/file_stream/30779](https://datadryad.org/stash/downloads/file_stream/30779)\n",
"\n",
"- Copy into working directory"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 2\n",
"\n",
"- open with Pandas and Normalize"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"import pandas\n"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Journal</th>\n",
" <th>Score1</th>\n",
" <th>Score2</th>\n",
" <th>IF2</th>\n",
" <th>IF5</th>\n",
" <th>NoOfScores</th>\n",
" <th>Citations</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>Immunity</td>\n",
" <td>8</td>\n",
" <td>NaN</td>\n",
" <td>24.221</td>\n",
" <td>22.133</td>\n",
" <td>1</td>\n",
" <td>220</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>Journal of the American Chemical Society</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>9.023</td>\n",
" <td>8.981</td>\n",
" <td>2</td>\n",
" <td>40</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>Science (New York)</td>\n",
" <td>8</td>\n",
" <td>10.0</td>\n",
" <td>31.377</td>\n",
" <td>31.777</td>\n",
" <td>2</td>\n",
" <td>1003</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>Gastroenterology</td>\n",
" <td>6</td>\n",
" <td>8.0</td>\n",
" <td>12.032</td>\n",
" <td>12.403</td>\n",
" <td>2</td>\n",
" <td>85</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>Nature Medicine</td>\n",
" <td>10</td>\n",
" <td>NaN</td>\n",
" <td>25.430</td>\n",
" <td>27.887</td>\n",
" <td>1</td>\n",
" <td>213</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5806</td>\n",
" <td>Cerebral Cortex</td>\n",
" <td>6</td>\n",
" <td>NaN</td>\n",
" <td>6.844</td>\n",
" <td>7.200</td>\n",
" <td>1</td>\n",
" <td>12</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5807</td>\n",
" <td>The Journal of Biological Chemistry</td>\n",
" <td>6</td>\n",
" <td>NaN</td>\n",
" <td>5.328</td>\n",
" <td>5.498</td>\n",
" <td>1</td>\n",
" <td>108</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5808</td>\n",
" <td>Molecular Biology and Evolution</td>\n",
" <td>6</td>\n",
" <td>NaN</td>\n",
" <td>5.510</td>\n",
" <td>8.907</td>\n",
" <td>1</td>\n",
" <td>37</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5809</td>\n",
" <td>The Journal of Biological Chemistry</td>\n",
" <td>8</td>\n",
" <td>NaN</td>\n",
" <td>5.328</td>\n",
" <td>5.498</td>\n",
" <td>1</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5810</td>\n",
" <td>Photochemistry and Photobiology</td>\n",
" <td>6</td>\n",
" <td>NaN</td>\n",
" <td>2.679</td>\n",
" <td>2.552</td>\n",
" <td>1</td>\n",
" <td>11</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5811 rows × 7 columns</p>\n",
"</div>"
],
"text/plain": [
" Journal Score1 Score2 IF2 \\\n",
"0 Immunity 8 NaN 24.221 \n",
"1 Journal of the American Chemical Society 6 6.0 9.023 \n",
"2 Science (New York) 8 10.0 31.377 \n",
"3 Gastroenterology 6 8.0 12.032 \n",
"4 Nature Medicine 10 NaN 25.430 \n",
"... ... ... ... ... \n",
"5806 Cerebral Cortex 6 NaN 6.844 \n",
"5807 The Journal of Biological Chemistry 6 NaN 5.328 \n",
"5808 Molecular Biology and Evolution 6 NaN 5.510 \n",
"5809 The Journal of Biological Chemistry 8 NaN 5.328 \n",
"5810 Photochemistry and Photobiology 6 NaN 2.679 \n",
"\n",
" IF5 NoOfScores Citations \n",
"0 22.133 1 220 \n",
"1 8.981 2 40 \n",
"2 31.777 2 1003 \n",
"3 12.403 2 85 \n",
"4 27.887 1 213 \n",
"... ... ... ... \n",
"5806 7.200 1 12 \n",
"5807 5.498 1 108 \n",
"5808 8.907 1 37 \n",
"5809 5.498 1 10 \n",
"5810 2.552 1 11 \n",
"\n",
"[5811 rows x 7 columns]"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#read in our text file, anything that is blank will be treated as a Null Value\n",
"citation_data = pandas.read_csv('F1000 data - Dryad.txt', sep=\"\\t\", skipinitialspace = True)\n",
"#no need to modify columns!\n",
"citation_data"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Journal</th>\n",
" <th>Score1</th>\n",
" <th>Score2</th>\n",
" <th>IF2</th>\n",
" <th>IF5</th>\n",
" <th>NoOfScores</th>\n",
" <th>Citations</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>Journal of the American Chemical Society</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>9.023</td>\n",
" <td>8.981</td>\n",
" <td>2</td>\n",
" <td>40</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>Science (New York)</td>\n",
" <td>8</td>\n",
" <td>10.0</td>\n",
" <td>31.377</td>\n",
" <td>31.777</td>\n",
" <td>2</td>\n",
" <td>1003</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>Gastroenterology</td>\n",
" <td>6</td>\n",
" <td>8.0</td>\n",
" <td>12.032</td>\n",
" <td>12.403</td>\n",
" <td>2</td>\n",
" <td>85</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>Neuron</td>\n",
" <td>8</td>\n",
" <td>8.0</td>\n",
" <td>14.027</td>\n",
" <td>14.927</td>\n",
" <td>2</td>\n",
" <td>336</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>The Journal of Cell Biology</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>9.921</td>\n",
" <td>10.123</td>\n",
" <td>2</td>\n",
" <td>177</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5753</td>\n",
" <td>Nature Immunology</td>\n",
" <td>10</td>\n",
" <td>8.0</td>\n",
" <td>25.668</td>\n",
" <td>25.934</td>\n",
" <td>2</td>\n",
" <td>125</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5774</td>\n",
" <td>Molecular and Cellular Biology</td>\n",
" <td>8</td>\n",
" <td>8.0</td>\n",
" <td>6.188</td>\n",
" <td>6.381</td>\n",
" <td>2</td>\n",
" <td>73</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5781</td>\n",
" <td>RNA (New York)</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>6.051</td>\n",
" <td>5.486</td>\n",
" <td>2</td>\n",
" <td>20</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5782</td>\n",
" <td>Molecular Biology and Evolution</td>\n",
" <td>6</td>\n",
" <td>8.0</td>\n",
" <td>5.510</td>\n",
" <td>8.907</td>\n",
" <td>2</td>\n",
" <td>11</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5791</td>\n",
" <td>The Journal of Biological Chemistry</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>5.328</td>\n",
" <td>5.498</td>\n",
" <td>2</td>\n",
" <td>30</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>1328 rows × 7 columns</p>\n",
"</div>"
],
"text/plain": [
" Journal Score1 Score2 IF2 \\\n",
"1 Journal of the American Chemical Society 6 6.0 9.023 \n",
"2 Science (New York) 8 10.0 31.377 \n",
"3 Gastroenterology 6 8.0 12.032 \n",
"5 Neuron 8 8.0 14.027 \n",
"6 The Journal of Cell Biology 6 6.0 9.921 \n",
"... ... ... ... ... \n",
"5753 Nature Immunology 10 8.0 25.668 \n",
"5774 Molecular and Cellular Biology 8 8.0 6.188 \n",
"5781 RNA (New York) 6 6.0 6.051 \n",
"5782 Molecular Biology and Evolution 6 8.0 5.510 \n",
"5791 The Journal of Biological Chemistry 6 6.0 5.328 \n",
"\n",
" IF5 NoOfScores Citations \n",
"1 8.981 2 40 \n",
"2 31.777 2 1003 \n",
"3 12.403 2 85 \n",
"5 14.927 2 336 \n",
"6 10.123 2 177 \n",
"... ... ... ... \n",
"5753 25.934 2 125 \n",
"5774 6.381 2 73 \n",
"5781 5.486 2 20 \n",
"5782 8.907 2 11 \n",
"5791 5.498 2 30 \n",
"\n",
"[1328 rows x 7 columns]"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#we drop rows with any missing values\n",
"citation_data = citation_data.dropna(how='any')\n",
"citation_data"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/tim/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" \n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Journal</th>\n",
" <th>Score1</th>\n",
" <th>Score2</th>\n",
" <th>IF2</th>\n",
" <th>IF5</th>\n",
" <th>NoOfScores</th>\n",
" <th>Citations</th>\n",
" <th>TopCitation</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>Journal of the American Chemical Society</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>9.023</td>\n",
" <td>8.981</td>\n",
" <td>2</td>\n",
" <td>40</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>Science (New York)</td>\n",
" <td>8</td>\n",
" <td>10.0</td>\n",
" <td>31.377</td>\n",
" <td>31.777</td>\n",
" <td>2</td>\n",
" <td>1003</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>Gastroenterology</td>\n",
" <td>6</td>\n",
" <td>8.0</td>\n",
" <td>12.032</td>\n",
" <td>12.403</td>\n",
" <td>2</td>\n",
" <td>85</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>Neuron</td>\n",
" <td>8</td>\n",
" <td>8.0</td>\n",
" <td>14.027</td>\n",
" <td>14.927</td>\n",
" <td>2</td>\n",
" <td>336</td>\n",
" <td>8</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>The Journal of Cell Biology</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>9.921</td>\n",
" <td>10.123</td>\n",
" <td>2</td>\n",
" <td>177</td>\n",
" <td>6</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5753</td>\n",
" <td>Nature Immunology</td>\n",
" <td>10</td>\n",
" <td>8.0</td>\n",
" <td>25.668</td>\n",
" <td>25.934</td>\n",
" <td>2</td>\n",
" <td>125</td>\n",
" <td>5</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5774</td>\n",
" <td>Molecular and Cellular Biology</td>\n",
" <td>8</td>\n",
" <td>8.0</td>\n",
" <td>6.188</td>\n",
" <td>6.381</td>\n",
" <td>2</td>\n",
" <td>73</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5781</td>\n",
" <td>RNA (New York)</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>6.051</td>\n",
" <td>5.486</td>\n",
" <td>2</td>\n",
" <td>20</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5782</td>\n",
" <td>Molecular Biology and Evolution</td>\n",
" <td>6</td>\n",
" <td>8.0</td>\n",
" <td>5.510</td>\n",
" <td>8.907</td>\n",
" <td>2</td>\n",
" <td>11</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5791</td>\n",
" <td>The Journal of Biological Chemistry</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>5.328</td>\n",
" <td>5.498</td>\n",
" <td>2</td>\n",
" <td>30</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>1328 rows × 8 columns</p>\n",
"</div>"
],
"text/plain": [
" Journal Score1 Score2 IF2 \\\n",
"1 Journal of the American Chemical Society 6 6.0 9.023 \n",
"2 Science (New York) 8 10.0 31.377 \n",
"3 Gastroenterology 6 8.0 12.032 \n",
"5 Neuron 8 8.0 14.027 \n",
"6 The Journal of Cell Biology 6 6.0 9.921 \n",
"... ... ... ... ... \n",
"5753 Nature Immunology 10 8.0 25.668 \n",
"5774 Molecular and Cellular Biology 8 8.0 6.188 \n",
"5781 RNA (New York) 6 6.0 6.051 \n",
"5782 Molecular Biology and Evolution 6 8.0 5.510 \n",
"5791 The Journal of Biological Chemistry 6 6.0 5.328 \n",
"\n",
" IF5 NoOfScores Citations TopCitation \n",
"1 8.981 2 40 1 \n",
"2 31.777 2 1003 9 \n",
"3 12.403 2 85 3 \n",
"5 14.927 2 336 8 \n",
"6 10.123 2 177 6 \n",
"... ... ... ... ... \n",
"5753 25.934 2 125 5 \n",
"5774 6.381 2 73 2 \n",
"5781 5.486 2 20 0 \n",
"5782 8.907 2 11 0 \n",
"5791 5.498 2 30 0 \n",
"\n",
"[1328 rows x 8 columns]"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#split into 10 quartiles\n",
"citation_data[\"TopCitation\"] = pandas.qcut(citation_data[\"Citations\"],10,labels=False)\n",
"citation_data"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Journal</th>\n",
" <th>Score1</th>\n",
" <th>Score2</th>\n",
" <th>IF2</th>\n",
" <th>IF5</th>\n",
" <th>NoOfScores</th>\n",
" <th>TopCitation</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>Journal of the American Chemical Society</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>9.023</td>\n",
" <td>8.981</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>Science (New York)</td>\n",
" <td>8</td>\n",
" <td>10.0</td>\n",
" <td>31.377</td>\n",
" <td>31.777</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>Gastroenterology</td>\n",
" <td>6</td>\n",
" <td>8.0</td>\n",
" <td>12.032</td>\n",
" <td>12.403</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>Neuron</td>\n",
" <td>8</td>\n",
" <td>8.0</td>\n",
" <td>14.027</td>\n",
" <td>14.927</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>The Journal of Cell Biology</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>9.921</td>\n",
" <td>10.123</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5753</td>\n",
" <td>Nature Immunology</td>\n",
" <td>10</td>\n",
" <td>8.0</td>\n",
" <td>25.668</td>\n",
" <td>25.934</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5774</td>\n",
" <td>Molecular and Cellular Biology</td>\n",
" <td>8</td>\n",
" <td>8.0</td>\n",
" <td>6.188</td>\n",
" <td>6.381</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5781</td>\n",
" <td>RNA (New York)</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>6.051</td>\n",
" <td>5.486</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5782</td>\n",
" <td>Molecular Biology and Evolution</td>\n",
" <td>6</td>\n",
" <td>8.0</td>\n",
" <td>5.510</td>\n",
" <td>8.907</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5791</td>\n",
" <td>The Journal of Biological Chemistry</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>5.328</td>\n",
" <td>5.498</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>1328 rows × 7 columns</p>\n",
"</div>"
],
"text/plain": [
" Journal Score1 Score2 IF2 \\\n",
"1 Journal of the American Chemical Society 6 6.0 9.023 \n",
"2 Science (New York) 8 10.0 31.377 \n",
"3 Gastroenterology 6 8.0 12.032 \n",
"5 Neuron 8 8.0 14.027 \n",
"6 The Journal of Cell Biology 6 6.0 9.921 \n",
"... ... ... ... ... \n",
"5753 Nature Immunology 10 8.0 25.668 \n",
"5774 Molecular and Cellular Biology 8 8.0 6.188 \n",
"5781 RNA (New York) 6 6.0 6.051 \n",
"5782 Molecular Biology and Evolution 6 8.0 5.510 \n",
"5791 The Journal of Biological Chemistry 6 6.0 5.328 \n",
"\n",
" IF5 NoOfScores TopCitation \n",
"1 8.981 2 0 \n",
"2 31.777 2 1 \n",
"3 12.403 2 0 \n",
"5 14.927 2 0 \n",
"6 10.123 2 0 \n",
"... ... ... ... \n",
"5753 25.934 2 0 \n",
"5774 6.381 2 0 \n",
"5781 5.486 2 0 \n",
"5782 8.907 2 0 \n",
"5791 5.498 2 0 \n",
"\n",
"[1328 rows x 7 columns]"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\n",
"citation_data[\"TopCitation\"].replace(\\\n",
" {1:0,\n",
" 2:0,\n",
" 3:0,\n",
" 4:0,\n",
" 5:0,\n",
" 6:0,\n",
" 7:0,\n",
" 8:0},inplace=True)\n",
"citation_data[\"TopCitation\"].replace({9:1},inplace=True)\n",
"citation_data.pop(\"Citations\")\n",
"citation_data"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Journal</th>\n",
" <th>Score1</th>\n",
" <th>Score2</th>\n",
" <th>IF2</th>\n",
" <th>IF5</th>\n",
" <th>NoOfScores</th>\n",
" </tr>\n",
" <tr>\n",
" <th>TopCitation</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>1197</td>\n",
" <td>1197</td>\n",
" <td>1197</td>\n",
" <td>1197</td>\n",
" <td>1197</td>\n",
" <td>1197</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>131</td>\n",
" <td>131</td>\n",
" <td>131</td>\n",
" <td>131</td>\n",
" <td>131</td>\n",
" <td>131</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Journal Score1 Score2 IF2 IF5 NoOfScores\n",
"TopCitation \n",
"0 1197 1197 1197 1197 1197 1197\n",
"1 131 131 131 131 131 131"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"citation_data.groupby(\"TopCitation\").count()"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"# Let's shorted a couple of our labels\n",
"# Proceedings of the National Academy of Sciences of the United States of America to Proceedings\n",
"# Science (New York) to Science\n",
"citation_data[\"Journal\"].mask(citation_data[\"Journal\"]== \"Proceedings of the National Academy of Sciences of the United States of America\",'Proceedings',inplace=True)\n",
"citation_data[\"Journal\"].mask(citation_data[\"Journal\"]== \"Science (New York)\",'Science',inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1 2\n",
"2 2\n",
"3 2\n",
"5 2\n",
"6 2\n",
" ..\n",
"5753 2\n",
"5774 2\n",
"5781 2\n",
"5782 2\n",
"5791 2\n",
"Name: NoOfScores, Length: 1328, dtype: int64"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#NoOFScores doesn't really help us...\n",
"citation_data.pop(\"NoOfScores\")"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Journal</th>\n",
" <th>Score1</th>\n",
" <th>Score2</th>\n",
" <th>IF2</th>\n",
" <th>IF5</th>\n",
" <th>TopCitation</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>Journal of the American Chemical Society</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>9.023</td>\n",
" <td>8.981</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>Science</td>\n",
" <td>8</td>\n",
" <td>10.0</td>\n",
" <td>31.377</td>\n",
" <td>31.777</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>Gastroenterology</td>\n",
" <td>6</td>\n",
" <td>8.0</td>\n",
" <td>12.032</td>\n",
" <td>12.403</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>Neuron</td>\n",
" <td>8</td>\n",
" <td>8.0</td>\n",
" <td>14.027</td>\n",
" <td>14.927</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>The Journal of Cell Biology</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>9.921</td>\n",
" <td>10.123</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5753</td>\n",
" <td>Nature Immunology</td>\n",
" <td>10</td>\n",
" <td>8.0</td>\n",
" <td>25.668</td>\n",
" <td>25.934</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5774</td>\n",
" <td>Molecular and Cellular Biology</td>\n",
" <td>8</td>\n",
" <td>8.0</td>\n",
" <td>6.188</td>\n",
" <td>6.381</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5781</td>\n",
" <td>RNA (New York)</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>6.051</td>\n",
" <td>5.486</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5782</td>\n",
" <td>Molecular Biology and Evolution</td>\n",
" <td>6</td>\n",
" <td>8.0</td>\n",
" <td>5.510</td>\n",
" <td>8.907</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5791</td>\n",
" <td>The Journal of Biological Chemistry</td>\n",
" <td>6</td>\n",
" <td>6.0</td>\n",
" <td>5.328</td>\n",
" <td>5.498</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>1328 rows × 6 columns</p>\n",
"</div>"
],
"text/plain": [
" Journal Score1 Score2 IF2 \\\n",
"1 Journal of the American Chemical Society 6 6.0 9.023 \n",
"2 Science 8 10.0 31.377 \n",
"3 Gastroenterology 6 8.0 12.032 \n",
"5 Neuron 8 8.0 14.027 \n",
"6 The Journal of Cell Biology 6 6.0 9.921 \n",
"... ... ... ... ... \n",
"5753 Nature Immunology 10 8.0 25.668 \n",
"5774 Molecular and Cellular Biology 8 8.0 6.188 \n",
"5781 RNA (New York) 6 6.0 6.051 \n",
"5782 Molecular Biology and Evolution 6 8.0 5.510 \n",
"5791 The Journal of Biological Chemistry 6 6.0 5.328 \n",
"\n",
" IF5 TopCitation \n",
"1 8.981 0 \n",
"2 31.777 1 \n",
"3 12.403 0 \n",
"5 14.927 0 \n",
"6 10.123 0 \n",
"... ... ... \n",
"5753 25.934 0 \n",
"5774 6.381 0 \n",
"5781 5.486 0 \n",
"5782 8.907 0 \n",
"5791 5.498 0 \n",
"\n",
"[1328 rows x 6 columns]"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"citation_data"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [],
"source": [
"citation_data.to_csv('week_4_citation_homework.csv',index=False)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
GitHub Events
Total
- Push event: 1
Last Year
- Push event: 1