lja_python_for_librarians

Instructional material for Library Juice Academy Python For Librarians

https://github.com/elibtronic/lja_python_for_librarians

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (1.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Instructional material for Library Juice Academy Python For Librarians

Basic Info
  • Host: GitHub
  • Owner: elibtronic
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 8.81 MB
Statistics
  • Stars: 1
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created over 5 years ago · Last pushed 7 months ago
Metadata Files
Readme Citation

README.md

Python for Librarians


Instructional material for Library Juice Academy Python For Librarians

Week 1

Week 1 Workalong

Week 1 Homework

Week 2

Week 2 Workalong

Week 2 Homework

Week 3

Week 3 Workalong

Week 3 Homework

Week 4

Week 4 Workalong

Week 4 Homework

Owner

  • Name: Tim Ribaric
  • Login: elibtronic
  • Kind: user
  • Location: Canada

Citation (Citation Data Prep for Week 4 Homework.ipynb)

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Citation Counts and impact factors\n",
    "\n",
    "Dataset from [https://datadryad.org/stash/dataset/doi:10.5061/dryad.2h4j5](https://datadryad.org/stash/dataset/doi:10.5061/dryad.2h4j5) article is [https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001675](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001675)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 1\n",
    "\n",
    "- Download Text file from [https://datadryad.org/stash/downloads/file_stream/30779](https://datadryad.org/stash/downloads/file_stream/30779)\n",
    "\n",
    "- Copy into working directory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2\n",
    "\n",
    "- open with Pandas and Normalize"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Journal</th>\n",
       "      <th>Score1</th>\n",
       "      <th>Score2</th>\n",
       "      <th>IF2</th>\n",
       "      <th>IF5</th>\n",
       "      <th>NoOfScores</th>\n",
       "      <th>Citations</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>Immunity</td>\n",
       "      <td>8</td>\n",
       "      <td>NaN</td>\n",
       "      <td>24.221</td>\n",
       "      <td>22.133</td>\n",
       "      <td>1</td>\n",
       "      <td>220</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>Journal of the American Chemical Society</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>9.023</td>\n",
       "      <td>8.981</td>\n",
       "      <td>2</td>\n",
       "      <td>40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>Science (New York)</td>\n",
       "      <td>8</td>\n",
       "      <td>10.0</td>\n",
       "      <td>31.377</td>\n",
       "      <td>31.777</td>\n",
       "      <td>2</td>\n",
       "      <td>1003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>Gastroenterology</td>\n",
       "      <td>6</td>\n",
       "      <td>8.0</td>\n",
       "      <td>12.032</td>\n",
       "      <td>12.403</td>\n",
       "      <td>2</td>\n",
       "      <td>85</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>Nature Medicine</td>\n",
       "      <td>10</td>\n",
       "      <td>NaN</td>\n",
       "      <td>25.430</td>\n",
       "      <td>27.887</td>\n",
       "      <td>1</td>\n",
       "      <td>213</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5806</td>\n",
       "      <td>Cerebral Cortex</td>\n",
       "      <td>6</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.844</td>\n",
       "      <td>7.200</td>\n",
       "      <td>1</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5807</td>\n",
       "      <td>The Journal of Biological Chemistry</td>\n",
       "      <td>6</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5.328</td>\n",
       "      <td>5.498</td>\n",
       "      <td>1</td>\n",
       "      <td>108</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5808</td>\n",
       "      <td>Molecular Biology and Evolution</td>\n",
       "      <td>6</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5.510</td>\n",
       "      <td>8.907</td>\n",
       "      <td>1</td>\n",
       "      <td>37</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5809</td>\n",
       "      <td>The Journal of Biological Chemistry</td>\n",
       "      <td>8</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5.328</td>\n",
       "      <td>5.498</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5810</td>\n",
       "      <td>Photochemistry and Photobiology</td>\n",
       "      <td>6</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2.679</td>\n",
       "      <td>2.552</td>\n",
       "      <td>1</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5811 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                       Journal  Score1  Score2     IF2  \\\n",
       "0                                     Immunity       8     NaN  24.221   \n",
       "1     Journal of the American Chemical Society       6     6.0   9.023   \n",
       "2                           Science (New York)       8    10.0  31.377   \n",
       "3                             Gastroenterology       6     8.0  12.032   \n",
       "4                              Nature Medicine      10     NaN  25.430   \n",
       "...                                        ...     ...     ...     ...   \n",
       "5806                           Cerebral Cortex       6     NaN   6.844   \n",
       "5807       The Journal of Biological Chemistry       6     NaN   5.328   \n",
       "5808           Molecular Biology and Evolution       6     NaN   5.510   \n",
       "5809       The Journal of Biological Chemistry       8     NaN   5.328   \n",
       "5810           Photochemistry and Photobiology       6     NaN   2.679   \n",
       "\n",
       "         IF5  NoOfScores  Citations  \n",
       "0     22.133           1        220  \n",
       "1      8.981           2         40  \n",
       "2     31.777           2       1003  \n",
       "3     12.403           2         85  \n",
       "4     27.887           1        213  \n",
       "...      ...         ...        ...  \n",
       "5806   7.200           1         12  \n",
       "5807   5.498           1        108  \n",
       "5808   8.907           1         37  \n",
       "5809   5.498           1         10  \n",
       "5810   2.552           1         11  \n",
       "\n",
       "[5811 rows x 7 columns]"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#read in our text file, anything that is blank will be treated as a Null Value\n",
    "citation_data = pandas.read_csv('F1000 data - Dryad.txt', sep=\"\\t\", skipinitialspace = True)\n",
    "#no need to modify columns!\n",
    "citation_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Journal</th>\n",
       "      <th>Score1</th>\n",
       "      <th>Score2</th>\n",
       "      <th>IF2</th>\n",
       "      <th>IF5</th>\n",
       "      <th>NoOfScores</th>\n",
       "      <th>Citations</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>Journal of the American Chemical Society</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>9.023</td>\n",
       "      <td>8.981</td>\n",
       "      <td>2</td>\n",
       "      <td>40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>Science (New York)</td>\n",
       "      <td>8</td>\n",
       "      <td>10.0</td>\n",
       "      <td>31.377</td>\n",
       "      <td>31.777</td>\n",
       "      <td>2</td>\n",
       "      <td>1003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>Gastroenterology</td>\n",
       "      <td>6</td>\n",
       "      <td>8.0</td>\n",
       "      <td>12.032</td>\n",
       "      <td>12.403</td>\n",
       "      <td>2</td>\n",
       "      <td>85</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>Neuron</td>\n",
       "      <td>8</td>\n",
       "      <td>8.0</td>\n",
       "      <td>14.027</td>\n",
       "      <td>14.927</td>\n",
       "      <td>2</td>\n",
       "      <td>336</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>The Journal of Cell Biology</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>9.921</td>\n",
       "      <td>10.123</td>\n",
       "      <td>2</td>\n",
       "      <td>177</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5753</td>\n",
       "      <td>Nature Immunology</td>\n",
       "      <td>10</td>\n",
       "      <td>8.0</td>\n",
       "      <td>25.668</td>\n",
       "      <td>25.934</td>\n",
       "      <td>2</td>\n",
       "      <td>125</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5774</td>\n",
       "      <td>Molecular and Cellular Biology</td>\n",
       "      <td>8</td>\n",
       "      <td>8.0</td>\n",
       "      <td>6.188</td>\n",
       "      <td>6.381</td>\n",
       "      <td>2</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5781</td>\n",
       "      <td>RNA (New York)</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>6.051</td>\n",
       "      <td>5.486</td>\n",
       "      <td>2</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5782</td>\n",
       "      <td>Molecular Biology and Evolution</td>\n",
       "      <td>6</td>\n",
       "      <td>8.0</td>\n",
       "      <td>5.510</td>\n",
       "      <td>8.907</td>\n",
       "      <td>2</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5791</td>\n",
       "      <td>The Journal of Biological Chemistry</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>5.328</td>\n",
       "      <td>5.498</td>\n",
       "      <td>2</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1328 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                       Journal  Score1  Score2     IF2  \\\n",
       "1     Journal of the American Chemical Society       6     6.0   9.023   \n",
       "2                           Science (New York)       8    10.0  31.377   \n",
       "3                             Gastroenterology       6     8.0  12.032   \n",
       "5                                       Neuron       8     8.0  14.027   \n",
       "6                  The Journal of Cell Biology       6     6.0   9.921   \n",
       "...                                        ...     ...     ...     ...   \n",
       "5753                         Nature Immunology      10     8.0  25.668   \n",
       "5774            Molecular and Cellular Biology       8     8.0   6.188   \n",
       "5781                            RNA (New York)       6     6.0   6.051   \n",
       "5782           Molecular Biology and Evolution       6     8.0   5.510   \n",
       "5791       The Journal of Biological Chemistry       6     6.0   5.328   \n",
       "\n",
       "         IF5  NoOfScores  Citations  \n",
       "1      8.981           2         40  \n",
       "2     31.777           2       1003  \n",
       "3     12.403           2         85  \n",
       "5     14.927           2        336  \n",
       "6     10.123           2        177  \n",
       "...      ...         ...        ...  \n",
       "5753  25.934           2        125  \n",
       "5774   6.381           2         73  \n",
       "5781   5.486           2         20  \n",
       "5782   8.907           2         11  \n",
       "5791   5.498           2         30  \n",
       "\n",
       "[1328 rows x 7 columns]"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#we drop rows with any missing values\n",
    "citation_data = citation_data.dropna(how='any')\n",
    "citation_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/tim/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Journal</th>\n",
       "      <th>Score1</th>\n",
       "      <th>Score2</th>\n",
       "      <th>IF2</th>\n",
       "      <th>IF5</th>\n",
       "      <th>NoOfScores</th>\n",
       "      <th>Citations</th>\n",
       "      <th>TopCitation</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>Journal of the American Chemical Society</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>9.023</td>\n",
       "      <td>8.981</td>\n",
       "      <td>2</td>\n",
       "      <td>40</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>Science (New York)</td>\n",
       "      <td>8</td>\n",
       "      <td>10.0</td>\n",
       "      <td>31.377</td>\n",
       "      <td>31.777</td>\n",
       "      <td>2</td>\n",
       "      <td>1003</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>Gastroenterology</td>\n",
       "      <td>6</td>\n",
       "      <td>8.0</td>\n",
       "      <td>12.032</td>\n",
       "      <td>12.403</td>\n",
       "      <td>2</td>\n",
       "      <td>85</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>Neuron</td>\n",
       "      <td>8</td>\n",
       "      <td>8.0</td>\n",
       "      <td>14.027</td>\n",
       "      <td>14.927</td>\n",
       "      <td>2</td>\n",
       "      <td>336</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>The Journal of Cell Biology</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>9.921</td>\n",
       "      <td>10.123</td>\n",
       "      <td>2</td>\n",
       "      <td>177</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5753</td>\n",
       "      <td>Nature Immunology</td>\n",
       "      <td>10</td>\n",
       "      <td>8.0</td>\n",
       "      <td>25.668</td>\n",
       "      <td>25.934</td>\n",
       "      <td>2</td>\n",
       "      <td>125</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5774</td>\n",
       "      <td>Molecular and Cellular Biology</td>\n",
       "      <td>8</td>\n",
       "      <td>8.0</td>\n",
       "      <td>6.188</td>\n",
       "      <td>6.381</td>\n",
       "      <td>2</td>\n",
       "      <td>73</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5781</td>\n",
       "      <td>RNA (New York)</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>6.051</td>\n",
       "      <td>5.486</td>\n",
       "      <td>2</td>\n",
       "      <td>20</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5782</td>\n",
       "      <td>Molecular Biology and Evolution</td>\n",
       "      <td>6</td>\n",
       "      <td>8.0</td>\n",
       "      <td>5.510</td>\n",
       "      <td>8.907</td>\n",
       "      <td>2</td>\n",
       "      <td>11</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5791</td>\n",
       "      <td>The Journal of Biological Chemistry</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>5.328</td>\n",
       "      <td>5.498</td>\n",
       "      <td>2</td>\n",
       "      <td>30</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1328 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                       Journal  Score1  Score2     IF2  \\\n",
       "1     Journal of the American Chemical Society       6     6.0   9.023   \n",
       "2                           Science (New York)       8    10.0  31.377   \n",
       "3                             Gastroenterology       6     8.0  12.032   \n",
       "5                                       Neuron       8     8.0  14.027   \n",
       "6                  The Journal of Cell Biology       6     6.0   9.921   \n",
       "...                                        ...     ...     ...     ...   \n",
       "5753                         Nature Immunology      10     8.0  25.668   \n",
       "5774            Molecular and Cellular Biology       8     8.0   6.188   \n",
       "5781                            RNA (New York)       6     6.0   6.051   \n",
       "5782           Molecular Biology and Evolution       6     8.0   5.510   \n",
       "5791       The Journal of Biological Chemistry       6     6.0   5.328   \n",
       "\n",
       "         IF5  NoOfScores  Citations  TopCitation  \n",
       "1      8.981           2         40            1  \n",
       "2     31.777           2       1003            9  \n",
       "3     12.403           2         85            3  \n",
       "5     14.927           2        336            8  \n",
       "6     10.123           2        177            6  \n",
       "...      ...         ...        ...          ...  \n",
       "5753  25.934           2        125            5  \n",
       "5774   6.381           2         73            2  \n",
       "5781   5.486           2         20            0  \n",
       "5782   8.907           2         11            0  \n",
       "5791   5.498           2         30            0  \n",
       "\n",
       "[1328 rows x 8 columns]"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#split into 10 quartiles\n",
    "citation_data[\"TopCitation\"] = pandas.qcut(citation_data[\"Citations\"],10,labels=False)\n",
    "citation_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Journal</th>\n",
       "      <th>Score1</th>\n",
       "      <th>Score2</th>\n",
       "      <th>IF2</th>\n",
       "      <th>IF5</th>\n",
       "      <th>NoOfScores</th>\n",
       "      <th>TopCitation</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>Journal of the American Chemical Society</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>9.023</td>\n",
       "      <td>8.981</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>Science (New York)</td>\n",
       "      <td>8</td>\n",
       "      <td>10.0</td>\n",
       "      <td>31.377</td>\n",
       "      <td>31.777</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>Gastroenterology</td>\n",
       "      <td>6</td>\n",
       "      <td>8.0</td>\n",
       "      <td>12.032</td>\n",
       "      <td>12.403</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>Neuron</td>\n",
       "      <td>8</td>\n",
       "      <td>8.0</td>\n",
       "      <td>14.027</td>\n",
       "      <td>14.927</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>The Journal of Cell Biology</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>9.921</td>\n",
       "      <td>10.123</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5753</td>\n",
       "      <td>Nature Immunology</td>\n",
       "      <td>10</td>\n",
       "      <td>8.0</td>\n",
       "      <td>25.668</td>\n",
       "      <td>25.934</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5774</td>\n",
       "      <td>Molecular and Cellular Biology</td>\n",
       "      <td>8</td>\n",
       "      <td>8.0</td>\n",
       "      <td>6.188</td>\n",
       "      <td>6.381</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5781</td>\n",
       "      <td>RNA (New York)</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>6.051</td>\n",
       "      <td>5.486</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5782</td>\n",
       "      <td>Molecular Biology and Evolution</td>\n",
       "      <td>6</td>\n",
       "      <td>8.0</td>\n",
       "      <td>5.510</td>\n",
       "      <td>8.907</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5791</td>\n",
       "      <td>The Journal of Biological Chemistry</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>5.328</td>\n",
       "      <td>5.498</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1328 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                       Journal  Score1  Score2     IF2  \\\n",
       "1     Journal of the American Chemical Society       6     6.0   9.023   \n",
       "2                           Science (New York)       8    10.0  31.377   \n",
       "3                             Gastroenterology       6     8.0  12.032   \n",
       "5                                       Neuron       8     8.0  14.027   \n",
       "6                  The Journal of Cell Biology       6     6.0   9.921   \n",
       "...                                        ...     ...     ...     ...   \n",
       "5753                         Nature Immunology      10     8.0  25.668   \n",
       "5774            Molecular and Cellular Biology       8     8.0   6.188   \n",
       "5781                            RNA (New York)       6     6.0   6.051   \n",
       "5782           Molecular Biology and Evolution       6     8.0   5.510   \n",
       "5791       The Journal of Biological Chemistry       6     6.0   5.328   \n",
       "\n",
       "         IF5  NoOfScores  TopCitation  \n",
       "1      8.981           2            0  \n",
       "2     31.777           2            1  \n",
       "3     12.403           2            0  \n",
       "5     14.927           2            0  \n",
       "6     10.123           2            0  \n",
       "...      ...         ...          ...  \n",
       "5753  25.934           2            0  \n",
       "5774   6.381           2            0  \n",
       "5781   5.486           2            0  \n",
       "5782   8.907           2            0  \n",
       "5791   5.498           2            0  \n",
       "\n",
       "[1328 rows x 7 columns]"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "citation_data[\"TopCitation\"].replace(\\\n",
    "                                {1:0,\n",
    "                                 2:0,\n",
    "                                 3:0,\n",
    "                                 4:0,\n",
    "                                 5:0,\n",
    "                                 6:0,\n",
    "                                 7:0,\n",
    "                                 8:0},inplace=True)\n",
    "citation_data[\"TopCitation\"].replace({9:1},inplace=True)\n",
    "citation_data.pop(\"Citations\")\n",
    "citation_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Journal</th>\n",
       "      <th>Score1</th>\n",
       "      <th>Score2</th>\n",
       "      <th>IF2</th>\n",
       "      <th>IF5</th>\n",
       "      <th>NoOfScores</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>TopCitation</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1197</td>\n",
       "      <td>1197</td>\n",
       "      <td>1197</td>\n",
       "      <td>1197</td>\n",
       "      <td>1197</td>\n",
       "      <td>1197</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>131</td>\n",
       "      <td>131</td>\n",
       "      <td>131</td>\n",
       "      <td>131</td>\n",
       "      <td>131</td>\n",
       "      <td>131</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             Journal  Score1  Score2   IF2   IF5  NoOfScores\n",
       "TopCitation                                                 \n",
       "0               1197    1197    1197  1197  1197        1197\n",
       "1                131     131     131   131   131         131"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "citation_data.groupby(\"TopCitation\").count()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Let's shorted a couple of our labels\n",
    "# Proceedings of the National Academy of Sciences of the United States of America to Proceedings\n",
    "# Science (New York) to Science\n",
    "citation_data[\"Journal\"].mask(citation_data[\"Journal\"]== \"Proceedings of the National Academy of Sciences of the United States of America\",'Proceedings',inplace=True)\n",
    "citation_data[\"Journal\"].mask(citation_data[\"Journal\"]== \"Science (New York)\",'Science',inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1       2\n",
       "2       2\n",
       "3       2\n",
       "5       2\n",
       "6       2\n",
       "       ..\n",
       "5753    2\n",
       "5774    2\n",
       "5781    2\n",
       "5782    2\n",
       "5791    2\n",
       "Name: NoOfScores, Length: 1328, dtype: int64"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#NoOFScores doesn't really help us...\n",
    "citation_data.pop(\"NoOfScores\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Journal</th>\n",
       "      <th>Score1</th>\n",
       "      <th>Score2</th>\n",
       "      <th>IF2</th>\n",
       "      <th>IF5</th>\n",
       "      <th>TopCitation</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>Journal of the American Chemical Society</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>9.023</td>\n",
       "      <td>8.981</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>Science</td>\n",
       "      <td>8</td>\n",
       "      <td>10.0</td>\n",
       "      <td>31.377</td>\n",
       "      <td>31.777</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>Gastroenterology</td>\n",
       "      <td>6</td>\n",
       "      <td>8.0</td>\n",
       "      <td>12.032</td>\n",
       "      <td>12.403</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>Neuron</td>\n",
       "      <td>8</td>\n",
       "      <td>8.0</td>\n",
       "      <td>14.027</td>\n",
       "      <td>14.927</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>The Journal of Cell Biology</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>9.921</td>\n",
       "      <td>10.123</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5753</td>\n",
       "      <td>Nature Immunology</td>\n",
       "      <td>10</td>\n",
       "      <td>8.0</td>\n",
       "      <td>25.668</td>\n",
       "      <td>25.934</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5774</td>\n",
       "      <td>Molecular and Cellular Biology</td>\n",
       "      <td>8</td>\n",
       "      <td>8.0</td>\n",
       "      <td>6.188</td>\n",
       "      <td>6.381</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5781</td>\n",
       "      <td>RNA (New York)</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>6.051</td>\n",
       "      <td>5.486</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5782</td>\n",
       "      <td>Molecular Biology and Evolution</td>\n",
       "      <td>6</td>\n",
       "      <td>8.0</td>\n",
       "      <td>5.510</td>\n",
       "      <td>8.907</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5791</td>\n",
       "      <td>The Journal of Biological Chemistry</td>\n",
       "      <td>6</td>\n",
       "      <td>6.0</td>\n",
       "      <td>5.328</td>\n",
       "      <td>5.498</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1328 rows × 6 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                       Journal  Score1  Score2     IF2  \\\n",
       "1     Journal of the American Chemical Society       6     6.0   9.023   \n",
       "2                                      Science       8    10.0  31.377   \n",
       "3                             Gastroenterology       6     8.0  12.032   \n",
       "5                                       Neuron       8     8.0  14.027   \n",
       "6                  The Journal of Cell Biology       6     6.0   9.921   \n",
       "...                                        ...     ...     ...     ...   \n",
       "5753                         Nature Immunology      10     8.0  25.668   \n",
       "5774            Molecular and Cellular Biology       8     8.0   6.188   \n",
       "5781                            RNA (New York)       6     6.0   6.051   \n",
       "5782           Molecular Biology and Evolution       6     8.0   5.510   \n",
       "5791       The Journal of Biological Chemistry       6     6.0   5.328   \n",
       "\n",
       "         IF5  TopCitation  \n",
       "1      8.981            0  \n",
       "2     31.777            1  \n",
       "3     12.403            0  \n",
       "5     14.927            0  \n",
       "6     10.123            0  \n",
       "...      ...          ...  \n",
       "5753  25.934            0  \n",
       "5774   6.381            0  \n",
       "5781   5.486            0  \n",
       "5782   8.907            0  \n",
       "5791   5.498            0  \n",
       "\n",
       "[1328 rows x 6 columns]"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "citation_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "citation_data.to_csv('week_4_citation_homework.csv',index=False)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1