cholera-disease-analysis
The repository shows the data analysis of Cholera disease which is killing people for last two centuries using R programming language.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary
Keywords
Repository
The repository shows the data analysis of Cholera disease which is killing people for last two centuries using R programming language.
Basic Info
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Cholera-Disease-Analysis

1. Objective
- The objective of this project is to create awareness and to understand the spread of Cholera disease globally using R programming language and also what measures we can take to eradicate the disease which is killing people since 1850's by developing the required healthcare infrastructure and accessibility to clean drinking water and vaccines.
- Please watch the following video's to understand more about Cholera and also the story of John Snow who used data analysis in order to find out the source of Cholera outbreak of 1854 which killed 600 people in just a few weeks in London.
| Sr. No | Title | Link To Video| |--------|-------|--------------| |1| John Snow & The 1854 Broad Street Cholera Outbreak | https://www.youtube.com/watch?feature=player_embedded&v=lNjrAXGRda4 | |2| The Pandemic The World Has Forgotten | https://www.youtube.com/watch?feature=player_embedded&v=hj95IZMlZWw|
2. Introduction
- The repository shows the data analysis of Cholera disease which is killing people for last two centuries and shows the evidence that if the countries doesn't have access to clean drinking water or sanitation or doesn't have basic healthcare infrastructure then Cholera can kill thousands of people if not millions.

- The above graphs shows the Top-12 Countries with the most reported Cholera Cases (1949-2016) and it is evident that India and Bangladesh which had half a million cases reported in the late 1940s have now relatively very less number of cases/deaths reported due to the vaccination and sanitation availability but still in Sub-Saharan Africa, Latin America and Caribbean countries Cholera is a major threat to the people due to very weak and fragile healthcare infrastructure present in these countries (Refer the below figure).
- The following short documentary explains, How Cholera Outbreak Happened In Haiti In 2010 after a devastating earthquake which destroyed most of the infrastructure of Haiti and made millions of people to live in tents like a refugee camp shows that how weak and fragile the infrastructure of Haiti was and how that with several other consequences of weak infrastructure and governance led to Cholera Outbreak In Haiti in 2010.
3. How To Execute
- To view the jupyter notebook i.e
Cholera-Disease-Analysis.ipynbclick on the badgebut if you want to execute the jupyter notebook or R-code in the R-IDE then follow the steps given below:
3.1 Execution Using Binder
- Click on the badge in order to execute the jupyter notebook in binder which will allow you to have an interactve experience.
NOTE: The binder will take atleast 10-15 minutes to create an executable environment by installing all the dependencies for R and this project.
3.2 Execution On Local System
- You can also execute the R-code on your local system in the R-IDE by cloning this repository onto your local system and then open the R-IDE, set the working directory to the directory at which cloned repository is stored and execute the Cholera-Disease-Analysis.r file.
NOTE: Do maximise all the plots which are created through execution of the R-code in R-IDE for better view.
- You can also execute the the jupyter notebook i.e. Cholera-Disease-Analysis.ipynb on your local system if you have have a jupyter notebook and R-kernel for jupyter notebook installed.
4. About Dataset
- The dataset consists of information about the total number of cases, deaths and fatality rate (CFR) for the cholera disease from 1949-2016.
- The dataset is downloaded from the WHO Global Health Observatory Data Repository website which tracks the disease outbreaks globally.
| Sr. No | Dataset Name | Source |
|:-----:|:-------------:|:--------:|
| 1 | Cholera Disease Annual Cases, Deaths and Fatality Rate Dataset (1949-2016) | WHO GHO Data Repository: Cholera Disease Dataset Download Link |
NOTE: If you want to cite this repository, then please copy the respective style information (APA or BibTex) provided under cite this repository option as shown in the tutorial: https://github.blog/wp-content/uploads/2021/08/GitHub-citation-demo.gif
GNU General Public License v3.0
Owner
- Name: Suraj Sharma
- Login: strikersps
- Kind: user
- Location: Bangalore, Karnataka, India
- Company: Software Engineer, CommScope
- Website: https://www.linkedin.com/in/sps22/
- Twitter: _noble_liar_
- Repositories: 6
- Profile: https://github.com/strikersps
I am a Data Scientist and Software Developer. I love programming, writing, and playing soccer.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Sharma" given-names: "Suraj" title: "Cholera-Disease-Analysis" version: 1.0.0 url: "https://github.com/strikersps/Cholera-Disease-Analysis"
