public_open_source_data_science
A repository of open source data science projects for social good
https://github.com/neelsoumya/public_open_source_data_science
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary
Keywords
Repository
A repository of open source data science projects for social good
Basic Info
Statistics
- Stars: 3
- Watchers: 4
- Forks: 1
- Open Issues: 0
- Releases: 4
Topics
Metadata Files
README.md
Introduction
Source code and data for open source data science for social good. This is a data science portfolio.
List of projects
1) university_sexcrimes
Analysis of data on sex crimes in US university campuses.
2) heartdiseaserisk_prediction
Predicting heart disease risk from open data.
3) cancermortalityprediction
Predicting cancer survival using logistic regression from open data.
4) predictingnewspopularity
Predicting popularity of news articles from open data.
5) opensourcemappingproject
Open source mapping project.
6) astroinformatics
Analysis of astronomy data using machine learning techniques.
7) scientific_collaboration
Project to analyze planetary scale scientific collaboration data.
8) accident_prediction
Road accident forecasting and data exploration project.
Interactive website using shiny at:
https://neelsoumya.shinyapps.io/accident_prediction/
9) patternsincrime
Predicting patterns of crime using data science. Larger cities have disproportionately more crime per capita compared to smaller cities (super-linear scaling of crime). We used techniques from dynamical systems and complex systems to explain the super-linear scaling of crime in cities and other socio-technological systems
10) spam_classification
Building an SVM based spam classifier trained on data from the UCI repository
11) breastcancerprediction
Downloads data from the UCI machine learning repository to make predictions
for breast cancer. A few features turn out to be really important for prediction like epithelial cell size. This uses a random forest.
12) fundingtrendsscience
Project to analyze data on funding trends in biomedical science.
13) infectiousdiseaseprediction
Project to analyze data on emerging infectious diseases.
14) forecasting_imports
Project to forecast imports and model supply chains.
15) deeplearningbasic
Basic deep learning model using keras for prediction.
16) ai_healthcare
Machine learning and AI applied to healthcare.
17) aisocialgood
Machine learning, data science and AI for social good.
18) aibigdatabiology
Machine learning and bioinformatics for big data in biology.
19) browserbaseddata_science
Browser based data science for democratic access to data science tools.
20) clinical_informatics
Open source privacy-preserving clinical informatics.
21) policypapergeneral_public
Policy paper for general public on Ethical Artificial Intelligence (EAI) for social good.
22) nlp
Resources, code and data for natural language processing.
23) selforganisingmapwinedataset
A self organising map (SOM) on the UCI wine dataset using the Orange data science tool.
24) LLMs
Hackathons and resources for large-language models (LLMs).
25) outreach
Outreach for machine learning and AI for general public
26) teaching_resources
Teaching resources for machine learning, data science and AI for a general audience
What is this repository for?
Quick summary
- Open source code and data for open source data science.
Citation
If you use this code, please cite the paper and code
- Citizen Data Science for Social Good: Case Studies and Vignettes from Recent Projects https://doi.org/10.13140/RG.2.1.1846.6002
- Citizen Data Science for Social Good in Complex Systems, Interdisciplinary Description of Complex Systems, 16(1):88-91, 2018 http://indecs.eu/index.php?s=x&y=2018&p=88-91
- Banerjee, Soumya. (2017, September 3). Citizen Data Science for Social Good: Case Studies and Vignettes from Recent Projects (Supplementary Resources). Zenodo. http://doi.org/10.5281/zenodo.883783
](https://doi.org/10.5281/zenodo.883783)
These projects are an example of my approach to data science for good. I work very closely with domain experts and stakeholders and use computational tools for good. I outline my design and work philosophy below.
Installation
Install R, R Studio, MATLAB and Python
Install R
https://www.r-project.org/
and R Studio
https://www.rstudio.com/products/rstudio/download/preview/
r
source("https://raw.githubusercontent.com/neelsoumya/rlib/master/INSTALL_MANY_MODULES.R")
Install Python dependencies as follows:
r
pip3 install -r requirements.txt
Contact
Soumya Banerjee
https://sites.google.com/site/neelsoumya/
sb2333@cam.ac.uk
Owner
- Name: Soumya Banerjee
- Login: neelsoumya
- Kind: user
- Location: Cambridge, UK
- Company: University of Cambridge
- Website: https://sites.google.com/site/neelsoumya/
- Repositories: 249
- Profile: https://github.com/neelsoumya
My research interests are in complex systems data science, machine learning, computational biology, computational immunology and computational immunogenomics.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Banerjee" given-names: "Soumya" orcid: "https://orcid.org/0000-0001-7748-9885" title: "CITIZEN DATA SCIENCE FOR SOCIAL GOOD IN COMPLEX SYSTEMS" version: 1.0.0 doi: 10.7906/indecs.16.1.6 date-released: 2022-01-02 url: "http://indecs.eu/index.php?s=x&y=2018&p=88-91"
GitHub Events
Total
- Push event: 1
Last Year
- Push event: 1
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Soumya Banerjee | n****a@g****m | 41 |
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
