https://github.com/csyhuang/data-science-from-scratch
code for Data Science From Scratch book
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 5 committers (20.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.2%) to scientific vocabulary
Repository
code for Data Science From Scratch book
Basic Info
- Host: GitHub
- Owner: csyhuang
- License: unlicense
- Language: Python
- Default Branch: master
- Size: 210 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Data Science from Scratch
Here's all the code and examples from my book Data Science from Scratch. The code directory contains Python 2.7 versions, and the code-python3 direction contains the Python 3 equivalents. (I tested them in 3.5, but they should work in any 3.x.)
Each can be imported as a module, for example (after you cd into the /code directory):
python
from linear_algebra import distance, vector_mean
v = [1, 2, 3]
w = [4, 5, 6]
print distance(v, w)
print vector_mean([v, w])
Or can be run from the command line to get a demo of what it does (and to execute the examples from the book):
bat
python recommender_systems.py
Additionally, I've collected all the links from the book.
And, by popular demand, I made an index of functions defined in the book, by chapter and page number. The data is in a spreadsheet, or I also made a toy (experimental) searchable webapp.
Table of Contents
- Introduction
- A Crash Course in Python
- Visualizing Data
- Linear Algebra
- Statistics
- Probability
- Hypothesis and Inference
- Gradient Descent
- Getting Data
- Working With Data
- Machine Learning
- k-Nearest Neighbors
- Naive Bayes
- Simple Linear Regression
- Multiple Regression
- Logistic Regression
- Decision Trees
- Neural Networks
- Clustering
- Natural Language Processing
- Network Analysis
- Recommender Systems
- Databases and SQL
- MapReduce
- Go Forth And Do Data Science
Owner
- Name: Clare S. Y. Huang
- Login: csyhuang
- Kind: user
- Website: http://claresyhuang.info
- Twitter: claresyhuang
- Repositories: 43
- Profile: https://github.com/csyhuang
Data Scientist. Climate Scientist. Ph.D in Geophysical Sciences (U of Chicago). Love coding, writing and playing music.
GitHub Events
Total
Last Year
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Joel Grus | j****s@g****m | 51 |
| Pandeesh | p****h@g****m | 1 |
| unknown | 曾****恺 | 1 |
| Brooke Anderson | g****s@j****u | 1 |
| Lucy Park | me@l****r | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: over 2 years ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0