https://github.com/cedrickchee/data-science-notebooks

Data science Python notebooks—a collection of Jupyter notebooks on machine learning, deep learning, statistical inference, data analysis and visualization.

https://github.com/cedrickchee/data-science-notebooks

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.7%) to scientific vocabulary

Keywords

data-science deep-learning fastai kaggle keras machine-learning notebooks numpy pandas python pytorch tensorflow
Last synced: 6 months ago · JSON representation

Repository

Data science Python notebooks—a collection of Jupyter notebooks on machine learning, deep learning, statistical inference, data analysis and visualization.

Basic Info
  • Host: GitHub
  • Owner: cedrickchee
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: master
  • Size: 3.3 MB
Statistics
  • Stars: 92
  • Watchers: 11
  • Forks: 29
  • Open Issues: 0
  • Releases: 0
Topics
data-science deep-learning fastai kaggle keras machine-learning notebooks numpy pandas python pytorch tensorflow
Created over 7 years ago · Last pushed almost 3 years ago
Metadata Files
Readme License

README.md

Data Science Notebooks

Data science Python notebooks—a collection of Jupyter notebooks on machine learning, deep learning, statistical inference, data analysis and visualization.

This repo contains various Python Jupyter notebooks I have created to experiment and learn with the core libraries essential for working with data in Python and work through exercises, assignments, course works, and explore subjects that I find interesting such as machine learning and deep learning. Familiarity with Python as a language is assumed.

The essential core libraries that I will be focusing on for working with data are NumPy, Pandas, Matplotlib, PyTorch, TensorFlow, Keras, Caffe, scikit-learn, spaCy, NLTK, Gensim, and related packages.

Table of Contents

How to Use this Repo

  • Run the code using the Jupyter notebooks available in this repository's notebooks directory.
  • Launch a live notebook server with these notebooks using binder: Binder

About

The notebooks were written and tested with Python 3.6, though other Python versions (including Python 3.x) should work in nearly all cases.

See index.ipynb for an index of the notebooks available.

Software

The code in the notebook was tested with Python 3.6, though most (but not all) will also work correctly with Python 3.x.

The packages I used to run the code in the notebook are listed in requirements.txt (Note that some of these exact version numbers may not be available on your platform: you may have to tweak them for your own use). To install the requirements using conda, run the following at the command-line:

bash $ conda install --file requirements.txt

To create a stand-alone environment named DSN with Python 3.6 and all the required package versions, run the following:

bash $ conda create -n DSN python=3.5 --file requirements.txt

You can read more about using conda environments in the Managing Environments section of the conda documentation.

Deep Learning

Projects

|Notebook|Description| | --- | --- | | Deep Painterly Harmonization | Implement Deep Painterly Harmonization paper in PyTorch | | Language modelling in Malay language for downstream NLP tasks | Implement Universal Language Model Fine-tuning for Text Classification (ULMFiT) in PyTorch | | Not Hotdog AI Camera mobile app | Asia virtual study group project for fast.ai deep learning part 1, v3 course. Ship a convolutional neural network on Android/iOS with PyTorch and Android Studio/Xcode |

Language Models

Notebooks for trying out transformer and large language models.

|Notebook|Description| | --- | --- | | Flan-UL2 20B | Flan 20B with UL2 code walkthrough. This shows how you can get it running on 1x A100 40GB GPU with the HuggingFace library and using 8-bit inference. Using CoT, zeroshot (logical reasoning, story writing, common sense reasoning, speech writing). Testing large (2048) token input. |

DL Assignments, Exercises or Course Works

fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2018 (v2): Oct - Dec 2017

| Notebook | Description | | --- | --- | | lesson1,
lesson1-vgg,
lesson1-rxt50,
keras_lesson1 | Lesson 1 - Recognizing Cats and Dogs | | lesson2-image_models | Lesson 2 - Improving Your Image Classifier | | lesson3-rossman | Lesson 3 - Understanding Convolutions | | lesson4-imdb | Lesson 4 - Structured Time Series and Language Models | | lesson5-movielens | Lesson 5 - Collaborative Filtering; Inside the Training Loop | | lesson6-rnn,
lesson6-sgd | Lesson 6 - Interpreting Embeddings; RNNs from Scratch | | lesson7-cifar10,
lesson7-CAM | Lesson 7 - ResNets from Scratch |

fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2019 (v3): Oct - Dec 2018

Deep Learning Part 1: 2019 Edition

| Notebook | Description | | --- | --- | | 00notebooktutorial.ipynb,
lesson1-pets.ipynb | Lesson 1 - Image Recognition | | lesson2-download.ipynb
lesson2-sgd.ipynb | Lesson 2 - Computer Vision: Deeper Applications | | lesson3-planet.ipynb
lesson3-camvid.ipynb
lesson3-head-pose.ipynb
lesson3-imdb.ipynb | Lesson 3 - Multi-label, Segmentation, Image Regression, and More | | lesson4-tabular.ipynb
lesson4-collab.ipynb | Lesson 4 - NLP, Tabular, and Collaborative Filtering | | lesson5-sgd-mnist.ipynb | Lesson 5 - Foundations of Neural Networks | | lesson6-rossmann.ipynb
rossmandataclean.ipynb
lesson6-pets-more.ipynb | Lesson 6 - Foundations of Convolutional Neural Networks | | lesson7-resnet-mnist.ipynb
lesson7-superres-gan.ipynb
lesson7-superres-imagenet.ipynb
lesson7-superres.ipynb
lesson7-wgan.ipynb
lesson7-human-numbers.ipynb | Lesson 7 - ResNets, U-Nets, GANs and RNNs |

fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2017 (v1): Feb - Apr 2017

Deep Learning Part 2: 2017 Edition

| Notebook | Description | | --- | --- | | neural-style | Lesson 8 - Artistic Style | | imagenet-processing | Lesson 9 - Generative Models | | neural-sr,
keras-dcgan,
pytorch-tutorial,
wgan-pytorch | Lesson 10 - Multi-modal & GANs | | kmeans-clustering,
babi-memory-neural-net | Lesson 11 - Memory Networks | | spellingbeeRNN | Lesson 12 - Attentional Models | | translate-pytorch,
densenet-keras | Lesson 13 - Neural Translation | | rossmann,
tiramisu-keras | Lesson 14 - Time Series & Segmentation |

fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2018 (v2): Mar - May 2018

Deep Learning Part 2: 2018 Edition

| Notebook | Description | | --- | --- | | Pascal VOC—Object Detection | Lesson 8 - Object Detection | | Pascal VOC—Multi Object Detection | Lesson 9 - Single Shot Multibox Detector (SSD) | | IMDB—Language Model | Lesson 10 - Transfer Learning for NLP and NLP Classification | | WMT15 Giga French-English—Neural Machine Translation,
DeViSE (Deep Visual-Semantic Embedding Model) | Lesson 11 - Neural Translation; Multi-modal Learning | | CIFAR-10 DarkNet,
CIFAR-10 DAWNBench,
Wasserstein GAN,
CycleGAN | Lesson 12 - DarkNet; Generative Adversarial Networks (GANs) | | TrainingPhase API,
Neural Algorithm of Artistic Style Transfer | Lesson 13 - Image Enhancement; Style Transfer; Data Ethics | | Super Resolution,
Real-time Style Transfer Neural Net,
Kaggle Carvana Image Masking,
Kaggle Carvana Image Masking using U-Net,
Kaggle Carvana Image Masking using U-Net Large | Lesson 14 - Super Resolution; Image Segmentation with U-Net |

Machine Learning

ML Assignments, Exercises or Course Works

Andrew Ng's "Machine Learning" class on Coursera

fast.ai's machine learning course

Libraries or Frameworks

NumPy

|Notebook|Description| | --- | --- | | NumPy in 10 minutes | Introduction to NumPy for deep learning in 10 minutes |

PyTorch

WIP

TensorFlow

|Notebook|Description| | --- | --- | | Guide to TensorFlow Keras on TPUs MNIST | Guide to TensorFlow + Keras on TPU v2 for free on Google Colab |

Keras

WIP

Pandas

WIP

Matplotlib

WIP

Kaggle Competitions

| Notebook | Description | | --- | --- | | planet_cv | Planet: Understanding the Amazon from Space—use satellite data to track the human footprint in the Amazon rainforest | | Rossmann | Rossmann Store Sales—forecast sales using store, promotion, and competitor data | | fish | The Nature Conservancy Fisheries Monitoring—Can you detect and classify species of fish? |

License

This repository contains a variety of content; some developed by Cedric Chee, and some from third-parties. The third-party content is distributed under the license provided by those parties.

I am providing code and resources in this repository to you under an open source license. Because this is my personal repository, the license you receive to my code and resources is from me and not my employer.

The content developed by Cedric Chee is distributed under the following license:

Code

The code in this repository, including all code samples in the notebooks listed above, is released under the MIT license. Read more at the Open Source Initiative.

Text

The text content of the book is released under the CC-BY-NC-ND license. Read more at Creative Commons.

Owner

  • Name: Cedric Chee
  • Login: cedrickchee
  • Kind: user
  • Location: PID 1
  • Company: InvictusByte

Lead Software Engineer | LLMs | full stack Go/JS dev, backend | product dev @ startups | 🧑‍🎓 CompSci | alumni: fast.ai, Antler.co

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi