https://github.com/darkstarstrix/advancedml_learning_materials
An Advanced ML Course
https://github.com/darkstarstrix/advancedml_learning_materials
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.5%) to scientific vocabulary
Keywords
Repository
An Advanced ML Course
Basic Info
- Host: GitHub
- Owner: DarkStarStrix
- Language: Python
- Default Branch: master
- Homepage: https://eureka.xyz/courses/303/enrollments/942
- Size: 23.3 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Machine Learning Course
This repository contains code examples, data, and plots for a machine learning course that I'm taking. The course covers various advanced topics in machine learning and optimization. Below are the lessons and their respective topics:
Table of Contents
- Lesson 1: Abstract Linear Algebra for ML and Optimization (done)
- Lesson 2: Statistical Foundations and Probability Theory (done)
- Lesson 3: Deep Optimization Techniques (done)
- Lesson 4: Hyperparameter Tuning with Optuna (done)
- Lesson 5: Quantization and Model Compression (done)
- Lesson 6: Big Data Analytics and Distributed Systems (done)
- Lesson 7: Capstone Project: Real-World Application (done)
- Lesson 8: Optuna-Driven Quantization for Production (done)
- File Structure
- How to Use
Lesson 1: Abstract Linear Algebra for ML and Optimization
Introduction to Abstract Linear Algebra
Explore the role and significance of abstract linear algebra in machine learning and optimization. Understand foundational concepts that will be built upon in this lesson.
Singular Value Decomposition (SVD)
Learn the theoretical aspects of Singular Value Decomposition and its applications. Implement SVD using NumPy to perform dimensionality reduction on datasets.
Principal Component Analysis (PCA)
Master Principal Component Analysis for feature extraction and dimensionality reduction. Use NumPy to apply PCA on real-world datasets, enhancing data representation.
Matrix Factorization with JAX
Delve into the process of matrix factorization using the JAX library. Compare and contrast its performance with traditional methods like NumPy, focusing on scalability.
Practical Performance Comparisons
Analyze performance metrics by implementing matrix operations using NumPy and JAX. Gain insights into optimization strategies for large-scale datasets.
File location for this lesson:
All the scripts and plots for this lesson are located in the Abstract linear Algebra folder.
Lesson 2: Statistical Foundations and Probability Theory
Introduction to Probability Theory
Explore the foundational concepts of probability theory crucial for understanding machine learning models. Discuss probability distributions and their significance in statistical modeling.
Statistical Inference and Estimation
Learn about statistical inference methods and parameter estimation. Understand hypothesis testing and confidence intervals as tools for making data-driven decisions.
Bayesian Regression Models
Dive into Bayesian statistics and the construction of Bayesian regression models. Gain practical experience by building models using PyMC3 and interpreting their results.
Gaussian Mixture Models
Understand the theory and implementation of Gaussian Mixture Models for modeling a mixture of different distributions. Use TensorFlow Probability to build and evaluate these models.
Applications of Statistical Learning
Explore real-world applications of statistical methods and probability theory in machine learning. Discuss case studies and practice applying these techniques to datasets.
File location for this lesson:
All the scripts and plots for this lesson are located in the Statistical Foundations and Probability Theory folder.
Lesson 3: Deep Optimization Techniques
Understanding Optimization Strategies
Explore various advanced optimization strategies for deep learning. Analyze the theoretical foundations and their impact on training neural networks.
Implementing Adam Optimizer with PyTorch
Learn the workings of the Adam optimizer and its advantages over traditional methods. Implement the Adam optimizer on the MNIST dataset using PyTorch.
Gradient Descent and JAX
Understand the principles of gradient descent optimization. Implement gradient descent in JAX and compare its performance with PyTorch implementations.
Comparison of Optimization Frameworks
Compare and contrast optimization strategies using JAX and PyTorch. Evaluate performance through practical implementation scenarios and metrics.
Real-world Application of Optimization Techniques
Apply advanced optimization techniques to real-world machine learning problems. Discuss case studies where these methodologies have improved model performance.
File location for this lesson:
All the scripts and plots for this lesson are located in the Deep Optimization Techniques folder.
Lesson 4: Hyperparameter Tuning with Optuna
Understanding Hyperparameter Tuning
Learn the fundamentals of hyperparameter tuning and its importance in model optimization. This sets the foundation for using Optuna effectively.
Introduction to Optuna
Discover Optuna's features for hyperparameter tuning and explore its core components. Understand its advantages over traditional tuning techniques.
Setting Up and Running Optuna for CNNs
Learn how to implement Optuna for optimizing CNN hyperparameters on the CIFAR-10 dataset. Develop practical skills for conducting experiments and evaluating results.
Optimizing LSTM Models with Optuna
Master the use of Optuna to tune hyperparameters for LSTM models on time-series data. Gain insights into adjusting configurations for optimal forecasting performance.
Advanced Strategies for Hyperparameter Search
Explore advanced search strategies within Optuna, including Bayesian optimization and pruning techniques. Learn to enhance efficiency in finding optimal parameters.
File location for this lesson:
All the scripts and plots for this lesson are located in the Understanding Hyperparameter tuing folder.
Lesson 5: Quantization and Model Compression
Introduction to Quantization
Learn the fundamentals of quantization and why it's essential for efficient AI model deployment. Explore different types of quantization techniques such as post-training and quantization-aware training.
TensorFlow Lite for Model Quantization
Master the process of quantizing models using TensorFlow Lite. Understand how to quantize a pre-trained ResNet model and assess its performance on edge devices.
ONNX for Model Compression
Explore ONNX as a tool for model compression and quantization. Learn to perform post-training quantization and compare inference speeds across platforms.
Quantization-Aware Training
Delve into quantization-aware training to improve model accuracy post-quantization. Implement this technique on a MobileNet model using TensorFlow on the COCO Dataset.
Comparing Quantization Techniques
Analyze the effectiveness and trade-offs between different quantization methods. Compare accuracy, speed, and resource utilization to determine optimal practices for your AI needs.
File location for this lesson:
All the scripts and plots for this lesson are located in the Quantization and Model Compression folder.
Lesson 6: Big Data Analytics and Distributed Systems
Introduction to Distributed Systems for Big Data
Learn the basics of distributed systems and their role in handling large-scale data. Understand the key concepts and architecture behind Hadoop and PySpark.
Data Processing with Hadoop MapReduce
Explore the MapReduce programming model and its application in processing big data. Implement a simple word count problem to understand the basic operations of distributed data processing.
Introduction to PySpark
Discover PySpark as a powerful tool for large-scale data analysis. Get hands-on experience in setting up PySpark and performing basic operations like data ingestion and transformation.
Advanced Data Analytics with PySpark
Delve deeper into PySpark's data processing capabilities with real-world datasets. Learn to implement complex data aggregation, filtering, and transformation operations.
Building Scalable Applications with Distributed Systems
Understand the challenges and strategies for building scalable applications using distributed systems. Learn about resource management and fault tolerance in distributed environments.
Case Study: Distributed Machine Learning Pipeline
Analyze a complete end-to-end machine learning pipeline in a distributed system. Apply learned concepts to design and implement a multi-node distributed training pipeline.
File location for this lesson:
All the scripts and plots for this lesson are located in the Big Data Analytics and Distributed Systems folder.
Lesson 7: Capstone Project: Real-World Application
Project Selection and Scope Definition
Learn how to choose an impactful project within the areas of recommender systems or model compression. Define the project scope to ensure feasibility and alignment with the skills learned.
Project Planning and Milestones
Develop a roadmap for project execution with clear milestones. Establish timelines and resource allocation for effective project management.
Data Collection and Preprocessing
Master techniques for collecting and preprocessing data necessary for the capstone project. Focus on data cleaning, transformation, and validation to ensure quality results.
Model Development and Iteration
Apply machine learning and optimization techniques from previous lessons to develop initial models. Iterate and refine models to improve performance and efficiency.
Evaluation and Deployment Strategies
Design strategies to evaluate model performance effectively. Learn deployment techniques for scaling applications to real-world environments.
Presentation and Knowledge Sharing
Prepare and deliver a comprehensive presentation of the project findings. Emphasize lessons learned and possible future enhancements.
File location for this lesson:
All the scripts and plots for this lesson are located in the Capstone Project: Real-World Application folder.
Lesson 8: Optuna-Driven Quantization for Production
Integrating Optuna with JAX and TensorFlow
Learn how to integrate Optuna with JAX and TensorFlow for enhancing model optimization pipelines. Understand the benefits and use cases of combining these tools for streamlined machine learning workflows.
Optimizing Quantization Settings with Optuna
Discover how Optuna can be leveraged to optimize quantization settings for deep learning models. Explore techniques for automating the search for optimal parameters that maximize performance on diverse hardware.
Pipeline Construction for Model Optimization
Learn to build robust pipelines that integrate Optuna-driven hyperparameter tuning with model quantization processes. Develop an understanding of creating scalable and efficient pipelines for machine learning deployment.
Deploying Optimized Models in Production
Master the deployment of models optimized through Optuna-driven quantization. Gain insights into real-world constraints and devise strategies for efficient model serving in production environments.
Case Studies: Optuna-Driven Optimization Success
Analyze real-world case studies where Optuna-driven quantization significantly enhanced model performance. Discuss the methodologies and techniques that led to successful optimization and deployment.
File location for this lesson:
All the scripts and plots for this lesson are located in the Optuna-Driven Quantization for Production folder.
File Structure
AdvancedML_materials/
Abstract linear Algebra/
Scrpits/
Plots/
Statistical Foundations and Probability Theory/
Scrpits/
Plots/
Deep Optimization Techniques/
Scrpits/
Plots/
How to Use
Clone the repository:
sh git clone https://github.com/DarkStarStrix/AdvancedML_materials.git cd Machine-Learning-CourseInstall the required packages:
sh pip install -r requirements.txt
Course Material
The course material is organized into eight lessons, each focusing on a specific topic in machine learning and optimization. Each lesson includes theoretical explanations, practical implementations, and performance comparisons. The lessons are designed to build upon each other, providing a comprehensive understanding of advanced machine learning concepts and techniques.
Owner
- Name: Allan Murimi Wandia
- Login: DarkStarStrix
- Kind: user
- Location: U.S.A
- Company: Freelance
- Website: https://www.kaggle.com/allanwandia
- Repositories: 1
- Profile: https://github.com/DarkStarStrix
Full stack Dev Turning ideas into projects
GitHub Events
Total
- Watch event: 1
- Delete event: 2
- Push event: 8
- Pull request event: 4
- Create event: 3
Last Year
- Watch event: 1
- Delete event: 2
- Push event: 8
- Pull request event: 4
- Create event: 3