https://github.com/darkstarstrix/advancedml_learning_materials

An Advanced ML Course

https://github.com/darkstarstrix/advancedml_learning_materials

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.5%) to scientific vocabulary

Keywords

advanced educational-project machine-learning
Last synced: 9 months ago · JSON representation

Repository

An Advanced ML Course

Basic Info
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
advanced educational-project machine-learning
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme

README.md

Machine Learning Course

This repository contains code examples, data, and plots for a machine learning course that I'm taking. The course covers various advanced topics in machine learning and optimization. Below are the lessons and their respective topics:

Table of Contents

  1. Lesson 1: Abstract Linear Algebra for ML and Optimization (done)
  2. Lesson 2: Statistical Foundations and Probability Theory (done)
  3. Lesson 3: Deep Optimization Techniques (done)
  4. Lesson 4: Hyperparameter Tuning with Optuna (done)
  5. Lesson 5: Quantization and Model Compression (done)
  6. Lesson 6: Big Data Analytics and Distributed Systems (done)
  7. Lesson 7: Capstone Project: Real-World Application (done)
  8. Lesson 8: Optuna-Driven Quantization for Production (done)
  9. File Structure
  10. How to Use

Lesson 1: Abstract Linear Algebra for ML and Optimization

Introduction to Abstract Linear Algebra

Explore the role and significance of abstract linear algebra in machine learning and optimization. Understand foundational concepts that will be built upon in this lesson.

Singular Value Decomposition (SVD)

Learn the theoretical aspects of Singular Value Decomposition and its applications. Implement SVD using NumPy to perform dimensionality reduction on datasets.

Principal Component Analysis (PCA)

Master Principal Component Analysis for feature extraction and dimensionality reduction. Use NumPy to apply PCA on real-world datasets, enhancing data representation.

Matrix Factorization with JAX

Delve into the process of matrix factorization using the JAX library. Compare and contrast its performance with traditional methods like NumPy, focusing on scalability.

Practical Performance Comparisons

Analyze performance metrics by implementing matrix operations using NumPy and JAX. Gain insights into optimization strategies for large-scale datasets.

File location for this lesson: All the scripts and plots for this lesson are located in the Abstract linear Algebra folder.

Lesson 2: Statistical Foundations and Probability Theory

Introduction to Probability Theory

Explore the foundational concepts of probability theory crucial for understanding machine learning models. Discuss probability distributions and their significance in statistical modeling.

Statistical Inference and Estimation

Learn about statistical inference methods and parameter estimation. Understand hypothesis testing and confidence intervals as tools for making data-driven decisions.

Bayesian Regression Models

Dive into Bayesian statistics and the construction of Bayesian regression models. Gain practical experience by building models using PyMC3 and interpreting their results.

Gaussian Mixture Models

Understand the theory and implementation of Gaussian Mixture Models for modeling a mixture of different distributions. Use TensorFlow Probability to build and evaluate these models.

Applications of Statistical Learning

Explore real-world applications of statistical methods and probability theory in machine learning. Discuss case studies and practice applying these techniques to datasets.

File location for this lesson: All the scripts and plots for this lesson are located in the Statistical Foundations and Probability Theory folder.

Lesson 3: Deep Optimization Techniques

Understanding Optimization Strategies

Explore various advanced optimization strategies for deep learning. Analyze the theoretical foundations and their impact on training neural networks.

Implementing Adam Optimizer with PyTorch

Learn the workings of the Adam optimizer and its advantages over traditional methods. Implement the Adam optimizer on the MNIST dataset using PyTorch.

Gradient Descent and JAX

Understand the principles of gradient descent optimization. Implement gradient descent in JAX and compare its performance with PyTorch implementations.

Comparison of Optimization Frameworks

Compare and contrast optimization strategies using JAX and PyTorch. Evaluate performance through practical implementation scenarios and metrics.

Real-world Application of Optimization Techniques

Apply advanced optimization techniques to real-world machine learning problems. Discuss case studies where these methodologies have improved model performance.

File location for this lesson: All the scripts and plots for this lesson are located in the Deep Optimization Techniques folder.

Lesson 4: Hyperparameter Tuning with Optuna

Understanding Hyperparameter Tuning

Learn the fundamentals of hyperparameter tuning and its importance in model optimization. This sets the foundation for using Optuna effectively.

Introduction to Optuna

Discover Optuna's features for hyperparameter tuning and explore its core components. Understand its advantages over traditional tuning techniques.

Setting Up and Running Optuna for CNNs

Learn how to implement Optuna for optimizing CNN hyperparameters on the CIFAR-10 dataset. Develop practical skills for conducting experiments and evaluating results.

Optimizing LSTM Models with Optuna

Master the use of Optuna to tune hyperparameters for LSTM models on time-series data. Gain insights into adjusting configurations for optimal forecasting performance.

Advanced Strategies for Hyperparameter Search

Explore advanced search strategies within Optuna, including Bayesian optimization and pruning techniques. Learn to enhance efficiency in finding optimal parameters.

File location for this lesson: All the scripts and plots for this lesson are located in the Understanding Hyperparameter tuing folder.

Lesson 5: Quantization and Model Compression

Introduction to Quantization

Learn the fundamentals of quantization and why it's essential for efficient AI model deployment. Explore different types of quantization techniques such as post-training and quantization-aware training.

TensorFlow Lite for Model Quantization

Master the process of quantizing models using TensorFlow Lite. Understand how to quantize a pre-trained ResNet model and assess its performance on edge devices.

ONNX for Model Compression

Explore ONNX as a tool for model compression and quantization. Learn to perform post-training quantization and compare inference speeds across platforms.

Quantization-Aware Training

Delve into quantization-aware training to improve model accuracy post-quantization. Implement this technique on a MobileNet model using TensorFlow on the COCO Dataset.

Comparing Quantization Techniques

Analyze the effectiveness and trade-offs between different quantization methods. Compare accuracy, speed, and resource utilization to determine optimal practices for your AI needs.

File location for this lesson: All the scripts and plots for this lesson are located in the Quantization and Model Compression folder.

Lesson 6: Big Data Analytics and Distributed Systems

Introduction to Distributed Systems for Big Data

Learn the basics of distributed systems and their role in handling large-scale data. Understand the key concepts and architecture behind Hadoop and PySpark.

Data Processing with Hadoop MapReduce

Explore the MapReduce programming model and its application in processing big data. Implement a simple word count problem to understand the basic operations of distributed data processing.

Introduction to PySpark

Discover PySpark as a powerful tool for large-scale data analysis. Get hands-on experience in setting up PySpark and performing basic operations like data ingestion and transformation.

Advanced Data Analytics with PySpark

Delve deeper into PySpark's data processing capabilities with real-world datasets. Learn to implement complex data aggregation, filtering, and transformation operations.

Building Scalable Applications with Distributed Systems

Understand the challenges and strategies for building scalable applications using distributed systems. Learn about resource management and fault tolerance in distributed environments.

Case Study: Distributed Machine Learning Pipeline

Analyze a complete end-to-end machine learning pipeline in a distributed system. Apply learned concepts to design and implement a multi-node distributed training pipeline.

File location for this lesson: All the scripts and plots for this lesson are located in the Big Data Analytics and Distributed Systems folder.

Lesson 7: Capstone Project: Real-World Application

Project Selection and Scope Definition

Learn how to choose an impactful project within the areas of recommender systems or model compression. Define the project scope to ensure feasibility and alignment with the skills learned.

Project Planning and Milestones

Develop a roadmap for project execution with clear milestones. Establish timelines and resource allocation for effective project management.

Data Collection and Preprocessing

Master techniques for collecting and preprocessing data necessary for the capstone project. Focus on data cleaning, transformation, and validation to ensure quality results.

Model Development and Iteration

Apply machine learning and optimization techniques from previous lessons to develop initial models. Iterate and refine models to improve performance and efficiency.

Evaluation and Deployment Strategies

Design strategies to evaluate model performance effectively. Learn deployment techniques for scaling applications to real-world environments.

Presentation and Knowledge Sharing

Prepare and deliver a comprehensive presentation of the project findings. Emphasize lessons learned and possible future enhancements.

File location for this lesson: All the scripts and plots for this lesson are located in the Capstone Project: Real-World Application folder.

Lesson 8: Optuna-Driven Quantization for Production

Integrating Optuna with JAX and TensorFlow

Learn how to integrate Optuna with JAX and TensorFlow for enhancing model optimization pipelines. Understand the benefits and use cases of combining these tools for streamlined machine learning workflows.

Optimizing Quantization Settings with Optuna

Discover how Optuna can be leveraged to optimize quantization settings for deep learning models. Explore techniques for automating the search for optimal parameters that maximize performance on diverse hardware.

Pipeline Construction for Model Optimization

Learn to build robust pipelines that integrate Optuna-driven hyperparameter tuning with model quantization processes. Develop an understanding of creating scalable and efficient pipelines for machine learning deployment.

Deploying Optimized Models in Production

Master the deployment of models optimized through Optuna-driven quantization. Gain insights into real-world constraints and devise strategies for efficient model serving in production environments.

Case Studies: Optuna-Driven Optimization Success

Analyze real-world case studies where Optuna-driven quantization significantly enhanced model performance. Discuss the methodologies and techniques that led to successful optimization and deployment.

File location for this lesson: All the scripts and plots for this lesson are located in the Optuna-Driven Quantization for Production folder.

File Structure

AdvancedML_materials/ Abstract linear Algebra/ Scrpits/ Plots/ Statistical Foundations and Probability Theory/ Scrpits/ Plots/ Deep Optimization Techniques/ Scrpits/ Plots/

How to Use

  1. Clone the repository: sh git clone https://github.com/DarkStarStrix/AdvancedML_materials.git cd Machine-Learning-Course

  2. Install the required packages: sh pip install -r requirements.txt

Course Material

The course material is organized into eight lessons, each focusing on a specific topic in machine learning and optimization. Each lesson includes theoretical explanations, practical implementations, and performance comparisons. The lessons are designed to build upon each other, providing a comprehensive understanding of advanced machine learning concepts and techniques.

Owner

  • Name: Allan Murimi Wandia
  • Login: DarkStarStrix
  • Kind: user
  • Location: U.S.A
  • Company: Freelance

Full stack Dev Turning ideas into projects

GitHub Events

Total
  • Watch event: 1
  • Delete event: 2
  • Push event: 8
  • Pull request event: 4
  • Create event: 3
Last Year
  • Watch event: 1
  • Delete event: 2
  • Push event: 8
  • Pull request event: 4
  • Create event: 3