aims-dscbi
Data science course for Rwanda national statistical system (NSS) staff
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.9%) to scientific vocabulary
Repository
Data science course for Rwanda national statistical system (NSS) staff
Basic Info
- Host: GitHub
- Owner: dmatekenya
- License: mpl-2.0
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: https://dmatekenya.github.io/AIMS-DSCBI/
- Size: 5.94 MB
Statistics
- Stars: 2
- Watchers: 0
- Forks: 25
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
Data Science Capacity Building Initiative (DSCBI)
This repository provides information about the capacity building initiative, a collaboration between the African Institute for Mathematical Sciences (AIMS), the National Institute of Statistics of Rwanda (NISR), and Cenfri. The initiative aims to deliver data science training to staff from over 20 institutions that are part of Rwanda's National Statistical System (NSS).
This repository provides information about the capacity building initiative, a collaboration between the African Institute for Mathematical Sciences (AIMS), the National Institute of Statistics of Rwanda (NISR), and Cenfri. The initiative aims to deliver data science training to staff from over 20 institutions that are part of Rwanda’s National Statistical System (NSS).
Course Modules
The training program is structured around the following core modules:
Python Foundations for Data Science
Covers core and advanced Python programming, object-oriented design, development tools, Git/GitHub workflows, Python packaging, and best practices for maintainable code.Data Analysis with Python
Focuses on cleaning, transforming, and analyzing structured data using pandas and polars, with applications to real-world datasets like census and surveys.Working with Spatial Data in Python
Introduces geospatial data handling using GeoPandas, shapely, and rasterio; includes spatial joins, projections, and mapping access to services.Working with Time Series Data in Python
Covers handling temporal data, resampling, rolling windows, time series decomposition, and forecasting using statsmodels and Prophet.Databases and APIs
Introduces SQL and relational databases, extracting and analyzing data from APIs, and building REST APIs with FastAPI.Introduction to Machine Learning with Scikit-learn
Provides foundations in supervised and unsupervised learning, including models like logistic regression, decision trees, and PCA, along with model evaluation.Natural Language Processing (NLP) and Large Language Models (LLMs)
Covers foundational NLP, LLMs, embeddings, vector search, and building AI-powered applications using frameworks like Hugging Face and LangChain.Advanced Topics in Data Science
Explores interactive dashboards, data integration from unstructured sources, advanced database techniques, and cloud storage for analytics.Capstone Project
Teams develop and present real-world data projects scoped from their institutional needs, applying techniques learned across modules.
Course Structure
The course is divided into self-contained modules, each designed to provide useful skills and knowledge. The modules are organized sequentially to build on skills learned in previous modules. To make the course engaging and informative, each module includes the following components:
Lecture
Each lecture covers key conceptual knowledge for the topic at hand.Practical Labs
Programming activities provide learners with practical skills to implement solutions discussed in lectures. These labs include adaptable recipes for various use cases.Case Studies
Case studies showcase elaborate projects that demonstrate real-world applications.Assessment
Each module assessment combines theoretical (quizzes) and programming questions to evaluate learners' understanding of the concepts and skills covered in the module.
Repository Structure and Contents
This repository serves as the primary resource for accessing course content, including slides, Python programming labs, example applications using LLMs, and additional materials to support learning about Generative AI and building applications with LLMs. For easy navigation, use the link and contents outlined below.
Contents
Please visit the documentation website for a complete table of contents.
License
The template is licensed under the Mozilla Public License. Remember to replace the license if necessary. If open source, choose an open source license.
Owner
- Name: Dunstan Matekenya
- Login: dmatekenya
- Kind: user
- Location: Washington, DC
- Company: The World Bank
- Repositories: 1
- Profile: https://github.com/dmatekenya
Data Scientist at the World Bank Group
Citation (CITATION.cff)
cff-version: 1.2.0
message: "Country borders or names do not necessarily reflect the World Bank Group’s official position. All maps are for illustrative purposes and do not imply the expression of any opinion on the part of the World Bank, concerning the legal status of any country or territory or concerning the delimitation of frontiers or boundaries."
title: "World Bank Data Lab Project Template"
authors:
- affiliation: World Bank
family-names: Stefanini Vicente
given-names: Gabriel
orcid: https://orcid.org/0000-0001-6530-3780
keywords:
- Open Science
repository-code: https://github.com/worldbank/template/tree/main
GitHub Events
Total
- Issues event: 1
- Watch event: 1
- Issue comment event: 3
- Member event: 2
- Push event: 28
- Pull request event: 3
- Fork event: 10
- Create event: 3
Last Year
- Issues event: 1
- Watch event: 1
- Issue comment event: 3
- Member event: 2
- Push event: 28
- Pull request event: 3
- Fork event: 10
- Create event: 3
Dependencies
- actions/checkout v4 composite
- actions/deploy-pages v4 composite
- actions/setup-python v5 composite
- actions/upload-pages-artifact v3 composite
- actions/checkout v4 composite
- actions/download-artifact v4 composite
- actions/setup-python v5 composite
- actions/upload-artifact v4 composite
- pypa/gh-action-pypi-publish release/v1 composite
- bokeh >=3,<4
- pandas >=2
- pycountry >=22.3.5
- requests >=2.28.1
- actions/checkout v4 composite
- actions/deploy-pages v4 composite
- actions/setup-python v5 composite
- actions/upload-pages-artifact v3 composite
- docutils >=0.18.1
- jupyter-book >=0.15.0
- myst-parser >=2.0.0
- sphinx >=7.0.0
- sphinx-book-theme >=1.0.0
- sphinx-external-toc >=0.3.1
- sphinx-multitoc-numbering >=0.1.3