aai-500-final-project

Statistical analysis of student performance from Portuguese school data, investigating impact of social, demographic, and academic factors on final grades.

https://github.com/apmalinsky/aai-500-final-project

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.9%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Statistical analysis of student performance from Portuguese school data, investigating impact of social, demographic, and academic factors on final grades.

Basic Info
  • Host: GitHub
  • Owner: apmalinsky
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 3.25 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme Citation

README.md

Statistical Analysis of Student Performance

This project is a part of the AAI-500: Probability and Statistics for Artificial Intelligence course in the Applied Artificial Intelligence Program at the University of San Diego (USD).

Installation

The code for this project is found within three separate Jupyter Notebooks in the notebooks folder.

Project Description

The goal of this project is to investigate which factors have the greatest impact on final grades. These factors were derived from the selected data set titled “Student Performance,” taken from the UC Irvine Machine Learning Repository (Cortez, 2008). The data involves survey and questionnaire data among students in two Portuguese schools in two distinct subjects: Mathematics (Math) and Portuguese language (Port). The data is split into two tables, with one table per subject. There are a total of 395 students in the Math table and 649 students in the Port table. Both tables have the same 30 independent features and 3 dependent features. For our analysis, the independent features were divided into three categories: Social, Demographic, and Academic. The dependent features represent first period (G1), second period (G2), and final grades (G3). We performed statistical analysis in each of these independent feature categories to determine the impacts of selected features on final grade outcomes. The notebook AppendixA-SocialFactors analyzes the impact of social factors on final grades. The notebook AppendixB-DemographicFactors analyzes the impact of demographic factors on final grades. The notebook AppendixC-AcademicFactors analyzes the impact of academic factors on final grades.

Project Report

We include our final report for further details about data cleaning and preparation, exploratory data analysis, model selection, model analysis, conclusions and recommendations, references, and outputs of the implemented code. The final report can be found here: Final Report.

Contributors

Andy Malinsky, Maha Jayapal, Scott Hogan

Methods/Technologies Used

  • Descriptive/Inferential statistics
  • Frequency Distribution plotting/analysis
  • Chi-square/Independent t-test significance testing
  • Ordinary Least Squares regression

Owner

  • Name: Andy Malinsky
  • Login: apmalinsky
  • Kind: user

GitHub Events

Total
Last Year