wine_classification

An implementation of supervised machine learning with k-nearest neighbours and decision tree algorithm

https://github.com/nazliozum/wine_classification

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.4%) to scientific vocabulary

Keywords

decision-tree k-nearest-neighbours machine-learning

Last synced: 11 months ago · JSON representation

Repository

An implementation of supervised machine learning with k-nearest neighbours and decision tree algorithm

Basic Info

Host: GitHub
Owner: Nazliozum
License: other
Language: Python
Default Branch: master
Homepage:
Size: 891 KB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 3

Topics

decision-tree k-nearest-neighbours machine-learning

Created over 8 years ago · Last pushed over 8 years ago

Metadata Files

Readme Contributing License Citation

Objectives

The aim of this project is to practice best practices in data science workflows as well as some newly-acquired supervised machine learning techniques.

In this project, I implement the k-nearest neighbours algorithm and the decision tree algorithm on the Wine Data Set. In the end, I get the accuracy score of both algorithms when predicting the type of wine for a new input.

I present the accuracy scores as percentages. I am interested in whether there will be a dramatic difference in the accuracy scores of these algorithms and if there is, which one will be higher.

Data

The data used in this project is from UC Irvine Machine Learning Repository. It consists of 178 observations, 13 attributes and 3 classes of wine.

The data is also available in current repository as wine_data.csv.

System Requirements

Python 3.6 and packages:
- scikit-learn==0.18.1
- pandas==0.20.1
- numpy==1.12.1
- argparse==1.4.0
- matplotlib==2.0.2

Dependency Diagram

Reproducing the Analysis

Clone this repository or download it. Then, cd to the project directory on your computer. The project directory already has the intermediate files of the analysis that has been run before. In order to clean the analysis and re-run it, first, run make clean.

You can use two options to run the analysis. The directions for both are explained as follows:

Using conda environment:

Run the command below.

conda env create -f environment.yml

This will create the python environment required for the analysis. Then run the command below to carry out the analysis from top to bottom.

make all

Using docker image:

If you have Docker installed on your computer, you can run the command below that will tell automatically download/pull the Docker image required for this analysis. Don't forget to modify the VOLUME_ON_YOUR_COMPUTER part with the appropriate path to the project directory on your computer.

docker run --rm -it -v VOLUME_ON_YOUR_COMPUTER:/home/wine_classification nazliozum/wine_classification /bin/bash

Now, your prompt should change to look something like this:

root@1fc309a08883:/#

Then cd into the /home/wine_classification directory and then run make all.

Now, the whole analysis will run from top to bottom.

You can use the command exit to exit the container and go back to your regular Shell.

Author

Nazli Ozum Kafaee

Owner

Name: Özüm Kafaee
Login: Nazliozum
Kind: user
Location: Vancouver, Canada
Company: University of British Columbia

Repositories: 1
Profile: https://github.com/Nazliozum

Data scientist

Citation (CITATION.md)

GitHub Events

Total

Last Year

Dependencies

Dockerfile docker

rocker/tidyverse latest build

environment.yml pypi

graphviz ==0.8.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

wine_classification

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Objectives

Data

System Requirements

Dependency Diagram

Reproducing the Analysis

Author

Owner

Citation (CITATION.md)

GitHub Events

Total

Last Year

Dependencies