classificationensembles

Automatically Builds 25 Classification Models (15 Individual and 10 Ensembles of Model) From Classification Data

https://github.com/infinitecuriosity/classificationensembles

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.9%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Automatically Builds 25 Classification Models (15 Individual and 10 Ensembles of Model) From Classification Data

Basic Info

Host: GitHub
Owner: InfiniteCuriosity
License: other
Language: R
Default Branch: master
Size: 4.52 MB

Statistics

Stars: 1
Watchers: 1
Forks: 0
Open Issues: 10
Releases: 0

Created over 1 year ago · Last pushed 11 months ago

Metadata Files

Readme Changelog License

ClassificationEnsembles

The goal of ClassificationEnsembles is to automatically conduct a thorough analysis of data that includes classification data. The user only needs to provide the data and answer a few questions (such as which column to analyze). ClassificationEnsembles fits 25 models (15 individual models and 10 ensembles of models). The package also returns 13 plots, five tables and a summary report sorted by accuracy (highest to lowest)

Installation

You can install the development version of ClassificationEnsembles like so:

r devtools::install_github("InfiniteCuriosity/ClassificationEnsembles")

Example

ClassificationEnsembles will model the location of a car seat (Good, Medium or Bad) based on the other features in the Carseats data set

``` r library(ClassificationEnsembles) Classification(data = ISLR::Carseats, colnum = 7, numresamples = 25, predictonnewdata = "N", setseed = "N", removeVIFabove = 5.00, scaleallnumericpredictorsindata = "N", howtohandlestrings = 1, savealltrainedmodels = "N", saveallplots = "N", useparallel = "Y", trainamount = 0.60, testamount = 0.20, validation_amount = 0.20) )

```

The 20 models which are build automatically are:

Bagged Random Forest
Bagging
C50
Ensemble BaggedCart
Ensemble Bagged Random Forest
Ensemble C50
Ensemble NaiveBayes
Ensemble Random Forest
Ensemble Ranger
Ensemble Support Vector Machines
Ensemble Trees
Linear
Naive Bayes
Partial Least Squares
Penalized Discrmininant Analysis
Random Forest
Ranger
RPart
Support Vector Machines
Trees

The 26 plots it returns automatically are:
1. Holdout accuracy / train accurcy by model, fixed scales 2. Residuals by model, free scales 3. Residuals by model, fixed scales 4. Classification error, free scales 5. Classification error, fixed scales 6. Accuracy data, free scales 7. Accuracy data, fixed scales 8. Accuracy by model, free scales 9. Accuracy by model, fixed scales 10. Histograms of numeric columns 11. Boxplots of numeric columns 12. Duration barchart 13. False negative rate free scales 14. False negative rate fixed scales 15. False positive rate, free scales 16. False positive rate, fixed scales 17. True negative rate, free scales 18. True negative rate, fixed scales 19. True positive rate, free scales 20. True positive rate, fixed scales 21. Over or underfitting barchart 22. Model accuracy barchart 23. Barchart of each feature vs target by percentage 24. Barchart of each feature vs target by value 25. Correlation of numeric data as circles and colors 26. Correlation of numeric data as numbers and colors

The 5 tables the package returns automatically are:
1. Head of the ensemble
2. Head of the data frame
3. Variance Inflation Factor of the numeric columns 4. Correlation of the data
5. Summary report, including accuracy, duration, overfitting, sum of diagonals

The package also returns 25 summary tables (sometimes called confusion matrices), one for each of the models. These can be found in the Console. For example, using the drybeamssmall classification data set:

ensemblebagrftestpred BARBUNYA BOMBAY CALI DERMASON HOROZ SEKER SIRA BARBUNYA 21 0 0 0 0 0 0 BOMBAY 0 16 0 0 0 0 0 CALI 0 0 35 0 0 0 0 DERMASON 0 0 0 76 0 0 0 HOROZ 0 0 0 0 36 0 0 SEKER 0 0 0 0 0 48 0 SIRA 0 0 0 0 0 0 51

A data summary is also in the Console. Using drybeanssmall as an example: $Data_summary Eccentricity ConvexArea Extent Solidity roundness ShapeFactor4
Min. :0.2190 Min. : 20825 Min. :0.5802 Min. :0.9551 Min. :0.5718 Min. :0.9550
1st Qu.:0.7175 1st Qu.: 37052 1st Qu.:0.7240 1st Qu.:0.9859 1st Qu.:0.8320 1st Qu.:0.9941
Median :0.7642 Median : 45261 Median :0.7606 Median :0.9886 Median :0.8833 Median :0.9966
Mean :0.7517 Mean : 53997 Mean :0.7519 Mean :0.9874 Mean :0.8750 Mean :0.9952
3rd Qu.:0.8117 3rd Qu.: 62159 3rd Qu.:0.7887 3rd Qu.:0.9903 3rd Qu.:0.9191 3rd Qu.:0.9980
Max. :0.9082 Max. :229994 Max. :0.8325 Max. :0.9937 Max. :0.9879 Max. :0.9996

BARBUNYA: 79
BOMBAY : 31
CALI : 97
DERMASON:212
HOROZ :115
SEKER :121
SIRA :158

Owner

Name: Russ Conte
Login: InfiniteCuriosity
Kind: user
Location: Forest Park, Illinois

Website: DataScienceForBusiness.com
Repositories: 3
Profile: https://github.com/InfiniteCuriosity

Looking for ways to contribute and share in Data Science, feel free to contact me!

GitHub Events

Total

Issues event: 9
Watch event: 1
Issue comment event: 5
Push event: 25
Create event: 4

Last Year

Issues event: 9
Watch event: 1
Issue comment event: 5
Push event: 25
Create event: 4

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 5
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 2
Total pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 5
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 2
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

InfiniteCuriosity (6)
iMarcello (1)

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- cran 335 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 2
Total maintainers: 1

cran.r-project.org: ClassificationEnsembles

Automatically Builds 20 Classification Models

Homepage: https://github.com/InfiniteCuriosity/ClassificationEnsembles
Documentation: http://cran.r-project.org/web/packages/ClassificationEnsembles/ClassificationEnsembles.pdf
License: MIT + file LICENSE
Latest release: 0.6.0
published 11 months ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 335 Last month

Rankings

Dependent packages count: 26.9%

Dependent repos count: 33.1%

Average: 49.0%

Downloads: 86.9%

Maintainers (1)

russconte@mac.com