AutoFunc
AutoFunc: A Python package for automating and verifying functional modeling - Published in JOSS (2021)
Science Score: 87.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Scientific Fields
Engineering
Computer Science -
80% confidence
Artificial Intelligence and Machine Learning
Computer Science -
60% confidence
Last synced: 4 months ago
·
JSON representation
Repository
Data Mining for Automated Functional Representations
Basic Info
- Host: GitHub
- Owner: AlexMikes
- License: mit
- Language: Python
- Default Branch: master
- Size: 1.77 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 3
- Releases: 0
Fork of SoftwareDevEngResearch/AutoFunc
Created over 6 years ago
· Last pushed almost 5 years ago
https://github.com/AlexMikes/AutoFunc/blob/master/
# AutoFunc Data Mining for Automated Functional Representations [](https://travis-ci.org/AlexMikes/AutoFunc) [](https://doi.org/10.5281/zenodo.3243689) ``AutoFunc`` is a Python package that automatically generates the functional representations of components based on data from design repositories. ``AutoFunc`` also contains methods to validate and optimize the automation algorithm. A designer can use this software to input a list of components in their product, and it will automatically generate the functional representations for those components based on the most commonly seen functions and flows from previous products in the design repository. The package uses common data-mining techniques for finding information and classifying new observations based on that data. ``AutoFunc`` also uses the common methods of cross-validation and the F1 score to find the accuracy at different values for the threshold variables. `AutoFunc` was developed for use with the Design Repository housed at Oregon State University. A rudimentary web interface can be found here: http://ftest.mime.oregonstate.edu/repo/browse/ ## Installation `autofunc` has been tested on Linux Python 3.6, 3.7, and 3.8 ### Pip Use the package manager [pip](https://pip.pypa.io/en/stable/) to install autofunc. The package is not yet on PyPI, so it must be downloaded from here as a .zip file: https://github.com/AlexMikes/AutoFunc Once downloaded as a .zip file, install with: ```bash pip install /path/to/file/AutoFunc-master.zip ``` ### From repository To install from this repository: ```bash git clone https://github.com/AlexMikes/AutoFunc.git cd autofunc python setup.py install ``` ## Dependencies This package uses Pandas (Python Data Analysis Library). It can be installed with pip using: ```bash pip install pandas ``` Many of the examples also use Matplotlib for plotting. While not required to use the AutoFunc modules, it is required to run the examples. It can be installed with: ```bash pip install -U matplotlib ``` ## Usage Every module has NumPy formatted docstrings to explain the inputs, outputs, and usage of each of them. Rudimentary API documentation can be found here: https://autofunc.readthedocs.io/ Example files are provided in the examples folder. Autofunc will automate the functional representations of components as long as the format of the .csv file has the component in column 1 and the function-flow in column 2 More information on the methods used in these files can found in the various research papers that this software supports, especcially IDETC2020-22346 "OPTIMIZING AN ALGORITHM FOR DATA MINING A DESIGN REPOSITORY TO AUTOMATE FUNCTIONAL MODELING". All of the plots for this paper were created in the ```example_optimize_with_comp_ratio.py``` file. The following lists the examples included, with their expected functionality and outputs: 1. ```example_cross_validation.py``` uses the k-fold cross validation functionality to find the accuracy of a data mining classifer. This example will print the maximum and average accuracies using this verification method. 1. ```example_find_f1_from_file.py``` finds the F1 score of a single product when the component-function-flow combinations for that product are in a separate .csv file. This example will print the Recall, Precision, and F1 score for that testing product. 1. ```example_find_f1_from_id.py``` finds the F1 score of a single product using that product's ID number from the original dataset. Any number of IDs can be used. This example will print the testing ID(s) used, and the recall, precision, and F1 score for those testing IDs. 1. ```example_find_similarity.py``` will create a similarity matrix for the training dataset. This is the percent of similar components between each product in the dataset. The main diagonal of this matrix consists of ones because every product is 100% similar to itself, but the matrix is not symmetric because each product can contain a different number of components. For example, consider a case where Product 1 has 20 components and Product 2 has 40 components. If they have 10 components in common, the similarity between Product 1 and Product 2 is 10/20 = 50%, but the similarity between Product 2 and Product 1 is 10/40 = 25%. The first product of the pair is known as the generating product, which is the product in the column of this matrix. This example will create a Pandas dataframe of the similarity matrix and write this to a .csv file. 1. ```example_get_func_rep.py``` will create a functional representation of the components in the input file using data mining and a classification threshold. This can be used to automate functional modeling by connecting the functions and flows at the interface of components in a product. This example will write a .csv file with the results of component-function-flow and optional frequency. 1. ```example_optimization.py``` incorporates all of the main modules and optimizes the similarity and classification thresholds. This example will display a lot of plots and print optimum values for thresholds. 1. ```example_optimize_with_comp_ratio.py``` begins with ```example_optimization.py``` and also includes the stratification and optimization of a training set. This example will display a lot of plots and print optimum values for thresholds. 1. ```example_try_best_ids.py``` is a subset of ```example_optimize_with_comp_ratio.py``` which only includes the stratified training set and some relevant plots. This example will display a plot of the F1 scores vs. Classification threshold of the stratified dataset. This is the ```example_get_func_rep.py``` file: ```python from autofunc.get_top_results import get_top_results from autofunc.counter_pandas import counter_pandas from autofunc.get_func_rep import get_func_rep import os.path import pandas as pd """ Example showing how to automate functional representation """ # Dataset used for data mining script_dir = os.path.dirname(__file__) file_to_learn = os.path.join(script_dir, '../assets/consumer_systems.csv') include_frequencies = True train_data = pd.read_csv(file_to_learn) combos_sorted = counter_pandas(train_data) # Use a threshold to get the top XX% of confidence values threshold = 0.5 thresh_results = get_top_results(combos_sorted, threshold) # Use a known product for verification input_file = os.path.join(script_dir, '../assets/InputExample.csv') # Get dictionary of functions and flows for each component based on data mining results, unmatched = get_func_rep(thresh_results, input_file, include_frequencies) # Optional write to file - uncomment and rename to write file write_results_from_dict(results, 'test1.csv') ``` Run from within ```examples``` folder: ```bash python example_get_func_rep.py ``` And it will generate a file ```test1.csv``` with the results of the automated functional representation of the components in the ```input_file``` based on the data from the ```file_to_learn``` in the ```assets``` folder. ## Testing All tests are automated through [Travis CI](https://travis-ci.org/). Visit [this page](https://travis-ci.org/github/AlexMikes/AutoFunc) to view the results. ## Support Please submit requests for support or problems with software as issues in the repository. ## Contributing We welcome contributions to the `autofunc` package in the form of [pull requests](https://github.com/AlexMikes/AutoFunc/pulls) and [issues](https://github.com/AlexMikes/AutoFunc/issues) made in the repository. If you are having any problems using `autofunc`, please open an issue. If there is some functionality you would like to see added to `autofunc`, you can also open an issue up to discuss that. If you have a feature that you would like to propose be integrated into `autofunc`, then you should open a pull request. ## License [MIT](https://choosealicense.com/licenses/mit/)
Owner
- Login: AlexMikes
- Kind: user
- Repositories: 1
- Profile: https://github.com/AlexMikes
JOSS Publication
AutoFunc: A Python package for automating and verifying functional modeling
Published
February 04, 2021
Volume 6, Issue 58, Page 2362
Authors
Alex Mikes
Design Engineering Lab, Oregon State University
Design Engineering Lab, Oregon State University
Katherine Edmonds
Design Engineering Lab, Oregon State University
Design Engineering Lab, Oregon State University
Robert B. Stone
Design Engineering Lab, Oregon State University
Design Engineering Lab, Oregon State University
Bryony DuPont
Design Engineering Lab, Oregon State University
Design Engineering Lab, Oregon State University
Tags
engineering design functional modeling design repository data miningGitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 18
- Total pull requests: 0
- Average time to close issues: 27 days
- Average time to close pull requests: N/A
- Total issue authors: 3
- Total pull request authors: 0
- Average comments per issue: 1.56
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- srmnitc (10)
- e-dub (5)
- cmccomb (3)
