fbp
Frequency Based Pruning (FBP) is a feature selection algorithm based upon maximizing the Youden J statistic. FBP intelligently enumerates through combinations of features, using the frequency of smaller patterns to prune away large regions of the solution space.
Science Score: 62.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: ieee.org -
○Academic email domains
-
✓Institutional organization owner
Organization climerlab has institutional domain (www.cs.umsl.edu) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.8%) to scientific vocabulary
Keywords
Repository
Frequency Based Pruning (FBP) is a feature selection algorithm based upon maximizing the Youden J statistic. FBP intelligently enumerates through combinations of features, using the frequency of smaller patterns to prune away large regions of the solution space.
Basic Info
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 3
Topics
Metadata Files
README.md
FBP
Frequency Based Pruning (FBP) is a feature selection algorithm based upon maximizing the Youden J statistic. FBP intelligently enumerates through combinations of features, using the frequency of smaller patterns to prune away large regions of the solution space.
Details on FBP 1.0.0 from 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).
FBP 1.1.0 updates how the FBP_worker enumerates the patterns.
To Use
Configure the Makefile with the locaion of open mpi libraries and binary
Compile with the Makefile by navigating to the root directory and entering: make
Update configuration file
Run the program. For an example enter: mpirun -np 4 ./fbp
Configuration
DATAFILE - Tab seperated file where the first NUMCASES columns are cases and the next NUM_CTRLS columns are controls. The row indicate features.
SCRATCH_DIR - Directoty where results are recorded
SOLPOOLFILE - File File with the best and worst objective values from the solution pool of each pattern size.
RUN_TAG - Run tag appended to output file
RISK - Boolean that indicates if risk patterns (true) or protective patterns (false) should be found.
NUMCASES - The number of cases in DATAFILE.
NUMCTRLS - The number of controls in DATAFILE.
NUMEXPRS - The number of features in DATAFILE.
NUMHEADROWS - The number of header rows in DATA_FILE.
NUMHEADCOLS - The number of header columns in DATA_FILE.
PATTERN_SIZE - The number of marker states in the pattern(s) to be found.
MISSINGSYMBOL - String used to indicate missing data in DATAFILE.
MAX_PS - Maximum size patterns to find
USESOLUTIONPOOL - A boolean indicating if all pattens above a threshold should be found (true) or just an optimal solution. If true, the worst objective value for each patter size from SOLPOOLFILE is used as the lower bound.
HIGHVALUE - Value in DATAFILE that indicates high expression.
NORMVALUE - Value in DATAFILE that indicates normal expression.
LOWVALUE - Value in DATAFILE that indicates low expression.
SETNATRUE - Boolean used to indicate if missing data is treated as both high and low.
Outputs
PS#_
Notes
Recommend using sync-greedy to generate SOLPOOLFILE.
Requires Open MPI
DATA_FILE should be tab seperate, the columns represent individuals and the rows represent features
Owner
- Name: Climer Lab
- Login: ClimerLab
- Kind: organization
- Location: Saint Louis Missouri
- Website: http://www.cs.umsl.edu/~climer/
- Repositories: 1
- Profile: https://github.com/ClimerLab
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Smith" given-names: "Ken" orcid: "https://orcid.org/0000-0000-0000-0000" title: "Frequency Based Pruning" version: 1.1.0 license: BSD-3-Clause license-url: "https://github.com/ClimerLab/FBP/blob/main/LICENSE" repository-code: "https://github.com/ClimerLab/FBP/" keywords: - feature selection - youden j type: software url: "https://github.com/ClimerLab/FBP/"