ap-svm-data-cleaning
The Affinity Propagation (AP) + Support Vector Machine (SVM) data cleaning model code for time-series signals generated from Germanium detectors is found here.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary
Repository
The Affinity Propagation (AP) + Support Vector Machine (SVM) data cleaning model code for time-series signals generated from Germanium detectors is found here.
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
AP-SVM Data Cleaning
The Affinity Propagation (AP) + Support Vector Machine (SVM) Data Cleaning model is designed to remove anomalous and keep physical signals captured by Germanium detectors through a clustering + classification mechanism.
Software Requirements
Create a conda environment from the requirements.txt file with the following command:
bash
conda create --name apsvm --file requirements.txt
Make sure to run the scripts and Jupyter notebooks of this repository from the apsvm conda environment.
Repository Structure
- data/: Contains training and testing data, configuration JSON files, and serialized model and data files produced when training the AP-SVM model.
- plots/: Contains plots generated during the training and testing of the AP-SVM model.
- test/: Contains notebooks to evaluate the performance of the AP-SVM model on test data, including sacrifice and leakage studies.
- train/: Contains scripts and notebooks for training and optimizing the AP-SVM model.
- vis/: Contains scripts and notebooks for visualizing the AP-SVM model in 3D.
Usage
1. Data Preparation
Open the data/ directory. There you will find instructions on how to acces and process the data before feeding it into AP-SVM.
2. Training
Open the train/ directory. There you will find instructions on how to train and optimize AP and SVM.
4. Visualizing
Open the vis/ directory. There you will find instructions on how to create a 3D plot of the training dataset and the SVM decision regions.
4. Testing
Open the test/ directory. There you will find instructions on how to test the AP-SVM's performance and perform sacrifice and leakage studies.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Contact and Support
For any questions, issues, or feedback please contact Esteban León.
Owner
- Name: Esteban León
- Login: esleon97
- Kind: user
- Repositories: 3
- Profile: https://github.com/esleon97
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "AP-SVM Data Cleaning"
date-released: 2024-09-01
authors:
- family-names: León
given-names: Esteban
orcid: https://orcid.org/0000-0002-0073-5512
GitHub Events
Total
- Push event: 2
Last Year
- Push event: 2
Dependencies
- Babel ==2.15.0
- Brotli ==1.0.9
- GDAL ==3.6.2
- GitPython ==3.1.43
- Jinja2 ==3.1.3
- MarkupSafe ==2.1.3
- Pint ==0.22
- Pint-Pandas ==0.5
- PyJWT ==2.4.0
- PyQt5 ==5.15.10
- PyQt5-sip ==12.13.0
- PySocks ==1.7.1
- PyWavelets ==1.6.0
- PyYAML ==6.0.1
- Pygments ==2.18.0
- QtPy ==2.4.1
- SQLAlchemy ==2.0.31
- Send2Trash ==1.8.3
- anaconda-client ==1.11.1
- anaconda-navigator ==2.4.0
- anaconda-project ==0.11.1
- anyio ==4.4.0
- archspec ==0.2.3
- argon2-cffi ==23.1.0
- argon2-cffi-bindings ==21.2.0
- arrow ==1.3.0
- asttokens ==2.4.1
- async-lru ==2.0.4
- attrs ==23.2.0
- awkward ==2.5.2
- awkward-cpp ==28
- awkward-pandas ==2023.8.0
- backports.functools-lru-cache ==1.6.4
- backports.tempfile ==1.0
- backports.weakref ==1.0.post1
- beautifulsoup4 ==4.12.2
- bitarray ==2.9.2
- bitstring ==4.1.4
- bleach ==6.1.0
- blosc2 ==2.7.0
- boltons ==23.0.0
- boost_histogram ==1.4.1
- certifi ==2024.2.2
- cffi ==1.16.0
- chardet ==4.0.0
- charset-normalizer ==2.0.4
- click ==8.1.7
- clyent ==1.2.2
- colorlog ==6.7.0
- comm ==0.2.2
- conda ==24.3.0
- conda-build ==24.3.0
- conda-content-trust ==0.1.3
- conda-libmamba-solver ==24.1.0
- conda-pack ==0.6.0
- conda-package-handling ==2.2.0
- conda-repo-cli ==1.0.88
- conda-token ==0.4.0
- conda-verify ==3.4.2
- conda_index ==0.4.0
- conda_package_streaming ==0.9.0
- contourpy ==1.2.1
- cryptography ==42.0.5
- cycler ==0.12.1
- debugpy ==1.8.1
- decorator ==5.1.1
- defusedxml ==0.7.1
- distro ==1.8.0
- dspeed ==1.3.0
- exceptiongroup ==1.2.1
- executing ==2.0.1
- fastjsonschema ==2.16.2
- filelock ==3.13.1
- fonttools ==4.53.0
- fqdn ==1.5.1
- future ==0.18.3
- gitdb ==4.0.11
- greenlet ==3.0.3
- h11 ==0.14.0
- h5py ==3.11.0
- hdf5plugin ==4.2.0
- hist ==2.7.3
- histoprint ==2.4.0
- httpcore ==1.0.5
- httpx ==0.27.0
- idna ==3.4
- iminuit ==2.24.0
- importlib-metadata ==7.0.1
- importlib_resources ==6.4.0
- ipykernel ==6.29.4
- ipympl ==0.9.3
- ipython ==8.25.0
- ipython-genutils ==0.2.0
- ipywidgets ==8.1.3
- isoduration ==20.11.0
- jedi ==0.19.1
- joblib ==1.4.2
- json5 ==0.9.25
- jsonpatch ==1.33
- jsonpointer ==2.1
- jsonschema ==4.19.2
- jsonschema-specifications ==2023.7.1
- jupyter ==1.0.0
- jupyter-console ==6.6.3
- jupyter-lsp ==2.2.5
- jupyter_client ==8.6.2
- jupyter_core ==5.5.0
- jupyter_server_terminals ==0.5.3
- jupyterlab_pygments ==0.3.0
- jupyterlab_server ==2.27.2
- jupyterlab_widgets ==3.0.11
- kiwisolver ==1.4.5
- legend-plot-style ==0.0.1
- legend_daq2lh5 ==1.2.2
- legend_pydataobj ==1.7.0
- legendstyles ==0.1.dev46
- libarchive-c ==2.9
- libmambapy ==1.5.8
- llvmlite ==0.43.0
- matlabengine ==23.2
- matplotlib ==3.9.0
- matplotlib-inline ==0.1.7
- memory-profiler ==0.61.0
- menuinst ==2.0.2
- mistune ==3.0.2
- mkl-fft ==1.3.8
- mkl-random ==1.2.4
- mkl-service ==2.4.0
- more-itertools ==10.1.0
- msgpack ==1.0.8
- navigator-updater ==0.4.0
- nbclient ==0.10.0
- nbconvert ==7.16.4
- nbformat ==5.9.2
- ndindex ==1.8
- nest-asyncio ==1.6.0
- notebook ==7.2.1
- notebook_shim ==0.2.4
- numba ==0.60.0
- numexpr ==2.10.1
- numpy ==1.26.4
- overrides ==7.7.0
- packaging ==23.2
- pandas ==2.2.2
- pandocfilters ==1.5.1
- parse ==1.19.1
- parso ==0.8.4
- pexpect ==4.9.0
- pillow ==10.2.0
- pip ==22.3.1
- pkginfo ==1.9.6
- platformdirs ==3.10.0
- pluggy ==1.0.0
- ply ==3.11
- prometheus_client ==0.20.0
- prompt_toolkit ==3.0.47
- psutil ==5.9.0
- psycopg2-binary ==2.9.9
- ptyprocess ==0.7.0
- pure-eval ==0.2.2
- py-cpuinfo ==9.0.0
- pyFFTW ==0.13.1
- pyarrow ==15.0.0
- pybind11 ==2.10.4
- pycosat ==0.6.6
- pycparser ==2.21
- pyfcutils ==0.2.4
- pygama ==2.0.1
- pylegendmeta ==0.10.2
- pyparsing ==3.1.2
- python-dateutil ==2.8.2
- python-json-logger ==2.0.7
- pytz ==2023.3.post1
- pyzmq ==26.0.3
- qtconsole ==5.5.2
- referencing ==0.35.1
- requests ==2.31.0
- rfc3339-validator ==0.1.4
- rfc3986-validator ==0.1.1
- rpds-py ==0.18.1
- ruamel-yaml-conda ==0.17.21
- ruamel.yaml ==0.17.21
- ruamel.yaml.clib ==0.2.6
- scikit-learn ==1.3.2
- scipy ==1.13.1
- seaborn ==0.13.2
- setuptools ==65.6.3
- sip ==6.7.12
- six ==1.16.0
- smmap ==5.0.1
- sniffio ==1.3.1
- soupsieve ==2.5
- stack-data ==0.6.3
- tables ==3.9.2
- terminado ==0.18.1
- threadpoolctl ==3.5.0
- tinycss2 ==1.3.0
- tomli ==2.0.1
- tornado ==6.3.3
- tqdm ==4.66.4
- traitlets ==5.14.3
- truststore ==0.8.0
- types-python-dateutil ==2.9.0.20240316
- typing_extensions ==4.12.2
- tzdata ==2024.1
- uhi ==0.4.0
- ujson ==5.4.0
- uri-template ==1.3.0
- urllib3 ==2.1.0
- wcwidth ==0.2.13
- webcolors ==24.6.0
- webencodings ==0.5.1
- websocket-client ==1.8.0
- wheel ==0.38.4
- widgetsnbextension ==4.0.11
- xmltodict ==0.13.0
- zipp ==3.19.2
- zstandard ==0.19.0