Science Score: 77.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
1 of 2 committers (50.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.8%) to scientific vocabulary
Repository
Machine Learning code for Pan-STARRS and ATLAS
Basic Info
- Host: GitHub
- Owner: genghisken
- License: gpl-3.0
- Language: Python
- Default Branch: master
- Size: 293 KB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
psat-ml
Automatic classification of Pan-STARRS and ATLAS images. Based on the code originally written by Darryl Wright, Ken W. Smith and Amanda Ibsen. Documentation written by Amanda Ibsen.
In a Nutshell:
This repo contains a pipeline to connect to the ATLAS (or PS1) database, get cutouts of difference images, build a data set, train a classifier to differentiate between real and bogus images, and plot the results.
How does it work?

GetCutOuts
Input options
configFile : .yaml with database credentials
mjds : list of nights
stampSize : size of cutouts
stampLocation : where to store cutouts
camera : '02a' for Haleakala, '01a' for Maunaloa
downloadthreads : number of threads
stampThreads : number of threads
Explanation
-getATLASTrainingSetCutouts.py: It takes as input a config file, a list of dates (in MJD) and a directory to store the output in. It connects to the ATLAS database using the credentials in the config file and gets all exposures for the given time frame. For each exposure it creates a .txt file containing all x,y positions for the objects in the images and a 40x40 pixels cutout image for each object. It also creates a "good.txt" and a "bad.txt" file, containing the x,y positions for the real and bogus objects, respectively.
-getPS1TrainingSetCutouts.py: Same as the above file, but it connects to the PS1 data base instead.
BuildMLDataset
Input options
good : file with x,y pixel positions for real objects
bad : file with x,y pixel positions for bogus objects
outputFile : .h5 output file
e : extent (default=10)
E : Extension (default=0)
s : skew, how many bogus objects per real ones(default=3)
r : rotation (default=None)
N : normalization function (default='signPreserveNorm')
Explanation
-buildMLDataset.py: It takes as input the good.txt and bad.txt files with all x,y positions for real and bogus objects. From those, it builds an .h5 file containing the features (20x20 pixels of the image) and targets (real or bogus label) to be used later as training set.
KerasTensorflowClassifier
Input options
outputcsv : output csv file
trainingset : .h5 input dataset
classifierfile : .h5 file to store model (classifier)
Explanation
-kerasTensorflowClassifier.py: It takes as input an .h5 file with the training set and a path to store a classifier as an .h5 file. If the model doesn't exist yet, it creates it, trains it and classifies a test set. It returns a .csv file containing the targets and scores for all images. The classifier used is a CNN with the following architecture:

PlotResults
Input options
inputFiles : csv files to be plotted, with both target and score for each object
outputFile : output .png file with the plots
Explanation
-plotResults.py: It takes as input a csv file with the scores and targets for all images and plots the ROC curve and the Detection error tradeoff graph for the data set.
Some results
ROC curve and trade-off plots for ATLAS test data-set

Recall for 'confirmed' and 'good' transients

How to run the pipeline?
When trying to run one task, the pipeline will search for the necessary resources to complete it and try to run it. If it doesn't find them, it'll run the task that's needed to produce those resources and will keep doing this recursively until it can run the task.
To run a task:
python atlasClassificationPipeline.py Name_of_Task --local-scheduler --name_of_oition1 option1 ... --name_of_optionN optionN
Examples:
-To run the PlotResults task
python atlasClassificationPipeline.py PlotResults --local-scheduler --inputfiles [file1.csv,...,filen.csv] --outputFile output.png## How to run the pipeline?
For more information on how to run a pipeline, go check the luigi documentation
To set up:
- create virtual environment with python 3.6 and activate it
- pip install -r requirements.txt
Owner
- Login: genghisken
- Kind: user
- Twitter: TheGenghisKen
- Repositories: 15
- Profile: https://github.com/genghisken
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Smith" given-names: "Ken W." orcid: "https://orcid.org/0000-0001-9535-3199" - family-names: "Wright" given-names: "Darryl E." - family-names: "Ibsen" given-names: "Amanda" title: "psat-ml" version: 0.1.0 doi: 10.5281/zenodo.10869720 date-released: 2024-03-25 url: "https://github.com/genghisken/psat-ml"
GitHub Events
Total
- Watch event: 1
- Push event: 1
Last Year
- Watch event: 1
- Push event: 1
Committers
Last synced: 11 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Ken Smith | k****h@q****k | 38 |
| joshgithubbin | j****m@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 4 days
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- joshgithubbin (2)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Keras ==2.0.4
- Markdown ==2.6.11
- Pillow ==5.2.0
- PyMySQL ==0.9.2
- PyWavelets ==0.5.2
- PyYAML ==3.13
- Theano ==1.0.2
- Werkzeug ==0.14.1
- absl-py ==0.2.2
- asn1crypto ==0.24.0
- astor ==0.7.1
- astropy ==3.0.3
- certifi ==2018.4.16
- cffi ==1.11.5
- chardet ==3.0.4
- cloudpickle ==0.5.3
- cov-core ==1.15.0
- coverage ==4.5.1
- cryptography >=2.3
- cycler ==0.10.0
- dask ==0.18.1
- decorator ==4.3.0
- dill ==0.2.8.2
- docopt ==0.6.2
- docutils ==0.14
- eventlet ==0.23.0
- fundamentals ==1.6.0
- gast ==0.2.0
- gkutils >=0.2.22
- greenlet ==0.4.14
- grpcio ==1.13.0
- h5py ==2.8.0
- idna ==2.7
- kiwisolver ==1.0.1
- lockfile ==0.12.2
- luigi ==2.7.6
- matplotlib ==2.2.2
- multiprocess ==0.70.6.1
- mysqlclient ==1.3.13
- networkx ==2.1
- nose2 ==0.7.4
- numpy ==1.14.5
- pandas ==0.23.3
- panstamps ==0.5.1
- protobuf ==3.6.0
- psutil ==5.4.6
- pycparser ==2.18
- pyparsing ==2.2.0
- pyprof2calltree ==1.4.3
- python-daemon ==2.1.2
- python-dateutil ==2.7.3
- pytz ==2018.5
- requests ==2.19.1
- scikit-image ==0.14.0
- scikit-learn ==0.19.2
- scipy ==1.1.0
- six ==1.11.0
- tensorboard ==1.9.0
- tensorflow ==1.9.0
- termcolor ==1.1.0
- threadpool ==1.3.2
- toolz ==0.9.0
- tornado ==4.5.3
- unicodecsv ==0.14.1
- urllib3 ==1.23