https://github.com/andrew-saydjari/apogeereduction.jl

An alternative pipeline for reducing APOGEE data from raw (3D) observations

https://github.com/andrew-saydjari/apogeereduction.jl

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

An alternative pipeline for reducing APOGEE data from raw (3D) observations

Basic Info
  • Host: GitHub
  • Owner: andrew-saydjari
  • License: mit
  • Language: Julia
  • Default Branch: main
  • Homepage:
  • Size: 113 MB
Statistics
  • Stars: 4
  • Watchers: 3
  • Forks: 0
  • Open Issues: 21
  • Releases: 0
Created over 1 year ago · Last pushed 6 months ago
Metadata Files
Readme License

README.md

ApogeeReduction AR Logo

Build Status codecov

Files

The pipeline produces files at many stages of reduction. - ar3Dcal: Raw 3D datacubes. These are zero point adjusted 3D inputs into the photoelectron rate extractions and are experimental (not always created). - ar2D: 2D images after 3D→2D extraction, before calibration - ar2Dcal: 2D calibrated images after dark subtraction and flat fielding - ar2Dresiduals: Residuals from 2D extraction process - ar1D: Extracted 1D spectra for each fiber, in detector pixel units (before wavelength calibration and resampling) - ar1Duni: 1D spectra resampled onto a uniform wavelength grid - ar1Dunical: Flux (relative) calibrated 1D spectra on the uniform wavelength grid

Structure

There are four main types of files in this repository: - scripts/*/run*.sh : wrapper scripts to run the pipeline (submit job, determine resources, etc.) - scripts/*/makerunlist*.sh: interface to get the data to run the pipeline on - pipeline.sh: how functions combine to process the data - src/.jl: core functions of the repository

File Structure

├── src/ : core functions of the repository ├── scripts/ : scripts for running the pipeline │ └── run/ : scripts general users will interact with to run the pipeline │ └── cal/ : scripts to build the calibrations files ├── test/ : test files for the repository (name matched to the src/ files they test) ├── metadata/ : metadata files for the repository (mostly dates for instrument changes/special calibrations runs) ├── data/ : input data (e.g. sky line lists from HITRAN) ├── dags/ : dags to run the pipeline through Airflow automations ├── pipeline.jl : main pipeline function (3D → 2D) └── pipeline_2d_1d.jl : 2D pipeline function (2D → 1D)

Call Structure

Nightly Runs: └── run_all.sh : run all the data for a given night ├── almanac: queries database containing targeting information and data transfer status ├── make_runlist_all.sh: convert almanac output into a runlist interpreted by the pipeline ├── pipeline.sh: reduces data from raw type (3D compressed) to 2D calibrated data ├── run_trace_cal.sh: extracts the traces from domeflats to define 1D extraction profiles │ ├── almanac │ ├── make_runlist_dome_flats.sh: scrape almanac outputs for dome flats │ ├── pipeline.sh │ └── make_traces_domeflats.jl: extracts/saves traces from dome flats via gaussian fits to the "y" direction ├── pipeline_2d_1d.sh: extracts and calibrates 1D spectra from 2D calibrated data └── plot_all.sh: makes end of night plots for validation/QA and posts them to Slack

Bulk reprocessing workflow is still TBD, but the massive parallelization we have designed even for nightly runs means it should be similar, with possible interruptions to build higher signal to noise calibrations based on combining many calibration exposures.

Current Flag Bits

Certain pixels are entirely masked or have data of questionable quality. This pipeline bit gives insight into the root cause of why this (tiny fraction of the) data is unable to be processed.

| Bit | Value | Meaning | | ----- | --------- | ----------- | | - | 0 | No problems | | 0 | 1 | reference array pixels | | 1 | 2 | reference pixels | | 2 | 4 | bad reference pixels | | 3 | 8 | pixels not dark corrected | | 4 | 16 | pixels with negative dark current | | 5 | 32 | pixels with large dark current | | 6 | 64 | flat response too low | | 7 | 128 | one diff was dropped because it is a likely cosmic ray | | 8 | 256 | more than one diff was dropped because they were likely cosmic rays (sus) | | 9 | 512 | bad linear SUTR chi2 | | 10 | 1024 | failed 1D extraction | | 11 | 2048 | no nearby good pixels in 1D extraction | | 12 | 4096 | neff>10 in 1D extraction | | 13 | 8192 | pixel partially saturated | | 14 | 16384 | pixel fully saturated |

Testing

To test the pipeline, run the run_all.sh script with the desired tele and SJD. For example:

bash ./src/run_scripts/run_all.sh apo 60639

This is good practice before asserting a PR with substantial changes is ready for a merge (in the absence of a CI pipeline, which is still in progress).

Nomenclature

SJD

SJD is an "SDSS Julian day," which is adjusted to roll-over earlier than the usual MJD (modified Julian day) so that the day roll-over does not collide with evening calibrations and preparations (defined in https://ui.adsabs.harvard.edu/abs/2015PASP..127..397W/abstract, updated for LCO see for example https://github.com/sdss/sdsstools/blob/main/src/sdsstools/time.py#L21).

The two APOGEE instruments are at two different observatories: APO (north) and LCO (south)

MJD = JD - 2400000.5 SJD = MJD + 0.3 # at APO SJD = MJD + 0.4 # at LCO - APO is MST/MDT. This means that a new SJD occurs at 10:48 AM MST (UTC-7), instead of 5:00 PM MST (UTC-7). - LCO is CLT/CLST. This means that a new SJD occurs at 12:48 PM CLT (UTC-4), instead of 7:00 PM CLT (UTC-4).

SJD is only ever used for rough definitions of a "day" (taking only the integer part), used mostly for foldering and grouping nightly calibrations with observations. However, long daytime calibration runs can sometimes be broken up by the SJD switch. In call cases, when precise timing is necessary, we convert from TAI to JD, storing at Float64 precision.

Contributing

All contributions are welcome! Please feel free to open a PR with any changes you would like to see. We will help you troubleshoot any test failures, so please feel free to open a PR with in progress code, or even code in another coding language that has the functionality you would like to see. If you don't have any code related to your idea, please feel free to open an issue and we will help you get started.

Quick tips on Julia for Python programmers

To enable the Slack Messaging functionality, you need the OAuth token for the bot to be in your bashrc. Please contact the current repo owner for that token. Please also change the the channel key ENV["SLACK_CHANNEL"] in the src/utils.jl file to the "dev" version during development to reduce noise on the daily processing channel.

Owner

  • Name: Andrew Saydjari
  • Login: andrew-saydjari
  • Kind: user
  • Location: Cambridge, MA
  • Company: @Harvard

5th Year PhD student @ Harvard Physics. BS/MS @ Yale '18. I am an astronomer interested in data science working on dust.

GitHub Events

Total
  • Issues event: 27
  • Watch event: 1
  • Delete event: 73
  • Issue comment event: 308
  • Push event: 617
  • Pull request review event: 117
  • Pull request review comment event: 256
  • Pull request event: 315
  • Create event: 116
Last Year
  • Issues event: 27
  • Watch event: 1
  • Delete event: 73
  • Issue comment event: 308
  • Push event: 617
  • Pull request review event: 117
  • Pull request review comment event: 256
  • Pull request event: 315
  • Create event: 116

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 20
  • Total pull requests: 213
  • Average time to close issues: 12 days
  • Average time to close pull requests: about 15 hours
  • Total issue authors: 5
  • Total pull request authors: 6
  • Average comments per issue: 0.5
  • Average comments per pull request: 0.89
  • Merged pull requests: 165
  • Bot issues: 1
  • Bot pull requests: 28
Past Year
  • Issues: 19
  • Pull requests: 195
  • Average time to close issues: 8 days
  • Average time to close pull requests: about 16 hours
  • Issue authors: 5
  • Pull request authors: 6
  • Average comments per issue: 0.53
  • Average comments per pull request: 0.96
  • Merged pull requests: 148
  • Bot issues: 1
  • Bot pull requests: 17
Top Authors
Issue Authors
  • andrew-saydjari (12)
  • ajwheeler (8)
  • KevinMcK95 (2)
  • github-actions[bot] (1)
  • andycasey (1)
Pull Request Authors
  • andrew-saydjari (97)
  • ajwheeler (53)
  • KevinMcK95 (34)
  • github-actions[bot] (24)
  • andycasey (9)
  • dependabot[bot] (4)
Top Labels
Issue Labels
good first issue (1)
Pull Request Labels
dependencies (4) github_actions (2)

Dependencies

.github/workflows/CI.yml actions
  • actions/checkout v4 composite
  • julia-actions/cache v2 composite
  • julia-actions/julia-buildpkg v1 composite
  • julia-actions/julia-runtest v1 composite
  • julia-actions/setup-julia v2 composite
.github/workflows/CompatHelper.yml actions
  • julia-actions/setup-julia v1 composite
.github/workflows/TagBot.yml actions
  • JuliaRegistries/TagBot v1 composite
.github/workflows/Format.yml actions
  • actions/checkout v4 composite
  • julia-actions/setup-julia latest composite
  • reviewdog/action-suggester v1 composite