hi-friends

SKA data challenge SDC2 solution

https://github.com/hi-friends-sdc2/hi-friends

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 5 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

SKA data challenge SDC2 solution

Basic Info

Host: GitHub
Owner: HI-FRIENDS-SDC2
License: gpl-3.0
Language: Jupyter Notebook
Default Branch: master
Size: 17.4 MB

Statistics

Stars: 4
Watchers: 2
Forks: 3
Open Issues: 1
Releases: 1

Created almost 5 years ago · Last pushed over 2 years ago

Metadata Files

Readme Contributing License Code of conduct Citation

Summary

This repository hosts a workflow to process HI data cubes produced by radio interferometers, in particular large data cubes produced by future instruments like the SKA. It extract radio sources and characterize their main properties.

The workflow is managed and executed using snakemake workflow management system. It uses spectral-cube based on dask parallelization tool and astropy suite to divide the large cube in smaller pieces. On each of the subcubes, we execute Sofia-2 for masking the subcubes, find sources and characterize their properties. Finally, the individual catalogs are cleaned, concatenated into a single catalog, and duplicates from the overlapping regions are eliminated. Some diagnostic plots are produced using Jupyter notebook.

HI-FRIENDS team: participation in the SKA Data Challenge 2

This repository contains the workflow used to find and characterize the HI sources in the data cube of the SKA Data Challenge 2. This is developed by the HI-FRIENDS team. The execution of the workflow was conducted in the SP-SRC cluster at the IAA-CSIC. Documentation can be found in HI-FRIENDS SDC2 Documentation (more details below).

Accessibility to the workflow

Following FAIR principles, we are trying to make the workflow as accessible as possible. The contents of this repository and the solution to participate in the SDC2 are published in this Zenodo record. The snakemake workflow is also provided as a singularity and a docker container. The workflow is also published in WorkflowHub. Installation and execution instructions can be found in the online documentation developed in this repository.

Installing

For details on installing and using HI-FRIENDS, please visit the documentation: installation, execution.

License

We are using GNU General Public License v3.0. See full license here.

Citation

Please, use this reference (resolves to most recent version in Zenodo): https://doi.org/10.5281/zenodo.5167659

Documentation

The repository documentation can be found in the HI-FRIENDS SDC2 webpage where you can find details on:

The SKA Data Challenge 2
- The HI-FRIENDS solution to the SDC2
- Workflow general description
- The HI-FRIENDS team
Methodology
- Data exploration
- Feedback from the workflow and logs
- Configuration
- Unit tests
- Software managed and containerization
- Check conformance to coding standards
Workflow Description
- Workflow definition diagrams
- Workflow file structure
- Output products
- Snakemake execution and diagrams
Workflow installation
- Dependencies
- Installation
  1. Get conda
  2. Get the pipeline and install snakemake
- Deploy in containers
  - Docker
  - Singularity
  - Podman
- Use tarball of the workflow
- Use myBinder
Workflow execution
- Preparation
- Basic usage and verification of the workflow
- Execution on a data cube
SDC2 HI-FRIENDS results
- Our solution
- Score
SDC2 Reproducibility award
- Reproducibility of the solution check list
Developers
- define_chunks module
- eliminate_duplicates module
- filter_catalog module
- run_sofia module
- sofia2cat module
- split_subcube module
Acknowledgments

Contributing

More details in CONTRIBUTING.MD. Summary here:

Coding

Nothing fancy here, just:

Fork this repo
Commit you code
Submit a pull request. It will be reviewed by maintainers and they'll give you proper feedback so you can iterate over it.

Considerations

Make sure existing tests pass
Make sure your new code is properly tested and fully-covered
Following The seven rules of a great Git commit message is highly encouraged
When adding a new feature, branch from master-branch

Testing

As mentioned above, existing tests must pass and new features are required to be tested and fully-covered.

Documenting

Code should be self-documented. But, in case there is any code that may be hard to understand, it must include some comments to make it easier to review and maintain later on.

Citation (CITATION.cff)

cff-version: 1.1.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Javier
    given-names: Moldon
    orcid: https://orcid.org/0000-0002-8079-7608
title: "HI-FRIENDS participation in the SKA Data Challenge 2"
version: 1.0.1
doi: https://doi.org/10.5281/zenodo.5167659
date-released: 2021-07-06

GitHub Events

Total

Fork event: 1

Last Year

Fork event: 1

Dependencies

docs/requirements.txt pypi

astropy ==4.3.post1
attrs ==21.2.0
cloudpickle ==1.6.0
dask ==2021.7.2
docutils ==0.16
fsspec ==2021.7.0
joblib ==1.0.1
locket ==0.2.1
markdown-it-py ==1.1.0
matplotlib *
mdit-py-plugins ==0.2.8
myst-parser ==0.15.1
myst-parser *
pandas *
partd ==1.2.0
pyerfa ==2.0.0
pyyaml ==5.4.1
radio-beam ==0.3.3
scipy ==1.7.1
spectral-cube ==0.5.0
sphinx-rtd-theme ==0.5.2
toolz ==0.11.1

.github/workflows/fair-software.yml actions

fair-software/howfairis-github-action 0.2.0 composite

environment.yml pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science