Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: iop.org, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: floriangriese
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 19.6 MB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 4
Created about 4 years ago · Last pushed about 3 years ago
Metadata Files
Readme License Citation

README.md

Radio Galaxy Dataset

DOI License

This Radio Galaxy Dataset is a collection and combination of several catalogues using the FIRST radio galaxy survey [1]. To the images from the FIRST radio galaxy survey the following license applies:

"Provenance: The FIRST project team: R.J. Becker, D.H. Helfand, R.L. White M.D. Gregg. S.A. Laurent-Muehleisen. Copyright: 1994, University of California. Permission is granted for publication and reproduction of this material for scholarly, educational, and private non-commercial use. Inquiries for potential commercial uses should be addressed to: Robert Becker, Physics Dept, University of California, Davis, CA 95616:

Further, the following catalogues are included in this dataset: * MiraBest [2], Source * Gendre [3-4], Supplementary Data: mnras0404-1719-SD1.pdf, data tables CoNFIG-1 to CoNFIG-4 * Capetti 2017a [5], Table * Capetti 2017b [6], Table * Baldi 2018 [7], Table * Proctor [8], Table, data from Table 1 with label “WAT” and “NAT”

Examples for the class definitions of FRI, FRII, Compact and Bent are shown below, image with the labels

| classes | Label |
| ----------- | ----------- | | FRI | 0 | | FRII | 1 | | Compact| 2 | | Bent | 3 |

The dataset has the following total number of samples per class.

| classes/split | FRI | FRII | Compact | Bent | Total | | ----------- | ----------- |----------- |----------- |----------- |-----------| | total | 495 |924 |391 |348 |2158 |

We provide two splitting options for the dataset. The first splitting option (galaxydatah5.zip) provides three splittings in train, valid and test with the following number of sample per class.

| classes/split | FRI | FRII | Compact | Bent | Total | | ----------- | ----------- |----------- |----------- |----------- |-----------| | train | 395 |824 |291 |248 |1758 | | valid | 50 | 50 | 50 | 50 |200 | | test | 50 | 50 | 50 | 50 |200 | | total | 495 |924 |391 |348 |2158 |

The second splitting option (galaxydatacrossvalid0h5.zip to galaxydatacrossvalid4h5.zip and galaxydatacrossvalidtesth5.zip) provides a 5-fold cross validation dataset with a larger test set.

| classes/split | FRI | FRII | Compact | Bent | Total | | ----------- | ----------- |----------- |----------- |----------- |-----------| | 5-fold cross train | 316 | 659 | 232 | 198 |1405 | | 5-fold cross valid | 79 | 165 | 59 | 50 |353 | | test | 100 | 100 | 100 | 100 |400 | | total | 495 |924 |391 |348 |2158 |

Installation usage with pytorch

If you want to use the dataset via the dataset class FIRSTGalaxyData with pytorch, install the necessary packages with

pip3 install -r requirements.txt

first, otherwise you can use the dataset * directly with *.png files on disk or * load the dataset directly from the HDF5 file.

Both options are descibed further below.

Usage with pytorch

from firstgalaxydata import FIRSTGalaxyData import torchvision.transforms as transforms transformRGB = transforms.Compose( [transforms.ToTensor(), transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])]) data = FIRSTGalaxyData(root="./", selected_split="train", input_data_list=["galaxy_data_h5.h5"], is_PIL=True, is_RGB=True, transform=transformRGB) print(data)

This will print out the following output: Dataset FIRSTGalaxyData Selected classes: dict_values(['FRI', 'FRII', 'Compact', 'Bent']) Number of datapoints in total: 1758 Number of datapoint in class FRI: 395 Number of datapoint in class FRII: 824 Number of datapoint in class Compact: 291 Number of datapoint in class Bent: 248 Split: train Root Location: ./ Transforms (if any): Compose( ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) ) Target Transforms (if any): None

Options

With selected_split the data split is selected. Choose either "train" or "valid" or "test".

With selected_classes only data containing the chosen classes is returned. e.g. ["FRI",FRII"] returns only FRI and FRII images.

With selected_catalogues the dataset uses only the selected catalogues. All possible catalogues are listed here:

selected_catalogues= ["Gendre", "MiraBest", "Capetti2017a", "Capetti2017b", "Baldi2018", "Proctor_Tab1"]

data = FIRSTGalaxyData(root="./", selected_split="train", input_data_list=["galaxy_data_h5.h5"], selected_catalogues=selected_catalogues, is_PIL=True, is_RGB=True, transform=transformRGB)

Basic usage with files on disk

You will also find the dataset in the 'galaxydata' folder by unzipping `galaxydata.zip. It contains the following folder sturcture with *.png images. The most import information will also be part of the file name separated by underscores: RADECLabelSource.png E.g.14.084-9.6083MiraBest.png galaxy_data │ └───all │ │ Bent | | *.png │ │ Compact | | *.png | | FRI | | *.png │ │ FRII | | *.png │ └───test │ │ Bent | | *.png │ │ Compact | | *.png | | FRI | | *.png │ │ FRII | | *.png │ └───train │ │ Bent | | *.png │ │ Compact | | *.png | | FRI | | *.png │ │ FRII | | *.png │ └───valid │ │ Bent | | *.png │ │ Compact | | *.png | | FRI | | *.png │ │ FRII | | *.png `

Basic usage with HDF5 file

The dataset can also be accessed via the HDF5 file galaxy_data_h5.h5. Every data entry consists of a group named data_$(i) with i=1...n where n is the total number of data entries. Each group consists of the following data: * Img: two-dimensional uint8 array with (300,300) * Attributes of Img: * RA right ascension equatorial coordinate system (J2000): double * DEC declination equatorial coordinate system (J2000): double * Source: string, ["Gendre", "MiraBest", "Capetti2017a", "Capetti2017b", "Baldi2018", "ProctorTab1"] * `Filepathliterature: string, relative path to the *.png file in the foldergalaxydata *Labelliterature: double scalar, 0: ”FRI”, 1: ”FRII”, 2: ”Compact”, 3: ”Bent” *Split_literature`: string, ["train","test","valid"]

References

[1] R. H. Becker, R. L. White, D. J. Helfand, The FIRST Survey: Faint Images of the Radio Sky at Twenty Centimeters, The Astrophysical Journal 450 (1995) 559.

[2] H. Miraghaei, P. N. Best, The nuclear properties and extended morphologies of powerful radio galaxies: the roles of host galaxy and environment, Monthly Notices of the Royal Astronomical Society (2017) stx007.

[3] M. A. Gendre, P. N. Best, J. V. Wall, The combined nvss-first galaxies (config) sample - ii. comparison of space densities in the fanaroff-riley dichotomy, Monthly Notices of the Royal Astronomical Society (2010).

[4] M. A. Gendre, J. V. Wall, The combined nvss-first galaxies (config) sample - i. sample definition, classification and evolution, Monthly Notices of the Royal Astronomical Society (2008).

[5] A. Capetti, F. Massaro, R. D. Baldi, Fricat: A first catalog of fr i radio galaxies, Astronomy & Astrophysics 598 (2017) A49.

[6] A. Capetti, F. Massaro, R. D. Baldi, Friicat: A first catalog of fr ii radio galaxies, Astronomy & Astrophysics 601 (2017) A81.

[7] R. D. Baldi, A. Capetti, F. Massaro, Fr0cat: a first catalog of fr 0 radio galaxies, Astronomy & Astrophysics 609 (2017) A1.

[8] D. D. Proctor, Morphological annotations for groups in the first database, The Astrophysical Journal Supplement Series 194 (2011) 31.

Owner

  • Name: Florian Griese
  • Login: floriangriese
  • Kind: user

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Griese"
  given-names: "Florian"
  affiliation: "CDCS"
  orcid: "https://orcid.org/0000-0003-3309-9783"
- family-names: "Kummer"
  given-names: "Janis"
  affiliation: "CDCS"
  orcid: "https://orcid.org/0000-0002-7853-0103"
- family-names: "Rustige"
  given-names: "Lennart"
  affiliation: "CDCS"
  orcid: "https://orcid.org/0000-0002-0292-2477"
title: "Radio Galaxy Dataset"
version: 0.1.1
doi: 10.5281/zenodo.7120632
url: "https://github.com/floriangriese/RadioGalaxyDataset"
license: MIT
date-released: 2022-10-6

GitHub Events

Total
Last Year

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 37
  • Total Committers: 2
  • Avg Commits per committer: 18.5
  • Development Distribution Score (DDS): 0.432
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Florain Griese f****e@t****e 21
Florian f****e@g****e 16
Committer Domains (Top 20 + Academic)
gmx.de: 1 tuhh.de: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • Pillow *
  • astropy *
  • h5py *
  • matplotlib *
  • numpy *
  • setuptools *
  • torch *
  • torchvision *
pyproject.toml pypi