zoraptera-occurrence-dataset
Zoraptera Occurrence Dataset - curated dataset of global occurrence records of Zoraptera
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 8 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.1%) to scientific vocabulary
Repository
Zoraptera Occurrence Dataset - curated dataset of global occurrence records of Zoraptera
Basic Info
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 2
Metadata Files
README.md
Zoraptera Occurrence Dataset
The dataset is stored in zoraptera_occs.csv file, its based on Darwin Core standard, and is updated manually or by semi-automated workflow described below.
How to cite:
If you use this repository, please cite both the asociated paper and the used version of data/repository itself.
Published data descriptor:
Kaláb, O., Hoffmannova, J., Packova, G. et al. Curated global occurrence dataset of the insect order Zoraptera. Sci Data 12, 360 (2025). https://doi.org/10.1038/s41597-025-04696-4
Last version of the dataset (1.1.0):
Note on fields:
Besides fields defined by Darwin Core Standard (https://dwc.tdwg.org/list/), we added five custom fields:
| Field name | Description |
|------------|-------------------------------------------|
|zodID | unique id of the record in the dataset |
|osmID | id of related OSM geometry |
|polygon_fid | id of related polygon in geom.gpkg file |
|gbifID | id of related GBIF record |
|inatID | id of related iNaturalist record |
[!NOTE] Notice on WKT geometries in
footprintWKTBe aware that spreadsheet processors may have a limited number of characters per cell, and thus may trim values that are too long. This may cause problem with
footprintWKTcolumn specifically when the user opens the data in software with a limited number of characters in cell value, then edits the data and saves the file. In such a case, longer WKT text may be truncated and the geometries may be invalidated. However, this will not affect the rest of the dataset andfootprintWKTcolumn can be easily restored from the original file, or recalculated runinggeom_calc.r, which retrieve WKT geometries fromgeom.gpkg.
Graphical data summary:
Geographical distribution of Zoraptera records in the dataset by subfamily
Count of Zoraptera records in the dataset across years by family
Dataset update workflow
All updates can be tracked in history of the file, or in commits history in general. Semi-automated updates are tracked in update.log file including date, source, and doi if aplicable.
Manual updates
Simply manual manipulation of zoraptera.csv
Semi-aumtomatized updates
iNaturalist
- designated person revise identification directly on iNaturalist
- run script
scripts/inat.rwhich lookup actual iNaturalist data and check if any new identification were done by designated person (now only Petr Kočárek), and if there any, its automatically written tozoraptera_occs.csv, and information about update (date, source) will be written in to logfileupdate.log
GBIF
- run script
scripts/gbif.rwhich downloads latest used GBIF dataset with doi red from logfileupdate.log - any new data found will be written to csv file, and information about update (date, source, dataset doi) will be written in to logfile
update.log - csv file with the new data have to be manually checked and implemented in
zoraptera_occs.csvacording to methods published in paper.
Geometry (coordinates) updates
If any new record without coordinates is added to the dataset, the coordinates and positional uncertainty will be obtained following this workflow:
QGIS
- use OSM place search plugin to find the locality by name, and copy appropriate features to the layer
geom/geom.gpkg - edit the geometry to represent the locality as close as possible, if the desired place is not the feature but it is related to it, add a new polygon while keeping the original feature attributes. Remove sea or ocean areas with OSM features taged
natural=coastlines - if the locality is not present in OSM, draw polygon manually
- simplify the polygon with QGIS
Simplifyalgorithm from geoprocessing toolbox (Visvalingam algorithm, tolerance 100) - fill the
feature_originattribute with categories:manual- not related to any OSM feature, manually digitized from the descriptionosm_related- features related to OSM features, not intersecting them but manually digitized based on themosm_derived- features derived from OSM features, features intersecting each otherosm_exact- features that are exact copies of OSM features
- polygon geometry can be edited in
geom/geom.gpkg(e.g. polygon site improvement or adding new polygons)
R
- after any
geom/geom.gpkgedit, the coordinates and positional uncertainty should be recalculated withscripts/geom_calc.rto write changes tozoraptera_occs.csv. Runnigscripts/geom_calc.ralso recalculate the WKT geometries forfootprintWKTcolumn in thezoraptera_occs.csvdataset.
[!NOTE] This MIT licence apply on repository excluding single data records in the dataset. The license for each entry (if applicable) is listed in column
licenceof thezoraptera_occs.csvdataset and may be incompatible with MIT. This research was supported by the Grant Agency of the Czech Republic (project No. 22-05024S; Evolution of angel insects (Zoraptera): from fossils and comparative morphology to cytogenetics and transcriptomes).
Owner
- Name: Oto Kaláb
- Login: kalab-oto
- Kind: user
- Location: Czechia
- Company: Department of Physical Geography and Geoecology / University of Ostrava & @GISMentors / @OpenGeoLabs
- Repositories: 11
- Profile: https://github.com/kalab-oto
ecology - GIS - spatial ecology - biogeography - orthoptera
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this repository, please cite both the article from preferred-citation and this dataset repository itself."
authors:
- family-names: Kaláb
given-names: Oto
affiliation: Department of Physical Geography and Geoecology, Faculty of Science, University of Ostrava,
orcid: 0000-0003-3485-9377
- family-names: Hoffmannova
given-names: Johana
affiliation: Department of Zoology, Faculty of Science, Palacky University,
orcid: 0000-0003-0216-6031
- family-names: Packova
given-names: Gabriela
affiliation: Department of Zoology, Faculty of Science, Palacky University,
orcid: 0000-0001-7949-619X
- family-names: Kočárková
given-names: Ivona
affiliation: Department of Biology and Ecology, Faculty of Science, University of Ostrava,
orcid: 0000-0002-8942-9481
- family-names: Kundrata
given-names: Robin
affiliation: Department of Zoology, Faculty of Science, Palacky University,
orcid: 0000-0001-9397-1030
- family-names: Kočárek
given-names: Petr
affiliation: Department of Biology and Ecology, Faculty of Science, University of Ostrava,
orcid: 0000-0002-1739-0143
preferred-citation:
authors:
- family-names: Kaláb
given-names: Oto
affiliation: Department of Physical Geography and Geoecology, Faculty of Science, University of Ostrava,
orcid: 0000-0003-3485-9377
- family-names: Hoffmannova
given-names: Johana
affiliation: Department of Zoology, Faculty of Science, Palacky University,
orcid: 0000-0003-0216-6031
- family-names: Packova
given-names: Gabriela
affiliation: Department of Zoology, Faculty of Science, Palacky University,
orcid: 0000-0001-7949-619X
- family-names: Kočárková
given-names: Ivona
affiliation: Department of Biology and Ecology, Faculty of Science, University of Ostrava,
orcid: 0000-0002-8942-9481
- family-names: Kundrata
given-names: Robin
affiliation: Department of Zoology, Faculty of Science, Palacky University,
orcid: 0000-0001-9397-1030
- family-names: Kočárek
given-names: Petr
affiliation: Department of Biology and Ecology, Faculty of Science, University of Ostrava,
orcid: 0000-0002-1739-0143
title: "Curated global occurrence dataset of the insect order Zoraptera"
type: article
database: DOI.org (Crossref)
issn: 2052-4463
issue: 1
journal: Sci Data
languages: en
pages: 360
volume: 12
url: https://www.nature.com/articles/s41597-025-04696-4
date-published: 2025-02-28
identifiers:
- type: doi
value: 10.1038/s41597-025-04696-4
title: "Zoraptera Occurrence Dataset"
version: 1.1.0
doi:
date-released:
type: dataset
GitHub Events
Total
- Release event: 1
- Watch event: 2
- Push event: 25
- Create event: 1
Last Year
- Release event: 1
- Watch event: 2
- Push event: 25
- Create event: 1