EML
Ecological Metadata Language interface for R: synthesis and integration of heterogenous data
Science Score: 33.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
4 of 25 committers (16.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (19.9%) to scientific vocabulary
Keywords
eml
eml-metadata
metadata-standard
r
r-package
rstats
Keywords from Contributors
genome
phylogenetics
noaa-data
taxonomy
geocode
geo
taxize
biology
mock
biodiversity
Last synced: 9 months ago
·
JSON representation
Repository
Ecological Metadata Language interface for R: synthesis and integration of heterogenous data
Basic Info
- Host: GitHub
- Owner: ropensci
- License: other
- Language: R
- Default Branch: master
- Homepage: https://docs.ropensci.org/EML
- Size: 6.92 MB
Statistics
- Stars: 98
- Watchers: 15
- Forks: 32
- Open Issues: 47
- Releases: 0
Topics
eml
eml-metadata
metadata-standard
r
r-package
rstats
Created almost 13 years ago
· Last pushed 10 months ago
Metadata Files
Readme
License
README.Rmd
--- output: github_document --- # EML[](https://lifecycle.r-lib.org/articles/stages.html) [](https://app.travis-ci.com/ropensci/EML) [](https://ci.appveyor.com/project/cboettig/eml) [](https://app.codecov.io/gh/ropensci/EML) [](https://cran.r-project.org/package=EML)  [](https://zenodo.org/badge/latestdoi/10894022) ```{r, echo = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" ) ``` ```{r include=FALSE} has_pandoc <- rmarkdown::pandoc_available() ``` EML is a widely used metadata standard in the ecological and environmental sciences. We strongly recommend that interested users visit the [EML Homepage](https://eml.ecoinformatics.org/) for an introduction and thorough documentation of the standard. Additionally, the scientific article *[The New Bioinformatics: Integrating Ecological Data from the Gene to the Biosphere (Jones et al 2006)](https://doi.org/10.1146/annurev.ecolsys.37.091305.110031)* provides an excellent introduction into the role EML plays in building metadata-driven data repositories to address the needs of highly heterogeneous data that cannot be easily reduced to a traditional vertically integrated database. At this time, the `EML` R package provides support for the serializing and parsing of all low-level EML concepts, but still assumes some familiarity with the EML standard, particularly for users seeking to create their own EML files. We hope to add more higher-level functions which will make such familiarity less essential in future development. ## Notes on the EML v2.0 Release `EML` v2.0 is a complete re-write which aims to provide both a drop-in replacement for the higher-level functions of the existing EML package while also providing additional functionality. This new `EML` version uses only simple and familiar list structures (S3 classes) instead of the more cumbersome use of S4 found in the original `EML`. While the higher-level functions are identical, this makes it easier to for most users and developers to work with `eml` objects and also to write their own functions for creating and manipulating EML objects. Under the hood, `EML` relies on the [emld](https://github.com/ropensci/emld/) package, which uses a Linked Data representation for EML. It is this approach which lets us combine the simplicity of lists with the specificity required by the XML schema. This revision also supports the **[recently released EML 2.2.0 specification](https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html)**. # Creating EML ```{r message=FALSE, warning=FALSE} library(EML) ``` ## A minimal valid EML document: ```{r} me <- list(individualName = list(givenName = "Carl", surName = "Boettiger")) my_eml <- list(dataset = list( title = "A Minimal Valid EML Dataset", creator = me, contact = me) ) write_eml(my_eml, "ex.xml") eml_validate("ex.xml") ``` ## A Richer Example Here we show the creation of a relatively complete EML document using `EML`. This closely parallels the function calls shown in the original EML [R-package vignette](https://docs.ropensci.org/EML/articles/creating-EML.html). ## `set_*` methods The original EML R package defines a set of higher-level `set_*` methods to facilitate the creation of complex metadata structures. `EML` provides these same methods, taking the same arguments for `set_coverage`, `set_attributes`, `set_physical`, `set_methods` and `set_textType`, as illustrated here: ### Coverage metadata ```{r} geographicDescription <- "Harvard Forest Greenhouse, Tom Swamp Tract (Harvard Forest)" coverage <- set_coverage(begin = '2012-06-01', end = '2013-12-31', sci_names = "Sarracenia purpurea", geographicDescription = geographicDescription, west = -122.44, east = -117.15, north = 37.38, south = 30.00, altitudeMin = 160, altitudeMaximum = 330, altitudeUnits = "meter") ``` ### Reading in text from Word and Markdown We read in detailed methods written in a Word doc. This uses EML's docbook-style markup to preserve formatting of paragraphs, lists, titles, and so forth. (This is a drop-in replacement for EML `set_method()`) ```{r eval=has_pandoc} methods_file <- system.file("examples/hf205-methods.docx", package = "EML") methods <- set_methods(methods_file) ``` We can also read in text that uses Markdown for markup elements: ```{r eval=has_pandoc} abstract_file <- system.file("examples/hf205-abstract.md", package = "EML") abstract <- set_TextType(abstract_file) ``` ### Attribute Metadata from Tables Attribute metadata can be verbose, and is often defined in separate tables (e.g. separate Excel sheets or `.csv` files). Here we use attribute metadata and factor definitions as given from `.csv` files. ```{r} attributes <- read.table(system.file("extdata/hf205_attributes.csv", package = "EML")) factors <- read.table(system.file("extdata/hf205_factors.csv", package = "EML")) attributeList <- set_attributes(attributes, factors, col_classes = c("character", "Date", "Date", "Date", "factor", "factor", "factor", "numeric")) ``` ### Data file format Though the `physical` metadata specifying the file format is extremely flexible, the `set_physical` function provides defaults appropriate for `.csv` files. DEVELOPER NOTE: ideally the `set_physical` method should guess the appropriate metadata structure based on the file extension. ```{r} physical <- set_physical("hf205-01-TPexp1.csv") ``` ## Generic construction In the `EML` R package, objects for which there is no `set_` method are constructed using the `new()` S4 constructor. This provided an easy way to see the list of available slots. In `eml2`, all objects are just lists, and so there is no need for special methods. We can create any object directly by nesting lists with names corresponding to the EML elements. Here we create a `keywordSet` from scratch: ```{r} keywordSet <- list( list( keywordThesaurus = "LTER controlled vocabulary", keyword = list("bacteria", "carnivorous plants", "genetics", "thresholds") ), list( keywordThesaurus = "LTER core area", keyword = list("populations", "inorganic nutrients", "disturbance") ), list( keywordThesaurus = "HFR default", keyword = list("Harvard Forest", "HFR", "LTER", "USA") )) ``` Of course, this assumes that we have some knowledge of what the possible terms permitted in an EML keywordSet are! Not so useful for novices. We can get a preview of the elements that any object can take using the `emld::template()` option, but this involves a two-part workflow. Instead, `eml2` provides generic `construct` methods for all objects. ## Constructor methods For instance, the function `eml$creator()` has function arguments corresponding to each possible slot for a creator. This means we can rely on *tab completion* (and/or autocomplete previews in RStudio) to see what the possible options are. `eml$` functions exist for all complex types. If `eml$` does not exist for an argument (e.g. there is no `eml$givenName`), then the field takes a simple string argument. ### Creating parties (creator, contact, publisher) ```{r} aaron <- eml$creator( individualName = eml$individualName( givenName = "Aaron", surName = "Ellison"), electronicMailAddress = "fakeaddress@email.com") ``` ```{r} HF_address <- eml$address( deliveryPoint = "324 North Main Street", city = "Petersham", administrativeArea = "MA", postalCode = "01366", country = "USA") ``` ```{r} publisher <- eml$publisher( organizationName = "Harvard Forest", address = HF_address) ``` ```{r} contact <- list( individualName = aaron$individualName, electronicMailAddress = aaron$electronicMailAddress, address = HF_address, organizationName = "Harvard Forest", phone = "000-000-0000") ``` ### Putting it all together ```{r} my_eml <- eml$eml( packageId = uuid::UUIDgenerate(), system = "uuid", dataset = eml$dataset( title = "Thresholds and Tipping Points in a Sarracenia", creator = aaron, pubDate = "2012", intellectualRights = "http://www.lternet.edu/data/netpolicy.html.", abstract = abstract, keywordSet = keywordSet, coverage = coverage, contact = contact, methods = methods, dataTable = eml$dataTable( entityName = "hf205-01-TPexp1.csv", entityDescription = "tipping point experiment 1", physical = physical, attributeList = attributeList) )) ``` ## Serialize and validate We can also validate first and then serialize: ```{r} eml_validate(my_eml) write_eml(my_eml, "eml.xml") ``` ## Setting the version EML will use the latest EML specification by default. To switch to a different version, use `emld::eml_version()` ```{r} emld::eml_version("eml-2.1.1") ``` Switch back to the 2.2.0 release: ```{r} emld::eml_version("eml-2.2.0") ``` ```{r include = FALSE} unlink("eml.xml") unlink("ex.xml") codemetar::write_codemeta() ```
Owner
- Name: rOpenSci
- Login: ropensci
- Kind: organization
- Email: info@ropensci.org
- Location: Berkeley, CA
- Website: https://ropensci.org/
- Twitter: rOpenSci
- Repositories: 307
- Profile: https://github.com/ropensci
GitHub Events
Total
- Watch event: 1
- Issue comment event: 2
- Push event: 2
Last Year
- Watch event: 1
- Issue comment event: 2
- Push event: 2
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Carl Boettiger | c****g@g****m | 726 |
| masalmon | m****n@y****e | 44 |
| Bryce Mecum | p****h@g****m | 34 |
| cpfaff | c****f@a****e | 32 |
| Matt Jones | g****e@m****g | 17 |
| Anna Liu | l****o@g****m | 15 |
| Jeanette Clark | j****k@n****u | 9 |
| Karthik Ram | k****m@g****m | 8 |
| rstudio | r****o@e****m | 4 |
| laijasmine | j****i@n****u | 4 |
| Duncan Temple Lang | d****n@r****g | 4 |
| Matthias Grenié | m****e@e****r | 3 |
| Jeroen Ooms | j****s@g****m | 3 |
| Ian Brunjes | 7****v | 2 |
| An T. Nguyen | 4****8 | 1 |
| Anna Krystalli | a****i@g****m | 1 |
| Colin Smith | c****h@w****u | 1 |
| Darío Hereñú | m****a@g****m | 1 |
| Dominic Mullen | d****7@g****m | 1 |
| Edmund Hart | e****t@g****m | 1 |
| Julien Brun | b****7 | 1 |
| Lauren Walker | w****r@n****u | 1 |
| ropenscibot | m****t@g****m | 1 |
| mmfink | m****k | 1 |
| Ivan Hanigan | i****n@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 70
- Total pull requests: 32
- Average time to close issues: 2 months
- Average time to close pull requests: 27 days
- Total issue authors: 30
- Total pull request authors: 13
- Average comments per issue: 3.94
- Average comments per pull request: 2.19
- Merged pull requests: 28
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 3.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- jeanetteclark (8)
- cboettig (8)
- atn38 (7)
- peterdesmet (6)
- amoeba (6)
- RobLBaker (4)
- laijasmine (3)
- earnaud (3)
- mbjones (2)
- BrennieDev (2)
- karawoo (2)
- kzollove (1)
- lkuiucsb (1)
- sagesteppe (1)
- annakrystalli (1)
Pull Request Authors
- jeanetteclark (9)
- amoeba (8)
- BrennieDev (3)
- cboettig (3)
- atn38 (1)
- katrinleinweber (1)
- kant (1)
- laijasmine (1)
- laurenwalker (1)
- srearl (1)
- annakrystalli (1)
- clnsmth (1)
- mmfink (1)
Top Labels
Issue Labels
bug (3)
schema / S4 class (1)
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- cran 709 last-month
- Total docker downloads: 88,618
-
Total dependent packages: 10
(may contain duplicates) -
Total dependent repositories: 44
(may contain duplicates) - Total versions: 19
- Total maintainers: 1
cran.r-project.org: EML
Read and Write Ecological Metadata Language Files
- Homepage: https://docs.ropensci.org/EML/
- Documentation: http://cran.r-project.org/web/packages/EML/EML.pdf
- License: MIT + file LICENSE
-
Latest release: 2.0.7
published 10 months ago
Rankings
Docker downloads count: 0.0%
Forks count: 2.4%
Dependent repos count: 3.9%
Stargazers count: 4.2%
Average: 5.2%
Dependent packages count: 6.6%
Downloads: 14.1%
Maintainers (1)
Last synced:
10 months ago
conda-forge.org: r-eml
- Homepage: https://github.com/ropensci/EML
- License: MIT
-
Latest release: 2.0.5
published over 5 years ago
Rankings
Dependent packages count: 15.6%
Dependent repos count: 24.3%
Average: 26.2%
Forks count: 30.2%
Stargazers count: 34.5%
Last synced:
10 months ago