https://github.com/juliaearth/coda.jl

Compositional data analysis in Julia

https://github.com/juliaearth/coda.jl

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.7%) to scientific vocabulary

Keywords

coda compositional-data compositional-data-analysis compositions simplex statistics

Keywords from Contributors

stochastic-processes
Last synced: 6 months ago · JSON representation

Repository

Compositional data analysis in Julia

Basic Info
  • Host: GitHub
  • Owner: JuliaEarth
  • License: mit
  • Language: Julia
  • Default Branch: master
  • Homepage:
  • Size: 298 KB
Statistics
  • Stars: 61
  • Watchers: 5
  • Forks: 8
  • Open Issues: 1
  • Releases: 51
Topics
coda compositional-data compositional-data-analysis compositions simplex statistics
Created about 8 years ago · Last pushed 6 months ago
Metadata Files
Readme License

README.md


This package defines a Composition{D} type representing a D-part composition as defined by Aitchison 1986. In Aitchison's geometry, the D-simplex together with addition (a.k.a. pertubation) and scalar multiplication (a.k.a. scaling) form a vector space, and important properties hold:

  • Scaling invariance
  • Pertubation invariance
  • Permutation invariance
  • Subcompositional coherence

In practice, this means that one can operate on compositional data (i.e. vectors whose entries represent parts of a total) without destroying the ratios of the parts.

Installation

Get the latest stable release with Julia's package manager:

julia ] add CoDa

Usage

Basics

Compositions are static vectors with named parts:

```julia julia> using CoDa

julia> c = Composition(CO₂=2.0, CH₄=0.1, N₂O=0.3) 3-part composition ┌ ┐ CO₂ ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 2.0
CH₄ ┤■■ 0.1
N₂O ┤■■■■■ 0.3
└ ┘

julia> CoDa.parts(c) (:CO₂, :CH₄, :N₂O)

julia> CoDa.components(c) 3-element StaticArrays.SVector{3, Union{Missing, Float64}} with indices SOneTo(3): 2.0 0.1 0.3

julia> c.CO₂ 2.0 ```

Default names are added otherwise:

julia julia> c = Composition(1.0, 0.1, 0.1) 3-part composition ┌ ┐ w1 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 1.0 w2 ┤■■■■ 0.1 w3 ┤■■■■ 0.1 └ ┘

and serve for internal compile-time checks.

Compositions can be added, subtracted, negated, and multiplied by scalars. Other operations are also defined including dot product, induced norm, and distance:

```julia julia> cₒ = Composition(CO₂=1.0, CH₄=0.1, N₂O=0.1) 3-part composition ┌ ┐ CO₂ ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 1.0
CH₄ ┤■■■■ 0.1
N₂O ┤■■■■ 0.1
└ ┘

julia> -cₒ 3-part composition ┌ ┐ CO₂ ┤■■ 0.047619047619047616
CH₄ ┤■■■■■■■■■■■■■■■■■■■ 0.47619047619047616
N₂O ┤■■■■■■■■■■■■■■■■■■■ 0.47619047619047616
└ ┘

julia> 0.5c 3-part composition ┌ ┐ CO₂ ┤■■■■■■■■■■■■■■■■■■■■ 0.6207690197922022
CH₄ ┤■■■■ 0.13880817265812764
N₂O ┤■■■■■■■■ 0.24042280754967013
└ ┘

julia> c - cₒ 3-part composition ┌ ┐ CO₂ ┤■■■■■■■■■■■■■■■■■■■■■■■ 0.3333333333333333
CH₄ ┤■■■■■■■■■■■■ 0.16666666666666666
N₂O ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.5
└ ┘

julia> c ⋅ cₒ 3.7554028908352994

julia> norm(c) 2.1432393747688687

julia> aitchison(c, cₒ) # Aitchison distance 0.7856640352007868 ```

More complex functions can be defined in terms of these operations. For example, the function below defines the composition line passing through cₒ in the direction of c:

julia julia> f(λ) = cₒ + λ*c f (generic function with 1 method)

Finally, two compositions are considered to be equal when their closure is approximately equal:

```julia julia> c == c true

julia> c == cₒ false ```

Log-ratio transformations

Currently, the following log-ratio transformations are implemented:

```julia julia> alr(c) 2-element StaticArrays.SArray{Tuple{2},Float64,1,2} with indices SOneTo(2): 1.8971199848858813 -1.0986122886681096

julia> clr(c) 3-element StaticArrays.SArray{Tuple{3},Float64,1,3} with indices SOneTo(3): 1.6309507528132907 -1.3647815207407001 -0.2661692320725906

julia> ilr(c) 2-element StaticArrays.SArray{Tuple{2},Float64,1,2} with indices SOneTo(2): -2.1183026052494185 -0.3259894019031434 ```

and their inverses alrinv, clrinv and ilrinv.

The transforms for tables are defined in the TableTransforms.jl package, they are: Compose, Closure, Remainder, ALR, CLR, ILR. These transforms are functors that can be used as follows:

julia julia> table |> ILR()

Arrays

It is often useful to compose D columns of a table into D-part compositions. The package provides a CoDaArray type that implements the Julia array interface and the Tables.jl interface. We recommend using the function compose(table, cols) to construct such arrays:

```julia julia> table = (a=[1,2,3], b=[4,5,6], c=[7,8,9]) (a = [1, 2, 3], b = [4, 5, 6], c = [7, 8, 9])

julia> ctable = compose(table, (:a,:b)) (c = [7, 8, 9], coda = Composition{2, (:a, :b)}[1.000 : 4.000, 2.000 : 5.000, 3.000 : 6.000])

julia> ctable.coda[1] 2-part composition ┌ ┐ a ┤■■■■■■■■■ 1.0
b ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 4.0
└ ┘ ```

Random

D-part compositions can be created at random from a Dirichlet distribution:

julia julia> rand(Composition{3}) 3-part composition ┌ ┐ w1 ┤■■■■■■■■■■■■■■■■■ 0.39938229705106565 w2 ┤■■■■■■ 0.1491859823748656 w3 ┤■■■■■■■■■■■■■■■■■■■ 0.45143172057406883 └ ┘

Plots

Separate packages are available for plotting compositional data:

References

This package is heavily influenced by Aitchison's monograph:

  • Aitchison, J. 1986. The Statistical Analysis of Compositional Data

and by other textbooks:

  • den Boogaart, K. & Tolosana-Delgado. 2011. Analyzing Compositional Data with R
  • Pawlowsky-Glahn et al. 2015. Modeling and Analysis of Compositional Data
  • Pawlowsky-Glahn, V. & Buccianti, A. 2011. Compositional Data Analysis - Theory and Applications

Notes

The unicode display of composition objects can be obtained with the following code:

```julia using UnicodePlots using CoDa

function Base.show(io::IO, mime::MIME"text/plain", c::Composition{D,PARTS}) where {D,PARTS} w = CoDa.components(c) x = Vector{Float64}() p = Vector{Symbol}() m = Vector{Symbol}() for i in 1:D if ismissing(w[i]) push!(m, PARTS[i]) else push!(p, PARTS[i]) push!(x, w[i]) end end plt = barplot(p, x, title="$D-part composition") isempty(m) || annotate!(plt, :t, "missing: $(join(m,", "))") show(io, mime, plt) end ```

The code is not added to the CoDa.jl package itself because the UnicodePlots.jl package has become a very heavy dependency, see UnicodePlots/issues/291.

Owner

  • Name: JuliaEarth
  • Login: JuliaEarth
  • Kind: organization

Fostering geospatial data science and geostatistical modeling in Earth sciences

GitHub Events

Total
  • Create event: 1
  • Commit comment event: 2
  • Release event: 1
  • Watch event: 3
  • Delete event: 1
  • Issue comment event: 1
  • Push event: 4
  • Pull request event: 2
Last Year
  • Create event: 1
  • Commit comment event: 2
  • Release event: 1
  • Watch event: 3
  • Delete event: 1
  • Issue comment event: 1
  • Push event: 4
  • Pull request event: 2

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 219
  • Total Committers: 7
  • Avg Commits per committer: 31.286
  • Development Distribution Score (DDS): 0.105
Top Committers
Name Email Commits
Júlio Hoffimann j****n@g****m 196
Rodrigo Loro Schuller r****o@i****r 8
github-actions[bot] 4****]@u****m 5
José A. S. Silva 3****t@u****m 4
mralbu m****e@g****m 3
Rodrigo Schuller 1****r@u****m 2
Okon Samuel 3****l@u****m 1
Committer Domains (Top 20 + Academic)
impa.br: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 9
  • Total pull requests: 42
  • Average time to close issues: 11 days
  • Average time to close pull requests: 1 day
  • Total issue authors: 3
  • Total pull request authors: 8
  • Average comments per issue: 6.89
  • Average comments per pull request: 1.9
  • Merged pull requests: 35
  • Bot issues: 0
  • Bot pull requests: 18
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: about 12 hours
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 2
Top Authors
Issue Authors
  • juliohm (7)
  • JuliaTagBot (1)
  • Iddingsite (1)
Pull Request Authors
  • dependabot[bot] (15)
  • github-actions[bot] (7)
  • eliascarv (7)
  • mrr00b00t (6)
  • mralbu (4)
  • juliohm (3)
  • rlschuller (3)
  • OkonSamuel (1)
Top Labels
Issue Labels
enhancement (5) help wanted (3) good first issue (3) tests (1)
Pull Request Labels
dependencies (15) formatting (1) automated pr (1) no changelog (1) bug (1)

Dependencies

.github/workflows/CI.yml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • codecov/codecov-action v1 composite
  • julia-actions/julia-buildpkg v1 composite
  • julia-actions/julia-processcoverage v1 composite
  • julia-actions/julia-runtest v1 composite
  • julia-actions/setup-julia v1 composite
.github/workflows/CompatHelper.yml actions
  • julia-actions/setup-julia latest composite
.github/workflows/TagBot.yml actions
  • JuliaRegistries/TagBot v1 composite