dosearch

dosearch: R Package for Identifying General Causal Queries

https://github.com/santikka/dosearch

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 9 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (19.7%) to scientific vocabulary

Keywords

c-plus-plus causal-inference causal-models causality causality-algorithms directed-acyclic-graph graphs labeled-graphs r
Last synced: 6 months ago · JSON representation

Repository

dosearch: R Package for Identifying General Causal Queries

Basic Info
  • Host: GitHub
  • Owner: santikka
  • License: gpl-3.0
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 2.64 MB
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Topics
c-plus-plus causal-inference causal-models causality causality-algorithms directed-acyclic-graph graphs labeled-graphs r
Created about 5 years ago · Last pushed 7 months ago
Metadata Files
Readme Changelog Contributing License Codemeta

README.Rmd

---
output: github_document
---



```{r setup, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r srr-tags, eval = FALSE, echo = FALSE}
#' @srrstats {G1.0} Primary reference is:
#'   S. Tikka, A. Hyttinen and J. Karvanen.
#'   "Causal effect identification from multiple incomplete data sources:
#'   a general search-based approach." \emph{Journal of Statistical Software},
#'   99(5):1--40, 2021.
#' @srrstats {G1.1} First implementation of an original algorithm.
#' @srrstats {G1.2} Life Cycle Statement is included in the README.
```

# dosearch


[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![R-CMD-check](https://github.com/santikka/dosearch/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/santikka/dosearch/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/santikka/dosearch/branch/master/graph/badge.svg)](https://app.codecov.io/gh/santikka/dosearch?branch=master)
[![CRAN version](https://www.r-pkg.org/badges/version/dosearch)](https://CRAN.R-project.org/package=dosearch)



The `dosearch` [R](https://www.r-project.org/) package facilitates 
identification of causal effects from arbitrary observational and experimental 
probability distributions via do-calculus and standard probability 
manipulations using a search-based algorithm (Tikka et al., 2019, 2021).
Formulas of identifiable target distributions are returned in character format
using LaTeX syntax. The causal graph may additionally include mechanisms 
related to: 

* Selection bias (Bareinboim and Tian, 2015)
* Transportability (Bareinboim and Pearl, 2014) 
* Missing data (Mohan et al., 2013)
* Context-specific independence (Corander et al., 2019)
  
See the package vignette or the references for further information.
  
### Citing the package

If you use the `dosearch` package in a publication, please cite the 
corresponding paper in the Journal of Statistical Software:

Tikka S, Hyttinen A, Karvanen J (2021). “Causal Effect Identification from 
Multiple Incomplete Data Sources: A General Search-Based Approach.” 
*Journal of Statistical Software*, 99(5), 1--40. 
[doi:10.18637/jss.v099.i05](https://doi.org/10.18637/jss.v099.i05).

## Installation

You can install the latest release version from CRAN:
```{r, eval = FALSE}
install.packages("dosearch")
```

Alternatively, you can install the latest development version of `dosearch`:
```{r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("santikka/dosearch")
```

## Examples

```{r, echo = FALSE}
library(dosearch)
```

```{r, eval = TRUE}
# back-door formula
data <- "p(x,y,z)"
query <- "p(y|do(x))"
graph <- "
  x -> y
  z -> x
  z -> y
"
dosearch(data, query, graph)

# front-door formula
graph <- "
  x -> z
  z -> y
  x <-> y
"
dosearch(data, query, graph)

# the 'napkin' graph
data <- "p(x,y,z,w)"
graph <- "
  x -> y
  z -> x
  w -> z
  x <-> w
  w <-> y
"
dosearch(data, query, graph)

# case-control design
data <- "
  p(x*,y*,r_x,r_y)
  p(y)
"
graph <- "
  x -> y
  y -> r_y
  r_y -> r_x
"
md <- "r_x : x, r_y : y"
dosearch(data, query, graph, missing_data = md)
```

## References

* Tikka S, Hyttinen A, Karvanen J (2021).
  "Causal effect identification from multiple incomplete data sources: a general search-based approach." 
  *Journal of Statistical Software*, 
  99(5), 1--40.
  [doi:10.18637/jss.v099.i05](https://doi.org/10.18637/jss.v099.i05)
* Tikka S, Hyttinen A, Karvanen J (2019). 
  "Identifying causal effects via context-specific independence relations." 
  In *Proceedings of the 33rd Annual Conference on Neural Information Processing Systems*.
  (https://papers.nips.cc/paper/2019/hash/d88518acbcc3d08d1f18da62f9bb26ec-Abstract.html)
* Bareinboim E, Tian J (2015).
  "Recovering causal effects from selection bias." 
  In *Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence*,
  (http://ftp.cs.ucla.edu/pub/stat_ser/r445.pdf)
* Bareinboim E, Pearl J (2014).
  "Transportability from multiple environments with limited Experiments: completeness Results." 
  In *Advances of Neural Information Processing Systems 27*,
  (http://ftp.cs.ucla.edu/pub/stat_ser/r443.pdf)
* Mohan K, Pearl J, Tian J (2013).
  "Graphical models for inference with missing data." 
  In *Advances of Neural Information Processing Systems 26*,
  (http://ftp.cs.ucla.edu/pub/stat_ser/r410.pdf)
* Corander J, Hyttinen A, Kontinen J, Pensar J, Väänänen J (2019).
  "A logical approach to context-specific independence." 
  *Annals of Pure and Applied Logic*,
  170(9), 975--992. 
  [doi:10.1016/j.apal.2019.04.004](https://doi.org/10.1016/j.apal.2019.04.004)

Owner

  • Name: Santtu Tikka
  • Login: santikka
  • Kind: user
  • Location: Finland
  • Company: University of Jyväskylä

Postdoctoral researcher at University of Jyväskylä, Department of Mathematics and Statistics.

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "dosearch",
  "description": "Identification of causal effects from arbitrary observational and experimental probability distributions via do-calculus and standard probability manipulations using a search-based algorithm by Tikka, Hyttinen and Karvanen (2021) <doi:10.18637/jss.v099.i05>. Allows for the presence of mechanisms related to selection bias (Bareinboim and Tian, 2015) <doi:10.1609/aaai.v29i1.9679>, transportability (Bareinboim and Pearl, 2014) <http://ftp.cs.ucla.edu/pub/stat_ser/r443.pdf>, missing data (Mohan, Pearl, and Tian, 2013) <http://ftp.cs.ucla.edu/pub/stat_ser/r410.pdf>) and arbitrary combinations of these. Also supports identification in the presence of context-specific independence (CSI) relations through labeled directed acyclic graphs (LDAG). For details on CSIs see (Corander et al., 2019) <doi:10.1016/j.apal.2019.04.004>.",
  "name": "dosearch: Causal Effect Identification from Multiple Incomplete Data Sources",
  "codeRepository": "https://github.com/santikka/dosearch",
  "issueTracker": "https://github.com/santikka/dosearch/issues",
  "license": "https://spdx.org/licenses/GPL-3.0",
  "version": "1.0.12",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.5.1 (2025-06-13 ucrt)",
  "provider": {
    "@id": "https://cran.r-project.org",
    "@type": "Organization",
    "name": "Comprehensive R Archive Network (CRAN)",
    "url": "https://cran.r-project.org"
  },
  "author": [
    {
      "@type": "Person",
      "givenName": "Santtu",
      "familyName": "Tikka",
      "email": "santtuth@gmail.com",
      "@id": "https://orcid.org/0000-0003-4039-4342"
    }
  ],
  "contributor": [
    {
      "@type": "Person",
      "givenName": "Antti",
      "familyName": "Hyttinen",
      "@id": "https://orcid.org/0000-0002-6649-3229"
    },
    {
      "@type": "Person",
      "givenName": "Juha",
      "familyName": "Karvanen",
      "@id": "https://orcid.org/0000-0001-5530-769X"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": "Santtu",
      "familyName": "Tikka",
      "email": "santtuth@gmail.com",
      "@id": "https://orcid.org/0000-0003-4039-4342"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "covr",
      "name": "covr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=covr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "dagitty",
      "name": "dagitty",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dagitty"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "DiagrammeR",
      "name": "DiagrammeR",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=DiagrammeR"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "DOT",
      "name": "DOT",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=DOT"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "igraph",
      "name": "igraph",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=igraph"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "knitr",
      "name": "knitr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=knitr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "mockr",
      "name": "mockr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=mockr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rmarkdown",
      "name": "rmarkdown",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rmarkdown"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "version": ">= 3.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "R",
      "name": "R",
      "version": ">= 4.0"
    },
    "2": {
      "@type": "SoftwareApplication",
      "identifier": "Rcpp",
      "name": "Rcpp",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=Rcpp"
    },
    "SystemRequirements": null
  },
  "fileSize": "9451.444KB",
  "citation": [
    {
      "@type": "ScholarlyArticle",
      "datePublished": "2021",
      "author": [
        {
          "@type": "Person",
          "givenName": "Santtu",
          "familyName": "Tikka"
        },
        {
          "@type": "Person",
          "givenName": "Antti",
          "familyName": "Hyttinen"
        },
        {
          "@type": "Person",
          "givenName": "Juha",
          "familyName": "Karvanen"
        }
      ],
      "name": "Causal Effect Identification from Multiple Incomplete Data Sources: A General Search-Based Approach",
      "identifier": "10.18637/jss.v099.i05",
      "pagination": "1--40",
      "@id": "https://doi.org/10.18637/jss.v099.i05",
      "sameAs": "https://doi.org/10.18637/jss.v099.i05",
      "isPartOf": {
        "@type": "PublicationIssue",
        "issueNumber": "5",
        "datePublished": "2021",
        "isPartOf": {
          "@type": [
            "PublicationVolume",
            "Periodical"
          ],
          "volumeNumber": "99",
          "name": "Journal of Statistical Software"
        }
      }
    },
    {
      "@type": "CreativeWork",
      "datePublished": "2024",
      "author": [
        {
          "@type": "Person",
          "givenName": "Santtu",
          "familyName": "Tikka"
        },
        {
          "@type": "Person",
          "givenName": "Antti",
          "familyName": "Hyttinen"
        },
        {
          "@type": "Person",
          "givenName": "Juha",
          "familyName": "Karvanen"
        }
      ],
      "name": "{dosearch}: Causal Effect Identification from Multiple Incomplete Data Sources",
      "identifier": "10.32614/CRAN.package.dosearch",
      "description": "{R} package version 1.0.11",
      "@id": "https://doi.org/10.32614/CRAN.package.dosearch",
      "sameAs": "https://doi.org/10.32614/CRAN.package.dosearch",
      "isPartOf": {
        "@type": "PublicationIssue",
        "datePublished": "2024",
        "isPartOf": {
          "@type": [
            "PublicationVolume",
            "Periodical"
          ],
          "name": "CRAN: Contributed Packages"
        }
      }
    }
  ],
  "releaseNotes": "https://github.com/santikka/dosearch/blob/main/NEWS.md",
  "readme": "https://github.com/santikka/dosearch/blob/master/README.md",
  "contIntegration": [
    "https://github.com/santikka/dosearch/actions/workflows/R-CMD-check.yaml",
    "https://app.codecov.io/gh/santikka/dosearch?branch=master"
  ],
  "developmentStatus": "https://www.repostatus.org/#active",
  "keywords": [
    "causal-models",
    "causal-inference",
    "causality-algorithms",
    "causality",
    "directed-acyclic-graph",
    "labeled-graphs",
    "graphs",
    "r",
    "c-plus-plus"
  ],
  "relatedLink": "https://CRAN.R-project.org/package=dosearch"
}

GitHub Events

Total
  • Watch event: 1
  • Push event: 1
Last Year
  • Watch event: 1
  • Push event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 0
  • Average time to close issues: about 1 year
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • zys07 (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 325 last-month
  • Total docker downloads: 20,406
  • Total dependent packages: 1
  • Total dependent repositories: 0
  • Total versions: 10
  • Total maintainers: 1
cran.r-project.org: dosearch

Causal Effect Identification from Multiple Incomplete Data Sources

  • Versions: 10
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 325 Last month
  • Docker Downloads: 20,406
Rankings
Dependent packages count: 18.7%
Average: 27.6%
Downloads: 28.7%
Dependent repos count: 35.5%
Maintainers (1)
Last synced: 6 months ago