-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
149 lines (126 loc) · 4.8 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r srr-tags, eval = FALSE, echo = FALSE}
#' @srrstats {G1.0} Primary reference is:
#' S. Tikka, A. Hyttinen and J. Karvanen.
#' "Causal effect identification from multiple incomplete data sources:
#' a general search-based approach." \emph{Journal of Statistical Software},
#' 99(5):1--40, 2021.
#' @srrstats {G1.1} First implementation of an original algorithm.
#' @srrstats {G1.2} Life Cycle Statement is included in the README.
```
# dosearch
<!-- badges: start -->
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![R-CMD-check](https://github.com/santikka/dosearch/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/santikka/dosearch/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/santikka/dosearch/branch/master/graph/badge.svg)](https://app.codecov.io/gh/santikka/dosearch?branch=master)
[![CRAN version](https://www.r-pkg.org/badges/version/dosearch)](https://CRAN.R-project.org/package=dosearch)
<!-- badges: end -->
The `dosearch` [R](https://www.r-project.org/) package facilitates
identification of causal effects from arbitrary observational and experimental
probability distributions via do-calculus and standard probability
manipulations using a search-based algorithm (Tikka et al., 2019, 2021).
Formulas of identifiable target distributions are returned in character format
using LaTeX syntax. The causal graph may additionally include mechanisms
related to:
* Selection bias (Bareinboim and Tian, 2015)
* Transportability (Bareinboim and Pearl, 2014)
* Missing data (Mohan et al., 2013)
* Context-specific independence (Corander et al., 2019)
See the package vignette or the references for further information.
### Citing the package
If you use the `dosearch` package in a publication, please cite the
corresponding paper in the Journal of Statistical Software:
Tikka S, Hyttinen A, Karvanen J (2021). “Causal Effect Identification from
Multiple Incomplete Data Sources: A General Search-Based Approach.”
*Journal of Statistical Software*, 99(5), 1--40.
[doi:10.18637/jss.v099.i05](https://doi.org/10.18637/jss.v099.i05).
## Installation
You can install the latest release version from CRAN:
```{r, eval = FALSE}
install.packages("dosearch")
```
Alternatively, you can install the latest development version of `dosearch`:
```{r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("santikka/dosearch")
```
## Examples
```{r, echo = FALSE}
library(dosearch)
```
```{r, eval = TRUE}
# back-door formula
data <- "p(x,y,z)"
query <- "p(y|do(x))"
graph <- "
x -> y
z -> x
z -> y
"
dosearch(data, query, graph)
# front-door formula
graph <- "
x -> z
z -> y
x <-> y
"
dosearch(data, query, graph)
# the 'napkin' graph
data <- "p(x,y,z,w)"
graph <- "
x -> y
z -> x
w -> z
x <-> w
w <-> y
"
dosearch(data, query, graph)
# case-control design
data <- "
p(x*,y*,r_x,r_y)
p(y)
"
graph <- "
x -> y
y -> r_y
r_y -> r_x
"
md <- "r_x : x, r_y : y"
dosearch(data, query, graph, missing_data = md)
```
## References
* Tikka S, Hyttinen A, Karvanen J (2021).
"Causal effect identification from multiple incomplete data sources: a general search-based approach."
*Journal of Statistical Software*,
99(5), 1--40.
[doi:10.18637/jss.v099.i05](https://doi.org/10.18637/jss.v099.i05)
* Tikka S, Hyttinen A, Karvanen J (2019).
"Identifying causal effects via context-specific independence relations."
In *Proceedings of the 33rd Annual Conference on Neural Information Processing Systems*.
(https://papers.nips.cc/paper/2019/hash/d88518acbcc3d08d1f18da62f9bb26ec-Abstract.html)
* Bareinboim E, Tian J (2015).
"Recovering causal effects from selection bias."
In *Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence*,
(http://ftp.cs.ucla.edu/pub/stat_ser/r445.pdf)
* Bareinboim E, Pearl J (2014).
"Transportability from multiple environments with limited Experiments: completeness Results."
In *Advances of Neural Information Processing Systems 27*,
(http://ftp.cs.ucla.edu/pub/stat_ser/r443.pdf)
* Mohan K, Pearl J, Tian J (2013).
"Graphical models for inference with missing data."
In *Advances of Neural Information Processing Systems 26*,
(http://ftp.cs.ucla.edu/pub/stat_ser/r410.pdf)
* Corander J, Hyttinen A, Kontinen J, Pensar J, Väänänen J (2019).
"A logical approach to context-specific independence."
*Annals of Pure and Applied Logic*,
170(9), 975--992.
[doi:10.1016/j.apal.2019.04.004](https://doi.org/10.1016/j.apal.2019.04.004)