jlmerclusterperm 1.0.3
CRAN release: 2023-07-19
diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml
index 0cb8e98..55e994c 100644
--- a/docs/pkgdown.yml
+++ b/docs/pkgdown.yml
@@ -10,7 +10,7 @@ articles:
reproducibility: reproducibility.html
tidying-output: tidying-output.html
jlmerclusterperm: jlmerclusterperm.html
-last_built: 2023-09-01T17:28Z
+last_built: 2023-09-06T12:45Z
urls:
reference: https://yjunechoe.github.io/jlmerclusterperm/reference
article: https://yjunechoe.github.io/jlmerclusterperm/articles
diff --git a/docs/search.json b/docs/search.json
index bf6bbb7..e6e6680 100644
--- a/docs/search.json
+++ b/docs/search.json
@@ -1 +1 @@
-[{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/asynchronous-cpa.html","id":"setup","dir":"Articles","previous_headings":"","what":"Setup","title":"Asynchronous CPA","text":"Minimal example data CPA specification: Example CPA (1000 simulations): details actual CPA skipped ’s focus vignette, example readme just nsim.","code":"library(jlmerclusterperm) jlmerclusterperm_setup(verbose = FALSE) chickweights <- ChickWeight chickweights$Time <- as.integer(factor(chickweights$Time)) chickweights_spec <- make_jlmer_spec( formula = weight ~ 1 + Diet, data = chickweights, subject = \"Chick\", time = \"Time\" ) set_rng_state(123L) CPA <- clusterpermute( chickweights_spec, threshold = 2.5, nsim = 1000, progress = FALSE )"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/asynchronous-cpa.html","id":"basic-idea","dir":"Articles","previous_headings":"","what":"Basic idea","title":"Asynchronous CPA","text":"basic idea behind asynchronous CPA strategy start background R process whose sole job send instructions CPA Julia. ’s asynchronous running CPA way block interactive R session. Since work done Julia anyways, parallelization virtually impact performance evaluating R code. future package allows asynchronous evaluation R code. ’s big package implementing complex topic. can read project futureverse just show bare minimum . advanced users, note ’re use parallelization asynchronous properties (non-blocking evaluation R code background process). recommended start multiple R processes running CPA Julia session shared already multithreaded (can spare cores, set options(\"jlmerclusterperm.nthreads\") calling jlmerclusterperm_setup()). Later updates jlmerclusterperm may wrap workflow principled way, now vignette serves minimally working example asynchronously running CPA.","code":""},{"path":[]},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/asynchronous-cpa.html","id":"setup-for-async-cpa","dir":"Articles","previous_headings":"Walkthrough","what":"Setup for async CPA","title":"Asynchronous CPA","text":"Three things order workflow: Load future package Initialize multisession future Grab options jlmerclusterperm package environment Please treat .jlmerclusterperm internal variable read-object - ’s unexported meant manipulated.","code":"library(future) plan(multisession) pkgopts <- as.list(jlmerclusterperm:::.jlmerclusterperm)"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/asynchronous-cpa.html","id":"creating-the-future-object","dir":"Articles","previous_headings":"Walkthrough","what":"Creating the future object","title":"Asynchronous CPA","text":"start creating special object class
using future::future(). can simply pass clusterpermute() code first argument future(), slight modification ensure background process connects Julia session. template follows. First, make pkgopts object (defined ) available future (via globals argument). , inside future expression ensure background process shares Julia session (via list2env(...)). future object replicating 1000-simulation CPA look like following: future created, immediately starts executing code background process. important evaluate future object (f) directly. Instead, query future::resolved() - simply tells whether background evaluation completed : , use loop show background CPA non-blocking. evaluating R code interactive session simultaneously CPA running: reach end loop 2.5 seconds, approximately long CPA took complete (confirm next section). point future completed result available collection. Although R code just crude progress alert using Sys.sleep(), can freely evaluate R code except functions call Julia. can make plots, clean data, write analyses, etc., just don’t run another CPA one already running Julia process shared.","code":"# Not run future( { list2env(pkgopts, jlmerclusterperm:::.jlmerclusterperm) ## Your CPA code below ## }, globals = structure(TRUE, add = \"pkgopts\") ) f <- future( { list2env(as.list(pkgopts), jlmerclusterperm:::.jlmerclusterperm) ## Your CPA code below ## set_rng_state(123L) clusterpermute( chickweights_spec, threshold = 2.5, nsim = 1000 ) }, globals = structure(TRUE, add = \"pkgopts\") ) resolved(f) #> [1] FALSE i <- 0 while (!resolved(f)) { Sys.sleep(0.5) i <<- i + 1 cat(sprintf(\"Elapsed: %.01fs\", i / 2), \"\\n\") } #> Elapsed: 0.5s #> Elapsed: 1.0s #> Elapsed: 1.5s #> Elapsed: 2.0s #> Elapsed: 2.5s resolved(f) #> [1] TRUE"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/asynchronous-cpa.html","id":"collecting-the-results","dir":"Articles","previous_headings":"Walkthrough","what":"Collecting the results","title":"Asynchronous CPA","text":"can collect output background process future::value(). first, double check make sure background process indeed finished evaluating: ’s confirmed, can use value() collect results future assign variable inspection: Note value() also prints messages encountered executing CPA code. see specify progress = FALSE clusterpermute() call passed future. purposes, just note adding times messages similar saw loop (2.5 seconds). output asynchronous CPA (CPA_async) identical initial CPA ran beginning (CPA) ran default seed 1 RNG counter value 123L: separate vignette covers Julia RNG. practice, may want use random seed (via set_rng_seed()) background CPA.","code":"resolved(f) #> [1] TRUE CPA_async <- value(f) #> Connecting to Julia TCP server at localhost:11980 ... #> ℹ Detecting empirical clusters and calculating cluster-mass statistics. #> ✔ Detecting empirical clusters and calculating cluster-mass statistics. [75ms] #> #> ℹ Sampling cluster-mass statistics from a bootstrapped null distribution. #> ✔ Sampling cluster-mass statistics from a bootstrapped null distribution. [2s] #> #> ℹ Calculating the probability of the observed cluster-mass statistics. #> ✔ Calculating the probability of the observed cluster-mass statistics. [24ms] #> identical(CPA_async, CPA) #> [1] TRUE"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"background","dir":"Articles","previous_headings":"","what":"Background","title":"de Carvalho et al. 2021","text":"data comes eye-tracking study de Carvalho, Dautriche, Fiévet, & Christophe (2021) “Toddlers exploit referential syntactic cues flexibly adapt interpretation novel verb meanings.” article reproduces expands Experiment 2 analysis, used CPA (t-tests) conduct pairwise comparison three conditions (relevant Figure 7 paper ). data comes mostly ready CPA straight study’s OSF repository. collapsible chunk contains code prepare data reproduce figure. Data cleaning Code Figure 7 plots data original study (close reproduce ). E2_data_agg dataframe following four columns: Subject: Unique identifier subjects Condition: -participant factor variable three levels (\"Intransitive\", \"RightDislocated\", \"Transitive\") Time: continuous measure time 0-8000ms 50ms intervals Prop: Proportion looks target (averaged across trials within condition, subject) following reproduction Figure 7 original paper: original study used pairwise comparisons analyze relationship three conditions. smaller figures (B, C, D) plot following relationships: (B) looks target \"Transitive\" compared \"RightDislocated\" (C) looks target \"RightDislocated\" compared \"Intransitive\" (D) looks target \"Transitive\" compared \"Intransitive\" rest vignette organized follows: First, replicate individual pairwise CPAs original paper. Next, consider parsimonious analysis reformulates research hypothesis choice regression contrast. Lastly, build model (2) consider issues around model diagnostics complexity (linear vs. logistic; fixed vs. mixed) context CPA Load package start Julia instance jlmerclusterperm_setup() proceeding.","code":"library(dplyr) library(forcats) filepath <- \"https://osf.io/download/var2h/?view_only=2a06746ba360446e9857df73307d21be\" E2_data_raw <- readr::read_delim(filepath) E2_data <- E2_data_raw %>% filter(away != 1) %>% mutate(Target = as.integer(bin_2P == 1)) %>% mutate(Condition = fct_recode( factor(Condition, levels = c(\"intrans\", \"pros\", \"trans\")), \"Intransitive\" = \"intrans\", \"RightDislocated\" = \"pros\", \"Transitive\" = \"trans\" )) %>% select(Subject, Trial, Condition, Time, Target) E2_data_agg <- E2_data %>% group_by(Subject, Condition, Time) %>% summarize(Prop = mean(Target), .groups = \"drop\") # Unaggregated trial-level data of 1s and 0s E2_data #> # A tibble: 69,123 × 5 #> Subject Trial Condition Time Target #> #> 1 200.asc 0 RightDislocated 0 0 #> 2 200.asc 0 RightDislocated 400 0 #> 3 200.asc 0 RightDislocated 450 1 #> 4 200.asc 0 RightDislocated 500 1 #> 5 200.asc 0 RightDislocated 550 1 #> 6 200.asc 0 RightDislocated 600 1 #> 7 200.asc 0 RightDislocated 650 1 #> 8 200.asc 0 RightDislocated 700 1 #> 9 200.asc 0 RightDislocated 750 1 #> 10 200.asc 0 RightDislocated 800 1 #> # ℹ 69,113 more rows # Aggregated subject-mean proportions data used in original study E2_data_agg #> # A tibble: 11,540 × 4 #> Subject Condition Time Prop #> #> 1 200.asc RightDislocated 0 0.5 #> 2 200.asc RightDislocated 50 1 #> 3 200.asc RightDislocated 100 1 #> 4 200.asc RightDislocated 150 1 #> 5 200.asc RightDislocated 200 1 #> 6 200.asc RightDislocated 250 0.5 #> 7 200.asc RightDislocated 300 0.5 #> 8 200.asc RightDislocated 350 0.4 #> 9 200.asc RightDislocated 400 0.286 #> 10 200.asc RightDislocated 450 0.429 #> # ℹ 11,530 more rows library(ggplot2) #> Warning: package 'ggplot2' was built under R version 4.3.1 make_fig7_plot <- function(conditions) { E2_data %>% group_by(Condition, Time) %>% summarize( Prop = mean(Target), se = sqrt(var(Target) / n()), lower = Prop - se, upper = Prop + se, .groups = \"drop\" ) %>% filter(Condition %in% conditions) %>% ggplot(aes(Time, Prop, color = Condition, fill = Condition)) + geom_hline(aes(yintercept = .5)) + geom_ribbon( aes(ymin = lower, ymax = upper), alpha = .2, show.legend = FALSE, ) + geom_line(linewidth = 1) + scale_color_manual( aesthetics = c(\"color\", \"fill\"), values = setNames(scales::hue_pal()(3)[c(2, 1, 3)], levels(E2_data$Condition)) ) + scale_y_continuous(limits = c(.2, .8), oob = scales::oob_keep) + labs(y = NULL, x = NULL) + theme_minimal() + theme(axis.title.y = element_text(angle = 0, vjust = .5, hjust = 0)) } fig7_comparisons <- list( \"A\" = c(\"Intransitive\", \"RightDislocated\", \"Transitive\"), \"B\" = c(\"RightDislocated\", \"Transitive\"), \"C\" = c(\"Intransitive\", \"RightDislocated\"), \"D\" = c(\"Intransitive\", \"Transitive\") ) fig7 <- lapply(fig7_comparisons, make_fig7_plot) # Figure 7 combined plots library(patchwork) #> Warning: package 'patchwork' was built under R version 4.3.1 p_top <- fig7$A + scale_x_continuous(breaks = scales::breaks_width(1000)) + theme(legend.position = \"bottom\") p_bot <- (fig7$B + fig7$C + fig7$D) & guides(color = guide_none()) fig7_combined <- p_top / guide_area() / p_bot + plot_layout(guides = \"collect\", heights = c(2, .1, 1)) + plot_annotation(tag_levels = \"A\") E2_data_agg #> # A tibble: 11,540 × 4 #> Subject Condition Time Prop #> #> 1 200.asc RightDislocated 0 0.5 #> 2 200.asc RightDislocated 50 1 #> 3 200.asc RightDislocated 100 1 #> 4 200.asc RightDislocated 150 1 #> 5 200.asc RightDislocated 200 1 #> 6 200.asc RightDislocated 250 0.5 #> 7 200.asc RightDislocated 300 0.5 #> 8 200.asc RightDislocated 350 0.4 #> 9 200.asc RightDislocated 400 0.286 #> 10 200.asc RightDislocated 450 0.429 #> # ℹ 11,530 more rows fig7_combined library(jlmerclusterperm) jlmerclusterperm_setup(verbose = FALSE)"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"replicating-the-pairwise-cpas","dir":"Articles","previous_headings":"","what":"Replicating the pairwise CPAs","title":"de Carvalho et al. 2021","text":"three conditions experiment levels Condition factor variable: begin replication \"Transitive\" vs. \"RightDislocated\" comparison shown Figure 7B apply logic two pairwise comparisons 7C 7D.","code":"levels(E2_data_agg$Condition) #> [1] \"Intransitive\" \"RightDislocated\" \"Transitive\""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"transitive-vs--rightdislocated","dir":"Articles","previous_headings":"Replicating the pairwise CPAs","what":"Transitive vs. RightDislocated","title":"de Carvalho et al. 2021","text":"reproduced Figure 7B compares \"Transitive\" \"RightDislocated\" conditions. paper (caption Figure 7; emphasis mine) reports: transitive right-dislocated conditions differed second repetition novel verbs (~6400 ms onset test trials end trials). now replicate analysis. First, prepare specification object. Two things note : express original t-test regression model Condition predictor drop third, unused condition data factor representation Next, fit global model sanity check structure model output. get one estimate ConditionTransitive positive coefficient, ’d expect: Finally, call clusterpermute() threshold = 1.5 (original study) simulate 100 permutations: detect largest empirical cluster spanning 6400-8000ms reported original paper. converges around p=0.02 separate 10,000-simulation run (shown ).","code":"fig7$B spec_7B <- make_jlmer_spec( formula = Prop ~ Condition, data = E2_data_agg %>% filter(Condition %in% c(\"Transitive\", \"RightDislocated\")) %>% mutate(Condition = droplevels(Condition)), # or forcats::fct_drop() subject = \"Subject\", time = \"Time\" ) spec_7B #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: Prop ~ 1 + ConditionTransitive #> Predictors: #> Condition: ConditionTransitive #> Groupings: #> Subject: Subject #> Trial: #> Time: Time #> Data: #> # A tibble: 7,688 × 4 #> Prop ConditionTransitive Subject Time #> #> 1 0.5 0 200.asc 0 #> 2 1 0 200.asc 50 #> 3 1 0 200.asc 100 #> # ℹ 7,685 more rows #> ──────────────────────────────────────────────────────────────────────────────── jlmer(spec_7B) #> #> ────────────────────────────────────────────────────────────────────────────────── #> Coef. Std. Error z Pr(>|z|) Lower 95% Upper 95% #> ────────────────────────────────────────────────────────────────────────────────── #> (Intercept) 0.592999 0.00364424 162.72 <1e-99 0.585856 0.600141 #> ConditionTransitive 0.0569017 0.0051591 11.03 <1e-27 0.04679 0.0670133 #> ────────────────────────────────────────────────────────────────────────────────── clusterpermute(spec_7B, threshold = 1.5, nsim = 100L, progress = FALSE) #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 1.5) ──────────── ── #> ConditionTransitive (n = 100) #> Mean (SD): -2.674 (27.83) #> Coverage intervals: 95% [-61.740, 46.439] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 1.5) ──────────────────────── ── #> ConditionTransitive #> [300, 1150]: 40.643 (p=0.1287) #> [2700, 2750]: -4.068 (p=0.9010) #> [6150, 6200]: 3.249 (p=0.9505) #> [6400, 8000]: 88.742 (p=0.0198) #> ────────────────────────────────────────────────────────────────────────────────"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"rightdislocated-vs--intransitive","dir":"Articles","previous_headings":"Replicating the pairwise CPAs","what":"RightDislocated vs. Intransitive","title":"de Carvalho et al. 2021","text":"reproduced Figure 7C compares \"RightDislocated\" \"Intransitive\" conditions. paper reports: intransitive right-dislocated conditions differed first repetition novel verbs (2100 ms 3500 ms beginning test trials). repeat CPA procedure pairwise comparison: largest empirical cluster detect spans 2150-3650ms, slightly different cluster reported original paper (2100-3500ms). relatively less “extreme” cluster converges around p=0.05 10,000-simulation run.","code":"fig7$C spec_7C <- make_jlmer_spec( formula = Prop ~ Condition, data = E2_data_agg %>% filter(Condition %in% c(\"RightDislocated\", \"Intransitive\")) %>% mutate(Condition = droplevels(Condition)), subject = \"Subject\", time = \"Time\" ) clusterpermute(spec_7C, threshold = 1.5, nsim = 100L, progress = FALSE) #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 1.5) ──────────── ── #> ConditionRightDislocated (n = 100) #> Mean (SD): 0.619 (33.26) #> Coverage intervals: 95% [-63.734, 67.326] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 1.5) ──────────────────────── ── #> ConditionRightDislocated #> [150, 200]: 3.562 (p=0.9010) #> [2150, 3650]: 72.842 (p=0.0297) #> [4550, 5050]: 23.836 (p=0.3960) #> [5150, 5350]: 8.589 (p=0.7723) #> [5700, 5750]: 3.332 (p=0.9208) #> ────────────────────────────────────────────────────────────────────────────────"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"transitive-vs--intransitive","dir":"Articles","previous_headings":"Replicating the pairwise CPAs","what":"Transitive vs. Intransitive","title":"de Carvalho et al. 2021","text":"reproduced Figure 7D compares \"Transitive\" \"Intransitive\" conditions. paper reports: transitive intransitive conditions differed slightly offset first sentence test trials (4500 ms beginning test trials end trials). repeat CPA procedure pairwise comparison: largest empirical cluster detect spans 4600-8000ms, slightly different cluster reported original paper (4500-8000ms). converges around p=0.001 separate 10,000-simulation run.","code":"fig7$D spec_7D <- make_jlmer_spec( formula = Prop ~ Condition, data = E2_data_agg %>% filter(Condition %in% c(\"Transitive\", \"Intransitive\")) %>% mutate(Condition = droplevels(Condition)), subject = \"Subject\", time = \"Time\" ) clusterpermute(spec_7D, threshold = 1.5, nsim = 100L, progress = FALSE) #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 1.5) ──────────── ── #> ConditionTransitive (n = 100) #> Mean (SD): 2.024 (41.73) #> Coverage intervals: 95% [-70.510, 88.501] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 1.5) ──────────────────────── ── #> ConditionTransitive #> [300, 1000]: 25.928 (p=0.4356) #> [1300, 1400]: 4.808 (p=0.8713) #> [2800, 2900]: 5.356 (p=0.8218) #> [3000, 4300]: 52.638 (p=0.2079) #> [4600, 8000]: 172.635 (p=0.0198) #> ────────────────────────────────────────────────────────────────────────────────"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"expressed-as-regression-contrasts","dir":"Articles","previous_headings":"","what":"Expressed as regression contrasts","title":"de Carvalho et al. 2021","text":"now consider parsimonious analysis translates research hypothesis contrast coding avoid multiple testing. Specifically, exploit fact original paper specifies hypothesis Intransitive < RightDislocated < Transitive.","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"helmert-deviation-coding","dir":"Articles","previous_headings":"Expressed as regression contrasts","what":"Helmert (deviation) coding","title":"de Carvalho et al. 2021","text":"Testing ordinal relationship levels category require possible pairwise comparisons; instead, can approximated via Helmert coding (.k.. deviance coding) K levels expressed K-1 contrasts, contrast successively comparing level vs. average previous (typically lower) levels. Critically, Helmert contrasts orthogonal, can test simultaneously single model. data, test ordinal relationship Intransitive < RightDislocated < Transitive via two contrasts: \"RightDislocated\" vs. \"Intransitive\" \"Transitive\" vs. average \"RightDislocated\" \"Intransitive\" corresponding numerical coding following: practice, Helmert contrasts often standardized deviations expressed unit 1. also comparison \"RightDislocated\" vs. \"Intransitive\" expressed 1 unit RvsI comparison \"Transitive\" vs. average \"RightDislocated\" \"Intransitive\" expressed 1 unit TvsRI: contrast matrix, make new column original data called ConditionHelm copying Condition column, apply contrasts new column: Lastly, build new specification making use full data. , predict Prop ConditionHelm estimate effect contrasts single model. sanity check, fit global model - expect estimate contrast indeed find RvsI TvsRI output positive coefficients. suggests ordinal relationship three conditions hold least globally. Note coefficient RvsI contrast exactly pairwise model using spec_7C earlier, also compared \"RightDislocated\" \"Intransitive\": spec_helm two conditions RvsI coded -0.5 0.5, spec_7C’s treatment coding coded 0 1. Since express difference \"Intransitive\" \"RightDislocated\" unit +1, two coefficients equal magnitude sign.","code":"condition_helm <- contr.helmert(3) colnames(condition_helm) <- c(\"RvsI\", \"TvsRI\") condition_helm #> RvsI TvsRI #> 1 -1 -1 #> 2 1 -1 #> 3 0 2 condition_helm[, 1] <- condition_helm[, 1] / 2 condition_helm[, 2] <- condition_helm[, 2] / 3 condition_helm #> RvsI TvsRI #> 1 -0.5 -0.3333333 #> 2 0.5 -0.3333333 #> 3 0.0 0.6666667 E2_data_agg$ConditionHelm <- E2_data_agg$Condition contrasts(E2_data_agg$ConditionHelm) <- condition_helm # For pretty-printing as fractions MASS::fractions(contrasts(E2_data_agg$ConditionHelm)) #> RvsI TvsRI #> Intransitive -1/2 -1/3 #> RightDislocated 1/2 -1/3 #> Transitive 0 2/3 spec_helm <- make_jlmer_spec( formula = Prop ~ ConditionHelm, data = E2_data_agg, subject = \"Subject\", time = \"Time\" ) spec_helm #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: Prop ~ 1 + ConditionHelmRvsI + ConditionHelmTvsRI #> Predictors: #> ConditionHelm: ConditionHelmRvsI, ConditionHelmTvsRI #> Groupings: #> Subject: Subject #> Trial: #> Time: Time #> Data: #> # A tibble: 11,540 × 5 #> Prop ConditionHelmRvsI ConditionHelmTvsRI Subject Time #> #> 1 0.5 0.5 -0.333 200.asc 0 #> 2 1 0.5 -0.333 200.asc 50 #> 3 1 0.5 -0.333 200.asc 100 #> # ℹ 11,537 more rows #> ──────────────────────────────────────────────────────────────────────────────── jlmer(spec_helm) %>% tidy(effects = \"fixed\") #> # A tibble: 3 × 5 #> term estimate std.error statistic p.value #> #> 1 (Intercept) 0.591 0.00215 275. 0 #> 2 ConditionHelmRvsI 0.0640 0.00526 12.2 5.67e-34 #> 3 ConditionHelmTvsRI 0.0889 0.00457 19.5 2.02e-84 jlmer(spec_7C) %>% tidy(effects = \"fixed\") #> # A tibble: 2 × 5 #> term estimate std.error statistic p.value #> #> 1 (Intercept) 0.529 0.00379 140. 0 #> 2 ConditionRightDislocated 0.0640 0.00536 11.9 8.30e-33"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"interpreting-cpa-results","dir":"Articles","previous_headings":"Expressed as regression contrasts","what":"Interpreting CPA results","title":"de Carvalho et al. 2021","text":"proceed normal clusterpermute() using new spec_helm: ’s summary find CPA_helm: RvsI clusters similar detected previous “partial” CPA spec_7C. slight differences cluster-mass due part due Helmert-coded model simultaneously estimating TvsRI contrast. importantly, CPA_helm detects largest cluster \"RightDislocated\" \"Intransitive\" (2150-3650ms). converges around p=0.05 10,000-simulation run. TvsRI clusters new, largest cluster predictor spans 5800ms-8000ms. converges around p=0.01 separate 10,000-simulation run. cluster effectively captures region relationship Transitive > (RightDislocated & Intransitive) emerges robust. Essentially, TvsRI comparison line \"Transitive\" invisible line runs lines \"Intransitive\" \"RightDislocated\". conclude visualizing clusters two Helmert-coded terms, annotated empirical data. “invisible” line RI Helmert coding drawn dashed line.","code":"reset_rng_state() CPA_helm <- clusterpermute(spec_helm, threshold = 1.5, nsim = 100L, progress = FALSE) CPA_helm #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 1.5) ──────────── ── #> ConditionHelmRvsI (n = 100) #> Mean (SD): 4.019 (28.80) #> Coverage intervals: 95% [-45.805, 67.827] #> ConditionHelmTvsRI (n = 100) #> Mean (SD): 1.213 (24.59) #> Coverage intervals: 95% [-50.218, 47.462] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 1.5) ──────────────────────── ── #> ConditionHelmRvsI #> [150, 200]: 3.654 (p=0.8911) #> [2150, 3650]: 71.141 (p=0.0297) #> [4550, 5350]: 36.043 (p=0.2079) #> [5700, 5750]: 3.328 (p=0.9010) #> ConditionHelmTvsRI #> [300, 1050]: 35.863 (p=0.1881) #> [3500, 3700]: 8.348 (p=0.6436) #> [3850, 4250]: 15.985 (p=0.4455) #> [4800, 5600]: 36.143 (p=0.1782) #> [5800, 8000]: 123.631 (p=0.0099) #> ──────────────────────────────────────────────────────────────────────────────── fig7A_v2 <- fig7$A + geom_line( aes(Time, Prop, linetype = \"RI\"), inherit.aes = FALSE, data = . %>% filter(Condition %in% c(\"RightDislocated\", \"Intransitive\")) %>% group_by(Time) %>% summarize(Prop = mean(Prop)), ) + scale_linetype_manual(values = \"41\", guide = guide_legend(\"ConditionHelm\")) + guides(x = guide_none(\"\")) clusters_annotation <- tidy(CPA_helm$empirical_clusters) %>% mutate(contrast = gsub(\".*([TR])vs([RI]+)\", \"\\\\1 vs. \\\\2\", predictor)) %>% ggplot(aes(y = fct_rev(contrast))) + geom_segment( aes( x = start, xend = end, yend = contrast, color = pvalue < 0.05 ), linewidth = 8 ) + scale_color_manual(values = c(\"grey70\", \"steelblue\")) + scale_y_discrete() + scale_x_continuous(n.breaks = 9, limits = range(E2_data_agg$Time)) + theme_minimal() + theme( axis.title = element_blank(), axis.text.y = element_text(face = \"bold\"), panel.border = element_rect(fill = NA), panel.grid.major.y = element_blank() ) fig7A_v2 / clusters_annotation & plot_layout(heights = c(4, 1))"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"model-complexity","dir":"Articles","previous_headings":"","what":"Model complexity","title":"de Carvalho et al. 2021","text":"wrap case study considering complex CPA uses logistic mixed effects models trial-level data fixations target (1s 0s).","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"logistic-mixed-model","dir":"Articles","previous_headings":"Model complexity","what":"Logistic mixed model","title":"de Carvalho et al. 2021","text":"un-aggregated trial-level data use section E2_data, comes initial data preparation code chunk: apply Helmert/deviation-coded contrast matrix: specification object E2_data, add trial = \"Trial\" predict Target instead Prop. also add -subject random intercept formula: , CPA family = \"binomial\": now visualize results CPA_helm_complex CPA_helm side side: results largely , except largest, significant cluster identified TvsRI extends much complex CPA simple CPA. examine difference next.","code":"E2_data #> # A tibble: 69,123 × 5 #> Subject Trial Condition Time Target #> #> 1 200.asc 0 RightDislocated 0 0 #> 2 200.asc 0 RightDislocated 400 0 #> 3 200.asc 0 RightDislocated 450 1 #> 4 200.asc 0 RightDislocated 500 1 #> 5 200.asc 0 RightDislocated 550 1 #> 6 200.asc 0 RightDislocated 600 1 #> 7 200.asc 0 RightDislocated 650 1 #> 8 200.asc 0 RightDislocated 700 1 #> 9 200.asc 0 RightDislocated 750 1 #> 10 200.asc 0 RightDislocated 800 1 #> # ℹ 69,113 more rows E2_data$ConditionHelm <- E2_data$Condition contrasts(E2_data$ConditionHelm) <- condition_helm MASS::fractions(contrasts(E2_data$ConditionHelm)) #> RvsI TvsRI #> Intransitive -1/2 -1/3 #> RightDislocated 1/2 -1/3 #> Transitive 0 2/3 spec_helm_complex <- make_jlmer_spec( formula = Target ~ ConditionHelm + (1 | Subject), data = E2_data, subject = \"Subject\", trial = \"Trial\", time = \"Time\" ) spec_helm_complex #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: Target ~ 1 + ConditionHelmRvsI + ConditionHelmTvsRI + (1 | Subject) #> Predictors: #> ConditionHelm: ConditionHelmRvsI, ConditionHelmTvsRI #> Groupings: #> Subject: Subject #> Trial: Trial #> Time: Time #> Data: #> # A tibble: 69,123 × 6 #> Target ConditionHelmRvsI ConditionHelmTvsRI Subject Trial Time #> #> 1 0 0.5 -0.333 200.asc 0 0 #> 2 0 0.5 -0.333 200.asc 0 400 #> 3 1 0.5 -0.333 200.asc 0 450 #> # ℹ 69,120 more rows #> ──────────────────────────────────────────────────────────────────────────────── reset_rng_state() CPA_helm_complex <- clusterpermute( spec_helm_complex, family = \"binomial\", threshold = 1.5, nsim = 100, progress = FALSE ) CPA_helm_complex #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 1.5) ──────────── ── #> ConditionHelmRvsI (n = 100) #> Mean (SD): 1.791 (31.12) #> Coverage intervals: 95% [-60.956, 80.601] #> ConditionHelmTvsRI (n = 100) #> Mean (SD): 2.704 (28.90) #> Coverage intervals: 95% [-54.636, 62.925] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 1.5) ──────────────────────── ── #> ConditionHelmRvsI #> [2100, 3450]: 69.563 (p=0.0495) #> [4200, 5050]: 40.460 (p=0.1584) #> [5150, 5350]: 8.379 (p=0.7228) #> ConditionHelmTvsRI #> [350, 800]: 19.358 (p=0.4158) #> [900, 1050]: 7.519 (p=0.6535) #> [1250, 1350]: 4.980 (p=0.7228) #> [3100, 4450]: 56.770 (p=0.0792) #> [4650, 8000]: 181.165 (p=0.0099) #> ──────────────────────────────────────────────────────────────────────────────── clusters_annotation2 <- tidy(CPA_helm_complex$empirical_clusters) %>% mutate(contrast = gsub(\".*([TR])vs([RI]+)\", \"\\\\1 vs. \\\\2\", predictor)) %>% ggplot(aes(y = fct_rev(contrast))) + geom_segment( aes( x = start, xend = end, yend = contrast, color = pvalue < 0.05 ), linewidth = 8 ) + scale_color_manual(values = c(\"grey70\", \"steelblue\")) + scale_y_discrete() + scale_x_continuous(n.breaks = 9, limits = range(E2_data$Time)) + theme_minimal() + theme( legend.position = \"bottom\", axis.title = element_blank(), axis.text.y = element_text(face = \"bold\"), panel.border = element_rect(fill = NA), panel.grid.major.y = element_blank() ) clusters_annotation / clusters_annotation2 & guides(color = guide_none()) & plot_annotation(tag_levels = list(c(\"Simple\", \"Complex\")))"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html","id":"comparison-of-cpas","dir":"Articles","previous_headings":"Model complexity","what":"Comparison of CPAs","title":"de Carvalho et al. 2021","text":"Looking timewise statistics computed simple vs. complex CPA tells us . figure plots information calls compute_timewise_statistics(): Whereas largest cluster starts emerge 5800ms CPA_helm, emerges much earlier 4650ms CPA_helm_complex. zoom region around 5800ms, see timewise statistics T vs. RI CPA_helm suddenly dip 1.5 threshold 5750ms: CPA better? existence dips spikes indicate problem, ’s consistent expectation simple CPA less robust variance. can inspect time-point model 5750ms two CPAs fitting : ’s standard way comparing goodness fit linear fixed-effects model logistic mixed-effects model fitted different data. complex model outperforms simple model classic metrics inspect glance(). doesn’t come surprise, differences largely driven number observations (nobs). Determining appropriate degree model complexity CPA beyond scope vignette, pursue discussion . Instead, conclude old wisdom: chain strong weakest link.","code":"# Compute the timewise statistics from the CPA specifications empirical_statistics <- bind_rows( tidy(compute_timewise_statistics(spec_helm)), tidy(compute_timewise_statistics(spec_helm_complex, family = \"binomial\")), .id = \"spec\" ) %>% mutate( CPA = c(\"Simple\", \"Complex\")[as.integer(spec)], Contrast = gsub(\".*([TR])vs([RI]+)\", \"\\\\1 vs. \\\\2\", predictor) ) # Time series plot of the statistics, with a line for each Helmert contrasts empirical_statistics_fig <- ggplot(empirical_statistics, aes(time, statistic)) + geom_line(aes(color = Contrast, linetype = CPA), linewidth = 1, alpha = .7) + geom_hline(yintercept = c(-1.5, 1.5), linetype = 2) + theme_classic() empirical_statistics_fig empirical_statistics_fig + geom_vline(xintercept = 5750) + coord_cartesian(xlim = 5750 + c(-500, 500), ylim = c(1, 2.5)) jlmer_simple_5750 <- to_jlmer( formula = Prop ~ ConditionHelm, data = E2_data_agg %>% filter(Time == 5750) ) jlmer_complex_5750 <- to_jlmer( formula = Target ~ ConditionHelm + (1 | Subject), family = \"binomial\", data = E2_data %>% filter(Time == 5750) ) glance(jlmer_simple_5750) #> # A tibble: 1 × 8 #> nobs df sigma logLik AIC BIC deviance df.residual #> #> 1 72 4 0.223 7.51 -7.02 2.08 3.42 68 glance(jlmer_complex_5750) #> # A tibble: 1 × 8 #> nobs df sigma logLik AIC BIC deviance df.residual #> #> 1 448 4 NA -301. 609. 626. 601. 444"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"background","dir":"Articles","previous_headings":"","what":"Background","title":"Garrison et al. 2020","text":"data comes eye-tracking study Garrison, Baudet, Breitfield, Aberman, & Bergelson (2020) “Familiarity plays small role noun comprehension 12-18 months”, way peekbankr corpus package. original study used growth curve model analyze children’s looks object familiar (“”) vs. unfamiliar (“”) different ages (median-split “younger” “older” groups). replicating specific analysis . Instead, follow Katie Von Holzen’s tutorial used dataset conduct simpler, cluster-based permutation analysis (CPA) differences two age groups proportion looks target. tutorial used eyetrackingR package data preparation CPA, vignette use dplyr jlmerclusterperm tasks, respectively. following set-code reads data prepares analysis: data ts_data_agg ’ll use replicate tutorial just 4 columns relevant cluster-permutation analysis: subject_id: Unique identifier subjects age: -participant factor variable (\"Younger\", \"Older\") time: continuous measure time 0-3975ms 25ms intervals prop: -subject proportion looks target object following replicates figure tutorial introduces cluster-permutation analysis:","code":"library(dplyr) remotes::install_github(\"langcog/peekbankr\", quiet = TRUE) library(peekbankr) # Load data aoi_timepoints <- get_aoi_timepoints(dataset_name = \"garrison_bergelson_2020\") administrations <- get_administrations(dataset_name = \"garrison_bergelson_2020\") # Pre-processing ## Filter for age groups of interest ps_data <- aoi_timepoints %>% left_join(administrations, by = \"administration_id\") %>% filter(!between(age, 14, 16)) %>% mutate(age_binned = ifelse(age < 14, \"Younger\", \"Older\")) ## Filter for time window of interest ts_window <- ps_data |> filter(t_norm >= 0, t_norm < 4000) ## Identify trials to exclude (trackloss in >25% of samples) to_exclude <- ts_window |> group_by(subject_id, trial_id) |> summarize(prop = mean(aoi == \"missing\"), .groups = \"drop\") |> filter(prop > 0.25) ## Exclude disqualifying trials and keep information relevant for CPA ts_data <- ts_window |> anti_join(to_exclude, by = c(\"subject_id\", \"trial_id\")) |> mutate( target = aoi == \"target\", missing = aoi == \"missing\", age = factor(age_binned, c(\"Younger\", \"Older\")) ) |> select(subject_id, age, trial_id, time = t_norm, target, missing) # Data of subject mean proportions of fixations to target ts_data_agg <- ts_data |> group_by(subject_id, age, time) |> summarize(prop = sum(target) / sum(!missing), .groups = \"drop\") # Data of trial-level fixations to target (0s and 1s) ts_data_trials <- ts_data %>% filter(!missing) %>% mutate(target = as.integer(target)) %>% select(subject_id, age, trial_id, time, target) ts_data_agg #> # A tibble: 2,560 × 4 #> subject_id age time prop #> #> 1 1413 Older 0 0.7 #> 2 1413 Older 25 0.733 #> 3 1413 Older 50 0.7 #> 4 1413 Older 75 0.7 #> 5 1413 Older 100 0.677 #> 6 1413 Older 125 0.677 #> 7 1413 Older 150 0.645 #> 8 1413 Older 175 0.645 #> 9 1413 Older 200 0.581 #> 10 1413 Older 225 0.548 #> # ℹ 2,550 more rows library(ggplot2) tutorial_fig <- ggplot(ts_data_agg, aes(x = time, y = prop, color = age)) + stat_smooth(method = \"loess\", span = 0.1, se = TRUE, aes(fill = age), alpha = 0.3) + theme_bw() + labs(y = \"Prop. of Target Looks\") + geom_hline(yintercept = .5, color = \"black\") + geom_vline(xintercept = 0, color = \"black\") + coord_cartesian(ylim = c(0, 1)) + theme( legend.position = \"bottom\", legend.title = element_blank() ) tutorial_fig"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"outline","dir":"Articles","previous_headings":"","what":"Outline","title":"Garrison et al. 2020","text":"case study vignette showcases four features CPA jlmerclusterperm: Prepping data CPA using make_jlmer_spec() CPA one go clusterpermute() modular, step--step approach CPA Using logistic regression trial-level data 1s 0s Load package start Julia instance jlmerclusterperm_setup() proceeding.","code":"library(jlmerclusterperm) jlmerclusterperm_setup(verbose = FALSE)"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"a-prepping-a-specification-object","dir":"Articles","previous_headings":"","what":"A) Prepping a specification object","title":"Garrison et al. 2020","text":"Conducting cluster-based permutation analysis jlmerclusterm starts creating special specification object, compiles information necessary CPA. function make_jlmer_spec() returns specification object class minimally requires two arguments: formula: R regression model formula data: data frame example, model prop using age predictor ts_data_agg data: printed display includes four pieces information specification object: Formula: model formula Julia syntax. , looks similar formula provided, contrasts “spelled ” (age becomes ageOlder) Predictors: list predictors form ( original : expanded ). case, one predictor age expanded ageOlder ’s treatment coded “Younger” reference level. Groupings: specified call make_jlmer_spec(). empty simple_spec provided formula data. Data: transformed dataframe whose columns terms expanded formula. Note default dummy coding applied discrete variable age - now ageOlder 0 “Younger” 1 “Older”. start bare specification object , lacks parts CPA, can use jlmerclusterperm’s simple interface Julia regression models. function jlmer() takes specification object input returns Julia regression model. model represents “global” model fitted entire data, collapsing across time. equivalent lm() model specifications: raise tangent jlmer() recommend sanity-checking design output global model prior using model specifications CPA. regression models fit Julia “hood”, want make sure output comparable expect R. global model can also tell upper bound model complexity. example, global model singular fit, time-wise models fitted subsets data likely . Functions CPA also take object require information time (calculate time-wise statistics) subject/trial (bootstrapped permutation). ts_data_agg data trial-level information summarized subject, just leave . trial argument can omitted object time sample/bin uniquely identified subject. Otherwise, trial column data uniquely identifies time values within subject (example, column items counterbalanced designed participant sees every item one variants).","code":"simple_spec <- make_jlmer_spec(formula = prop ~ 1 + age, data = ts_data_agg) simple_spec #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: prop ~ 1 + ageOlder #> Predictors: #> age: ageOlder #> Data: #> # A tibble: 2,560 × 2 #> prop ageOlder #> #> 1 0.7 1 #> 2 0.733 1 #> 3 0.7 1 #> # ℹ 2,557 more rows #> ──────────────────────────────────────────────────────────────────────────────── global_jmod <- jlmer(simple_spec) tidy(global_jmod) #> # A tibble: 2 × 5 #> term estimate std.error statistic p.value #> #> 1 (Intercept) 0.523 0.00387 135. 0 #> 2 ageOlder 0.0984 0.00548 18.0 3.45e-72 library(broom) # for the `tidy()` method for `lm()` global_rmod <- lm(formula = prop ~ 1 + age, data = ts_data_agg) tidy(global_rmod) #> # A tibble: 2 × 5 #> term estimate std.error statistic p.value #> #> 1 (Intercept) 0.523 0.00387 135. 0 #> 2 ageOlder 0.0984 0.00548 18.0 4.45e-68 ts_data_agg_spec <- make_jlmer_spec( formula = prop ~ 1 + age, data = ts_data_agg, subject = \"subject_id\", time = \"time\", trial = NULL # This is the default value ) ts_data_agg_spec #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: prop ~ 1 + ageOlder #> Predictors: #> age: ageOlder #> Groupings: #> Subject: subject_id #> Trial: #> Time: time #> Data: #> # A tibble: 2,560 × 4 #> prop ageOlder subject_id time #> #> 1 0.7 1 1413 0 #> 2 0.733 1 1413 25 #> 3 0.7 1 1413 50 #> # ℹ 2,557 more rows #> ────────────────────────────────────────────────────────────────────────────────"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"b-cpa-with-clusterpermute","dir":"Articles","previous_headings":"","what":"B) CPA with clusterpermute()","title":"Garrison et al. 2020","text":"clusterpermute() function runs CPA one go, using information object. addition specification object, must also supply (t-value) threshold number simulations run: result list two elements: object holds information distribution bootstrapped cluster-mass statistics random permutations data: object holds per-predictor information clusters detected, p-values significance testing null: output explained next section, presented purposes simply note results similar reported Katie Von Holzen’s tutorial, copied . minor numerical differences due stochastic nature permutation test ’re comparing t-tests R regression models Julia.","code":"CPA_agg <- clusterpermute( ts_data_agg_spec, threshold = 2, nsim = 100, progress = FALSE # Suppress printing of progress for vignette ) CPA_agg #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 2) ────────────── ── #> ageOlder (n = 100) #> Mean (SD): -0.844 (29.71) #> Coverage intervals: 95% [-46.664, 71.698] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 2) ────────────────────────── ── #> ageOlder #> [850, 1525]: 74.724 (p=0.0396) #> [2675, 2775]: 11.017 (p=0.4356) #> [3500, 3900]: 46.295 (p=0.0990) #> ──────────────────────────────────────────────────────────────────────────────── #> ## Summary of Clusters ====== #> ## Cluster Direction SumStatistic StartTime EndTime Probability #> ## 1 1 Positive 74.723687 850 1550 0.051 #> ## 2 2 Positive 6.905622 2725 2800 0.405 #> ## 3 3 Positive 46.294644 3500 3925 0.111"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"c-a-step-by-step-approach","dir":"Articles","previous_headings":"","what":"C) A step-by-step approach","title":"Garrison et al. 2020","text":"clusterpermute() function showcased consists five parts, also standalone functions package: compute_timewise_statistics() extract_empirical_clusters() permute_timewise_statistics() extract_null_cluster_dists() calculate_clusters_pvalues() functions correspond algorithmic steps statistical test. clusterpermute() obviates need think individual components, knowing gives greater flexibility procedure. section walks CPA step--step using functions.","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"empirical-clusters","dir":"Articles","previous_headings":"C) A step-by-step approach","what":"1) Empirical clusters","title":"Garrison et al. 2020","text":"compute_timewise_statistics() function takes object returns timewise statistics form matrix row predictor column time point: example see 25ms, t-value age -0.127. consistent figure tutorial replicated show two lines mostly overlapping. Just demonstration purposes, use to_jlmer() quickly fit model using formula just data 25ms. see t-value age model exactly saw : construct empirical clusters timewise statistics. Clusters defined continuous span statistics sign whose absolute value passes certain threshold. extract_empirical_clusters() function takes timewise statistics threshold just : t-value threshold 2, detect three clusters data main effect age (ageOlder). numbers brackets show span cluster number right show sum t-values cluster (.k.. cluster-mass statistic). can collect object data frame tidy(): add figure : point know whether size clusters observe (un)likely emerge chance. calculate probability, need simulate null distribution.","code":"empirical_statistics <- compute_timewise_statistics(ts_data_agg_spec) dim(empirical_statistics) #> [1] 1 160 empirical_statistics[1, 1:5, drop = FALSE] # First 5 time points #> Time #> Predictor 0 25 50 75 100 #> ageOlder 0.1166025 -0.1273179 -0.2387778 -0.4022095 -0.2804397 to_jlmer(prop ~ 1 + age, data = filter(ts_data_agg, time == 25)) %>% tidy() %>% filter(term == \"ageOlder\") %>% pull(statistic) #> [1] -0.1273179 empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 2) empirical_clusters #> ── Empirical clusters (t > 2) ────────────────────────── ── #> ageOlder #> [850, 1525]: 74.724 #> [2675, 2775]: 11.017 #> [3500, 3900]: 46.295 #> ──────────────────────────────────────────────────────────────────────────────── empirical_clusters_df <- tidy(empirical_clusters) empirical_clusters_df #> # A tibble: 3 × 6 #> predictor id start end length sum_statistic #> #> 1 ageOlder 1 850 1525 28 74.7 #> 2 ageOlder 2 2675 2775 5 11.0 #> 3 ageOlder 3 3500 3900 17 46.3 tutorial_fig + geom_segment( aes(x = start, xend = end, y = -Inf, yend = -Inf), linewidth = 10, color = \"steelblue\", inherit.aes = FALSE, data = empirical_clusters_df ) + geom_text( aes( y = -Inf, x = start + (end - start) / 2, label = paste(\"Cluster\", id) ), vjust = -2, inherit.aes = FALSE, data = empirical_clusters_df )"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"null-distribution","dir":"Articles","previous_headings":"C) A step-by-step approach","what":"2) Null distribution","title":"Garrison et al. 2020","text":"permute_timewise_statistics() function takes specification object number bootstrapped permutation data simulate (via nsim argument). returned 3-dimensional array like output compute_timewise_statistics() except additional dimension representing simulation: permutation algorithm preserves grouping structures data specified specification object. example, testing -subject predictor like age, age groups participants random swapped preserving temporal structure data. null distribution collection largest cluster-mass statistic simulated data. clusters detected simulation, contributes cluster-mass zero null. use extract_null_cluster_dists() construct null distribution, using threshold value 2 allow comparison empirical clusters: printed, returned object shows summary statistics. can use tidy() collect samples null data frame: null distribution can plotted like : Now left test probability observing cluster-mass statistics extreme detected clusters empirical_clusters, using null_cluster_dists reference.","code":"null_statistics <- permute_timewise_statistics(ts_data_agg_spec, nsim = 100) # simulation by time by predictor dim(null_statistics) #> [1] 100 160 1 # First 5 time points of the first three simulations null_statistics[1:3, 1:5, 1, drop = FALSE] #> , , Predictor = ageOlder #> #> Time #> Sim 0 25 50 75 100 #> 001 -0.1037696 0.1387838 0.4583545 0.4337538 0.69136613 #> 002 1.4358330 1.2745381 1.2985888 0.9111359 1.16401311 #> 003 0.2545470 0.3885999 0.3374939 0.3200497 0.05603617 null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 2) null_cluster_dists #> ── Null cluster-mass distribution (t > 2) ────────────── ── #> ageOlder (n = 100) #> Mean (SD): -6.150 (40.29) #> Coverage intervals: 95% [-114.564, 74.425] #> ──────────────────────────────────────────────────────────────────────────────── null_cluster_dists_df <- tidy(null_cluster_dists) null_cluster_dists_df #> # A tibble: 100 × 6 #> predictor start end length sum_statistic sim #> #> 1 ageOlder NA NA NA 0 001 #> 2 ageOlder NA NA NA 0 002 #> 3 ageOlder 2175 2225 3 -6.37 003 #> 4 ageOlder NA NA NA 0 004 #> 5 ageOlder 3425 3475 3 -6.67 005 #> 6 ageOlder NA NA NA 0 006 #> 7 ageOlder NA NA NA 0 007 #> 8 ageOlder NA NA NA 0 008 #> 9 ageOlder NA NA NA 0 009 #> 10 ageOlder 1525 2475 39 -114. 010 #> # ℹ 90 more rows plot( density(null_cluster_dists_df$sum_statistic), main = \"Null distribution of cluster-mass statistics\" )"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"significance-test","dir":"Articles","previous_headings":"C) A step-by-step approach","what":"3) Significance test","title":"Garrison et al. 2020","text":"calculate_clusters_pvalues() function computes p-value empirical cluster returns augmented object: , p-values slightly different test stochastic. separate vignette covers issues replicability, seed, rng. collected tidy(), p-values added column: Note reporting pvalues, ’s customary add 1 numerator denominator. observed data, however unlikely, one particular arrangement (permutation) data. controlled via add1 argument - false default calculate_clusters_pvalues() true clusterpermute(). now come full circle: pvalue-augmented object, along object, also clusterpermute() function returned .","code":"empirical_clusters_tested <- calculate_clusters_pvalues(empirical_clusters, null_cluster_dists) empirical_clusters_tested #> ── Empirical clusters (t > 2) ────────────────────────── ── #> ageOlder #> [850, 1525]: 74.724 (p=0.1100) #> [2675, 2775]: 11.017 (p=0.4000) #> [3500, 3900]: 46.295 (p=0.1700) #> ──────────────────────────────────────────────────────────────────────────────── tidy(empirical_clusters_tested) #> # A tibble: 3 × 7 #> predictor id start end length sum_statistic pvalue #> #> 1 ageOlder 1 850 1525 28 74.7 0.11 #> 2 ageOlder 2 2675 2775 5 11.0 0.4 #> 3 ageOlder 3 3500 3900 17 46.3 0.17 calculate_clusters_pvalues(empirical_clusters, null_cluster_dists, add1 = TRUE) #> ── Empirical clusters (t > 2) ────────────────────────── ── #> ageOlder #> [850, 1525]: 74.724 (p=0.1188) #> [2675, 2775]: 11.017 (p=0.4059) #> [3500, 3900]: 46.295 (p=0.1782) #> ────────────────────────────────────────────────────────────────────────────────"},{"path":[]},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"aggregated-vs--trial-level-data","dir":"Articles","previous_headings":"D) As logistic regression","what":"Aggregated vs. trial-level data","title":"Garrison et al. 2020","text":"data using far aggregated data collapsed across trials within subject. words, exactly one measure proportion looks target per time point within subject: possible also fit logistic regression model trial-level measure “hits” (1s 0s) target area interest. un-aggregated, trial-level data comes data preparation code top vignette. demonstrate relationship two data sets: specification object fitting logistic regression models trial-level data requires two changes original: formula must predict binary target variable, instead prop trial grouping must specified uniquely identify time point within subject following appropriate object CPA using logistic regression: , sanity check global model using family = \"binomial\": estimates similar global_jmod previous section, log-odds.","code":"nrow(ts_data_agg) #> [1] 2560 ts_data_agg %>% distinct(subject_id, time) %>% nrow() #> [1] 2560 ts_data_trials #> # A tibble: 62,466 × 5 #> subject_id age trial_id time target #> #> 1 1413 Older 2659 0 1 #> 2 1413 Older 2659 25 1 #> 3 1413 Older 2659 50 1 #> 4 1413 Older 2659 75 1 #> 5 1413 Older 2659 100 1 #> 6 1413 Older 2659 125 1 #> 7 1413 Older 2659 150 1 #> 8 1413 Older 2659 175 1 #> 9 1413 Older 2659 200 1 #> 10 1413 Older 2659 225 1 #> # ℹ 62,456 more rows identical( ts_data_agg, ts_data_trials %>% group_by(subject_id, age, time) %>% summarize(prop = mean(target), .groups = \"drop\") ) #> [1] TRUE ts_data_trial_spec <- make_jlmer_spec( formula = target ~ 1 + age, data = ts_data_trials, subject = \"subject_id\", trial = \"trial_id\", time = \"time\" ) ts_data_trial_spec #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: target ~ 1 + ageOlder #> Predictors: #> age: ageOlder #> Groupings: #> Subject: subject_id #> Trial: trial_id #> Time: time #> Data: #> # A tibble: 62,466 × 5 #> target ageOlder subject_id trial_id time #> #> 1 1 1 1413 2659 0 #> 2 1 1 1413 2659 25 #> 3 1 1 1413 2659 50 #> # ℹ 62,463 more rows #> ──────────────────────────────────────────────────────────────────────────────── global_jmod_binom <- jlmer(ts_data_trial_spec, family = \"binomial\") tidy(global_jmod_binom) #> # A tibble: 2 × 5 #> term estimate std.error statistic p.value #> #> 1 (Intercept) 0.113 0.0121 9.31 1.24e- 20 #> 2 ageOlder 0.392 0.0164 23.9 1.08e-126"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html","id":"repeating-the-cpa-with-logistic-regression","dir":"Articles","previous_headings":"D) As logistic regression","what":"Repeating the CPA with logistic regression","title":"Garrison et al. 2020","text":"repeat procedure use new specification object set family = \"binomial\": Instead 3 clusters, now detect 5. can annotate detected clusters time series data : differences see aggregated vs. trial-level data completely expected follow straightforwardly following two facts: cluster-mass statistic derived size t-values time point, effect size. virtue unaggregated ts_data_trials data number observations, t-values overall inflated difference means (also see output global_jmod vs. global_jmod_binom well). Whereas aggregated data reflects complete pooling trial-level variation, binomial model trial-level data performs pooling treats trial independent. Thus, certain item experiment show outlier effect one time point, example, strongly influence mass cluster spanning given time point. first issue (inflated statistics) trivial just takes matter adjusting threshold. fact, good “stretching ” t-statistics, get higher resolution shape timewise statistics. However. second issue (pooling) troubling can warp fundamental shape timewise statistics. diagnose factors visualizing timewise statistics. also plot timewise statistics aggregated data comparison: see trial-level binomial model overall inflated t-values, well pronounced “peaks” (specifically maximum 1300ms). middle-ground, partial pooling approaching using mixed-effects models CPA showcased case study vignettes. present purposes, raise threshold value (2.8) gets us close detecting least largest cluster empirical_statistics, spanning 850ms-1525ms: run cluster permutation test threshold just largest cluster: gives us pvalue cluster spanning 850ms 1525ms similar one reported aggregated data: appear converge higher nsim data set. also time 1000-simulation runs showcase speed jlmerclusterperm. Note use set_rng_state(), makes two CPAs generate permutations observed data.","code":"empirical_statistics_binom <- compute_timewise_statistics(ts_data_trial_spec, family = \"binomial\") empirical_clusters_binom <- extract_empirical_clusters(empirical_statistics_binom, threshold = 2) empirical_clusters_binom #> ── Empirical clusters (t > 2) ────────────────────────── ── #> ageOlder #> [650, 725]: 8.727 #> [800, 1575]: 112.199 #> [1775, 1875]: 12.764 #> [2600, 2825]: 23.733 #> [3475, 3975]: 62.714 #> ──────────────────────────────────────────────────────────────────────────────── empirical_clusters_binom_df <- tidy(empirical_clusters_binom) tutorial_fig + geom_segment( aes(x = start, xend = end, y = -Inf, yend = -Inf), linewidth = 10, color = \"steelblue\", inherit.aes = FALSE, data = empirical_clusters_binom_df ) + geom_text( aes( y = -Inf, x = start + (end - start) / 2, label = paste(\"Cluster\", id) ), vjust = -2, inherit.aes = FALSE, data = empirical_clusters_binom_df ) timewise_statistics_fig <- ggplot(mapping = aes(time, statistic)) + geom_hline(aes(yintercept = 0)) + geom_line( aes(color = \"aggregated\"), linewidth = 1.5, data = tidy(empirical_statistics) ) + geom_line( aes(color = \"trial-level\"), linewidth = 1.5, data = tidy(empirical_statistics_binom) ) timewise_statistics_fig + geom_vline( aes(xintercept = time), data = tidy(empirical_statistics_binom) %>% slice(which.max(abs(statistic))) ) empirical_clusters_binom2 <- extract_empirical_clusters(empirical_statistics_binom, threshold = 2.8) empirical_clusters_binom2 #> ── Empirical clusters (t > 2.8) ──────────────────────── ── #> ageOlder #> [850, 1525]: 102.450 #> [3500, 3875]: 51.365 #> ──────────────────────────────────────────────────────────────────────────────── CPA_trial <- clusterpermute( ts_data_trial_spec, family = \"binomial\", threshold = 2.8, nsim = 100, top_n = 1, # Just test the largest cluster progress = FALSE # Suppress printing of progress for vignette ) CPA_trial$empirical_clusters #> ── Empirical clusters (t > 2.8) ──────────────────────── ── #> ageOlder #> [850, 1525]: 102.450 (p=0.0297) #> ──────────────────────────────────────────────────────────────────────────────── CPA_agg$empirical_clusters #> ── Empirical clusters (t > 2) ────────────────────────── ── #> ageOlder #> [850, 1525]: 74.724 (p=0.0396) #> [2675, 2775]: 11.017 (p=0.4356) #> [3500, 3900]: 46.295 (p=0.0990) #> ──────────────────────────────────────────────────────────────────────────────── system.time({ set_rng_state(123L) CPA_agg_1000 <- clusterpermute( ts_data_agg_spec, threshold = 2, nsim = 1000, top_n = 1, progress = FALSE ) }) #> user system elapsed #> 0.00 0.07 5.63 CPA_agg_1000 #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 2) ────────────── ── #> ageOlder (n = 1000) #> Mean (SD): 0.078 (40.29) #> Coverage intervals: 95% [-76.748, 82.717] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 2) ────────────────────────── ── #> ageOlder #> [850, 1525]: 74.724 (p=0.0629) #> ──────────────────────────────────────────────────────────────────────────────── system.time({ set_rng_state(123L) CPA_trial_1000 <- clusterpermute( ts_data_trial_spec, family = \"binomial\", threshold = 2.8, nsim = 1000, top_n = 1, progress = FALSE ) }) #> user system elapsed #> 0.12 0.41 12.87 CPA_trial_1000 #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 2.8) ──────────── ── #> ageOlder (n = 1000) #> Mean (SD): -0.045 (55.09) #> Coverage intervals: 95% [-119.239, 125.506] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 2.8) ──────────────────────── ── #> ageOlder #> [850, 1525]: 102.450 (p=0.0689) #> ────────────────────────────────────────────────────────────────────────────────"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Geller-et-al-2020.html","id":"background","dir":"Articles","previous_headings":"","what":"Background","title":"Geller et al. 2020","text":"data comes lexical decision task using pupilometry (specifically, measure pupil dilation) study differences processing difficulty print vs. cursive script. data available part gazer package Geller, Winn, Mahr, & Mirman 2020. follow Jason Geller’s tutorial used clusterperm package conduct cluster-based permutation analysis (CPA) differences pupil size print cursive conditions. data package (cursive_agg_data) looks slightly different used tutorial, ensure full reproducibility use exact data tutorial used: data comes prepared analysis box. following columns relevant analysis: subject: Unique identifier subjects script: within-participant factor variable (\"cursive\", \"print\") time: continuous measure time 0-2500ms 100ms intervals aggbaseline: response variable representing normalized (“baseline-corrected”) pupil size name cusrive_agg_data variable suggests, data aggregated within subject, collapsing across trials. following reproduces figure tutorial: within-participant predictors like script, permutation algorithm randomly swap labels condition trials within subject. preserves temporal structure trial-level data (swapping trial-level grouping) well subject-level grouping structure (swapping trials across participants).","code":"library(dplyr) cursive_agg_data <- as_tibble(read.csv(\"https://raw.githubusercontent.com/jgeller112/drjasongeller/main/content/blog/2020-07-10-CBPT/blog_data.csv\")) cursive_agg_data #> # A tibble: 2,184 × 5 #> X subject script timebins aggbaseline #> #> 1 1 10b.edf cursive 0 36.9 #> 2 2 10b.edf cursive 100 27.5 #> 3 3 10b.edf cursive 200 19.6 #> 4 4 10b.edf cursive 300 12.6 #> 5 5 10b.edf cursive 400 5.60 #> 6 6 10b.edf cursive 500 0.968 #> 7 7 10b.edf cursive 600 -5.98 #> 8 8 10b.edf cursive 700 -5.96 #> 9 9 10b.edf cursive 800 3.01 #> 10 10 10b.edf cursive 900 12.9 #> # ℹ 2,174 more rows library(ggplot2) #> Warning: package 'ggplot2' was built under R version 4.3.1 script_fig <- ggplot(cursive_agg_data, aes(timebins, aggbaseline)) + stat_summary(aes(linetype = script), geom = \"line\", linewidth = 1) script_fig"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Geller-et-al-2020.html","id":"outline","dir":"Articles","previous_headings":"","what":"Outline","title":"Geller et al. 2020","text":"case study vignette showcases five features CPA jlmerclusterperm: Prepping data CPA using make_jlmer_spec() clusterpermute() default statistic = \"t\" comparison “t” “chisq” Specifying random effects Contrast coding Load package start Julia instance jlmerclusterperm_setup() proceeding.","code":"library(jlmerclusterperm) jlmerclusterperm_setup(verbose = FALSE)"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Geller-et-al-2020.html","id":"a-prepping-a-specification-object","dir":"Articles","previous_headings":"","what":"A) Prepping a specification object","title":"Geller et al. 2020","text":"start simple specification object model aggbaseline using script predictor script_fig data: sanity check, fit global model data… …check ’s comparable expect lm(): full specification object CPA must also declare grouping structures present data. rule thumb every observation (row) data must uniquely identified combination columns subject, trial, time. CPA time series data time must always specified, subject must also always specified permutation algorithm respect subject-level grouping data. cursive_agg_data data collapses across trials within subject, column trial. However, leave trial argument unspecified observations uniquely identified subject time alone. instead 2 rows per subject-time combination, one script condition: Therefore need “dummy” column trial distinctly mark “cursive” vs. “script” trials. save new data cursive_agg: makes observations uniquely identified columns subject, trial, time: final specification object looks like following:","code":"simple_spec <- make_jlmer_spec( formula = aggbaseline ~ 1 + script, data = cursive_agg_data ) simple_spec #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: aggbaseline ~ 1 + scriptprint #> Predictors: #> script: scriptprint #> Data: #> # A tibble: 2,184 × 2 #> aggbaseline scriptprint #> #> 1 36.9 0 #> 2 27.5 0 #> 3 19.6 0 #> # ℹ 2,181 more rows #> ──────────────────────────────────────────────────────────────────────────────── jlmer(simple_spec) #> #> ──────────────────────────────────────────────────────────────────────── #> Coef. Std. Error z Pr(>|z|) Lower 95% Upper 95% #> ──────────────────────────────────────────────────────────────────────── #> (Intercept) 46.7221 2.79622 16.71 <1e-61 41.2417 52.2026 #> scriptprint -10.2582 3.95445 -2.59 0.0095 -18.0088 -2.5076 #> ──────────────────────────────────────────────────────────────────────── summary(lm(formula = aggbaseline ~ 1 + script, data = cursive_agg_data))$coefficients #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 46.72214 2.796221 16.709029 4.512338e-59 #> scriptprint -10.25818 3.954454 -2.594083 9.547816e-03 cursive_agg_data %>% count(subject, timebins) #> # A tibble: 1,092 × 3 #> subject timebins n #> #> 1 10b.edf 0 2 #> 2 10b.edf 100 2 #> 3 10b.edf 200 2 #> 4 10b.edf 300 2 #> 5 10b.edf 400 2 #> 6 10b.edf 500 2 #> 7 10b.edf 600 2 #> 8 10b.edf 700 2 #> 9 10b.edf 800 2 #> 10 10b.edf 900 2 #> # ℹ 1,082 more rows cursive_agg <- cursive_agg_data %>% mutate(trial_type = paste0(script, \"_agg\")) cursive_agg #> # A tibble: 2,184 × 6 #> X subject script timebins aggbaseline trial_type #> #> 1 1 10b.edf cursive 0 36.9 cursive_agg #> 2 2 10b.edf cursive 100 27.5 cursive_agg #> 3 3 10b.edf cursive 200 19.6 cursive_agg #> 4 4 10b.edf cursive 300 12.6 cursive_agg #> 5 5 10b.edf cursive 400 5.60 cursive_agg #> 6 6 10b.edf cursive 500 0.968 cursive_agg #> 7 7 10b.edf cursive 600 -5.98 cursive_agg #> 8 8 10b.edf cursive 700 -5.96 cursive_agg #> 9 9 10b.edf cursive 800 3.01 cursive_agg #> 10 10 10b.edf cursive 900 12.9 cursive_agg #> # ℹ 2,174 more rows cursive_agg %>% count(subject, timebins, trial_type) %>% distinct(n) #> # A tibble: 1 × 1 #> n #> #> 1 1 cursive_agg_spec <- make_jlmer_spec( formula = aggbaseline ~ 1 + script, data = cursive_agg, subject = \"subject\", trial = \"trial_type\", time = \"timebins\" ) cursive_agg_spec #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: aggbaseline ~ 1 + scriptprint #> Predictors: #> script: scriptprint #> Groupings: #> Subject: subject #> Trial: trial_type #> Time: timebins #> Data: #> # A tibble: 2,184 × 5 #> aggbaseline scriptprint subject trial_type timebins #> #> 1 36.9 0 10b.edf cursive_agg 0 #> 2 27.5 0 10b.edf cursive_agg 100 #> 3 19.6 0 10b.edf cursive_agg 200 #> # ℹ 2,181 more rows #> ────────────────────────────────────────────────────────────────────────────────"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Geller-et-al-2020.html","id":"b-cpa-with-default-statistic-t","dir":"Articles","previous_headings":"","what":"B) CPA with default statistic = \"t\"","title":"Geller et al. 2020","text":"CPA output original tutorial (using 100 simulations) copied comparison: Using threshold 2 default statistic = \"t\", clusterpermute() returns following: results look different original. due following: cluster-mass statistic (cms) lower. ran clusterpermute() default statistic = \"t\". Different clusters identified. simple model account subject-level variation, whereas original error term formula aggbaseline ~ script + Error(subject). sign cluster reversed. defaults aov() used original tutorial different default contrast coding regression model. now address issue turn, building CPA closely replicate results tutorial.","code":"#> #> effect b0 b1 sign cms p #> #> script 0 100 1 10.121215 0.1584158 #> #> script 900 1000 -1 8.756152 0.1782178 #> #> script 1900 2500 1 82.198279 0.0099010 set_rng_state(123L) clusterpermute( cursive_agg_spec, statistic = \"t\", # Default value spelled out threshold = 2, nsim = 100, progress = FALSE ) #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 2) ────────────── ── #> scriptprint (n = 100) #> Mean (SD): 0.000 (0.00) #> Coverage intervals: 95% [0.000, 0.000] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 2) ────────────────────────── ── #> scriptprint #> [2000, 2300]: -8.748 (p=0.0099) #> ────────────────────────────────────────────────────────────────────────────────"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Geller-et-al-2020.html","id":"c-comparing-t-vs--chisq","dir":"Articles","previous_headings":"","what":"C) Comparing “t” vs. “chisq”","title":"Geller et al. 2020","text":"original tutorial used clusterperm::cluster_nhds() conduct CPA, fits ANOVA models time. , timewise statistic used chi-squared threshold determined p-value chi-squared statistic. Using chi-squared statistics p-value threshold also supported clusterpermute() using statistic = \"chisq\" (instead default \"t\"): returns cluster now numerically larger cluster-mass statistic, expected (chi-squared asymptotic t^2). , compare shape timewise statistics “t” threshold 2 “chisq” threshold p=0.05: find clusters identified 2000ms-2300ms “t” “chisq”, peaks pronounced “chisq”. differences two inconsequential example, may produce different results cases. chi-squared statistic (jlmerclusterperm computes via likelihood ratio test) often preferred testing single parameters makes less assumptions tend robust (glmmFAQ). goodness--fit tests level predictor model formula, “chisq” less interpretable multi-level predictors (k-1 > 1). teasing apart contribution individual levels multi-level predictor, using statistic = \"t\" appropriate.","code":"set_rng_state(123L) clusterpermute( cursive_agg_spec, statistic = \"chisq\", threshold = 0.05, # Threshold is now the p-value of the chi-squared statistic nsim = 100, progress = FALSE ) #> $null_cluster_dists #> ── Null cluster-mass distribution (chisq p < 0.05) ───── ── #> script (n = 100, df = 1) #> Mean (SD): 0.000 (0.00) #> Coverage intervals: 95% [0.000, 0.000] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (chisq p < 0.05) ───────────────── ── #> script (df = 1) #> [2000, 2300]: -19.085 (p=0.0099) #> ──────────────────────────────────────────────────────────────────────────────── timewise_ts <- compute_timewise_statistics(cursive_agg_spec, statistic = \"t\") timewise_chisqs <- compute_timewise_statistics(cursive_agg_spec, statistic = \"chisq\") library(ggplot2) timewise_fig <- ggplot(mapping = aes(x = time, y = statistic)) + geom_line( aes(color = \"fixed-t\"), linewidth = 1.5, data = tidy(timewise_ts) ) + geom_line( aes(color = \"fixed-chisq\"), linewidth = 1.5, data = tidy(timewise_chisqs) ) + geom_hline( aes(yintercept = c(-2, 2, qchisq(.95, df = 1), -qchisq(.95, df = 1))), color = rep(c(\"#00BFC4\", \"#F8766D\"), each = 2), linetype = 2 ) + scale_color_manual( values = setNames( c(\"#E69F00\", \"#56B4E9\", \"#009E73\", \"#F0E442\"), c(\"fixed-t\", \"fixed-chisq\", \"re-intercept-chisq\", \"re-max-chisq\") ) ) timewise_fig"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Geller-et-al-2020.html","id":"d-specifying-random-effects","dir":"Articles","previous_headings":"","what":"D) Specifying random effects","title":"Geller et al. 2020","text":"biggest missing component point subject random effects, original tutorial captures via error term Error(subject). isn’t strict equivalent regression, specifying random intercepts subject (1 | subject) gets us close: repeat results original tutorial comparison: now plot timewise statistics random intercept models: point may wonder whether results change much used maximal model random effects structure (1 + script | subject). first create another specification object maximal formula: compute timewise statistics: find chisq statistics maximal model virtually identical parsimonious, intercept-model (two lines overlap): Parsimony incredibly important simulation - jlmerclusterperm fast, model complexity still major bottleneck. adding extra random effect terms (random slope correlation) negligible effects statistic, removing likely inconsequential CPA . show following CPA run, uses maximal model. Notice results identical intercept-re_CPA substantially slower: run maximal mixed model CPA purely demonstration purposes. random effects structure actually unidentifiable given aggregated data, point time 2 observations subject (mean condition). Lastly, back figure data, annotate clusters detected chisq p-value threshold 0.05 random intercept CPA:","code":"cursive_agg_spec_re <- make_jlmer_spec( formula = aggbaseline ~ 1 + script + (1 | subject), data = cursive_agg, subject = \"subject\", time = \"timebins\", trial = \"trial_type\" ) set_rng_state(123L) system.time({ re_CPA <- clusterpermute( cursive_agg_spec_re, statistic = \"chisq\", threshold = 0.05, nsim = 100, progress = FALSE ) }) #> user system elapsed #> 0.03 0.06 13.65 re_CPA #> $null_cluster_dists #> ── Null cluster-mass distribution (chisq p < 0.05) ───── ── #> script (n = 100, df = 1) #> Mean (SD): 1.132 (12.58) #> Coverage intervals: 95% [-21.081, 32.649] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (chisq p < 0.05) ───────────────── ── #> script (df = 1) #> [0, 100]: -9.770 (p=0.1881) #> [900, 1000]: 8.521 (p=0.2475) #> [1900, 2500]: -73.440 (p=0.0099) #> ──────────────────────────────────────────────────────────────────────────────── #> #> effect b0 b1 sign cms p #> #> script 0 100 1 10.121215 0.1584158 #> #> script 900 1000 -1 8.756152 0.1782178 #> #> script 1900 2500 1 82.198279 0.0099010 timewise_chisqs_re <- compute_timewise_statistics(cursive_agg_spec_re, statistic = \"chisq\") timewise_fig_re <- timewise_fig + geom_line( aes(color = \"re-intercept-chisq\"), linewidth = 1.5, data = tidy(timewise_chisqs_re) ) timewise_fig_re cursive_agg_spec_re_max <- make_jlmer_spec( formula = aggbaseline ~ 1 + script + (1 + script | subject), data = cursive_agg, subject = \"subject\", time = \"timebins\", trial = \"trial_type\" ) timewise_chisqs_re_max <- compute_timewise_statistics(cursive_agg_spec_re_max, statistic = \"chisq\") #> ℹ 1 singular fit (3.85%). timewise_fig_re + geom_line( aes(color = \"re-max-chisq\"), linewidth = 1.5, data = tidy(timewise_chisqs_re_max) ) set_rng_state(123L) system.time({ re_max_CPA <- clusterpermute( cursive_agg_spec_re_max, statistic = \"chisq\", threshold = 0.05, nsim = 100, progress = FALSE ) }) #> user system elapsed #> 0.06 0.08 12.04 re_max_CPA #> $null_cluster_dists #> ── Null cluster-mass distribution (chisq p < 0.05) ───── ── #> script (n = 100, df = 1) #> Mean (SD): 1.132 (12.58) #> Coverage intervals: 95% [-21.081, 32.649] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (chisq p < 0.05) ───────────────── ── #> script (df = 1) #> [0, 100]: -9.770 (p=0.1881) #> [900, 1000]: 8.521 (p=0.2475) #> [1900, 2500]: -73.440 (p=0.0099) #> ──────────────────────────────────────────────────────────────────────────────── empirical_clusters_df <- tidy(extract_empirical_clusters(timewise_chisqs_re, threshold = 0.05)) script_fig + geom_segment( aes( x = start, xend = end, y = -Inf, yend = -Inf, color = factor(sign(sum_statistic)) ), linewidth = 10, inherit.aes = FALSE, data = empirical_clusters_df ) + geom_text( aes( y = -Inf, x = start + (end - start) / 2, label = paste(\"Cluster\", id) ), vjust = -2, inherit.aes = FALSE, data = empirical_clusters_df ) + labs(color = \"Sign of cluster\")"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/Geller-et-al-2020.html","id":"e-contrast-coding","dir":"Articles","previous_headings":"","what":"E) Contrast coding","title":"Geller et al. 2020","text":"last piece puzzle flipped sign effect. Whereas detect negative cluster line cursive line print, original tutorial reports positive cluster (output repeated ): Fixing trivial - just takes different choice contrast. example, can flip levels factor make “print” reference level: also sum-code set “print” -1 “cursive” 1: CPA operates domain timewise statistics (effect size), whether difference “print” “cursive” expressed unit 1 (treatment coding) 2 (contrast coding) bearing CPA. conclude, re-run time 1000-simulation CPA using intercept-model new treatment coding script “print” reference level. Finally, annotating just significant cluster figure:","code":"#> #> effect b0 b1 sign cms p #> #> script 0 100 1 10.121215 0.1584158 #> #> script 900 1000 -1 8.756152 0.1782178 #> #> script 1900 2500 1 82.198279 0.0099010 rev_contrast_df <- cursive_agg rev_contrast_df$script <- factor(rev_contrast_df$script, levels = c(\"print\", \"cursive\")) contrasts(rev_contrast_df$script) #> cursive #> print 0 #> cursive 1 reverse_contrast_spec <- make_jlmer_spec( formula = aggbaseline ~ 1 + script + (1 | subject), data = rev_contrast_df, subject = \"subject\", time = \"timebins\", trial = \"trial_type\" ) set_rng_state(123L) clusterpermute( reverse_contrast_spec, statistic = \"chisq\", threshold = 0.05, nsim = 100, progress = FALSE ) #> $null_cluster_dists #> ── Null cluster-mass distribution (chisq p < 0.05) ───── ── #> script (n = 100, df = 1) #> Mean (SD): -1.132 (12.58) #> Coverage intervals: 95% [-32.649, 21.081] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (chisq p < 0.05) ───────────────── ── #> script (df = 1) #> [0, 100]: 9.770 (p=0.1881) #> [900, 1000]: -8.521 (p=0.2475) #> [1900, 2500]: 73.440 (p=0.0099) #> ──────────────────────────────────────────────────────────────────────────────── sum_contrast_df <- cursive_agg sum_contrast_df$script <- factor(sum_contrast_df$script) contrasts(sum_contrast_df$script) <- contr.sum(2) contrasts(sum_contrast_df$script) #> [,1] #> cursive 1 #> print -1 sum_contrast_spec <- make_jlmer_spec( formula = aggbaseline ~ 1 + script + (1 | subject), data = sum_contrast_df, subject = \"subject\", time = \"timebins\", trial = \"trial_type\" ) set_rng_state(123L) clusterpermute( sum_contrast_spec, statistic = \"chisq\", threshold = 0.05, nsim = 100, progress = FALSE ) #> $null_cluster_dists #> ── Null cluster-mass distribution (chisq p < 0.05) ───── ── #> script (n = 100, df = 1) #> Mean (SD): -1.132 (12.58) #> Coverage intervals: 95% [-32.649, 21.081] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (chisq p < 0.05) ───────────────── ── #> script (df = 1) #> [0, 100]: 9.770 (p=0.1881) #> [900, 1000]: -8.521 (p=0.2475) #> [1900, 2500]: 73.440 (p=0.0099) #> ──────────────────────────────────────────────────────────────────────────────── set_rng_state(123L) system.time({ final_CPA <- clusterpermute( reverse_contrast_spec, statistic = \"chisq\", threshold = 0.05, nsim = 1000, progress = FALSE ) }) #> user system elapsed #> 0.05 0.08 5.67 final_CPA #> $null_cluster_dists #> ── Null cluster-mass distribution (chisq p < 0.05) ───── ── #> script (n = 1000, df = 1) #> Mean (SD): -0.107 (16.50) #> Coverage intervals: 95% [-36.966, 41.530] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (chisq p < 0.05) ───────────────── ── #> script (df = 1) #> [0, 100]: 9.770 (p=0.2108) #> [900, 1000]: -8.521 (p=0.2358) #> [1900, 2500]: 73.440 (p=0.0090) #> ──────────────────────────────────────────────────────────────────────────────── signif_clusters_df <- tidy(final_CPA$empirical_clusters) %>% filter(pvalue < 0.05) signif_clusters_df #> # A tibble: 1 × 7 #> predictor id start end length sum_statistic pvalue #> #> 1 script 3 1900 2500 7 73.4 0.00899 script_fig + geom_segment( aes(x = start, xend = end, y = -Inf, yend = -Inf), color = \"steelblue\", linewidth = 10, inherit.aes = FALSE, data = signif_clusters_df ) + geom_text( aes( y = -Inf, x = start + (end - start) / 2, label = paste(\"p =\", signif(pvalue, 3)) ), vjust = -2, inherit.aes = FALSE, data = signif_clusters_df )"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/jlmerclusterperm.html","id":"overview-of-cpa","dir":"Articles","previous_headings":"","what":"Overview of CPA","title":"Introduction to jlmerclusterperm","text":"Cluster-based permutation analysis (CPA) simulation-based, non-parametric statistical test difference groups time series. suitable analyzing densely-sampled time data (EEG eye-tracking research) research hypothesis often specified existence effect (e.g., predicted higher-order cognitive mechanisms) agnostic temporal details effect (precise moment emergence). CPA formalizes two intuitions means difference groups: countable unit difference (.e., cluster) contiguous, uninterrupted span sufficiently large differences time point (determined via threshold). degree extremity cluster (.e., cluster-mass statistic) measure sensitive magnitude difference, variability, sample size (e.g., t-statistic regression). CPA procedure identifies empirical clusters time series data tests significance observed cluster-mass statistics bootstrapped permutations data. bootstrapping procedure samples “null” (via random shuffling condition labels), yielding distribution cluster-mass statistics emerging chance. statistical significance cluster probability observing cluster-mass statistic extreme cluster’s simulated null distribution.","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/jlmerclusterperm.html","id":"package-design","dir":"Articles","previous_headings":"","what":"Package design","title":"Introduction to jlmerclusterperm","text":"jlmerclusterperm provides wholesale piecemeal approach conducting CPA. main workhorse function package clusterpermute(), composed five smaller functions called internally succession. smaller functions representing algorithmic steps CPA also exported, allow control procedure (e.g., debugging diagnosing CPA run). See function documentation .","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/jlmerclusterperm.html","id":"organization-of-vignettes","dir":"Articles","previous_headings":"","what":"Organization of vignettes","title":"Introduction to jlmerclusterperm","text":"package vignettes roughly divided two groups: topics case studies. researcher data ready analysis, recommended go case studies first pluck relevant bits desired analysis order increasing complexity, case studies : Garrison et al. 2020, introduces package’s functions running CPA, demonstrating wholesale piecemeal approaches. also compares use linear vs. logistic regression calculation timewise statistics. Geller et al. 2020, demonstrates use random effects regression contrasts CPA specification. also compares use t vs. chisq timewise statistics. de Carvalho et al. 2021, showcases using custom regression contrasts test relationship multiple levels factor. also discusses issues surrounding complexity timewise regression model. topics cover general package features: Tidying output: tidy() ways collecting jlmerclusterperm objects tidy data. Julia interface: interacting Julia objects using JuliaConnectoR package. Reproducibility: using Julia RNG make CPA runs reproducible. Asynchronous CPA: running CPA background process using future package.","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/jlmerclusterperm.html","id":"compared-to-other-implementations","dir":"Articles","previous_headings":"","what":"Compared to other implementations","title":"Introduction to jlmerclusterperm","text":"R packages cluster-based permutation analysis, clusterperm, eyetrackingR, permuco, permutes. Compared existing implementations, jlmerclusterperm package optimized CPAs based mixed-effects regression models, suitable typical experimental research data multiple, crossed grouping structures (e.g., subjects items). also package interface individual algorithmic steps CPA. Lastly, thanks Julia back-end, bootstrapping fast even mixed models.","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/jlmerclusterperm.html","id":"further-readings","dir":"Articles","previous_headings":"","what":"Further readings","title":"Introduction to jlmerclusterperm","text":"original method paper CPA Maris & Oostenveld 2007 recent overview paper Meyer et al. 2021 (CPA EEG research, applies broadly) critical paper Sassenhagen & Draschkow 2019 DOs DON’Ts interpreting reporting CPA results. review paper eyetracking analysis methods Ito & Knoeferle 2022 compares CPA statistical techniques testing difference groups time series data. JOSS paper permuco package (Frossard & Renaud 2021) references therein. Benedikt V. Ehinger’s blog post, includes detailed graphical explainer CPA procedure.","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/julia-interface.html","id":"setup","dir":"Articles","previous_headings":"","what":"Setup","title":"Julia interface","text":"","code":"library(jlmerclusterperm) jlmerclusterperm_setup(verbose = FALSE) julia_progress(show = FALSE)"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/julia-interface.html","id":"interacting-with-julia-objects","dir":"Articles","previous_headings":"","what":"Interacting with Julia objects","title":"Julia interface","text":"jlmerclusterperm functions collect Julia objects R objects, except jlmer() to_jlmer() return GLM.jl MixedModels.jl fitted model objects. can use functions JuliaConnectoR interact (pointers ) Julia objects: can also call functions Julia packages already installed. example, Effects.jl marginaleffects/emmeans-like features: Note jlmer() to_lmer() just conveniences sanity checking steps cluster-based permutation test, thus used general interface fitting regression models Julia serious analyses.","code":"jmod <- to_jlmer(Reaction ~ Days + (Days | Subject), lme4::sleepstudy) jmod #> #> Variance components: #> Column Variance Std.Dev. Corr. #> Subject (Intercept) 565.51067 23.78047 #> Days 32.68212 5.71683 +0.08 #> Residual 654.94145 25.59182 #> ────────────────────────────────────────────────── #> Coef. Std. Error z Pr(>|z|) #> ────────────────────────────────────────────────── #> (Intercept) 251.405 6.63226 37.91 <1e-99 #> Days 10.4673 1.50224 6.97 <1e-11 #> ────────────────────────────────────────────────── library(JuliaConnectoR) juliaLet(\"x.rePCA\", x = jmod) #> #> (Subject = [0.5406660682947617, 1.0],) juliaCall(\"issingular\", jmod) #> [1] FALSE juliaEval(\"using Effects\") juliaLet(\"effects(x.namedelements, y)\", x = list(Days = 2:5), y = jmod) #> #> 4×5 DataFrame #> Row │ Days Reaction err lower upper #> │ Int64 Float64 Float64 Float64 Float64 #> ─────┼──────────────────────────────────────────── #> 1 │ 2 272.34 7.09427 265.245 279.434 #> 2 │ 3 282.807 7.70544 275.102 290.512 #> 3 │ 4 293.274 8.55557 284.719 301.83 #> 4 │ 5 303.742 9.58128 294.16 313.323"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/julia-interface.html","id":"julia-global-options","dir":"Articles","previous_headings":"","what":"Julia global options","title":"Julia interface","text":"functions involved cluster-based permutation analysis print progress bars update real time console. function julia_progress() controls whether show progress bar width printed progress bar.","code":"# Set Julia progress options and save old state old_progress_opts <- julia_progress(show = FALSE, width = 30) old_progress_opts #> $show #> [1] TRUE #> #> $width #> [1] 50 julia_progress() #> $show #> [1] FALSE #> #> $width #> [1] 30 # Restored old state julia_progress(old_progress_opts) identical(old_progress_opts, julia_progress()) #> [1] TRUE # The syntax to reset progress options julia_progress(show = TRUE, width = \"auto\") julia_progress()"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/julia-interface.html","id":"the-julia-environment","dir":"Articles","previous_headings":"","what":"The Julia environment","title":"Julia interface","text":"following packages loaded Julia environment via jlmerclusterperm_setup():","code":"cat(readLines(system.file(\"julia/Project.toml\", package = \"jlmerclusterperm\")), sep = \"\\n\") #> [deps] #> DataFrames = \"a93c6f00-e57d-5684-b7b6-d8193f3e46c0\" #> Distributions = \"31c24e10-a181-5473-b8eb-7969acd0382f\" #> GLM = \"38e38edf-8417-5370-95a0-9cbb8c7f171a\" #> JlmerClusterPerm = \"2a59065a-9564-4903-82dd-0e42ce19d0e1\" #> MixedModels = \"ff71e718-51f3-5ec2-a782-8ffcbfa3c316\" #> ProgressMeter = \"92933f4c-e287-5a05-a399-4b506db050ca\" #> Random = \"9a3f8284-a2c9-5f02-9a11-845980a1fd5c\" #> Random123 = \"74087812-796a-5b5d-8853-05524746bad3\" #> StatsBase = \"2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91\" #> StatsModels = \"3eaba693-59b7-5ba5-a881-562e759f1c8d\" #> Suppressor = \"fd094767-a336-5f1f-9728-57cf17d0bbfb\" #> #> [compat] #> DataFrames = \"1.5\" #> GLM = \"1.8.2\" #> MixedModels = \"4.12\" #> ProgressMeter = \"1.7\" #> Random123 = \"1.6\" #> StatsBase = \"0.33\" #> StatsModels = \"0.7\" #> Suppressor = \"0.2.1\" #> julia = \"1.8\""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/reproducibility.html","id":"setup","dir":"Articles","previous_headings":"","what":"Setup","title":"Reproducibility","text":"Minimal example data: jlmer specification object used showcase:","code":"library(jlmerclusterperm) jlmerclusterperm_setup(verbose = FALSE) library(dplyr, warn.conflicts = FALSE) library(ggplot2) chickweights <- as_tibble(ChickWeight) %>% mutate(diet = forcats::fct_collapse(Diet, A = c(1, 3), B = c(2, 4))) %>% filter(Time %in% 0:20) ggplot(chickweights, aes(Time, weight, color = diet)) + stat_summary(fun.args = list(mult = 1.96)) + stat_summary(geom = \"line\") + theme_minimal(base_size = 18) jlmer_spec <- make_jlmer_spec( formula = weight ~ diet + (1 | Chick), data = chickweights, subject = \"Chick\", time = \"Time\" ) jlmer_spec #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: weight ~ 1 + dietB + (1 | Chick) #> Predictors: #> diet: dietB #> Groupings: #> Subject: Chick #> Trial: #> Time: Time #> Data: #> # A tibble: 533 × 4 #> weight dietB Chick Time #> #> 1 42 0 1 0 #> 2 51 0 1 2 #> 3 59 0 1 4 #> # ℹ 530 more rows #> ────────────────────────────────────────────────────────────────────────────────"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/reproducibility.html","id":"rng-seed-vs--counter","dir":"Articles","previous_headings":"","what":"RNG seed vs. counter","title":"Reproducibility","text":"CPA powered bootstrapping, inherently stochastic procedure. Reproducibility guaranteed clusterpermute() calls ran seed counter values. properties Julia random number generator used jlmerclusterperm refer two slightly different concepts: seed conceptually equivalent “seed” talked R, set.seed(). seed used RNG’s initialization, locking specific state persists session. jlmerclusterperm, default seed 1 primarily controlled via options(\"jlmerclusterperm.seed\") must set calling jlmerclusterperm_setup(). counter initializes 0 increments value every time RNG consumed produce random number. equivalent concept base R, though ’s handy intuitive way interfacing RNG’s state interactive session. counter can manipulated via functions reset_rng_state(), set_rng_state(), get_rng_state(). rest vignette showcase interacting counter state, useful practice.","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/reproducibility.html","id":"behavior-of-rng-counter","dir":"Articles","previous_headings":"","what":"Behavior of RNG counter","title":"Reproducibility","text":"start fresh session, counter initialized zero. specify seed, ’s one: start simple clusterpermute() 100 simulations. reproducibility profile run seed 1 counter 0: run, counter now incremented 3712: call clusterpermute() now runs new counter value, produces different result. , expected given stochastic nature CPA (thus need run tens thousands simulations reporting results): second run also increments counter state, note ’s exactly amount (possible predict): difference two runs specifically null distribution cluster-mass statistics. 100-simulation samples drawn share underlying (null) distribution, though identical one another (rarely ).","code":"get_rng_state() #> [1] 0 get_rng_seed() #> [1] 1 CPA_1_0 <- clusterpermute(jlmer_spec, threshold = 1, nsim = 100, progress = FALSE) CPA_1_0 #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 1) ────────────── ── #> dietB (n = 100) #> Mean (SD): 0.724 (5.54) #> Coverage intervals: 95% [-10.847, 12.592] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 1) ────────────────────────── ── #> dietB #> [2, 14]: 18.231 (p=0.0198) #> ──────────────────────────────────────────────────────────────────────────────── get_rng_state() #> [1] 3712 CPA_1_3712 <- clusterpermute(jlmer_spec, threshold = 1, nsim = 100, progress = FALSE) CPA_1_3712 #> $null_cluster_dists #> ── Null cluster-mass distribution (t > 1) ────────────── ── #> dietB (n = 100) #> Mean (SD): -0.618 (8.51) #> Coverage intervals: 95% [-18.328, 19.076] #> ──────────────────────────────────────────────────────────────────────────────── #> #> $empirical_clusters #> ── Empirical clusters (t > 1) ────────────────────────── ── #> dietB #> [2, 14]: 18.231 (p=0.0792) #> ──────────────────────────────────────────────────────────────────────────────── get_rng_state() #> [1] 7427 ggplot(mapping = aes(sum_statistic)) + geom_density( aes(fill = \"1\"), alpha = .5, data = tidy(CPA_1_0$null_cluster_dists) ) + geom_density( aes(fill = \"3712\"), alpha = .5, data = tidy(CPA_1_3712$null_cluster_dists) ) + guides(fill = guide_legend(\"Counter\")) + theme_classic()"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/reproducibility.html","id":"manipulating-rng-counter","dir":"Articles","previous_headings":"","what":"Manipulating RNG counter","title":"Reproducibility","text":"Using set_rng_counter(), possible reproduce prior run. example, reproduce second clusterpermute() call, set counter back 3712 set_rng_state(): results subsequent run counter state identical previous run: reproduce first CPA ran counter 0, can use set_rng_state() simply use reset_rng_state() alias set_rng_state(0): behavior can observed specific step simulating null distribution well. function corresponding step permute_timewise_statistics() run 100 simulations step counter zero: Running cluster detection algorithm result using extract_null_cluster_dists() returns null cluster-mass distribution identical CPA_1_0:","code":"set_rng_state(3712) get_rng_state() #> [1] 3712 identical( CPA_1_3712, clusterpermute(jlmer_spec, threshold = 1, nsim = 100, progress = FALSE) ) #> [1] TRUE reset_rng_state() get_rng_state() #> [1] 0 identical( CPA_1_0, clusterpermute(jlmer_spec, threshold = 1, nsim = 100, progress = FALSE) ) #> [1] TRUE reset_rng_state() get_rng_state() #> [1] 0 null_statistics <- permute_timewise_statistics(jlmer_spec, nsim = 100) get_rng_state() #> [1] 3712 null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 1) null_cluster_dists #> ── Null cluster-mass distribution (t > 1) ────────────── ── #> dietB (n = 100) #> Mean (SD): 0.724 (5.54) #> Coverage intervals: 95% [-10.847, 12.592] #> ──────────────────────────────────────────────────────────────────────────────── identical(null_cluster_dists, CPA_1_0$null_cluster_dists) #> [1] TRUE"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/reproducibility.html","id":"technical-details-and-notes","dir":"Articles","previous_headings":"","what":"Technical details and notes","title":"Reproducibility","text":"Julia back-end uses thread-safe, counter-based RNG called Threefry (specifically, Random123.Threefry2x) comes Random123.jl Julia library adaptation original algorithm developed Salmon, Moraes, Dror, Shaw (2011). RNG initialized “rounds” value 20, also default. Threefry used speed stability across Julia versions. Additionally, noted earlier, RNG initializes explicit seed 1. behavior different case R, random seed chosen user every new session. consequence absence explicit seeding via options(\"jlmerclusterperm.seed\"), first run every new session (.e., runs counter value zero) identical. Given interface RNG counter, many unique advantages using different seed manipulating RNG counter. interface RNG seed also provided via set_rng_seed() get_rng_seed(). Lastly, please aware changing seed R session (e.g., via set.seed()) effect jlmerclusterperm functions powered Julia back-end.","code":"# Restart Julia session jlmerclusterperm_setup(restart = TRUE, verbose = FALSE) # RNG seed is again 1 get_rng_seed() #> [1] 1 # So results of the first CPA of the session (with counter = 0) is the same get_rng_state() #> [1] 0 identical( CPA_1_0, clusterpermute(jlmer_spec, threshold = 1, nsim = 100, progress = FALSE) ) #> [1] TRUE"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/reproducibility.html","id":"miscellaneous","dir":"Articles","previous_headings":"","what":"Miscellaneous","title":"Reproducibility","text":"may wondering whether two CPAs overlap range counter values also share overlapping samples null. actually case practice shuffling CPA implemented: two colored lines represent two 100-simulation bootstrap permutations differ initial RNG counter value 1. point shape two lines look different - one just shifted form another. formal investigations (e.g., simulating power CPA), ’s course best practice generate random seeds. counter convenience balances interactivity reproducibility.","code":"# 100 simulations of timewise statistics ## First CPA using counter at 1 set_rng_state(1) x <- permute_timewise_statistics(jlmer_spec, nsim = 100) ## Second CPA using counter at 2 set_rng_state(2) y <- permute_timewise_statistics(jlmer_spec, nsim = 100) # The null cluster-mass statistics x_null <- extract_null_cluster_dists(x, threshold = 1) x_null_stats <- tidy(x_null)$sum_statistic y_null <- extract_null_cluster_dists(y, threshold = 1) y_null_stats <- tidy(y_null)$sum_statistic # Relationship between two CPAs with similar initial counters matplot(cbind(x_null_stats, y_null_stats), type = \"b\", pch = 1)"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/tidying-output.html","id":"setup","dir":"Articles","previous_headings":"","what":"Setup","title":"Tidying output","text":"Minimal example data: jlmer specification object used showcase:","code":"library(jlmerclusterperm) jlmerclusterperm_setup(verbose = FALSE) julia_progress(show = FALSE) library(dplyr, warn.conflicts = FALSE) library(ggplot2) chickweights <- as_tibble(ChickWeight) %>% mutate(diet = forcats::fct_collapse(Diet, A = c(1, 3), B = c(2, 4))) %>% filter(Time %in% 0:20) ggplot(chickweights, aes(Time, weight, color = diet)) + stat_summary(fun.args = list(mult = 1.96)) + stat_summary(geom = \"line\") + theme_minimal(base_size = 18) jlmer_spec <- make_jlmer_spec( formula = weight ~ diet + (1 | Chick), data = chickweights, subject = \"Chick\", time = \"Time\" ) jlmer_spec #> ── jlmer specification ───────────────────────────────────────── ── #> Formula: weight ~ 1 + dietB + (1 | Chick) #> Predictors: #> diet: dietB #> Groupings: #> Subject: Chick #> Trial: #> Time: Time #> Data: #> # A tibble: 533 × 4 #> weight dietB Chick Time #> #> 1 42 0 1 0 #> 2 51 0 1 2 #> 3 59 0 1 4 #> # ℹ 530 more rows #> ────────────────────────────────────────────────────────────────────────────────"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/tidying-output.html","id":"a-tour-of-tidy-methods","dir":"Articles","previous_headings":"","what":"A tour of tidy() methods","title":"Tidying output","text":"non- objects returned jlmerclusterperm functions tidy() method return data frame representation object.","code":""},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/tidying-output.html","id":"timewise_statistics-objects","dir":"Articles","previous_headings":"A tour of tidy() methods","what":" objects","title":"Tidying output","text":"Functions compute_timewise_statistics() permute_timewise_statistics() return object class , array statistics. Note isn’t stylized print() method class. Instead, recommended use tidy() inspect results data frame statistic represented row. case compute_timewise_statistics(), output matrix (2D array) predictor time. dimension becomes column tidy() data: case compute_timewise_statistics(), output 3D array simulation, time, predictor. , dimension becomes column tidy() data:","code":"empirical_statistics <- compute_timewise_statistics(jlmer_spec) class(empirical_statistics) #> [1] \"timewise_statistics\" dim(empirical_statistics) # predictor x time #> [1] 1 11 tidy(empirical_statistics) #> # A tibble: 11 × 3 #> predictor time statistic #> #> 1 dietB 0 -1.09 #> 2 dietB 2 2.29 #> 3 dietB 4 3.14 #> 4 dietB 6 4.01 #> 5 dietB 8 2.90 #> 6 dietB 10 2.44 #> 7 dietB 12 2.18 #> 8 dietB 14 1.26 #> 9 dietB 16 0.673 #> 10 dietB 18 0.532 #> 11 dietB 20 0.809 null_statistics <- permute_timewise_statistics(jlmer_spec, nsim = 1000) class(null_statistics) #> [1] \"timewise_statistics\" dim(null_statistics) # simulation x time x predictor #> [1] 1000 11 1 tidy(null_statistics) #> # A tibble: 11,000 × 4 #> predictor time statistic sim #> #> 1 dietB 0 -0.309 0001 #> 2 dietB 0 2.10 0002 #> 3 dietB 0 -2.83 0003 #> 4 dietB 0 -0.568 0004 #> 5 dietB 0 2.10 0005 #> 6 dietB 0 -1.09 0006 #> 7 dietB 0 -0.830 0007 #> 8 dietB 0 -0.0515 0008 #> 9 dietB 0 -0.0515 0009 #> 10 dietB 0 0.206 0010 #> # ℹ 10,990 more rows"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/tidying-output.html","id":"empirical_clusters-objects","dir":"Articles","previous_headings":"A tour of tidy() methods","what":" objects","title":"Tidying output","text":" object returned extract_empirical_clusters() print method defined, makes easier eye: tidy() data frame representation also contains information like length, number time points cluster spans. id variable uniquely identifies cluster within predictor.","code":"empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 1) class(empirical_clusters) #> [1] \"empirical_clusters\" empirical_clusters #> ── Empirical clusters (t > 1) ────────────────────────── ── #> dietB #> [2, 14]: 18.231 #> ──────────────────────────────────────────────────────────────────────────────── tidy(empirical_clusters) #> # A tibble: 1 × 6 #> predictor id start end length sum_statistic #> #> 1 dietB 1 2 14 7 18.2"},{"path":"https://yjunechoe.github.io/jlmerclusterperm/articles/tidying-output.html","id":"null_cluster_dists-objects","dir":"Articles","previous_headings":"A tour of tidy() methods","what":" objects","title":"Tidying output","text":" object returned extract_null_cluster_dists() also print method defined, displays summary statistics: null distribution constructed largest cluster-mass statistic observed bootstrapped permutations original data. words, simulation contributes one sample null. tidy() representation, simulation predictor represents row. sum_statistic column represents size largest cluster simulation. clusters found, column zero values start, end, length NA. output tidy(null_statistics) tidy(null_cluster_dists) form relational data. example, can use left_join() find time wise statistics permuted data extreme cluster-mass statistic: Note sum_statistic simply sum statistic values span largest cluster:","code":"null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 1) class(null_cluster_dists) #> [1] \"null_cluster_dists\" null_cluster_dists #> ── Null cluster-mass distribution (t > 1) ────────────── ── #> dietB (n = 1000) #> Mean (SD): 0.014 (7.11) #> Coverage intervals: 95% [-15.360, 15.902] #> ──────────────────────────────────────────────────────────────────────────────── tidy(null_cluster_dists) #> # A tibble: 1,000 × 6 #> predictor start end length sum_statistic sim #> #> 1 dietB 8 20 7 11.1 0001 #> 2 dietB 6 10 3 3.75 0002 #> 3 dietB 18 20 2 2.07 0003 #> 4 dietB 2 10 5 6.00 0004 #> 5 dietB 4 10 4 5.91 0005 #> 6 dietB NA NA NA 0 0006 #> 7 dietB 8 20 7 12.2 0007 #> 8 dietB 2 8 4 5.29 0008 #> 9 dietB NA NA NA 0 0009 #> 10 dietB NA NA NA 0 0010 #> # ℹ 990 more rows largest_permuted_cluster <- tidy(null_cluster_dists) %>% slice(which.max(abs(sum_statistic))) %>% left_join(tidy(null_statistics), by = c(\"predictor\", \"sim\")) largest_permuted_cluster #> # A tibble: 11 × 8 #> predictor start end length sum_statistic sim time statistic #>