This is the simulation code repository for our upcoming paper (Pharmacoepidemiol Drug Saf 2019) on multigroup empirical equipoise assessment. We extended the original definition (Walker et al. Comp Eff Res 2013;3:11) to the multigroup setting.
- Paper: PDS 2019 in press
*.R
: Main R script files for generating simulation data, analyzing data, and reporting results. Execution each file will generate a plain text report file named *.R.txt underlog/
.*_o2.sh
: Example shell scripts for the Linux SLURM batch job system. These are designed for Harvard Medical School’s O2 cluster specifically, and are not expected to work without modification elsewhere.data/
: Folder for simulation data. Due to file size issues, only the summary file for running06_assess_results.R
is kept.log/
: Folder for log files.out/
: Folder for figure PDFs.
Two custom R packages must be installed before running the R scripts provided in this repository.
## Install devtools (if you do not have it already)
install.packages("devtools")
## Install the data generation package
devtools::install_github(repo = "kaz-yos/datagen3")
## Install the simulation package
devtools::install_github(repo = "kaz-yos/empeq3")
## Install the simulation package
devtools::install_github(repo = "kaz-yos/trim3")
Additionally, packages doParallel
, doRNG
, grid
, gtable
, and tidyverse
, are required in the scripts.
Running the following will generate raw data files under the data/
folder using 8 cores.
Rscript ./01_generate_data.R 8
If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script.
sh ./01_generate_data_o2.sh
Running the following will process the specified data file (PS estimation and trimming) and generate a new file with the same name except that raw changes to prepared.
Rscript ./02_prepare_data.R ./data/scenario_raw001_part001_r50.RData 8
If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script. This will dispatch a SLURM job for each file.
sh ./02_prepare_data_o2.sh ./data/scenario_raw*
Running the following will analyze the specified data file (outcome model estimation) and generate a new file with the same name except that prepared changes to analyzed.
Rscript ./03_analyze_data.R ./data/scenario_prepared001_part001_r50.RData 8
If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script. This will dispatch a SLURM job for each file.
sh ./03_analyze_data_o2.sh ./data/scenario_prepared*
Running the following will aggregate all the analyzed data files under data/
and generate a single new file named all_analysis_results.RData
.
Rscript ./04_aggregate_results.R 1
If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script.
sh ./04_aggregate_results_o2.sh
Running the following will summarize the results (scenario-level summaries) and generate a new file named all_analysis_summary.RData
.
Rscript ./05_summarize_results.R ./data/all_analysis_results.RData 1
If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script.
sh ./05_summarize_results_o2.sh ./data/all_analysis_results.RData
Running the following will create figures under out/
describing the summary statistics in all_analysis_summary.RData
.
./Rscriptee ./06_assess_results.R ./data/all_analysis_summary.RData 1