This pipeline requires 4 mandatory input files:
-
Phenotype Count Matrix (.tsv): Is the tab separated tabular data file which contains normalized and quality controlled phenotype counts. Should contain at least the following columns: phenotype_id. Example of count matrix is here
-
Sample Metadata (.tsv): Is the tab separated tabular data file which contains metadata of the samples represented in phenotype count matrix. Should contain at least the following columns: sample_id, genotype_id, qtl_group. Example of Sample Metadata can be found in here
-
Phenotype Metadata (.tsv): Is the tab separated tabular data file which contains metadata of the phenotypes represented in phenotype count matrix. Should contain at least the following columns: phenotype_id, chromosome, phenotype_pos, strand, gene_id, group_id. Example of Phenotype Metadata can be found in here. It is recommended to use phenotype metadata designed for eQTL Catalogue project which can be found here
-
Genotype data (VCF or VCF.gz): Is the VCF file which contains genotypic data of the all samples represented in phenotype count matrix. Data column names of the VCF file should correspond to the genotype_id column values of sample_metadata. Example of genotype data can be found in here.