00 Accepted input formats

Accepted formats

amica format

Variable name	Column name or prefix	Description	Mandatory
Protein ID	Majority.protein.IDs	unique identifier	yes
Gene name	Gene.names		yes
LFQ intensity prefix	LFQIntensity_	MaxQuants (MQs) 'LFQ intensity' columns	no
Imputed intensity prefix	ImputedIntensity_	Imputed (and potentially re-normalized) intensities	yes
razor unique count	razorUniqueCount	MQs 'razor+unique count' column	no
razor unique prefix	razorUniqueCount	MQs 'razor+unique count' column per sample	no
p-value prefix	P.Value_	e.g `P.Value_group1__vs__group2`	no
adj. p-value prefix	adj.P.Val_	e.g `adj.P.Val_group1__vs__group2`	no
Log2 fold change prefix	logFC_	e.g `logFC_group1__vs__group2`	yes
avg. expression prefix	AveExpr_	e.g `AveExp_group1__vs__group2`	no
comparison infix	`__vs__`	see below	yes
Quantified column	quantified	see below	no
Potential contaminant column	Potential.contaminant	MQs Potential.contaminants column	no

IntensityPrefix, ImputedIntensityPrefix and abundancePrefix columns are log2 transformed, all 0s need to be converted to NANs. No INF values allowed. amica searches for all Intensity prefixes in the column names, if you want to provide more than the dafault intensities. However, all intensity prefixes must have the same number of samples in order to get processed.
ImputedIntensityPrefix should only contain filtered, imputed and normalized values
Quantified column: All proteins passing filter by valid values, spectraCount and razorUniqueCount thresholds that have been quantified are set to "+" in this column. Otherwise no value ("") is written in the column. If no quantified column is provided complete cases (i.e., have no missing values) of all ImputedIntensity and all columns containing the group comparison infix __vs__ are set to be quantified.
comparisonInfix: The infix is important to retrieve the group ids from a group comparison (e.g for downstream visualizations like heatmaps). The groups before and after the __vs__ infix need to match with groups defined in the uploaded experimental design.
razorUniqueCount is a column, razorUniquePrefix is the prefix to the count per sample, but they may very well have the same value (just like in MaxQuant’s proteinGroups.txt)
Proteins inferred from reverse hits and peptides ”only identified by site modifications” are not to be written into amica’s output. Additional columns can be added in the future but are at the moment not considered when uploaded.

MaxQuant

For MaxQuant label-free quantification (LFQ) output following columns are parsed:

Variable name	Column name/Prefix	Comment
proteinId	`Majority protein IDs`
geneName	`Gene names`
intensityPrefix	`LFQ Intensity <sample>`
Imputed Int. prefix		get's calculated
abundancePrefix	`iBAQ <sample>`
razorUniqueCount	`Razor + unique peptides`	specific column of summarized razor+unique count
razorUniquePrefix	`Razor + unique peptides <sample>`	corresponds to razor+unique count of a sample
spectraCount	`MS/MS count`
contaminantCol	`Potential contaminant`

amica automatically filters out reverse hits and proteins only identified by site.

FragPipe

For FragPipe/Philosopher LFQ output following columns are parsed:

Variable name	Column name/Prefix or Suffix	Comment
Default parameters

proteinId	`Protein ID`
geneName	`Gene Names`
intensityPrefix	`<sample> Razor Intensity`
Imputed Int. prefix		get's calculated
abundancePrefix
razorUniqueCount	`Unique Stripped Peptides`
razorUniquePrefix	`<sample> Razor Spectral Count`
spectraCount	`Summarized Razor Spectral Count`

FragPipe v16 (MSFragger v3.3, Philosopher v4.0.0)

proteinId	`Protein ID`
geneName	`Gene Names`
intensityPrefix	`<sample> Intensity`
Imputed Int. prefix		get's calculated
abundancePrefix
razorUniqueCount	`Combined Total Peptides`
razorUniquePrefix	`<sample> Razor Spectral Count`
spectraCount	`Combined Spectral Count`

FragPipe v17 (MSFragger v3.4, Philosopher v4.1.0)

proteinId	`Protein ID`
geneName	`Gene`
intensityPrefix	`<sample> MaxLFQ Intensity`
Imputed Int. prefix		get's calculated
abundancePrefix
razorUniqueCount	`Combined Total Peptides`
razorUniquePrefix	`<sample> Razor Spectral Count`
spectraCount	`Combined Spectral Count`

For FragPipe/Philosopher TMT [abundance/ratio]_protein_[normalization].tsv output following columns are parsed:

Variable name	Column name/Prefix or Suffix	Comment
proteinId	`ProteinID`
geneName	`Index`
intensityPrefix	`<sample>`	There is no prefix.
spectraCount	`NumberPSM`

Spectronaut

For Spectronaut's PG report following columns are parsed:

Variable name	Column name/Prefix or Suffix	Comment
proteinId	`PG ProteinAccessions`
geneName	`PG Genes`
intensityPrefix	`PG Quantity <sample>`
razorUniqueCount	`PG RunEvidenceCount`	non-mandatory
razorUniquePrefix	`PG NrOfPrecursorsIdentified <sample>`	non-mandatory

DIA-NN

For DIA-NN's PG matrix following columns are parsed:

Variable name	Column name/Prefix or Suffix	Comment
proteinId	`Protein Group`
geneName	`Genes`
intensityPrefix	`<sample>`	There is no prefix.

Design

The design file has two columns: samples and groups. The sample names in the samples column need to match the column names of the input file in the order of the input file.

groups	samples
group1	group1_sample_1
group1	group1_sample_2
group1	group1_sample_3
group2	group2_sample_1
group2	group2_sample_2
group2	group2_sample_3
group3	group3_sample_1
group3	group3_sample_2
group3	group3_sample_3

Contrast matrix

The contrast matrix tells amica which group comparisons to perform. The column names of this file can be freely chosen, but column names must be provided. For each row in this file the comparison group1-group2 is performed. If one wants to change the sign of the fold changes the position of the groups needs to be switched in the file (e.g group2-group1 instead of group1-group2

group1	group2
group1	group2
group1	group3
group2	group3

Custom tab-delimited input

Specification file

The specification file needs to be uploaded if a custom tab-delimited file is analyzed. The file has two columns, Variable and Pattern, these are used to change the prefixes (or post-fixes) to identify the relevant columns in your data.

Following columns can be parsed:

Variable	Pattern	Mandatory
proteinId	...	yes
geneName	...	yes
intensityPrefix	...	yes
abundancePrefix	...	no
razorUniqueCount	...	no
razorUniquePrefix	...	no
spectraCount	...	no
contaminantCol	...	no

The proteinId column must only contain unique entries. If razorUnique count is missing some functionality will be lost (DEqMS). It is important that the provided intensities are not log2-transformed. An example format is provided in the examples.zip file The specification file needs to be uploaded if a custom tab-delimited file is analyzed. The file has two columns, Variable and Pattern, these are used to change the prefixes (or post- fixes) to identify the relevant columns in your data.

An example specification file is provided here (the corresponding custom file can be downloaded in amica Input tab or from the file examples.zip):

Variable	Pattern
proteinId	Majority.protein.IDs
geneName	Gene.names
spectraCount	spectraCount
razorUniqueCount	razorUniqueCount
razorUniqueCountPrefix	razorUniqueCount_
abundancePrefix	iBAQ
intensityPrefix	LFQIntensity_
contaminantCol	Potential.contaminant

How to convert a tab-separated file into amica format

If you want to upload data into amica that has already been analyzed in a different tool or context (e.g data from RNA-Seq) you need to change the column names of your file into amica's column name.

The following example demonstrates how to do this:

uniqueID	Gene	logExpr_sample_1	logExpr_sample_2	...	logExpr_sample_n	pval_trtmt/ctrl	padj_trtmt/ctrl	logfc_trtmt/ctrl
id_1	Gene_1	30	30.5	...	28.2	0.00012	0.002	1.7
id_2	Gene_2	28.6	28.5	...	26.9	0.0002	0.003	1.68
...	...	...	...	...	...	...	...	...
id_p	Gene_p	20	20.3	...	18	0.99	0.99	-0.02

The uniqueID column needs to be renamed into Majority.protein.IDs,

the Gene column into Gene.names

and all logExpr_ prefixes need to be replaced by ImputedIntensity_ (e.g ImputedIntensity_sample_1, ImputedIntensity_sample_2, ..., ImputedIntensity_sample_n).

Columns containing the results from the differential expression analysis (pval_trtmt/ctrl, padj_trtmt/ctrl, logfc_trtmt/ctrl) need to be adapted that they contain the correct prefixes and the __vs__ - infix.

pval_trtmt/ctrl has to be changed to P.Value_trtmt__vs__ctrl,

padj_trtmt/ctrl to adj.P.Val_trtmt__vs__ctrl and

logfc_trtmt/ctrl to logFC_trtmt__vs__ctrl.

Furthermore, you could specify a quantified column that contains for each entry a + if it has been quantified, else it needs to be left empty. If none is provided, amica automatically creates one and sets a + in the quantified column for all entries that do not contain NAs in the ImputedIntensity and __vs__ - infix columns.

The data looks now like this:

Majority.protein.IDs	Gene.names	ImputedIntensity_sample_1	ImputedIntensity_sample_2	...	ImputedIntensity_sample_n	P.Value_trtmt__vs__ctrl	adj.P.Val_trtmt__vs__ctrl	logFC_trtmt__vs__ctrl	quantified
id_1	Gene_1	30	30.5	...	28.2	0.00012	0.002	1.7	+
id_2	Gene_2	28.6	28.5	...	26.9	0.0002	0.003	1.68	+
...	...	...	...	...	...	...	...	...	...
id_p	Gene_p	20	20.3	...	18	0.99	0.99	-0.02	+

Save this file as a tab-separated txt / tsv file (you can choose a file name of your choice, the output format of amica is by default amica_protein_groups.txt).

Finally, we need to create a tab-separated experimental design that assigns the samples to their appropriate group. Here it is important to link the samples to the p-values and the fold-change columns of the group comparison infixes (e.g logFC_trtmt__vs__ctrl corresponds to the group comparison trtmt vs ctrl). All groups from the group comparison infixes need to be defined in the experimental design. If you have multiple Intensity - prefixes in your amica file, it is important that all of them have the same number of samples. The sample names in the samples column of the design need to match the column names of the input file in the order of the input file.

groups	samples
trtmt	sample_1
trtmt	sample_2
trtmt	sample_3
ctrl	sample_4
ctrl	sample_5
ctrl	sample_6

Save this file as a tab-separated txt/tsv file (you can choose a file name of your choice). Now you can upload both files and analyze and visualize your data in amica.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly