Skip to content
Pierre Lindenbaum edited this page May 13, 2015 · 17 revisions

##Motivation

Bioinformatics file javascript-based reformatter ( java rhino engine http://en.wikipedia.org/wiki/Rhino_%28JavaScript_engine%29 ). Something like awk for VCF, BAM, SAM, FASTQ, FASTA etc...

the program injects the following variables:

###VCF for VCF , the program injects the following variables:

##Compilation

See also Compilation.

$  make bioalcidae

##Synopsis

$ java -jar dist/bioalcidae.jar [options] (stdin|file1 file2 ... fileN|file.list) 

##Options

Option Description
-f (file) javascript file
-e (expression) javascript expression
-o (file) output file. Default: stdout
-F (format) [VCF, SAM, BAM, FASTA, FASTQ] optional. Required when reading stdin
-h get help (this screen) and exit.
-v print version and exit.
-L (level) log level. One of java.util.logging.Level . Optional.

##Source Code

Main code is: https://github.com/lindenb/jvarkit/blob/master/src/main/java/com/github/lindenb/jvarkit/tools/bioalcidae/BioAlcidae.java

##Example

VCF

Reformating a VCF we want to reformat a VCF with header

CHROM POS REF ALT GENOTYPE_SAMPLE1 GENOTYPE_SAMPLE2 ... GENOTYPE_SAMPLEN

we use the following javascript file:

var samples = header.sampleNamesInOrder;
out.print("CHROM\tPOS\tREF\tALT");
for(var i=0;i< samples.size();++i)
	{
	out.print("\t"+samples.get(i));
	}
out.println();

while(iter.hasNext())
	{
	var ctx = iter.next();
	if(ctx.alternateAlleles.size()!=1) continue;
	out.print(ctx.chr +"\t"+ctx.start+"\t"+ctx.reference.displayString+"\t"+ctx.alternateAlleles.get(0).displayString);
	for(var i=0;i< samples.size();++i)
		{
		var g = ctx.getGenotype(samples.get(i));

		out.print("\t");

		if(g.isHomRef())
			{
			out.print("0");
			}
		else if(g.isHomVar())
			{
			out.print("2");
			}
		else if(g.isHet())
			{
			out.print("1");
			}
		else
			{
			out.print("-9");
			}
		}
	out.println();
	}
$ curl -s  "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz" | \
gunzip -c | java -jar ./dist/bioalcidae.jar -f jeter.js -F vcf | head -n 5 | cut -f 1-10

CHROM	POS	REF	ALT	HG00096	HG00097	HG00099	HG00100	HG00101	HG00102
22	16050075	A	G	0	0	0	0	0	0
22	16050115	G	A	0	0	0	0	0	0
22	16050213	C	T	0	0	0	0	0	0
22	16050319	C	T	0	0	0	0	0	0

Contribute

##See also

##History

  • 2015 : Creation

License

The project is licensed under the MIT license.

Clone this wiki locally