Skip to content

Commit

Permalink
Fixes after the last workshop
Browse files Browse the repository at this point in the history
  • Loading branch information
pmitev committed Sep 1, 2022
1 parent 9355457 commit ad50761
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/1.Simple_example.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ $ awk ' /pattern/ {action} ' file1 file2 ... fileN
- remove the minus sign in the `%-8s` formatting to see the effect.
- more string manipulations [exercises](Exercises/String_manipulation.md)

More on format modifiers: [gawk documentation](https://https://www.gnu.org/software/gawk/manual/html_node/Format-Modifiers.html#Format-Modifiers)
More on format modifiers: [gawk documentation](https://www.gnu.org/software/gawk/manual/html_node/Format-Modifiers.html#Format-Modifiers)

!!! example "Files"
* [coins.txt](data/coins.txt)
2 changes: 1 addition & 1 deletion docs/3.Shell_we_awk.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ Let's use the output from the Gaussian code as an example (*something from my re
2O,std. LJ params. for H2O, MC/QM\\-1,1\O,0,0.,0.,0.\H,0,-0.836605,-0.
. . . ( more lines ) . . .
```
The numbers that I am interested in are in bold. There are **56 such pairs** in the whole file. I need them tabulated in simple, two-column file that is easy to read, analyze and plot. Here I will not discuss other solutions. Instead, here is a possible awk solution:
The numbers that I am interested in are in bold. There are **56 such pairs** in the whole [file](https://github.com/pmitev/to-awk-or-not/raw/master/docs/data/gaussian.out). I need them tabulated in simple, two-column file that is easy to read, analyze and plot. Here I will not discuss other solutions. Instead, here is a possible awk solution:

``` awk title="extract-gaussian.awk"
#!/usr/bin/awk -f
Expand Down
2 changes: 1 addition & 1 deletion docs/6.One_line_programs.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ Many useful awk programs are as short as just a line or two. Here is a collectio
Here I realized that it is better to post some links and just mention some of my favorites, perhaps.

* [The best AWK one-liners](http://tuxgraphics.org/~guido/scripts/awk-one-liner.html)
* [AWK one-liners](http://www.softpanorama.org/Tools/Awk/awk_one_liners.shtml) by Softpanorama
* [awk one-liners](https://nixshell.wordpress.com/2009/04/01/awk-one-liners/) by *nix shell
* [Handy One-line Scripts for AWK](https://www.pement.org/awk/awk1line.txt) compiled by Eric Pement

2 changes: 2 additions & 0 deletions docs/Case_studies/manipulating_vcf.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,8 @@ Count and sort the different genomic features in chromosome 4 by number.
}
```

??? "_Solution_ proposed by Loïs Rancilhac - 2022.08.30"
`awk '/_SNP/ {SNP++; print $0 > "chr4_SNPs.vcf"} /_DEL/ {DEL++; print $0 > "chr4_DEL.vcf"; LENGTH=length($4)-length($5); print LENGTH > "Deletions_lengths.txt"} /_INS/ {INS++; print $0 > "chr4_INS.vcf"; LENGTH=length($5)-length($4); print LENGTH > "Insertions_lengths.txt"} END{print "SNPs: "SNP"\nInsertions: "INS"\nDeletions: "DEL}' chr4.vcf`

#### *Follow-up task:*
Print nucleotide substitution that these SNPs introduce sorted by number. Remember the coins...
Expand Down

0 comments on commit ad50761

Please sign in to comment.