Skip to content

Commit

Permalink
Updated readme for Krona (#28)
Browse files Browse the repository at this point in the history
  • Loading branch information
jaebeom-kim authored Jul 5, 2023
1 parent d797b6c commit 58ed958
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 8 deletions.
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,11 +83,11 @@ metabuli classify --seq-mode 1 read.fna dbdir outdir jobid
--spacing-mask : Binary patterend mask for spaced k-mer. The same mask must be used for DB creation and classification. A mask should contain at least eight '1's, and '0' means skip.
* --min-score and --min-sp-score for precision mode are optimized only for short reads.
* We don't recommend use them for long reads.
* We don't recommend using them for long reads.
```

This will generate two result files: `Job ID_classifications.tsv` and `Job ID_report.tsv`
#### Job ID_classifications.tsv
This will generate two result files: `JobID_classifications.tsv`, `JobID_report.tsv`, and `JobID_krona.html`.
#### JobID_classifications.tsv
1. Classified or not
2. Read ID
3. Taxonomy identifier
Expand All @@ -105,8 +105,8 @@ This will generate two result files: `Job ID_classifications.tsv` and `Job ID_re
0 read_3 0 294 0 0 0 no rank
```

#### Job ID_report.tsv
Proportion of reads that are assigned to each taxon.
#### JobID_report.tsv
The proportion of reads that are assigned to each taxon.
```
#Example
33.73 77571 77571 0 no rank unclassified
Expand All @@ -125,6 +125,10 @@ Proportion of reads that are assigned to each taxon.
0.01 24 24 170539 subspecies RS_GCF_000204275.1
```

#### JobID_krona.html
It is for an interactive taxonomy report (Krona). You can use any modern web browser to open `JobID_krona.html`.
<p align="left"><img src="https://raw.githubusercontent.com/steineggerlab/Metabuli/master/.github/image.png" height="350" /></p>

#### Resource requirements
Metabuli can classify reads against a database of any size as long as the database is fits in the hard disk, regardless of the machine's RAM size.
We tested it with a MacBook Air (2020, M1, 8 GiB), where we classified about 1.5 M paired-end 150 bp reads (~5 GiB in size) against a database built with ~23K prokaryotic genomes (~69 GiB in size)
Expand Down
17 changes: 14 additions & 3 deletions src/commons/Classifier.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
#include "Classifier.h"
#include "LocalParameters.h"
#include "krona_prelude.html.h"
#include "taxonomyreport.cpp"
#include <ctime>

Classifier::Classifier(LocalParameters & par) : maskMode(par.maskMode), maskProb(par.maskProb) {
Expand Down Expand Up @@ -341,8 +343,9 @@ void Classifier::startClassify(const LocalParameters &par) {
cout << "The number of matches: " << totalMatchCnt << endl;
readClassificationFile.close();


// Write report files
writeReportFile(outDir + "/" + jobId + "_report.tsv", numOfSeq, taxCounts);
writeReportFile(outDir, numOfSeq, taxCounts);

// Memory deallocation
free(matchBuffer.buffer);
Expand Down Expand Up @@ -2074,12 +2077,20 @@ void Classifier::writeReadClassification(const vector<Query> & queryList, int qu
}
}

void Classifier::writeReportFile(const string &reportFileName, int numOfQuery, unordered_map<TaxID, unsigned int> &taxCnt) {
void Classifier::writeReportFile(const string &outdir, int numOfQuery, unordered_map<TaxID, unsigned int> &taxCnt) {
unordered_map<TaxID, TaxonCounts> cladeCounts = taxonomy->getCladeCounts(taxCnt);
FILE *fp;
fp = fopen(reportFileName.c_str(), "w");
fp = fopen((outdir + + "/" + jobId + "_report.tsv").c_str(), "w");
writeReport(fp, cladeCounts, numOfQuery);
fclose(fp);

// Write Krona chart
FILE *kronaFile = fopen((outDir + "/" + jobId + "_krona.html").c_str(), "w");
fwrite(krona_prelude_html, krona_prelude_html_len, sizeof(char), kronaFile);
fprintf(kronaFile, "<node name=\"all\"><magnitude><val>%zu</val></magnitude>", numOfQuery);
kronaReport(kronaFile, *taxonomy, cladeCounts, numOfQuery);
fprintf(kronaFile, "</node></krona></div></body></html>");

}

void Classifier::writeReport(FILE *FP, const std::unordered_map<TaxID, TaxonCounts> &cladeCounts,
Expand Down

0 comments on commit 58ed958

Please sign in to comment.