Skip to content

pbfrandsen/metazoa_assemblies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Metazoa Assemblies

This repository houses the scripts for the manuscript, "Towards a genome sequence for every animal: where are we now?"

dedup_list_tab.py takes as input genome assembly metadata harvested from NCBI through the NCBI datasets tool in the form of a csv. The CSV must be sorted first by taxid and second by contig N50. Then it will choose the assembly for each taxon with the longest contig N50. The script will also discover whether an annotation exists for that species on NCBI (for any assembly).

scrape_assembly_info.py is a web scraper based on Beautiful Soup that will scrape metadata that is not included in the standard NCBI datasets metadata. All that the script needs as input is an assembly accession number.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages