Requirements

Scripts and config file to make an sequence alignment from a bunch of GPCR protein sequences

See snooker_align.vsd for workflow.

Gapped alignment based on gpcrdb human swissprot alignment

Steps to get an alignment

Create numbering schema
Download human swissprot alignment csv from gpcrdb website
Convert csv to with only positions of numbering schema
Create fasta from csv
Run blast with query seed alignment against swissprot/trembl/ensembl 5.1 Make sure all seed sequences have been found
Retrieve sequences of ids
Make sequences unique within same species
Remove species with less than 100 sequences
Run per tm alignment script
Remove sequences less than 9aa different within same species
Remove species with less than 100 sequences
Make tree of sequences
Generate entropy file based on tree

See runs.md for commands to perform the steps.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
scripts		scripts
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
config.dat-example		config.dat-example
requirements.txt		requirements.txt
runs.md		runs.md
snooker_align.vsd		snooker_align.vsd