degen¶
Find genes where a sequence has degenerate bases.
How-to¶
- Usage:
- degen.py <fasta> <options>
Options:
--gb-id=<accession_id> Accession id for reference --gb-file=<gbfile> Local Genbank file for reference --tab-file=<tabfile> TSV/CSV file for reference with fields name,start,end
Example:¶
degen sequence.fasta --gb-id 12398.91 degen sequence.fasta --gb-file tests/testinput/sequence.gb degen sequence.fasta --tab-file tests/testinput/degen.tab degen sequence.fasta --tab-file tests/testinput/degen.csv
Output:¶
Gene name, degnerate position, degenerate base:
anchored capsid protein 85 R
anchored capsid protein 88 Y
membrane glycoprotein precursor 509 R
nonstructural protein NS5 8513 Y
nonstructural protein NS5 8514 Y
nonstructural protein NS5 8515 Y
anchored capsid protein 85 R
anchored capsid protein 88 Y
membrane glycoprotein precursor 509 R
nonstructural protein NS5 8513 Y
nonstructural protein NS5 8514 Y
nonstructural protein NS5 8515 Y
Gene/Tab File¶
degen.tab could look like:
genename start stop
foo 1 2
bar 9 33
The headers do not matter, but the start field must always come before the stop field, so the below example would also be valid:
start GENEName stop
1 foo 2
9 bar 33
or optionally without headers:
1 foo 2
9 bar 33
alternatively, with commas in place of tabs:
name,start,stop
foo,1,2
bar,9,33
You can also specify a coding region(CDS) in your file as well:
name,start,stop
CDS,3,33
foo,1,2
bar,9,33
Genbank File¶
As downloaded from NCBI’s entrez database. Use this option if you don’t have internet access.
An example
LOCUS KJ189367 10452 bp ss-RNA linear VRL 10-FEB-2014
DEFINITION Dengue virus 1 isolate DENV-1/PR/BID-V8188/2010, complete genome.
ACCESSION KJ189367
VERSION KJ189367.1 GI:582052497
DBLINK BioProject: PRJNA31235
KEYWORDS .
SOURCE Dengue virus 1
ORGANISM Dengue virus 1
Viruses; ssRNA viruses; ssRNA positive-strand viruses, no DNA
stage; Flaviviridae; Flavivirus; Dengue virus group.
REFERENCE 1 (bases 1 to 10452)
AUTHORS Zody,M.C., Newman,R.M., Henn,M., Munoz-Jordan,J., McElroy,K.L.,
Santiago,G., Poon,T.W., Charlebois,P., Weiner,B., Yang,X.,
Piper,M.E., Fitzgerald,M., McCowan,C., Young,S., Gargeya,S.,
Levin,J., Malboeuf,C., Qu,J., Ireland,A., Chapman,S.B., Murphy,C.,
Wortman,J., Nusbaum,C. and Birren,B.
CONSRTM Genome Resources in Dengue Consortium; The Broad Institute Genomics
Platform; The Broad Institute Genome Sequencing Center for
Infectious Disease; Centers for Disease Control and Prevention
Division of Vector Borne Infectious Diseases; CDC Dengue Branch
Puerto Rico
TITLE Direct Submission
JOURNAL Submitted (22-JAN-2014) Broad Institute of MIT & Harvard, 7
Cambridge Center, Cambridge, MA 02142, USA
COMMENT ##Assembly-Data-START##
Assembly Method :: Vicuna v. 1
Sequencing Technology :: Illumina
##Assembly-Data-END##
FEATURES Location/Qualifiers
source 1..10452
/organism="Dengue virus 1"
/mol_type="genomic RNA"
/isolate="DENV-1/PR/BID-V8188/2010"
/isolation_source="cell supernatant"
/host="Homo sapiens"
/db_xref="taxon:11053"
/country="Puerto Rico"
/collection_date="2010"
/note="cell passage history: C6/36 1; cohort population:
Dengue Surveillance;
type: 1"
5'UTR 1..83
/note="indels in UTR have not been validated"
CDS 84..10262
/codon_start=1
/product="polyprotein"
/protein_id="AHI43750.1"
/db_xref="GI:582052498"
/translation="MNNQRKKTGRPSFNMLKRARNRVSTGSQLAKRFSKGLLSGQGPM
KLVMAFIAFLRFLAIPPTAGILARWSSFKKNGAIKVLRGFKKEISSMLNIMNRRKRSV
TMLLMLLPTALAFHLTTRGGEPHMIVSKQERGKSLLFKTSAGVNMCTLIAMDLGELCE
DTMTYKCPRITEAEPDDVDCWCNATDTWVTYGTCSQTGEHRREKRSVALAPHVGLGLE
TRTETWMSSEGAWKQIQRVETWALRHPGFTVIAFFLAHAIGTSITQKGIIFILLMLVT
PSMAMRCVGIGNRDFVEGLSGATWVDVVLEHGSCVTTMAKNKPTLDIELLKTEVTNPA
VLRKLCIEAKISNTTTDSRCPTQGEATLVEEQDANFVCRRTFVDRGWGNGCGLFGKGS
LLTCAKFKCVTKLEGKIVQYENLKYSVIVTVHTGDQHQVGNETTEHGTIATITPQAPT
SEIQLTDYGALTLDCSPRTGLDFNEMVLLTMKEKSWLVHKQWFLDLPLPWTSGASTSQ
ETWNRQDLLVTFKTAHAKKQEVVVLGSQEGAMHTALTGATEIQTSGTTTIFAGHLKCR
LKMDKLTLKGMSYVMCTGSFKLEKEVAETQHGTVLVQVKYEGTDAPCKIPFSTQDEKG
VTQNGRLITANPIVTDKEKPVNIETEPPFGESYIVVGAGEKALKLSWFKRGSSIGKMF
EATARGARRMAILGDTAWDFGSIGGVFTSVGKLVHQIFGTAYGVLFSGVSWTMKIGIG
ILLTWLGLNSRSTSLSMTCIVVGMVTLYLGVMVQADSGCVINWKGRELKCGSGIFVTN
EVHTWTEQYKFQADSPKRLSAAIGKAWEEGVCGIRSATRLENIMWKQISNELNHILLE
NDMKFTVVVGDANGILAQGKKMIRPQPMEHKYSWKSWGKAKIIGADIQNTTFIIDGPD
TPECPDGQRAWNIWEVEDYGFGVFTTNIWLKLRDSYTQMCDHRLMSAAIKDSKAVHAD
MGYWIESEKNETWKLARASFIEVKTCTWPKSHTLWSNGVLESEMIIPKIYGGPISQHN
YRPGYFTQTAGPWHLGKLELDFDLCEGTTVVVDEHCGNRGPSLRTTTVTGKIIHEWCC
RSCTLPPLRFRGEDGCWYGMEIRPVKEKEENLVRSMVSAGSGEVDSFSLGILCVSIMI
EEVMRSRWSRKMLMTGTLAVFLLLIMGQLTWNDLIRLCIMVGANASDRMGMGTTYLAL
MATFKMRPMFAVGLLFRRLTSREVLLLTIGLSLVASVELPNSLEELGDGLAMGIMMLK
LLTEFQPHQLWTTLLSLTFVKTTLSLDYAWKTTAMALSIVSLFPLCLSTTSQKTTWLP
VLLGSFGCKPLTMFLITENKIWGRKSWPLNEGIMAIGIVSILLSSLLKNDVPLAGPLI
AGGMLIACYVISGSSADLSLEKAAEVSWEQEAEHSGASHSILVEVQDDGTMKIKDEER
DDTLTILLKATLLAVSGVYPMSIPATLFVWYFWQKKKQRSGVLWDTPSPPEVERAVLD
NGIYRILQRGLLGRSQVGVGVFQDGVFHTMWHVTRGAVLMYQGKRLEPSWASVKKDLI
SYGGGWRFQGSWNTGEEVQVIAVEPGKNPKNVQTTPGTFKTPEGEVGAIALDFKPGTS
GSPIVNREGKIVGLYGNGVVTTSGTYVSAIAQAKASQEGPLPEIEDEVFKKRNLTIMD
LHPGSGKTRRYLPAIVREAIKRKLRTLILAPTRVVASEMAEALKGMPIRYQTTAVKSE
HTGREIVDLMCHATFTMRLLSPVRVPNYNMIIMDEAHFTDPASIAARGYISTRVGMGE
AAAIFMTATPPGSVEAFPQSNAVIQDEERDIPERSWNSGYDWITDFPGKTVWFVPSIK
SGNDIANCLRKNGKRVIQLSRKTFDTEYQKTKNNDWDYVVTTDISEMGANFRADRVID
PRRCLKPVILKDGPERVILAGPMPVTAASAAQRRGRIGRNQNKEGDQYVYMGQPLNND
EDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRLRGEARKTFVELMRRG
DLPVWLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVEIWTKEGERKKLRPRWLDA
RTYSDPLALREFKEFAAGRRSVSGDLILEIGKLPQHLTLRAQNALDNLVMLHNSEQGG
KAYRHAMEELPDTIETLMLLALIAVLTGGVTLFFLSGKGLGKTSIGLLCVTASSALLW
MASVEPHWIAASIILEFFLMVLLIPEPDRQRTPQDNQLAYVVIGLLFMILTVAANEMG
LLETTKKDLGIGYVAAENHQHATMLDVDLHPASAWTLYAVATTVITPMMRHTIENTTA
NISLTAIANQAAILMGLDKGWPISKMDIGVPLLALGCYSQVNPLTLTAAVLMLVAHYA
IIGPGLQAKATREAQKRTAAGIMKNPTVDGIVAIDLDPVVYDAKFEKQLGQIMLLILC
TSQILLMRTTWALCESITLATGPLTTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGL
AFSLMKSLGGGRRGTGAQGETLGEKWKRQLNQLSKSEFNTYKRSGIMEVDRSEAKEGL
KRGETTKHAVSRGTAKLRWFVERNLVKPEGKVIDLGCGRGGWSYYCAGLKKVTEVKGY
TKGGPGHEEPIPMATYGWNLVKLHSGKDVFFMPPEKCDTLLCDIGESSPNPTIEEGRT
LRVLKMVEPWLRGNQFCIKILNPYMPSVVETLERMQRKHGGMLVRNPLSRNSTHEMYW
VSCGTGNIVSAVNMTSRMLLNRFTMAHRKPTYERDVDLGAGTRHVAVEPEVANLDIIG
QRIENIKNEHKSTWHYDEDNPYKTWAYHGSYEVKPSGSASSMVNGVVRLLTKPWDVIP
MVTQIAMTDTTPFGQQRVFKEKVDTRTPRAKRGTTQIMEVTAKWLWGFLSRNKKPRIC
TREEFTRKVRSNAAIGAVFVDENQWNSAKEAVEDERFWDLVHRERELHKQGKCATCVY
NMMGKREKKLGEFGKAKGSRAIWYMWLGARFLEFEALGFMNEDHWFSRENSLSGVEGE
GLHKLGYILRDISKIPGGNMYADDTAGWDTRVTEDDLQNEAKITDIMEPEHALLATSI
FKLTYQNKVVRVQRPAKNGTVMDVISRRDQRGSGQVGTYGLNTFTNMEVQLIRQMESE
GIFLPSELETPNLAERALDWLEKHGAERLKRMAISGDDCVVKPIDDRFATALTALNDM
GKVRKDIPQWEPSKGWNDWQQVPFCSHHFHQLIMKDGREIVVPCRNQDELVGRARVSQ
GAGWSLRETACLGKSYAQMWQLMYFHRRDLRLAANAICSAVPVDWVPTSRTTWSIHAH
HQWMTTEDMLSVWNRVWIDENPWMENKTHVSSWEEVPYLGKREDQWCGSLIGLTARAT
WATNIQVAINQVRRLIGNENYLDYMTSMKRFKNESDSEGALW"
mat_peptide 84..425
/product="anchored capsid protein"
mat_peptide 426..923
/product="membrane glycoprotein precursor"
mat_peptide 924..2408
/product="envelope protein"
mat_peptide 2409..3464
/product="nonstructural protein NS1"
mat_peptide 3465..4118
/product="nonstructural protein NS2A"
mat_peptide 4119..4508
/product="nonstructural protein NS2B"
mat_peptide 4509..6365
/product="nonstructural protein NS3"
mat_peptide 6366..6746
/product="nonstructural protein NS4A"
mat_peptide 6747..6815
/product="2K peptide"
mat_peptide 6816..7562
/product="nonstructural protein NS4B"
mat_peptide 7563..10259
/product="nonstructural protein NS5"
3'UTR 10263..10452
/note="indels in UTR have not been validated"
ORIGIN
1 catctggacc gacaagaaca gtttcgaatc ggaagcttgc ttaacgtagt tctaacagtt
61 ttttattaga gagcagatct ctgatgaaca accaacggaa aaagacgggt cgaccgtctt
121 tcaatatgct gaaacgcgcg agaaaccgcg tgtcaactgg ttcacagttg gcgaagagat
181 tctcaaaagg attgctttca ggccaaggac ccatgaaatt ggtgatggct ttcatagcat
241 ttctaagatt tctagccata cccccaacag caggaatttt ggctagatgg agctcattca
301 agaagaatgg agcaattaaa gtgttacggg gtttcaaaaa agagatctca agcatgttga
361 acataatgaa caggaggaaa agatccgtga ccatgctcct catgctgctg cccacagccc
421 tggcgtttca tttgaccaca cgagggggag agccacacat gatagttagt aagcaggaaa
481 gaggaaagtc actcttgttt aagacctctg cgggcgtcaa tatgtgcacc ctcattgcga
541 tggacttggg agagttatgt gaggacacaa tgacctacaa atgcccccgg atcactgagg
601 cggaaccaga tgacgttgac tgctggtgca atgccacaga cacatgggtg acctatggga
661 cgtgttctca aaccggcgaa caccgacgag agaaacgttc cgtggcactg gccccacacg
721 tgggacttgg tctagaaaca agaaccgaaa catggatgtc ctctgaaggc gcctggaaac
781 aaatacaaag agtggaaact tgggctttga gacacccagg attcacggtg atagcctttt
841 ttttagcaca tgctatagga acatccatca ctcagaaagg gatcattttc atcttgctga
901 tgctggtgac accatcaatg gccatgcgat gcgtgggaat aggcaacaga gacttcgttg
961 aaggactgtc aggagcaacg tgggtggacg tggtactgga gcacggaagc tgcgtcacca
1021 ccatggcaaa aaataaacca acattggaca ttgaactctt gaagacggag gtcacgaacc
1081 ctgccgtctt gcgcaaactg tgcattgaag ctaaaatatc aaacaccacc accgattcaa
1141 gatgtccaac acaaggagag gccacactgg tggaagaaca agacgcgaac tttgtgtgtc
1201 gccgaacgtt tgtggacaga ggctggggta atggctgcgg actattcgga aagggaagtc
1261 tattgacgtg tgccaagttc aagtgtgtga caaaactaga aggaaagata gttcaatatg
1321 aaaacctaaa atattcagtg atagtcactg tccacactgg ggaccagcac caggtgggaa
1381 acgagaccac agaacatgga acaattgcaa ccataacacc tcaagctccc acgtcggaaa
1441 tacagctgac cgactacgga gccctcacac tggactgctc acctagaaca gggctggact
1501 ttaatgagat ggtgctattg acaatgaaag aaaaatcatg gcttgtccac aaacaatggt
1561 ttctagactt gccactgcca tggacttcgg gggcttcaac atcccaagag acctggaaca
1621 gacaagattt gctggtcaca ttcaagacag ctcatgcaaa gaaacaggaa gtagtcgtat
1681 tgggatcaca ggaaggagca atgcatactg cgttgactgg ggcgacagaa atccagacgt
1741 caggaacgac aacaatcttc gcaggacacc tgaaatgcag actaaaaatg gataaactga
1801 ccttaaaggg gatgtcatat gtgatgtgca caggctcatt taagctagag aaggaagtgg
1861 ctgagaccca gcatggaact gttctagtgc aggtcaaata tgaaggaaca gacgcgccat
1921 gcaagatccc cttttcgacc caagatgaga aaggagtgac ccagaatggg agattgataa
1981 cagccaatcc catagttact gacaaagaaa aaccagtcaa cattgagaca gaaccacctt
2041 ttggtgagag ctacatcgtg gtaggggcag gcgaaaaagc tttgaaacta agctggttca
2101 agagaggaag cagcataggg aaaatgttcg aagcaaccgc ccgaggagca cgaaggatgg
2161 ctatcctggg agacaccgca tgggacttcg gttctatagg aggagtgttt acatctgtgg
2221 gaaaattggt acaccagatt tttggaaccg catatggggt tctgtttagc ggtgtttctt
2281 ggaccatgaa aataggaata gggattctgc tgacatggtt gggattaaat tcaaggagca
2341 cgtcactttc gatgacgtgc attgtagttg gcatggtcac actgtaccta ggagtcatgg
2401 ttcaagcgga ttcgggatgt gtgatcaact ggaagggcag agaacttaaa tgcggaagtg
2461 gcatttttgt cactaatgaa gtccacactt ggacagagca atacaaattc caggctgact
2521 ccccaaaaag actgtcagca gccattggaa aggcgtggga ggagggcgtg tgtggaattc
2581 gatcagccac gcgtcttgag aacatcatgt ggaagcagat atcaaatgaa ttgaaccaca
2641 ttttacttga gaatgacatg aaattcacag tggttgtagg agatgccaac ggaattttgg
2701 cccaaggaaa aaaaatgatt aggccacaac ccatggaaca caaatactca tggaaaagct
2761 ggggaaaagc taaaatcata ggagcagaca tacaaaatac caccttcatt atcgacggcc
2821 cagacacccc agaatgtcct gatggccaaa gagcatggaa catttgggaa gttgaggact
2881 atgggtttgg agttttcacg acaaacatat ggctgaaatt gcgtgactcc tacacccaaa
2941 tgtgtgacca ccggctaatg tcagctgcca tcaaggacag caaggcagtc catgctgaca
3001 tggggtactg gatagaaagt gaaaagaacg aaacctggaa gttggcgaga gcctccttca
3061 tagaagtcaa aacatgcacc tggccgaaat ctcacactct atggagcaat ggagttttgg
3121 aaagtgaaat gataatccca aagatatatg gaggaccaat atctcagcac aactacagac
3181 cagggtattt cacacaaaca gcagggccat ggcacctagg taagttggaa ctggattttg
3241 acttgtgtga aggcaccaca gttgttgtgg atgaacattg tggaaatcga ggtccatctc
3301 tcagaaccac aacagtcaca ggaaagataa tccatgaatg gtgttgcaga tcctgcacgc
3361 tacccccctt acgtttcaga ggagaagacg ggtgttggta tggcatggaa atcagaccag
3421 tgaaggagaa ggaggagaat ctagttaggt caatggtctc tgcagggtca ggagaagtgg
3481 acagtttttc attaggaata ctatgcgtat caataatgat tgaagaagtg atgagatcca
3541 gatggagtag aaagatgctg atgactggaa cactggctgt cttcctcctt cttataatgg
3601 gacaactgac atggaatgat ctgattaggt tatgcatcat ggtcggagct aacgcttcag
3661 acaggatggg gatgggaaca acgtacctag ccttgatggc tactttcaaa atgagaccaa
3721 tgttcgctgt agggctatta ttccgcagac taacatccag agaagttctt ctcctaacga
3781 ttggattaag cctggtggca tccgtggagc taccaaattc cttggaggag ctaggggatg
3841 gacttgcaat gggtatcatg atgttaaaat tgttgactga atttcagcca caccagttat
3901 ggaccacctt attgtctctg acatttgtca aaacaactct ctcattggat tatgcatgga
3961 aaacaacggc tatggcactg tctatcgtat ctctctttcc tttatgcctg tctacgacct
4021 cccaaaaaac aacatggctt ccggtgctgt taggatcttt tggatgcaaa ccattaacca
4081 tgtttcttat aacagaaaat aaaatctggg gaaggaaaag ttggcccctc aatgaaggaa
4141 ttatggctat tggaatagtc agcattctac taagctcact cctcaaaaat gatgtgccgt
4201 tggccgggcc attaatagct ggaggcatgc taatagcatg ttatgtcata tccggtagct
4261 cagccgattt atcattggag aaagcggctg aagtatcctg ggaacaagaa gcagaacact
4321 ccggtgcctc acacagcata ttagtagagg tccaagatga tggaactatg aaaataaaag
4381 atgaagagag ggatgacaca ctcaccatac tccttaaagc aactttgctg gcagtctcag
4441 gagtgtaccc aatgtcaata ccagcaactc tttttgtgtg gtatttttgg cagaaaaaga
4501 aacagagatc aggagtgtta tgggacacac ccagccctcc ggaagtggaa agagcagttc
4561 ttgataatgg catctataga atcttgcaaa gaggattgtt gggcaggtcc caagtaggag
4621 tgggagtttt ccaagacggc gtgttccaca caatgtggca cgttaccagg ggagctgtcc
4681 ttatgtacca agggaagaga ctggaaccaa gctgggccag tgtgaaaaag gacttgatct
4741 catatggagg aggttggagg ttccaaggat catggaacac gggagaagaa gtgcaggtaa
4801 tagctgttga accaggaaaa aaccccaaaa atgtacagac aacgccgggc acctttaaga
4861 ctcctgaagg cgaagttgga gccatagctc tagatttcaa acccggcaca tctggatctc
4921 ccatcgtgaa cagagaggga aaaatagtgg gtctgtatgg aaatggagtg gtgacaacaa
4981 gtggaaccta cgtcagtgcc attgcccaag ctaaagcatc acaggaaggg cctctaccag
5041 agattgagga cgaggtattt aagaaaagaa acttaacaat aatggacctg cacccaggat
5101 cagggaaaac aagaagatat cttccagcca tagtccgtga ggccataaaa aggaaactgc
5161 gtacgttaat cctggctccc acaagagttg tcgcctctga aatggcagag gcactcaagg
5221 gaatgccaat aagatatcag acaacagcag tgaagagtga acacacagga agggagatag
5281 ttgacctcat gtgccacgct acttttacca tgcgtctctt atccccagtg agagttccca
5341 attacaacat gatcattatg gatgaagcac attttaccga tccagctagc atagcggcca
5401 gagggtacat ctcaacccga gtgggtatgg gtgaagcagc tgcgatcttt atgacagcca
5461 ctcccccagg atcggtggag gcctttccac agagcaatgc agttatccaa gatgaggaaa
5521 gagacattcc tgagagatca tggaactcag gctacgactg gatcactgac tttccaggta
5581 aaacagtctg gtttgttcca agcattaaat caggaaatga cattgccaac tgtttaagaa
5641 agaacggaaa acgggtaatc caattgagca gaaaaacctt tgacactgag taccagaaaa
5701 caaaaaacaa tgactgggac tatgttgtca caacagacat ttctgaaatg ggggcaaatt
5761 tccgggccga cagggtaata gacccaaggc ggtgcttgaa accggtaata ctaaaagatg
5821 gtccagagcg tgtcattcta gccggaccga tgccagtgac tgcggccagt gctgcccaga
5881 ggagaggaag aattggaagg aaccaaaaca aggaaggtga tcagtatgtt tatatgggac
5941 agcctttaaa taatgatgag gatcacgctc attggacaga agcaaaaatg ctccttgaca
6001 atataaacac accagaaggg atcatcccag ccctttttga gccagagaga gaaaagagtg
6061 cagcaataga cggggagtac agactgcggg gagaagcaag gaaaacgttc gtggagctca
6121 tgagaagagg agatctacca gtttggctat cctacaaagt agcctcagaa ggtttccagt
6181 actccgacag aaggtggtgc tttgatgggg aaaggaacaa ccaggtgttg gaggagaaca
6241 tggacgtgga gatctggaca aaggaaggag aaagaaagaa attgcgacct cgctggttgg
6301 acgccagaac atactctgat ccattggccc tgcgcgagtt taaagagttc gcagcaggaa
6361 gaagaagtgt ctcaggtgac ctgatattgg aaatagggaa acttccacaa catttgacgt
6421 taagagccca gaatgctctg gacaacttgg tcatgttgca caattccgaa caaggaggaa
6481 aagcctacag acatgccatg gaggaactac cagacaccat agaaacattg atgctactag
6541 ctttgatagc tgtgttgact ggtggagtga cgctgttctt cctatcagga aaaggcctag
6601 ggaaaacatc cattggcttg ctctgtgtga cggcctcaag cgcactgtta tggatggcca
6661 gtgtggagcc ccattggata gcggcctcca tcatactaga gttctttttg atggtgctgc
6721 tcattccaga gccagacaga cagcgcactc cacaggacaa ccagctagca tatgtggtga
6781 taggtttgtt attcatgata ctgacagtgg cagccaatga gatgggatta ttggaaacca
6841 caaagaaaga cctggggatt ggctatgtag ccgccgaaaa ccaccaacat gccacaatgc
6901 tggacgtaga cctacaccca gcttcagcct ggaccctcta tgcagtagcc acaacagtca
6961 tcactcccat gatgagacac acaattgaaa atacaacggc aaacatttcc ctgaccgcca
7021 ttgcaaatca ggcagctata ttgatgggac ttgacaaggg atggccaata tcgaagatgg
7081 acataggagt tccacttctc gccttagggt gctattccca ggtgaaccca ttgacactga
7141 cagcggcggt gttgatgtta gtggctcatt atgccataat tggaccagga ctgcaagcaa
7201 aggccactag agaagcccaa aaaaggacag cagccggaat aatgaaaaat ccaaccgtag
7261 acgggattgt tgcaatagac ttggatcctg tggtttatga tgcaaaattt gaaaaacaac
7321 taggccaaat aatgttactg atactttgta catcacagat cctcttgatg cggaccacat
7381 gggccttgtg tgaatccatc acactggcta ctggacccct gaccactctc tgggagggat
7441 ctccaggaaa attctggaat accacaatag cagtgtccat ggcaaatatt ttcaggggaa
7501 gttatctagc aggagcaggt ctggctttct cattgatgaa atctttagga ggaggtagga
7561 gaggcacggg agctcaaggg gaaacactgg gagagaaatg gaaaagacag ttgaaccaac
7621 tgagcaagtc agaattcaac acctacaaaa ggagtgggat tatggaggtg gacagatccg
7681 aagccaaaga gggactgaaa agaggagaaa caaccaaaca tgcagtgtca agaggaacag
7741 ccaaactgag gtggtttgtg gagaggaacc tcgtgaaacc agaaggaaaa gtcatagacc
7801 tcggttgtgg aagaggtggc tggtcatatt attgtgctgg gctgaagaaa gttactgaag
7861 tgaagggata cacaaaagga ggacctggac atgaggaacc tatcccaatg gcgacctatg
7921 gatggaacct agtaaaacta cactctggaa aggatgtatt ttttatgcca cctgagaaat
7981 gtgacactct tctgtgtgat attggtgagt cctctccgaa tccaactata gaagaaggaa
8041 gaacgttacg tgttctaaaa atggtggaac catggctcag aggaaaccaa ttctgcataa
8101 aaatcctaaa tccttacatg ccaagtgtgg tagaaactct ggagcgaatg caaagaaaac
8161 atggagggat gctagtgcga aacccactct caagaaattc tacccatgaa atgtattggg
8221 tttcatgtgg aacaggaaac attgtgtcgg cagtgaacat gacatccaga atgttactga
8281 accgattcac aatggctcac aggaagccaa catatgaaag agacgtggac ttaggcgctg
8341 gaacaagaca tgtggcagtg gaaccagagg tagccaacct agatatcatt ggccagagga
8401 tagaaaatat aaaaaatgaa cacaagtcaa catggcatta tgatgaggac aatccataca
8461 aaacatgggc ctatcatgga tcatatgagg tcaagccatc aggatcagcc tcatctatgg
8521 tgaatggagt ggtgagattg ctcacgaaac catgggatgt catccccatg gtcacacaaa
8581 tagctatgac tgataccaca ccctttggac aacagagagt gtttaaagag aaagttgaca
8641 cgcgcacacc aagagcaaaa cgaggcacaa cacagattat ggaggtgaca gccaagtggt
8701 tatggggttt cctttccaga aacaaaaaac ccagaatctg cacaagagag gagttcacaa
8761 gaaaggttag gtcaaacgcg gcaataggag cagtgttcgt tgatgaaaac caatggaact
8821 cagcaaaaga agcagtggaa gacgaaaggt tttgggatct tgtgcacaga gagagggagc
8881 ttcataaaca gggaaaatgt gccacgtgtg tctacaacat gatggggaag agagagaaaa
8941 aattaggaga gtttggaaag gcaaaaggaa gtcgtgcaat atggtacatg tggctgggag
9001 cacgctttct ggagttcgaa gcccttggtt ttatgaatga agatcactgg tttagtagag
9061 agaattcact cagtggagtg gaaggagaag gactgcacaa acttggatac atactcagag
9121 acatatcaaa gattccgggg ggaaatatgt atgcagatga tacagccgga tgggacacaa
9181 gagtaacaga ggatgacctc cagaatgagg ctaaaatcac tgacatcatg gagcctgaac
9241 atgctctatt ggctacgtca atttttaagc tgacttatca aaacaaggtg gtgagggtgc
9301 aaagaccagc aaaaaatgga accgtgatgg atgttatatc cagacgtgat cagagaggga
9361 gtggacaggt cggaacttat ggcttaaata ctttcaccaa tatggaggtc caactaataa
9421 gacaaatgga gtctgaggga atctttttac ccagcgaatt ggaaaccccc aacctagctg
9481 agagggctct tgactggtta gaaaaacatg gcgccgaaag gctgaaacga atggcaatca
9541 gcggagatga ttgcgtggtg aaaccaattg acgacaggtt cgcaacagcc ttaacagctc
9601 tgaatgacat gggaaaagta aggaaagaca taccgcagtg ggaaccttca aaaggatgga
9661 atgattggca gcaagtgcct ttttgttcac accatttcca ccaactgatc atgaaggatg
9721 ggagggaaat agtggtgcca tgccgcaacc aagatgaact tgtgggcagg gctagagtat
9781 cacaaggcgc cggatggagc ctgagagaaa ctgcttgcct aggcaagtca tatgcacaaa
9841 tgtggcagct gatgtacttc cacaggagag acctgagact agcggctaac gctatctgtt
9901 cagccgtccc agttgattgg gtcccaacca gccgcacaac ctggtcaatc catgcccacc
9961 accaatggat gacaacagaa gacatgttat cagtgtggaa tagggtttgg atagacgaaa
10021 acccatggat ggagaacaaa actcatgtat ccagttggga agaagttcca tacctaggaa
10081 aaagggaaga tcaatggtgt ggatccctga taggcttgac agcgagggcc acctgggcca
10141 ccaacataca agtagccata aaccaagtga gaaggctcat cgggaatgag aattatttag
10201 attacatgac atcaatgaag agattcaaga atgagagtga ttccgaagga gcactctggt
10261 aagtcaacac actcatgaaa taaaggaaaa tagaagatca aacaaagtaa gaagtcaggc
10321 cagattaagc catagcacgg aaagagctat gctgcctgtg agccccgtcc aaggacgtaa
10381 aatgaagtca ggccgaaagc cacggattga gcaagccgtg ctgcctgtgg ctccatcgtg
10441 gggatgtagc tc
//