Below is the code you should add/work with. It is the outline
for how the code should look like, and it also includes the
dictionary to use in step 6.
I have copied and pasted the following code from above for easy
use:
codon_map =
{'UUU':'Phe','UUC':'Phe','UUA':'Leu','UUG':'Leu','CUU':'Leu',
'CUC':'Leu','CUA':'Leu','CUG':'Leu','AUU':'Ile','AUC':'Ile',
'AUA':'Ile','AUG':'Met','GUU':'Val','GUC':'Val','GUA':'Val',
'GUG':'Val','UCU':'Ser','UCC':'Ser','UCA':'Ser','UCG':'Ser',
'CCU':'Pro','CCC':'Pro','CCA':'Pro','CCG':'Pro','ACU':'Thr',
'ACC':'Thr','ACA':'Thr','ACG':'Thr','GCU':'Ala','GCC':'Ala',
'GCA':'Ala','GCG':'Ala','UAU':'Tyr','UAC':'Tyr','UAA':'STOP',
'UAG':'STOP','CAU':'His','CAC':'His','CAA':'Gln','CAG':'Gln',
'AAU':'Asn','AAC':'Asn','AAA':'Lys','AAG':'Lys','GAU':'Asp',
'GAC':'Asp','GAA':'Glu','GAG':'Glu','UGU':'Cys','UGC':'Cys',
'UGA':'STOP','UGG':'Trp','CGU':'Arg','CGC':'Arg','CGA':'Arg',
'CGG':'Arg','AGU':'Ser','AGC':'Ser','AGA':'Arg','AGG':'Arg',
'GGU':'Gly','GGC':'Gly','GGA':'Gly','GGG':'Gly'}
class Sequence:
"""A class to contain sequences"""
def __init__(self, id, seq):
"""Initialize a sequence object with an id and a sequence"""
self.id = id
self.seq = seq.upper()
def __str__(self):
"""Print out your sequence and id in a FASTA format"""
return self.id + '\n' + self.seq
def gc(self):
"""A method to calculate GC content of a sequence"""
# We count the number Cs and Gs and divide by total length
return (self.seq.count('G') +
self.seq.count('C'))/len(self.seq)
def rev_comp(self):
"""A method to return the reverse complement of a DNA
sequence"""
# We make a dictionary of DNA bases and their complements
revdict = {'A':'T', 'C':'G', 'G':'C', 'T':'A'}
# We reverse the string using slicing with a negative step
value
rev = self.seq[::-1]
# We use a list comprehension to build a list of the complements of
each letter in rev using revdict
revcomp = [revdict[letter] for letter in rev]
# Now we rejoin the list of letters into a string and return
it
revcomp = ''.join(revcomp)
return revcomp
def to_rna(self):
"""A method to convert a DNA sequence to RNA"""
rna = self.seq.replace('T', 'U')
return rna
def start(self):
"""A method that returns whether or not a sequence contains a start
codon"""
return 'ATG' in self.seq
def gene_translation(self):
"""A method that, if a Sequence object contains a start codon,
returns the translated
protein sequence, starting with the start codon and ending at the
stop codon."""
# Put your code here
if self.start():
pass # Of course, delete pass once you have code to put here
else:
pass
def main():
# Put your code here to open cyto_pol.fasta, read the sequence from
it, and create a Sequence object
# Then call the gene_translation() method on that Sequence object
to print the protein sequence
# to the terminal.
pass # Of course, delete pass once you have sequence to put
here
THIS IS ALL THE INFORMATION I HAVE. IF YOU NEED MORE
INFORMATION, LET ME KNOW WHAT YOU NEED, DO NOT JUST SAY THAT IT
NEEDS MORE INFORMATION.
PLEASE SOLVE THIS ACCORDINGLY. SOMEONE HAS ATTEMPTED BUT
UNFORTUNATELY IT WAS THE WRONG ANSWER. DO NOT COPY AND PASTE THAT
SOLUTION, I WILL CONTACT answers.
THANK YOU!!
Create a new .py file named HW_12.py that contains just the class definition for Sequence(), but none of the functions from the Lecture 20 Workshop Add a new method to the Sequence class called gene_translation() that: 1) Calls the start() method to identify whether a start codon is present in a DNA sequence 2) If 'start() returns True, converts a DNA sequence to RNA by calling the torna() method. If 'start() returns False, then the method prints "Sorry, no gene in this sequence' and stops running. 3) Slices the RNA sequence to begin at the start codon 'AUG'. 4) Generates a list of codons beginning with 'AUG' and continuing every three nucleotides until the end of the sequence. So the sequence "AUGAGGACC would generate the list [AUG', 'AGG', 'ACC"). 5) Slices that list to contain everything from index 0 through the first occurrence of one of the following three stop codons: "UAG", "UAA", or "UGA" **Hint:** There are lot of ways you can find the first stop codon. I would probably use a list comprehension to make a list of all indices where those stop codons appear: '[i for i in range(len(mylist)) if mylist in ['UAG', 'UAA', 'UGA']]' and slice my list up to and including the first index number in the resulting list. 6) Makes use of the dictionary given below to translate each triplet into the corresponding amino acid sequence and prints the translation to the terminal as a single string. self.start() True False self.to_rna) print('Sorry, no gene here.') slice sequence to start at AUG Split into a list of codon strings ['AUG', 'CCG', ....) Find index of first stop codon. Slice list to be: ['AUG', ...., [stop codon here]] Translate with dictionary and print translation This method should require no arguments aside from self and returns nothing (just prints the amino acid sequence to the terminal. After the end of the class definition, create a 'main() function that reads the file cyto_pol.fasta (containing the DNA sequence for the Human Cytomegalovirus polymerase gene), extracting the sequence id and the sequence, and then instantiates a Sequence object. Note that the sequence spans multiple lines of the fasta file, but the file only contains the one sequence entry "main() should then call gene_translation()' on that Sequence object.
Your terminal output should look like this: Met Phe Phe Asn Pro Tyr Leu Ser Gly Gly Val Thr Gly Gly Ala Val Ala Gly Gly Arg Arg Gin Arg Ser Gin Pro Gly Ser Ala Gln Gly Ser Gly Lys Arg Pro Pro Gin Lys Gin Phe Leu Gin Ile Val Pro Arg Gly Val Met Phe Asp Gly Gin Thr Gly Leu Ile Lys His Lys Thr Gly Arg Leu Pro Leu Met Phe Tyr Arg Glu Ile Lys His Leu Leu Ser His Asp Met Val Trp Pro Cy's Pro Trp Arg Glu Thr Leu Val Gly Arg Val Val Gly Pro Ile Arg Phe His Thr Tyr Asp Gin Thr Asp Ala Val Leu Phe Phe Asp Ser Pro Glu Asn Val Ser Pro Arg Tyr Arg Gin His Leu Val Pro Ser Gly Asn Val Leu Arg Phe Phe Gly Ala Thr Glu His Gly Tyr Ser Ile Cys Val Asn Val Phe Gly Gin Arg Ser Tyr Phe Tyr Cys Glu Tyr Ser Asp Thr Asp Arg Leu Arg Glu Val Ile Ala Ser Val Gly Glu Leu Val Pro Glu Pro Arg Thr Pro Tyr Ala Val Ser Val Thr Pro Ala Thr Lys Thr Ser Ile Tyr Gly Tyr Gly Thr Arg Pro Val Pro Asp Leu Gin Cys Val Ser Ile Ser Asn Trp Thr Met Ala Arg Lys Ile Gly Glu Tyr Leu Leu Glu Gin Gly Phe Pro Val Tyr Glu Val Arg Val Asp Pro Leu Thr Arg Leu Val Ile Asp Arg Arg Ile Thr Thr Phe Gly Trp Cys Ser Val Asn Arg Tyr Asp Trp Arg Gin Gin Gly Arg Ala Ser Thr Cys Asp Ile Glu Val Asp Cys Asp Val Ser Asp Leu Val Ala Val Pro Asp Asp Ser Ser Trp Pro Arg Tyr Arg Cys Leu Ser Phe Asp Ile Glu Cys Met Ser Gly Glu Gly Gly Phe Pro Cys Ala Glu Lys Ser Asp Asp Ile Val Ile Gln Ile Ser Cys Val Cys Tyr Glu Thr Gly Gly Asn Thr Ala Val Asp Gln Gly Ile Pro Asn Gly Asn Asp Gly Arg Gly Cys Thr Ser Glu Gly Val Ile Phe Gly His Ser Gly Leu His Leu Phe Thr Ile Gly Thr Cys Gly Gin Val Gly Pro Asp Val Asp Val Tyr Glu Phe Pro Ser Glu Tyr Glu Leu Leu Leu Gly Phe Met Leu Phe Phe Gln Arg Tyr Ala Pro Ala Phe Val Thr Gly Tyr Asn Ile Asn Ser Phe Asp Leu Lys Tyr Ile Leu Thr Arg Leu Glu Tyr Leu Tyr Lys Val Asp Ser Gin Arg Phe Cys Lys Leu Pro Thr Ala Gin Gly Gly Arg Phe Phe Leu His Ser Pro Ala Val Gly Phe Lys Arg Gin Tyr Ala Ala Ala Phe Pro Ser Ala Ser His Asn Asn Pro Ala Ser Thr Ala Ala Thr Lys Val Tyr Ile Ala Gly Ser Val Val Ile Asp Met Tyr Pro Val Cys Met Ala Lys Thr Asn Ser Pro Asn Tyr Lys Leu Asn Thr Met Ala Glu Leu Tyr Leu Arg Gln Arg Lys Asp Asp Leu Ser Tyr Lys Asp Ile Pro Arg Cys Phe Val Ala Asn Ala Glu Gly Arg Ala Gin Val Gly Arg Tyr Cys Leu Gin Asp Ala Val Leu Val Arg Asp Leu Phe Asn Thr Ile Asn Phe His Tyr Glu Ala Gly Ala Ile Ala Arg Leu Ala Lys Ile Pro Leu Arg Arg Val Ile Phe Asp Gly Gin Gin Ile Arg Ile Tyr Thr Ser Leu Leu Asp Glu Cys Ala Cys Arg Asp Phe Ile Leu Pro Asn His Tyr Ser Lys Gly Thr Thr Val Pro Glu Thr Asn Ser Val Ala Val Ser Pro Asn Ala Ala Ile Ile Ser Thr Ala Ala Val Pro Gly Asp Ala Gly Ser Val Ala Ala Met Phe Gln Met Ser Pro Pro Leu Gin Ser Ala Pro Ser Ser Gin Asp Gly Val Ser Pro Gly Ser Gly Ser Asn Ser Ser Ser Ser Val Gly Val Phe Ser Val Gly Ser Gly Ser Ser Gly Gly Val Gly Val Ser Asn Asp Asn His Gly Ala Gly Gly Thr Ala Ala Val Ser Tyr Gln Gly Ala Thr Val Phe Glu Pro Glu Val Gly Tyr Tyr Asn Asp Pro Val Ala Val Phe Asp Phe Ala Ser Leu Tyr Pro Ser Ile Ile Met Ala His Asn Leu Cys Tyr Ser Thr Leu Leu Val Pro Gly Gly Glu Tyr Pro Val Asp Pro Ala Asp Val Tyr Ser Val Thr Leu Glu Asn Gly Val Thr His Arg Phe Val Arg Ala Ser Val Arg Val Ser Val Leu Ser Glu Leu Leu Asn Lys Trp Val Ser Gin Arg Arg Ala Val Arg Glu Cys Met Arg Glu Cys Gin Asp Pro Val Arg Arg Met Leu Leu Asp Lys Glu Gin Met Ala Leu Lys Val Thr cys Asn Ala Phe Tyr Gly Phe Thr Gly Val Val Asn Gly Met Met Pro Cys Leu Pro Ile Ala Ala Ser Ile Thr Arg Ile Gly Arg Asp Met Leu Glu Arg Thr Ala Arg Phe Ile Lys Asp Asn Phe Ser Glu Pro Cys Phe Leu His Asn Phe Phe Asn Gin Glu Asp Tyr Val Val Gly Thr Arg Glu Gly Asp Ser Glu Glu Ser Ser Ala Leu Pro Glu Gly Leu Glu Thr Ser Ser Gly Gly Ser Asn Glu Arg Arg Val Glu Ala Arg Val Ile Tyr Gly Asp Thr Asp Ser Val Phe Val Arg Phe Arg Gly Leu Thr Pro Gin Ala Leu Val Ala Arg Gly Pro Ser Leu Ala His Tyr Val Thr Ala Cys Leu Phe Val Glu Pro Val Lys Leu Glu Phe Glu Lys Val Phe Val Ser Leu Met Met Ile Cys Lys Lys Arg Tyr Ile Gly Lys Val Glu Gly Ala Ser Gly Leu Ser Met Lys Gly Val Asp Leu Val Arg Lys Thr Ala Cys Glu Phe Val Lys Gly Val Thr Arg Asp Val Leu Ser Leu Leu Phe Glu Asp Arg Glu Val Ser Glu Ala Ala Val Arg Leu Ser Arg Leu Ser Leu Asp Glu Val Lys Lys Tyr Gly Val Pro Arg Gly Phe Trp Arg Ile Leu Arg Arg Leu Val Gln Ala Arg Asp Asp Leu Tyr Leu His Arg Val Arg Val Glu Asp Leu Val Leu Ser Ser Val Leu Ser Lys Asp Ile Ser Leu Tyr Arg Gin Ser Asn Leu Pro His Ile Ala Val Ile Lys Arg Leu Ala Ala Arg Ser Glu Glu Leu Pro Ser Val Gly Asp Arg Val Phe Tyr Val Leu Thr Ala Pro Gly Val Arg Thr Ala Pro Gin Gly Ser Ser Asp Asn Gly Asp Ser Val Thr Ala Gly Val Val Ser Arg Ser Asp Ala Ile Asp Gly Thr Asp Asp Asp Ala Asp Gly Gly Gly Val Glu Glu Ser Asn Arg Arg Gly Gly Glu Pro Ala Lys Lys Arg Ala Arg Lys Pro Pro Ser Ala Val Cys Asn Tyr Glu Val Ala Glu Asp Pro Ser Tyr Val Arg Glu His Gly Val Pro Ile His Ala Asp Lys Tyr Phe Glu Gin Val Leu Lys Ala Val Thr Asn Val Leu Ser Pro Val Phe Pro Gly Gly Glu Thr Ala Arg Lys Asp Lys Phe Leu His Met Val Leu Pro Arg Arg Leu His Leu Glu Pro Ala Phe Leu Pro Tyr Ser Val Lys Ala His Glu Cys Cys STOP
codon_map = {'000':'Phe', 'UUC':'Phe', 'UA':'Leu', 'UG':'Leu', 'CUU': 'Leu', 'CUC':'Leu', 'CUA':'Leu', 'CUG':'Leu', 'AUU':'Ile', 'AUC':'Ile', 'AUA': 'Ile', 'AUG': 'Met', 'GUU':'Val', 'GUC': 'Val', 'GUA': 'Val', "GUG': 'Val', 'UCU':'Ser', 'UCC':'Ser', 'UCA': 'Ser', 'UCG': 'Ser', 'CCU':'Pro', 'CCC':'Pro', 'CCA':'Pro', 'CCG':'Pro', 'ACU': 'Thr', 'ACC':'Thr', 'ACA':'Thr', 'ACG':'Thr', 'GCU': 'Ala', 'GCC':'Ala', "GCA': 'Ala' GCG':'Ala', 'UAU':'Tyr', 'UAC':'Tyr', 'UAA': 'STOP 'UAG': 'STOP', 'CAU': 'His 'CAC': 'His', 'CAA':'Gin', 'CAG':'Gin' 'AAU':'Asn', 'AAC':'Asn', 'AAA':'Lys', 'AAG':'Lys', 'GAU':'Asp', 'GAC': 'Asp', 'GAA':'Glu' 'GAG':'Glu', 'UGU': 'Cys', 'UGC':'Cys', UGA': 'STOP "UGG': 'Trp 'CGU':'Arg', 'CGC':'Arg', 'CGA':'Arg 'CGG': 'Arg', 'AGU':'Ser', 'AGC':'Seri AGA':'Arg', 'AGG':'Arg', 'GGU':'Gly', 'GGC':'Gly', 'GGA':'Gly', 'GGG':'Gly'ſ class Sequence: "A class to contain sequences". def _init__(self, id, seq): ""'"Initialize a sequence object with an id and a sequence"". self.id = id self.seq = seq.upper() def _str_(self): "Print out your sequence and id in a FASTA format" return self.id + '\n' + self.seq def gc(self): A method to calculate GC content of a sequence" # We count the number Cs and Gs and divide by total length return (self.seq.count('G') + self.seq.count('C'))/len(self.seq) def rev_comp (self): "A method to return the reverse complement of a DNA sequence" # We make a dictionary of DNA bases and their complements revdict = {'A':'T', 'c':'G', 'G':'C', 'T':'A'} # We reverse the string using slicing with a negative step value rev = self.seq[::-1] # We use a list comprehension to build a list of the complements of each letter revcomp = [revdict [letter] for letter in rev] # Now we rejoin the list of letters into a string and return it revcomp = 1.join(revcomp) return revcomp def to_rna(self): ""A method to convert a DNA sequence to RNA" rna = self.seq.replace('T', 'U') return rna def start (self): ""A method that returns whether or not a sequence contains a start codon"... return 'ATG' in self.seq def gene_translation (self): "A method that, if a Sequence object contains a start codon, returns the tran protein sequence, starting with the start codon and ending at the stop codon."" # Put your code here if self.start(): pass # Of course, delete pass once you have code to put here else: pass def main(): # Put your code here to open cyto_pol.fasta, read the sequence from it, and create # Then call the gene_translation () method on that Sequence object to print the prot # to the terminal. pass # Of course, delete pass once you have sequence to put here