Getting complement of a character

Question

How can't the codes below work, in order to get the complement of the character entered? It seems like the loop never end, but let say, if I enter 'Z' as dna, why wouldn't it break and quit? Did I use the break or if wrongly? How about elif?

def get_complement(dna):

''' (ch) -> ch

Reverse the 'A' to 'T' or vice versa and 'C' to 'G' and vice versa too.
>>> get_complement('A')
'C'
>>> get_complement('G')
'T'

'''
if dna == 'A':
    print ('C')
    if dna == 'C':
        print ('A')
        if dna == 'T':
            print ('G')
            if dna == 'G' :
                print ('T')
                while  {'A', 'C', 'G', 'T'}.isnotsubset(set(dna)) :
                    break
                return ('')

As your example is written (and as Cyber has written his answer based on your example) you are not getting the complement. They are set up so that A -> C (instead of the complement T), T -> G instead of A, etc. Using a dictionary as Cyber has done, it should look like this: complement = {'A':'T', 'T':'A', 'C':'G', 'G':'C'} — iayork, Jun 30 '14 at 14:27

score 3 · Accepted Answer · edited Sep 25 '14 at 05:17

3

You should set up a map, using a dictionary

complement = {'A': 'C', 'C': 'A', 'T': 'G', 'G': 'T'}

Then for some string you can do

original = "ATCGTCA"
"".join(complement[letter] for letter in original)

Output

'CGATGAC'

For just a single character:

complement['A']

Output

'C'

edited Sep 25 '14 at 05:17

smci

32,567
20
113
146

answered Jun 30 '14 at 13:02

Cory Kramer

114,268
16
167
218

What does the i in compliment stand for? And why is an empty list is being joined? (I'm not really familiar with dictionary) – MingJian Jun 30 '14 at 13:45
`i` is just the current element, it is a temporary variable. I changed it to `letter` if that makes more sense to read. The line `"".join()` takes the output of that loop, which would be a list, and creates one single string instead of a list of characters. – Cory Kramer Jun 30 '14 at 13:48

score 0 · Answer 2 · answered Jun 30 '14 at 14:47

0

As your example is written (and as Cyber has written his answer based on your example) you are not getting the complement. You're getting A -> C (instead of the complement T), T -> G instead of A, etc.

Using a dictionary as Cyber has done, it should look like this:

complement = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}

And in code, including a check for non-DNA characters:

original = "ATCGTCA"
bad_original = "ATCGTCAZ"

complement = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
for dna in (original, bad_original):
    try:
        output = "".join([complement[x] for x in dna])
    except KeyError:
        output = "Contains non-DNA characters"

    print output

Where "original" yields "TAGCAGT" and "bad_original" yields "Contains non-DNA characters".

Note that this is complement, not the reverse complement, which is usually of more interest.

More generally, if you are planning on using this for sequences of DNA, you should probably look into the BioPython module (http://biopython.org/wiki/Seq#Complement_and_reverse_complement), which will get you complement (and reverse complement) with more versatility, error checking, etc.

answered Jun 30 '14 at 14:47

iayork

6,420
8
44
49

Why does the code below returns no value? (As in it proceeds to new line after I entered it - get_complement('ACGTAC') without anything returned. ) complement = {'A':'C', 'C':'A', 'T':'G', 'G':'T'} while {'A', 'T', 'C', 'G'}.issubset(set(dna)): if len(dna) >0: output = "".join(complement(letter) for letter in dna) else: ({'A', 'T', 'C', 'G'}!=subset(set(dna))) ouput = 'Non-DNA character!' break return print (output) – MingJian Jul 02 '14 at 11:24
I can't tell exactly what you're doing here; please format your code. I see a number of problems: (1) You have a typo, "ouput" for "output" (2) I can't tell what you're trying to do in the "else" block but I am sure it's not what you want to do. Are you trying to check for non-DNA characters there? Should there be an "if" there? In any case it's nested within the "while" clause that shouldn't allow it to ever be true (3) The while clause is also not checking that your DNA characters are correct, anyway (4) You still don't have the correct complements. "C" is not the complement of "A", etc – iayork Jul 03 '14 at 12:18
`complement = {'A':'T', 'T':'A', 'C':'G','G':'C'}` `while {'A','T','C''G'}.issubset(set(dna)):` `if len(dna) > 0:` `output = "".join(complement(i)for i in dna)` `elif ({'A', 'T', 'C', 'G'}!=subset(set(dna))):` `output = 'Non-DNA character found'` `break` `return print(output)` – MingJian Jul 04 '14 at 07:52
Your formatting still isn't showing line breaks correctly. It looks as if you have your `return` line ahead of your `print` line, so the print statement will never be reached. Your `elif` logic doesn't work. Your `while` statement precludes it from ever happening, and even if it didn't the `if` statement means the `elif` will only be reached if there is no DNA (i.e. if `len(dna) <= 0`) which isn't what you want. Moreover you probably want it to read `elif not {'A','T','C','G'}.issubset(set(dna)):` There is a missing comma in your first nucleic acid set (`{'A','T','C''G'}`). – iayork Jul 04 '14 at 12:07

Getting complement of a character

2 Answers2