So basically I have a string:
string_1 = '(((A,B)123,C)456,(D,E)789)135'
Containing a phylogenetic tree with bootstrap values is parenthetical notation (not really important to the question, but in case anyone was wondering). This example tree contains four relationships with four bootstrap values (the numbers following each close parenthesis). I have each of these relationships in a list of lists:
list_1 = [['(A,B)', 321], ['((A,B),C)', 654],
['(D,E)', 987], ['(((A,B),C),(D,E))', 531]]
each containing a relationship and its updated bootstrap value. All I need to do is to create a final string:
final = '(((A,B)321,C)654,(D,E)987)531'
where all the bootstrap values are updated to the values in list_1. I have a function to remove bootstrap values:
import re
def remove_bootstrap(string):
matches = re.split(r'(?<=\))\d+\.*\d*', string)
matches = ''.join(matches)
return matches
and code to isolate relationships:
list_of_bipart_relationships = []
for bipart_file in list_bipart_files:
open_file = open(bipart_file)
read_file = open_file.read()
length = len(read_file)
for index in range(1, length):
if read_file[index] == '(':
parenthesis_count = 1
for sub_index in range(index + 1, length):
if read_file[sub_index] == '(':
parenthesis_count += 1
if read_file[sub_index] == ')':
parenthesis_count -= 1
if parenthesis_count == 0:
bad_relationship = read_file[index:sub_index + 1]
relationship_without_values = remove_length(bad_relationship)
bootstrap_value = extract(sub_index, length, read_file)
pair = []
pair.append(bootstrap_value)
pair.append(relationship_without_values)
list_of_bipart_relationships.insert(0, pair)
break
and I am completely at a loss. I cannot figure out how to get the program to recognize a larger relationship once a nested relationship's bootstrap value is updated. Any help would be greatly appreciated!