I have a need to do some processing on many thousands of strings (each string being an element in a list, imported from records in a SQL table).
Each string comprises a number of phrases delimited by a consistent delimiter. I need to 1) eliminate duplicate phrases in the string; 2) sort the remaining phrases and return the deduplicated, sorted phrases as a delimited string.
This is what I've conjured:
def dedupe_and_sort(list_element, delimiter):
list_element = delimiter.join(set(list_element.split(f'{delimiter}')))
return( delimiter.join(sorted(list_element.split(f'{delimiter}'))) )
string_input = 'e\\\\a\\\\c\\\\b\\\\a\\\\b\\\\c\\\\a\\\\b\\\\d'
string_delimiter = "\\\\"
output = dedupe_and_sort(string_input, string_delimiter)
print(f"Input: {string_input}")
print(f"Output: {output}")
Output is as follows:
Input: e\\a\\c\\b\\a\\b\\c\\a\\b\\d
Output: a\\b\\c\\d\\e
Is this the most efficient approach or is there an alternative, more efficient method?