Convert string representation of list of lists to list of list python without eval

Question

I have this :

x = "[['ATRM', 'SIF', 'NWPX'], ['NAV','SENEA'], ['HES','AGYS', 'CBST', 'GTIM', 'XRSC']]"

x is a string. and I want this :

x = [['ATRM', 'SIF', 'NWPX'], ['NAV','SENEA'], ['HES','AGYS', 'CBST', 'GTIM', 'XRSC']]

where x is a list.

I would normally use eval or ast.literal_eval but those functions are unavailable. Any ideas? Maybe I can use re, but I don't know how.

And exactly *why* are these functions not available? They exist for a reason - is this some homework task? You could implement a parser function yourself. — Jan, May 21 '20 at 14:20
those functions are blacklisted on Quantopian, trying to work around them. — David Serero, May 21 '20 at 14:21
What have you tried so far? Which functions *are* available to you? — MisterMiyagi, May 21 '20 at 14:22
re.findall(r'"\s*([^"]*?)\s*"', x) something like that would work nice but this solution returns an empty list — David Serero, May 21 '20 at 14:25
Possible duplicate- https://stackoverflow.com/questions/1894269/convert-string-representation-of-list-to-list — McLovin, May 21 '20 at 14:27
Is the input always doubly-nested lists of strings? Do you have to deal with arbitrarily-nested lists, non-uniform nesting, and elements other than strings or lists? — MisterMiyagi, May 21 '20 at 14:35
@Solen'ya not duplicate because it works only with list not list of list and uses eval as the principal answer — David Serero, May 21 '20 at 14:38
@MisterMiyagi 1. Yes, 2.1 Yes, 2.2 Yes, 2.3 No elements other than lists and strings — David Serero, May 21 '20 at 14:39

score 2 · Answer 1 · answered May 21 '20 at 14:22

2

This is an odd workaround, but if you replace the single quotes with double quotes, could always use a json parser.

>>> import json
>>> json.loads(x.replace("'", '"'))
[['ATRM', 'SIF', 'NWPX'], ['NAV', 'SENEA'], ['HES', 'AGYS', 'CBST', 'GTIM', 'XRSC']]

answered May 21 '20 at 14:22

Cory Kramer

114,268
16
167
218

Nice catch - ++. – Jan May 21 '20 at 14:23
1

Great idea but json is blacklisted as well :/ – David Serero May 21 '20 at 14:23
@Ch3steR doesnt work, returns list of the characters without the [] but yes, list comprehension is the idea. – David Serero May 21 '20 at 14:31
@Ch3steR doesnt quite work, first element is a string when expected another list. you're getting there <3 – David Serero May 21 '20 at 14:33
@DavidSerero Check [this](https://repl.it/repls/DarkorangeBarrenAnalyst). This only works for list of lists though. – Ch3steR May 21 '20 at 14:37
@Ch3steR: I guess [writing your own parser](https://stackoverflow.com/a/61949002/1231450) is the only way to go (shamelessly self-promoting). – Jan May 22 '20 at 06:29
1

@Jan Yes agreed, writing parser should be first choice IMO, `ast.literal_eval` and `eval` should be the last resort. – Ch3steR May 22 '20 at 06:32

score 1 · Answer 2 · answered May 22 '20 at 06:23

Imo, you need to write your own little parser here, e.g.:

def tokenizer(string):
    buffer = ""
    quote = False
    for c in string:
        if quote:
            if c == "'":
                yield ("VALUE", buffer)
                buffer = ""
                quote = not quote
            else:
                buffer += c
        else:
            if c == "[":
                yield ("LIST_OPEN", None)
            elif c == "]":
                yield ("LIST_CLOSE", None)
            elif c == "'":
                quote = not quote
            else:
                pass


def parser(tokens):
    lst = []
    for token in tokens:
        x, y = token
        if x == "LIST_OPEN":
            lst.append(parser(tokens))
        elif x == "LIST_CLOSE":
            return lst
        elif x == "VALUE":
            lst.append(y)
    return lst[0]

With some test assertions:

assert parser(tokenizer("['HES', ['ATRM', 'SIF', 'NAV']]")) == ['HES', ['ATRM', 'SIF', 'NAV']]
assert parser(tokenizer("[['ATRM', 'SIF', 'NWPX'], ['NAV','SENEA'], ['HES','AGYS', 'CBST', 'GTIM', 'XRSC']]")) == [['ATRM', 'SIF', 'NWPX'], ['NAV','SENEA'], ['HES','AGYS', 'CBST', 'GTIM', 'XRSC']]

The idea is to first tokenize your string into values and commands and then convert this to an actual list.

score 0 · Answer 3 · answered May 21 '20 at 14:48

I acknowledge this is a very janky and limiting answer because it would only work with the given information based on the example text:

def list_list_str_to_list(data_str):
    final_word_list_list = []
    for temp_list_as_str in data_str.split("],"):
        final_word_list = []
        for raw_word in temp_list_as_str.split(","):
            new_word = raw_word
            for letter in "[],'\"":
                new_word = new_word.replace(letter, "")
            final_word_list.append(new_word)
        final_word_list_list.append(final_word_list)
    return final_word_list_list


def main():
    data_str = "[['ATRM', 'SIF', 'NWPX'], ['NAV','SENEA'], ['HES','AGYS', 'CBST', 'GTIM', 'XRSC']]"

    for final_word_list in list_list_str_to_list(data_str):
        print(final_word_list)


main()

The main idea it works off of is that you can tell a list ends by splitting the string when there is an instance of "],". The bulk of the code is just cleaning the words by removing unwanted trailing characters like brackets, quotes, and spaces. To reiterate, this would only work if:

The string is a string representation of ONLY a 2d list and
There are no brackets or single/double quotes within the individual strings

score 0 · Answer 4 · answered May 21 '20 at 14:52

Ok, I think found the answer using re.

x = "[['ATRM', 'SIF', 'NWPX'], ['NAV','SENEA'], ['HES','AGYS', 'CBST', 'GTIM', 'XRSC']]"

y = [re.findall(r"\[(.+?)\]", x[1:])[i] for i in range(x.count('[')-1)]

answer = [re.findall(r"'(.+?)'", y[i]) for i in range(len(y))]

Convert string representation of list of lists to list of list python without eval

4 Answers4