0

I coding a script in python to read a .INI file. I know there is a library called configparser, but my .INI is a little different from the "standard".

In general, the files should be like:

[HEADER]
username = john
fruits = oranges, apples

but in my case, I have to read something like that:

[HEADER]
username john
fruits oranges apples

Is there any easy way to do that or I have to make my own parser?

-- Edited --

Guys, thanks for the answers. I forgot to mention something very important. In the INI file (It is generated by a proprietary and painfull software), It also can have multiple lines with the same key. I will give you an example.

[HEADER]
day1 bananas oranges
day1 apples avocado
day2 bananas apples
  • 4
    Parsing this manually isn't even too hard - as long as none of your parameter names are meant to contain a space, you can get away with something like `line.split(' ')`. – Sebastian Lenartowicz Aug 10 '16 at 16:39
  • Just convert from your ini to regular ini before parsing: `newline = line.split(' ')[0] + "=" + ", ".join( line.split(' ')[1:]` – handle Aug 10 '16 at 17:18
  • 1
    Your example looks simple but there may be complications lurking in its format. If the same header appears twice, does it add or overwrite? Does it allow substitution variables? Can it have comments and what does it use to demark them? Can options have multiple values and how are they represented? Confiig parsing is a dark art! – tdelaney Aug 10 '16 at 17:25
  • 2
    See [the documentation on customizing the parser behavior](https://docs.python.org/3/library/configparser.html#customizing-parser-behaviour). It looks like at minimum you're looking at `configparser.ConfigParser(delimiters=(" ",), strict=False)` however you don't define what your last example should do! – Adam Smith Aug 10 '16 at 17:33
  • Hi guys, when there are two lines with same key, the software just concatenate the values. I don't know why they choose do that. In my example, a guy ate four fruits in day1 (bananas, oragens, apples and avocado). And I'm sorry, I'm not a english native speaking, so I'm not sure if I've been clearly enough. – Breno Arruda Aug 10 '16 at 17:34
  • Do you also seek proposals for parsing code? If yes, I'd come up with something neat. – Dominik George Aug 10 '16 at 17:35
  • Yes Dominik. My objective is to read the INI file generated by the software and do some processing and rewrite into the INI file in order to the software run it again with new parameters. The software is not famous. It was developed by a very specific situation and I'm using the example of fruits to simplify my question. – Breno Arruda Aug 10 '16 at 17:45

1 Answers1

0

This seems like the kind of thing you'll have to write your own parser to handle. I'd still drop it into a ConfigParser object, but only so it can be more easily used in the future. configparser.ConfigParser can do almost everything you need to do. I'm not aware of any way to tell it to treat

[SomeHeader]
foo = bar
foo = baz

as config["SomeHeader"]["foo"] == ["bar", "baz"] as you mention near the end of your question.

You could try something like:

def is_section_header(text):
    # what makes a section header? To me it's a word enclosed by
    # square brackets.
    match = re.match(r"\[([^\]]+)\]", text)
    if match:
        return match.group(1)

def get_option_value_tuple(line):
    """Returns a tuple of (option, comma-separated values as a string)"""
    option, *values = line.split(" ")
    return (option, ", ".join(values))

def parse(filename):
    config = configparser.ConfigParser()
    cursection = None
    with open(filename) as inifile:
        for line in inifile:
            sectionheader = is_section_header(line)
            if sectionheader:
                cursection = sectionheader
                try:
                    config.add_section(sectionheader)
                except configparser.DuplicateSectionError:
                    # This section already exists!
                    # how should you handle this?
                    continue  # ignore for now
            else:
                option, values = get_option_value_tuple(line.strip())
                if config.has_option(cursection, option):
                    config[cursection][option] += (", " + values)
                else:
                    config[cursection][option] = values
    return config

This will make:

[SomeHeader]
foo bar baz
foo spam eggs
bar baz

parse the same as a standard parse of

[SomeHeader]
foo = bar, baz, spam, eggs
bar = baz
Adam Smith
  • 52,157
  • 12
  • 73
  • 112
  • 1
    Thank you everybody for all the answers. Thank you Adam. I wasn't expecting for a working code, but it will really help me since I'm a beginner in programming. – Breno Arruda Aug 11 '16 at 12:00