Most efficient way of splitting strings like rubik's cube notations in python?

Question

If a string like "RL2R'F2LD'" given,What is the most efficient way of splitting this into Strings "R" "L2" "R'" "F2" "L" "D'"? I'v tried few methods like first splitting them into individual chars and then trying to add them to a list and nothing worked correctly.

@Madushan: I've updated my answer to explain the regular expression. — Martijn Pieters, Nov 07 '12 at 16:09

score 5 · Accepted Answer · answered Nov 07 '12 at 15:37

5

def rubikstring(s):
    import string
    cumu = ''
    for c in s:
        if c in string.ascii_letters:
            if cumu: yield cumu
            cumu = ''
        cumu += c
    if cumu: yield cumu

could do your job. With

>>> for i in rubikstring("RL2R'F2LD'"): i
...
'R'
'L2'
"R'"
'F2'
'L'
"D'"

you get your desired result, with

>>> list(rubikstring("RL2R'F2LD'"))
['R', 'L2', "R'", 'F2', 'L', "D'"]

as well.

answered Nov 07 '12 at 15:37

glglgl

89,107
13
149
217

I think this is the answer I'm looking for.re is a good way too but it's not telling me how It work :D – Madushan Nov 07 '12 at 15:52
Note that this solution also will return invalid directions, if there are errors in the rubrik directions (e.g. it'll return `Q33'` just as happily if that's present in the input. – Martijn Pieters Nov 07 '12 at 16:08

Martijn Pieters · Answer 2 · 2012-11-07T16:07:22.737

You could use a regular expression:

import re
cubedirs = re.compile(r"[RLFBUDrlfbudxyz][2']?")
cubedirs.findall("RL2R'F2LD'")

This outputs ['R', 'L2', "R'", 'F2', 'L', "D'"].

The regular expression is actually very simple. The [..] character group means: match one character from the set given (so an R, or an L, or an F, etc.).

Then we look for a second character group optionally matching 1 character, namely a 2 or '. The question mark after the second character is what makes it optional; we are specifying that it's also fine if the ' or the 2 character is not there.

The .findall() method simply returns all matches that have been found, so you get a list of all character groups in the input string that match the pattern.

@minitech: yup, better still. – Martijn Pieters Nov 07 '12 at 15:45 — Martijn Pieters, Nov 07 '12 at 15:45

score 3 · Answer 3 · edited Sep 26 '13 at 07:57

3

You could use a regular expression:

[FBUDLRfbudlrxyz][2']?

Here's a live demo.

import re

s = "RL2R'F2LD'"

for m in re.finditer("[FBUDLRfbudlrxyz][2']?", s):
    print m.group(0)

(Sorry for not explaining how to do it in the comments, I don't really know Python.)

edited Sep 26 '13 at 07:57

sehe

374,641
47
450
633

answered Nov 07 '12 at 15:38

Ry-

218,210
55
464
476

score 1 · Answer 4 · answered Nov 07 '12 at 15:38

1

As commented, a regular expression would be a good way:

>>> import re
>>> re.findall('[A-Z]{1}[0-9]{0,1}', "RL2R'F2LD'")
['R', 'L2', 'R', 'F2', 'L', 'D']

answered Nov 07 '12 at 15:38

Gonzalo

4,145
2
29
27

2

`{1}` is superfluous, `{0,1}` can be replaced by `?`, you ignored the `'`. – Ry- Nov 07 '12 at 15:39

Most efficient way of splitting strings like rubik's cube notations in python?

4 Answers4