2

How do I create a column that starts from "\"" and ends in "]" in another column?

For example

A                  new_column

\\loc\ggg.x]ddj    \\loc\ggg.x]
+\\lol\lll.d]aaa   \\lol\lll.d]

I tried doing this

df['new_column'] = df['A'].str.split(']').str[0]

but it included unneeded text and want to only start at X (\) and end with Y ("]").

helpme
  • 163
  • 3
  • 9

2 Answers2

1

Try .str.extract:

df["new_column"] = df["A"].str.extract(r"(\\.*?\])")
print(df)

Prints:

                                                                                       A                 new_column
0                                                                        \\loc\ggg.x]ddj               \\loc\ggg.x]
1                                                                       +\\lol\lll.d]aaa               \\lol\lll.d]
2  \\ddf\gdd\Ps\s\3\s[a.xls]ss'!e+'\\d\\P\2\d[d.xls]Canjet'!B42+'\\df\gds\+'\\s\P[s.pdf]  \\ddf\gdd\Ps\s\3\s[a.xls]
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • Thanks. The problem I now came across is some strings add to each other . For example "\\ddf\gdd\Ps\s\3\s\[a.xls]ss'!e+'\\d\\P\2\d\[d.xls]Canjet'!B42+'\\df\gds\+'\\s\P\[s.pdf]" and it’s not returning only \\ddf\gdd\Ps\s\3\s\[a.xls] but more than I wanted. – helpme Sep 16 '21 at 16:55
  • @helpme See my edit. (use the pattern `r"(\\.*?\])"`) – Andrej Kesely Sep 16 '21 at 16:56
  • 1
    worked perfect! Thanks! – helpme Sep 16 '21 at 17:51
0

You could use str.replace here with a capture group:

df["new_column"] = df["A"].str.replace(r'^.*?(\\\\.*\]).*$', r'\1')
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360