-3

I have this statement:

Credits Electronic deposits/bank credits Effective Posted date date Amount Transaction detail 07/01 2,023,825.24 Stagecoach Sweep Credit 07/02 2,023,825.24 Stagecoach Sweep Credit 07/02 19,479.00 WT Fed#02868 E Trade Securities /Org=Etrade Securities LLC Srf# 8785491 070220 Trn#200702058382 Rfb# 07/03 2,042,191.24 Stagecoach Sweep Credit 07/06 2,042,191.24 Stagecoach Sweep Credit 07/07 2,042,191.24 Stagecoach Sweep Credit 07/08 2,042,191.24 Stagecoach Sweep Credit 07/09 2,042,191.24 Stagecoach Sweep Credit 07/10 2,042,191.24 Stagecoach Sweep Credit 07/13 2,042,191.24 Stagecoach Sweep Credit 07/14 2,041,936.79 Stagecoach Sweep Credit 07/15 2,041,936.79 Stagecoach Sweep Credit 07/15 61,683.50 WT Fed#02317 E Trade Securities /Org=Etrade Securities LLC Srf# 8824249 071520 Trn#200715067847 Rfb#

I need to create a Regex formula which would extract and separate everything after the mm/dd format.

Example:

07/02 2,023,825.24 Stagecoach Sweep Credit 07/02 19,479.00 WT Fed#02868 E Trade Securities /Org=Etrade Securities LLC Srf# 8785491 070220 Trn#200702058382 Rfb#

then next line after this statement would be one which starts immediately after this one in mm/dd format

07/03 2,042,191.24 Stagecoach Sweep Credit 07/06 2,042,191.24 Stagecoach Sweep Credit

As I am completly new in regular expression I have no idea how to process the flow..

Many thanks in advance,

jjj
  • 35
  • 1
  • 8
  • Stackoverflow is not a free codeing-provide platform. Provide which solutions you have tried, a statement like "I just dont know how this works" is not how stackoverflow works. For more information on regular expressions see here: https://docs.python.org/3/library/re.html or here: https://www.w3schools.com/python/python_regex.asp – MichaelJanz Sep 11 '20 at 07:04
  • @Jizef Fujka don't understand what you try to archive. Post proper example with desired output. – Zaraki Kenpachi Sep 11 '20 at 07:05

2 Answers2

0

Unfortunately, your own example does not adhere to your rule of separating after the mm/dd delimiter. Therefore it is unclear what the actual delimiter is. Nevertheless, here is an idea.

The code below will show you every mm/dd match and their span in the text. Take the spans between entries in the result to slice the original text as you need. It would be best if you figure out exactly how the slicing should be and create a function that puts everything into a nice list:

import re

txt = """\
Credits Electronic deposits/bank credits Effective Posted date date Amount Transaction detail 07/01 2,023,825.24 Stagecoach Sweep Credit 07/02 2,023,825.24 Stagecoach Sweep Credit 07/02 19,479.00 WT Fed#02868 E Trade Securities /Org=Etrade Securities LLC Srf# 8785491 070220 Trn#200702058382 Rfb# 07/03 2,042,191.24 Stagecoach Sweep Credit 07/06 2,042,191.24 Stagecoach Sweep Credit 07/07 2,042,191.24 Stagecoach Sweep Credit 07/08 2,042,191.24 Stagecoach Sweep Credit 07/09 2,042,191.24 Stagecoach Sweep Credit 07/10 2,042,191.24 Stagecoach Sweep Credit 07/13 2,042,191.24 Stagecoach Sweep Credit 07/14 2,041,936.79 Stagecoach Sweep Credit 07/15 2,041,936.79 Stagecoach Sweep Credit 07/15 61,683.50 WT Fed#02317 E Trade Securities /Org=Etrade Securities LLC Srf# 8824249 071520 Trn#200715067847 Rfb#"""

pattern = re.compile(r'(\d{2}/\d{2})')

res = re.finditer(pattern, txt)

for r in res:
    print(r)


# Two examples
print(txt[94:137])
print(txt[137:180])
Jose Mir
  • 78
  • 7
0

This is a little old, but if you didn't find a solution, then you can do so as below in Alteryx.

Firstly, make sure that your field size is large enough (you can use a Select) REGEX with the following config:

Expression:(\s\d\d/\d\d\s)

Method: Replace

Text: \n$1

Then use a Text to Columns: Delimiter (\n) and Split to Rows

If you then need to move the data for the same date onto the same line then you can either use a Multi-Row, or split the date out and use a Summarise to Concatenate

KaneG
  • 146
  • 4