0

I am working on a python script to trim credit card statements show only debit lines, removing other lines like the preface information and the rolling balance. The intended use is copying the entire statement to the clipboard, running the program and having it output the trimmed matches to the clipboard to paste elsewhere. I am using the pyperclip module for the clipboard input and output.

The script works when I have a string in the code to test the regex and output, but when I start using pyperclip.paste to assign the input, I get an empty output. The code looks like this:

#! python3

import re
import pyperclip

debitRegex = re.compile(r'''
\d+
/
.*
\d*
\.
\d\d
\n
''', re.VERBOSE)

text = str(pyperclip.paste())

matchesPartial = debitRegex.findall(text)

matches = []
for element in matchesPartial:
    matches.append(element.strip())

pyperclip.copy('\n'.join(matches))

As described above, if I substitute the "text" variable with a test sting, things work fine. I have troubleshot by printing "matches" to validate output. It works with a string replacing "pyperclip.paste()" and similarly shows nothing when "pyperclip.paste()" is used.

I assume there is a an aspect of pyperclip.paste() that I am misunderstanding that is causing no output when I run a regex over it. Here is a sample of the kind of input I am using for testing purposes:

Having trouble making repayments....
03/06/19 01/06/19 MANGO 200.00
03/06/19 01/06/19 C ADV FEE 4.00
03/06/19 01/06/19 MULBERRY 200.00
03/06/19 03/06/19 INTERNET PAYMENT Linked A cc Trns 200.00 CR
03/06/19 03/06/19 INTERNET PAYMENT Linked A cc Trns 60.00 CR
04/06/19 04/06/19 BAT 100.00
04/06/19 04/06/19 C ADV FEE 2.50 

The output should strip the preface lines and the lines that end with "CR".

I am using this program on a vanilla Windows 10 machine with no known clipboard irregularities. Any help anyone can offer would be greatly appreciated!

  • Your regex requires a newline character to immediately follow the last digit of the amount. I see two possible problems here - there might be a trailing space on the line, or there might be a carriage return before the newline. Print the `repr()` of the clipboard text to see exactly what's in there. – jasonharper Nov 13 '20 at 04:48
  • Hey, thanks for the input. The numbers are right aligned with whitespace between them and the text. The .* seems to be picking up all the whitespace fine. There isnt anything after the digit im interested in save for the \n character. The regex outputs what i want when i manually assign text but fails when i use pyperclip.paste to do so. – nothingbutsneakernet Nov 13 '20 at 06:01

0 Answers0