1

I'm trying to copy all text data from an Amazon search result page (say the search term is laptop), using Ctrl+A, Ctrl+C through PyAutoGui. Then get the data using either pyperclip.paste() or pd.read_clipboard() and print it. Here's the code:

import pyautogui
import time
import pyperclip
import pandas as pd

keyword = 'laptop'

time.sleep(3)
pyautogui.click(x=750, y=135)
time.sleep(1)
pyautogui.write(keyword)
time.sleep(1)
pyautogui.press('enter')
time.sleep(5)
pyautogui.hotkey('ctrl', 'a')
pyautogui.hotkey('ctrl', 'c')
time.sleep(0.1)

#raw = pyperclip.paste()
raw = pd.read_clipboard()

print(raw)

Using Pandas gives this error:

Traceback (most recent call last):
  File "c:\Users\smfah\OneDrive\Desktop\tmp\regex.py", line 32, in <module>
    raw = pd.read_clipboard()
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\clipboards.py", line 88, in read_clipboard
    return read_csv(StringIO(text), sep=sep, **kwargs)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 950, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 611, in _read
    return parser.read(nrows)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 1778, in read
    ) = self._engine.read(  # type: ignore[attr-defined]
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\python_parser.py", line 282, in read
    alldata = self._rows_to_cols(content)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\python_parser.py", line 1045, in _rows_to_cols
    self._alert_malformed(msg, row_num + 1)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\python_parser.py", line 765, in _alert_malformed
    raise ParserError(msg)
pandas.errors.ParserError: Expected 4 fields in line 726, saw 7. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

And using Pyperclip gives this error:

Traceback (most recent call last):
  File "c:\Users\smfah\OneDrive\Desktop\tmp\regex.py", line 45, in <module>
    print(raw)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u200c' in position 60: character maps to <undefined>

However, if I hardcode the text on the code editor (using VSCode on Win11), and don't print it, I can work (e.g. applying regex) using the hardcoded data.

text = '''long block of text'''

But I want to work on the text copied into the clipboard. I tried applying various solutions, but none worked for me.

Note: This issue is not happening on Ubuntu 22.4, so looks like Windows related issue.

Any help will be greatly appreciated! Thanks!

Fahim
  • 308
  • 1
  • 3
  • 10

1 Answers1

1

Windows clipboards could be accessed with win32clipboard which is a part of winpy group. To get the latest text from clipboard,

import win32clipboard

# get clipboard data
win32clipboard.OpenClipboard()
data = win32clipboard.GetClipboardData()
win32clipboard.CloseClipboard()
print(data)

You don't need to install winpy or win32clipboard as they come with the default python installation.