3

Going through an itertuples, I need to export two data, located in two rows of an excel column ('G7', 'G8') to two columns of google sheets. How can I do this?

import gspread
from gspread_dataframe import get_as_dataframe, set_with_dataframe
from oauth2client.service_account import ServiceAccountCredentials
import pyperclip
import pyautogui as p
import rpa as r
import pandas as pd
import tabula
import openpyxl

r.init()
r.url('https://www.meudetran.ms.gov.br/veiculo.php#')
p.sleep(2)
janela = p.getActiveWindow()
janela.maximize()
p.sleep(2)

scope = ['https://spreadsheets.google.com/feeds']
credentials = ServiceAccountCredentials.from_json_keyfile_name('credentials.json', scope)
gc = gspread.authorize(credentials)
wks = gc.open_by_key('1AGYhinoPiE9xUABnrNEfVGjLf5s_bAJGjpz9hatfIQU')
worksheet = wks.get_worksheet(0)
dados = get_as_dataframe(worksheet)
df = pd.DataFrame.from_records(dados, columns=["Placa", "Renavam"])
set_with_dataframe(worksheet, df)
df2 = get_as_dataframe(worksheet)

for row in df2.itertuples():
    df = tabula.read_pdf(text, pages=1)[1]
    df.to_excel('dados.xlsx')
    wb = openpyxl.load_workbook('dados.xlsx')
    sheet = wb.active
    venc = sheet['G8'].value
    valor = sheet['G7'].value
    worksheet.update(row[3], venc)

This last line does not update column 3 of the google sheet

Tanaike
  • 181,128
  • 11
  • 97
  • 165
Lino Costa
  • 67
  • 1
  • 8
  • What is `df = tabula.read_pdf(text, pages=1)[1]` in your script? – Tanaike May 03 '21 at 01:40
  • I am converting a web pdf to table in xlsx. To get the data and feed it in the googlesheet, in the active tuple – Lino Costa May 03 '21 at 02:01
  • Thank you for replying. You want to retrieve the values from the cells "G7" and "G8" from 1st tab in the XLSX data converted from a PDF data. Is my understanding correct? If my understanding is correct, where do you want to put the retrieved values of "G7" and "G8" to Google Spreadsheet? It seems that in your script, only the value of "G8" is used. So I cannot understand about the detail of your goal. I apologize for this. – Tanaike May 03 '21 at 02:06
  • That's it. I need to import this data in 'G8' into cell 'C2' of google sheets. However, I need this to be done automatically according to the line being read in the tuple. In this case I could use worksheet.update ('C2', venc), but in all tuples, I would update the same cell in google sheets. In the next tuple, need to feed the Google 'C3' cell, then the 'C4' .... – Lino Costa May 03 '21 at 02:12
  • Thank you for replying. I have to apologize for my poor English skill. Unfortunately, from your replying, I cannot still understand about your goal. For example, you want to append the values to the column "C" in the Spreadsheet every run of script? For example, at 1st run, you want to put the retrieved values of "G7" and "G8" to the cells "C2" and "C3" of Spreadsheet. And, at 2nd run, you want to put the retrieved values of "G7" and "G8" to the cells "C4" and "C5" of Spreadsheet. You want to do this cycle. Is my understanding correct? – Tanaike May 03 '21 at 02:18
  • That's exactly it, but the value recovered from "G7" from excel to "C2" from google sheet, "G8" to "D2". Then again "G7" for the next line, "C3" and "G8" for "D3". So successively – Lino Costa May 03 '21 at 02:24
  • Thank you for replying. I understood your goal. Before I prepare an answer, I have 2 confirmation points for your situation. 1. You have already achieved about that you want to retrieve the values from the cells "G7" and "G8" from 1st tab in the XLSX data converted from a PDF data. Is my understanding correct? 2. You have already been able to get and put values for Google Spreadsheet using Sheets API. Is my understanding correct? – Tanaike May 03 '21 at 02:30
  • Yes, both points are correct – Lino Costa May 03 '21 at 02:33
  • Thanks. I would like to prepare an answer. Could you please wait for it? – Tanaike May 03 '21 at 02:34
  • Of course! Thank you in advance for your support – Lino Costa May 03 '21 at 02:38
  • Thank you for replying. From our discussions, I proposed an answer. Could you please confirm it? If that was not the direction you expect, I apologize. – Tanaike May 03 '21 at 02:46

1 Answers1

2

I believe your goal and your current situation as follows.

  • You want to retrieve the values from the cells "G7" and "G8" from 1st tab in the XLSX data converted from a PDF data.
    • You have already achieved this.
  • You want to append the values to the columns "C" and "D" in the Spreadsheet every run of script.
    • For example, at 1st run, you want to put the retrieved values of "G7" and "G8" to the cells "C2" and "D2" of Spreadsheet. And, at 2nd run, you want to put the retrieved values of "G7" and "G8" to the cells "C3" and "D3" of Spreadsheet. You want to do this cycle.
  • You have already been able to get and put values for Google Spreadsheet using Sheets API.

Modification points:

  • In your script, the values retrieved from Spreadsheet are converted to the dataframe. I thought that in your situation, this might not be required.
  • In this modification, I would like to propose the following flow.
    1. Retrieve the values from "G7" and "G8" from the XLSX data converted from PDF data.
    2. Retrieve the values from the columns "C" and "D" and retrieve the last row of the columns "C" and "D".
    3. Append the retrieved values to the columns "C" and "D" in Google Spreadsheet.

When above points are reflected to your script, it becomes as follows.

Modified script:

In this modified script, I modified below gc = gspread.authorize(credentials) in your script.

gc = gspread.authorize(credentials)
wks = gc.open_by_key('###') # Please set your Spreadsheet ID.
worksheet = wks.get_worksheet(0)

# 1. Retrieve the values from "G7" and "G8" from the XLSX data converted from PDF data.
df = tabula.read_pdf(text, pages=1)[1]
df.to_excel('dados.xlsx')
wb = openpyxl.load_workbook('dados.xlsx')
sheet = wb.active
venc = sheet['G8'].value
valor = sheet['G7'].value

# 2. Retrieve the values from the column "C" and retrieve the last row of the columns "C" and "D".
lastRow = max([len(worksheet.col_values(3)), len(worksheet.col_values(4))])

# 3. Append the retrieved values to the columns "C" and "D" in Google Spreadsheet.
worksheet.update('C' + str(lastRow + 1), [[valor, venc]])
  • In this modified script, it supposes that df = tabula.read_pdf(text, pages=1)[1] works fine. Please be careful this.
  • By above modification, the retrieved values valor, venc are appended to the columns "C" and "D" every run.

References:

Tanaike
  • 181,128
  • 11
  • 97
  • 165
  • Worked perfectly. I just need to make some adjustments, but that was exactly it. Friend, thank you very much for your help, I have been trying for a solution for days. – Lino Costa May 03 '21 at 03:05
  • 1
    @Lino Costa Thank you for replying and testing it. I'm glad your issue was resolved. Thank you, too. – Tanaike May 03 '21 at 03:11