Questions tagged [petl]

petl is a general-purpose ETL package (extract, transform, load) for Python.

petl supports extracting data from XML, HTML, JSON, CSV and text files, databases, Excel, as well as Python objects, arrays, Pandas DataFrames, pickle files and other sources.

It offers various ways of transforming and combining extracted data.

38 questions
0
votes
0 answers

How to make Petl with uppercase headers append to PostgreSQL table lowercase headers

I have this petl table in Python that I want to connect to a local PostgreSQL database. But some petl table headers are in uppercase. eg, ID not id. I realised that when I tried to append the table to the PostgreSQL, the headers were not…
David Obembe
  • 101
  • 2
  • 7
0
votes
1 answer

How to Load Text Data into Data Frame from Response Object

I am attempting to convert a curl request into a get-request to pull some data for work and transfer it to a local folder with a parameterized file name. One issue is that the data is only in text format and will not convert to JSON, even after…
Owen
  • 21
  • 3
0
votes
1 answer

Python or PETL Parsing XML

I have been playing with PETL and seeing if I could extract multiple xml files and combine them into one. I have no control over the structure of the XML files, Here are the variations I am seeing and which is giving my trouble. XML File 1…
0
votes
2 answers

Want to create a key value pair list from csv file but unable to

I want to create a list of key-value pair with the output from /getService route. I am able to filter the data that i wanted (Suburb and Services) from csv file vet_service_locations but want to have it as a key-value pair. Where keys are suburbs…
0
votes
1 answer

Using Python with semi-structured data, how to add a column value based on text encountered in preceding row

I am trying to transform some data into a structured format and do a minor transformation. The source is a .csv file that is actually semi-structured that looks like this: I would like the resulting data from output to look like this, and it is ok…
Mike G
  • 712
  • 5
  • 9
0
votes
0 answers

How to extract a table from any file using python?

I'm writing a python program to extract tables from excel sheets and pdf. Currently, I'm using different libraries for each file type. Xlrd for excel sheets, Pdfminer for pdf. I'm wondering if there is a generic approach to extract tables from any…
Parag
  • 21
  • 2
0
votes
1 answer

How to get strptime from row and compare to current time?

First off, I am very new to all of python. I am now trying to figure out how to replace a time string in a certain column (csv) when that time is greater than the current time. The script I am building from is relying on petl, so that is what i am…
flonuts
  • 3
  • 2
0
votes
0 answers

mysql.connector.errors.ProgrammingError: 1064 (42000):

I'm trying to insert some records in the mysql database, using the petl module but it returns the error: mysql.connector.errors. ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL…
0
votes
1 answer

PETL python removing rows

how can I delete rows using petl library? I have loaded the data using: self.tab = petl.fromcsv(self.filename, delimiter=self.delimiter, encoding=self.source_encoding) Now how can I delete rows in the self.tab with conditions? i think in pandas you…
Derek Lee
  • 475
  • 1
  • 6
  • 20
0
votes
1 answer

Pythonic syntax for extended variable transformation (multiple lengthy method calls)

Trying to seek some guidance on the best way of curating an extensive ETL process. My pipeline has a reasonably sleek extract section, and loads into a designated file in a succinct manner; but the only way I can think to do transformation steps is…
jukedl
  • 25
  • 5
0
votes
1 answer

Get list of columns in which data differs for consecutive rows

I have a table having duplicate rows in consecutive rows. Row having same 'id' should have duplicate data in other columns.But there are few rows in which data is not proper. Eg - id Name Age 1 Ram 12 1 Ram 10 2 Shyam 11 2 Yam…
TeeKay
  • 1,025
  • 2
  • 22
  • 60
0
votes
1 answer

How to join two tables from different databases with petl

I am using petl python package to perform some queries on tables that are stored in SQL Server databases. I need now to do a JOIN between 2 tables that are on different databases. The function petl.fromdb only accepts, as far as I know, one…
Pitrako Junior
  • 143
  • 1
  • 7
0
votes
1 answer

How to fix invalid source argument when using JSON from API

Im trying to pull financial data from the polygon.io API to organize it and later inject it into an Azure database. In order to accomplish this I have been using the "petl" python package but I am having issues starting with an example table…
0
votes
1 answer

Loading JSON, HTML, XML, or Text into PETL from memory rather than file

The PETL documentation states that in order to load JSON, HTML, XML, or text the data can only originate from a file. How can I load data into PETL in any of these formats from memory such as a string variable rather than a file? This would be…
Bosco
  • 935
  • 10
  • 18
0
votes
1 answer

Reading XML files with Petl

I'm trying to parse information from an XML file into a table that has already been created from another CSV file with Petl and am having trouble with the syntax of the fromxml() function. The XML file contains:
JackDG
  • 49
  • 1
  • 8