1

I have data which is stored in dataframe with one of the column (query column) contains sql queries. I want to remove all sql comments from all these sql queries in data frame and retain the query column along with other columns in data frame.

Basically I want to modify existing data frame which contains query column. Although I know sqlparse library and implemented it on a single cell element.

Using this:

print (sqlparse.format(data['Query'][0], strip_comments=True).strip())

It is applying only to one element, but I want to modify all the sql query values in query column of that data frame.

Please note I also used loop concept to get removed all the sql comments from that query column using this below:

list=[]
for i in range(columns):
    list.append(sqlparse.format(data['Query'][i], strip_comments=True).strip())
c=pd.DataFrame(list)

This is working fine and I will get removed all the comments and get all query stored in this data frame c.

But the point is I want to modify in that original data frame so that I can use other columns (which has some other data stored) along with query column.

@ArchAngelPwn Sure. Let say Data Frame df with column Name and SQL Query. As you see for each Name there is SQL Query with some comments. I want to remove such sql comments using python, so that data frame of sql query has query only.

Name SQL Query

A
SELECT * FROM sales

-- JOIN marketing on sales.id=marketing .Id

B
--SELECT * FROM C

SELECT * FROM Employee

C

SELECT * FROM

  Customer c

 JOIN Order o on c.id=o.id

/* This query give me data from 1 jan 2020 to 30 jan 2021 */

Nitesh Jha
  • 11
  • 2
  • Can you please provide some sample data that is not a screenshot so we can look at your data directly? – ArchAngelPwn Jun 22 '22 at 15:12
  • I general I think you want to replace the old values with the new values (i.e., the original dataframe column is a list, and you replace those values with the new list - possibly something like `data['Query'].values = theNewList` but I am not sure if this is the correct syntax exactly). – topsail Jun 22 '22 at 15:13
  • @ArchAngelPwn Sure. Let say Data Frame df with column Name and SQL Query. As you see for each Name there is SQL Query with some comments. I want to remove such sql comments using python, so that data frame of sql query has query only. Name SQL Query A) SELECT * FROM A -- JOIN B on A.id=B.Id B) --SELECT * FROM C SELECT * FROM Employee C) SELECT * FROM Customer c JOIN Order o on c.id=o.id /* This query give me data from 1 jan 2020 to 30 jan 2021 */ – Nitesh Jha Jun 23 '22 at 12:01
  • data[''Query] = sqlparse.format(data['Query'], strip_comments=True).strip()) should do the trick – rayad Jun 25 '22 at 16:03

0 Answers0