I have data which is stored in dataframe with one of the column (query column) contains sql queries. I want to remove all sql comments from all these sql queries in data frame and retain the query column along with other columns in data frame.
Basically I want to modify existing data frame which contains query column. Although I know sqlparse library and implemented it on a single cell element.
Using this:
print (sqlparse.format(data['Query'][0], strip_comments=True).strip())
It is applying only to one element, but I want to modify all the sql query values in query column of that data frame.
Please note I also used loop concept to get removed all the sql comments from that query column using this below:
list=[]
for i in range(columns):
list.append(sqlparse.format(data['Query'][i], strip_comments=True).strip())
c=pd.DataFrame(list)
This is working fine and I will get removed all the comments and get all query stored in this data frame c.
But the point is I want to modify in that original data frame so that I can use other columns (which has some other data stored) along with query column.
@ArchAngelPwn Sure. Let say Data Frame df with column Name and SQL Query. As you see for each Name there is SQL Query with some comments. I want to remove such sql comments using python, so that data frame of sql query has query only.
Name SQL Query
A
SELECT * FROM sales
-- JOIN marketing on sales.id=marketing .Id
B
--SELECT * FROM C
SELECT * FROM Employee
C
SELECT * FROM
Customer c
JOIN Order o on c.id=o.id
/* This query give me data from 1 jan 2020 to 30 jan 2021 */