8

I'm using python 3.9 to insert a list of multiple news from google rss news to SQL table with parameter using pyobc but always getting programming error below:

cursor.execute(query) pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Incorrect syntax near 'cò' . (102) (SQLExecDirectW)")

I checked the sql table and found out actually some of records had been imported to SQL successfully (15 records ) but not all of its (30 records)

Below its all of my codes pls help !

import bs4
from bs4 import BeautifulSoup as soup
from urllib.request import urlopen
import pyodbc

news_url="https://news.google.com/rss?hl=vi&gl=VN&ceid=VN:vi"
Client=urlopen(news_url)
xml_page=Client.read()
Client.close()
soup_page=soup(xml_page,"xml")
news_list=soup_page.findAll("item")

cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=ADMIN;DATABASE=NewsCollect2')
cursor = cnxn.cursor()

for news in news_list:
    query = f"insert into news2(Title,Source1,Time1) values (N'"+news.title.text+"',N'"+news.source.text+"',N'"+news.pubDate.text+"')"
    cursor.execute(query)
    cursor.commit()
cursor.close()
cnxn.close()

p/s I tried to extract to txt file and it worked totally fine

Parfait
  • 104,375
  • 17
  • 94
  • 125
Ng Hà
  • 81
  • 1
  • 1
  • 4
  • 4
    Guess what happens if someone enters `'); DROP TABLE Users; --` in one of those fields. Don't create queries by concatenating user data. No amount of quoting or sanitization will protect you from SQL injection. It's far safer and *easier* to use [parameterized queries](https://github.com/mkleehammer/pyodbc/wiki/Getting-started#parameters) – Panagiotis Kanavos Mar 04 '21 at 17:05
  • @Panagiotis Kanavos It worked. You saved my day. Thank you very much ! – Ng Hà Mar 04 '21 at 17:27

2 Answers2

6

As commented by @PanagiotisKanavos, use the industry recommended best practice of SQL parameterization which goes beyond Python and SQL Server but any application layer code and any SQL-compliant database.

Not only does this method safely escape user-submitted values, you also avoid breakage with special characters such as accent marks per your case and even quotes within the strings. Additionally, you enhance code readability, maintainability, and arguably efficiency. Even consider executemany:

# PREPARED STATEMENT (NO DATA)
query = "insert into news2 (Title, Source1, Time1) values (?, ?, ?)"

# LIST OF TUPLES FOR PARAMS
data = [(news.title.text, news.source.text, news.pubDate.text) for news in newslist]

# EXECUTE STATEMENT AND BIND PARAMS
cursor.executemany(query, data)
cursor.commit()
Parfait
  • 104,375
  • 17
  • 94
  • 125
6

in python3, you need to add two lines after your conn

import pyodbc as db # forgot the imports
conn = pyodbc.connect(driver=driver, server=serv, database=db,port = prt,
                  uid=usr, pwd=passwd)
conn.setdecoding(db.SQL_CHAR, encoding='latin1')
conn.setencoding('latin1')
Stella Fredo
  • 81
  • 1
  • 2