2

I'm working on my python script to extract the data from sqlite3 database for xbmc media application.

I can see that in my code it will extract the data using the unicode object where I will have the strings (u', u and L.

I want to convert it back to normal strings from unicode object to utf8.

Here is the code:

programs = None
daysLimit = 14
start = datetime.datetime.now()
end = start + datetime.timedelta(days = daysLimit)
cur.execute('SELECT channel, title, start_date, stop_date FROM programs WHERE channel')
programs = cur.fetchall()

print(programs)
cur.close()

Here is the xbmc log:

03:49:03 T:3628  NOTICE: [(u'101 ABC FAMILY ', u'The Middle -  The Ditch',
20140520170000L, 20140520173000L), (u'101 ABC FAMILY ', u'The Goonies', 
20140520173000L, 20140520200000L), (u'101 ABC FAMILY ', u'Pirates of the Caribbean: On Stranger Tides', 
20140520200000L, 20140520230000L), (u'101 ABC FAMILY ', u'The 700 Club', 
20140520230000L, 20140521000000L), (u'101 ABC FAMILY ', u'The Fresh Prince of Bel-Air -  Day Damn One', 
20140521000000L, 20140521003000L), (u'101 ABC FAMILY ', u'The Fresh Prince of Bel-Air -  Lucky Charm', 
20140521003000L, 20140521010000L), (u'101 ABC FAMILY ', u'The Fresh Prince of Bel-Air -  The Ethnic Tip', 
20140521010000L, 20140521013000L), (u'101 ABC FAMILY ', u'The Fresh Prince of Bel-Air -  The Young and the Restless', 
20140521013000L, 20140521020000L), (u'101 ABC FAMILY ', u'Summer Sexy With T25!', 
20140521020000L, 20140521023000L), (u'101 ABC FAMILY ', u'Paid Programming', 
20140521023000L, 20140521030000L)

I want to ignore the strings (u', u and L so I want to make it look like this:

'101 ABC FAMILY ', 'The Middle -  The Ditch', 20140520170000, 20140520173000, 
'101 ABC FAMILY ', 'The Goonies', 20140520173000, 20140520200000, 
'101 ABC FAMILY ', 'Pirates of the Caribbean: On Stranger Tides', 20140520200000, 20140520230000, 
'101 ABC FAMILY ', 'The 700 Club', 20140520230000, 20140521000000, 
'101 ABC FAMILY ', 'The Fresh Prince of Bel-Air -  Day Damn One', 20140521000000, 20140521003000,
and so on...

Can you please tell me how i can convert from unicode object to utf8 using python 2.6 version?

2 Answers2

2
  • The L postfixes signify long integers. They are the same thing as (short) integers really; there really is no need to convert these. It is only their repr() output that includes the L; print the value directly or write it to a file and the L postfix is not included.

  • Unicode values can be encoded to UTF-8 with the unicode.encode() method:

    encoded = unicodestr.encode('utf8')
    

Your beef is with the list representation here; you logged all rows, and Python containers represent their contents by calling repr() on each value. These representations are great for debugging as their types are made obvious.

It depends on what you do with these values next. It is generally a good idea to use Unicode throughout your code, and only encode at the last moment (when writing to a file, or printing or sending over the network). A good many methods handle this for you. Printing will encode to your terminal codec automatically, for example. When adding to an XML file, most XML libraries handle Unicode for you. Etc.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • thank you very much for this, so how i can use the `encoded = unicodestr.encode('utf8')` to work with my code? –  Jun 17 '14 at 16:22
  • @user3667173: Yes, you can encode Unicode values to a UTF-8 bytes like that. **Just make sure you don't have better options**, like not encoding manually. – Martijn Pieters Jun 17 '14 at 16:25
0

Your problem is that you are trying to display data, INSTEAD you are displaying python representation if this object.

So it contains meta-data like u, L, etc. If you want to display data the way you want, you should write a code to deal with it.

For example:

for row in cur.fetchall():
    print u"'{row[0]}', '{row[1]}', '{row[2]}', '{row[3]}', '{row[4]}'".format(row=row)

So it will look like

'1', '2', '3', '4'
'1', '2', '3', '4'
'1', '2', '3', '4'

But... as I can see, you make structure look like CSV-file(comma-separated values), do you? So, maybe, you should read about csv python module?

dt0xff
  • 1,553
  • 1
  • 10
  • 18
  • thank you very much, i can see it is working right now. When I try to use `print row[0], row[1], row[2], row[3].format(row=row)`, I will get an error: AttributeError: 'long' object has no attribute 'format'. Any idea? –  Jun 17 '14 at 16:31
  • You should use `u"'{row[0]}', '{row[1]}', '{row[2]}', '{row[3]}', '{row[4]}'".format(row=row)` – dt0xff Jun 17 '14 at 16:39
  • The logic is: you are using **string** pattern to **format** your data, so you should use a string object and call a `format` method on it. – dt0xff Jun 17 '14 at 16:40