4

From a different account, I sent myself an email with the subject Test de réception en local. Now using IMAP, I want to find that email searching by subject.

When doing a search for ALL and finding the email among the output, I see:
Subject: =?ISO-8859-1?Q?Test_de_r=E9ception_en_local?=

So now, searching with imap, I try:

M = imaplib.IMAP4_SSL('imap.gmail.com', 993)
M.login('user@gmail.com', 'password')
M.select('[Gmail]/All Mail')

subject = Header(email_model.subject, 'iso-8859-1').encode() #email_model.subject is in unicode, utf-8 encoded
typ, data = M.search('iso-8859-1', '(SUBJECT "%s")' % subject)
for num in data[0].split():
    typ, data = M.fetch(num, '(RFC822)')
    print 'Message %s\n%s\n' % (num, data[0][1])
M.close()
M.logout()

print 'Fin'

If you print out subject, you see that the result appears just the same as what I'm getting from the IMAP server on my prior, more-broad search. Yet, it doesn't seem to make a match when doing this more specific search.

For the search, I have tried everything I can think of:

typ, data = M.search('iso-8859-1', '(HEADER subject "%s")' % subject)
typ, data = M.search('iso-8859-1', 'ALL (SUBJECT "%s")' % subject)

And others that I can't recall at the moment, all without any luck.

I can search (and match) for emails that have subjects that only use ASCII, but it doesn't work with any subject that has an encoding applied. So...

With IMAP, what is the proper way to search for an email using a subject that has an encoding applied?

Thanks

rfadams
  • 1,898
  • 2
  • 19
  • 21

2 Answers2

10

When talking to IMAP servers, check with IMAP RFC.

You must remove extra quotes, and you must not encode the strings. Also, charset specifies the charset of the search query, not the charset of the message header. This should work (works for me):

M.search("utf-8", "(SUBJECT %s)" % u"réception".encode("utf-8"))
# this also works:
M.search("iso8859-1", "(SUBJECT %s)" % u"réception".encode("iso8859-1"))

Edit:

Apparently some servers (at least gmail as of August 2013) support utf-8 strings only when sent as literals. Python imaplib has a very limited literal arguments support, the best one can do is something like:

term = u"réception".encode("utf-8")
M.literal = term
M.search("utf-8", "SUBJECT")
Community
  • 1
  • 1
abbot
  • 27,408
  • 6
  • 54
  • 57
  • I will give this a try. It def seems to make sense. Thanks for the reply! – rfadams May 12 '11 at 04:04
  • 2
    I'm getting `SEARCH command error: BAD ['Could not parse command']` error with this. Even tried using the same string `u'réception'`. What gives? – john2x Aug 07 '13 at 03:33
  • @john2x, could be an IMAP server issue. Which software is running on the server side? – abbot Aug 07 '13 at 12:09
  • @john2x, it used to work some time ago, but now fails. Apparently they changed the charset handling, will have a look at this later. – abbot Aug 08 '13 at 18:05
  • 1
    @john2x, Ok, gmail IMAP now allows UTF-8 strings in search only if they are in literal form. I've update the answer to reflect this, but in general these are bad news: python imaplib can pass only one (last) literal argument in IMAP command. Original code still works fine with other imap servers, e.g. with Dovecot. – abbot Aug 09 '13 at 17:14
  • Thanks @abbot. So if I were to search for more than one field, how would I do that? e.g. searching for subject + datesent + recipients. (http://stackoverflow.com/questions/18054337/python-imapclient-imaplib-search-unicode-issue/18080640?noredirect=1#comment26465268_18080640) – john2x Aug 11 '13 at 05:49
  • @john2x, as noted above, imaplib support for literal arguments is very limited. So I would recommend looking for some imap implementations other then imaplib (may be even write your own), or if you need to work just with gmail, you may try to trick it using X-GM-RAW with a single argument. – abbot Aug 11 '13 at 23:48
  • @abbot, sorry, but I'm not sure how/where to specify the X-GM-RAW attribute. Do I do it in `search()`? Thanks again. – john2x Aug 12 '13 at 02:00
  • Something like this: `M.literal = u"in:inbox after:2013/08/12 réception".encode("utf-8") ; M.search("utf-8", "X-GM-RAW")`. Google for X-GM-RAW and gmail advanced search syntax. – abbot Aug 12 '13 at 20:43
0

This code work in 2021-2022. Try to count emails for others SUBJECT's. And work with mails_list if you need email content.

import imaplib
import mailbox

user = 'your@email.com'
password = 'secure_password'
imap_url = 'imap.gmail.com'

M = imaplib.IMAP4_SSL(imap_url)
M.login(user, password)

M.select()

term = u"Test results".encode("utf-8")
M.literal = term
typ, data = M.search("utf-8", "SUBJECT")

mails_list = data[0].split()  # get all email's in list

print(len(mails_list))  # get mails quantity for search query

# close connection
M.close()
M.logout()