2

i just started using mongodb and setup a test database to handle web scraping results from a couple scripts i created. right now, the date_found is being loaded as a string. when i run this in mongohub:

{"date_found" : /.*2015-05-02.*/}

i get all the collections with '2015-05-02'. awesome!

however, when i run:

for item in collection.find({"date_found": "/.*2015-05-02.*/"}):
    print item

i get nothing.

also, this:

for item in collection.find():
    print item

gives me all the collections, so it seems everything works to the extent that i can query the database.

any chance someone can tell me what bonehead mistake i'm making (or what i'm missing)?

Silas
  • 89
  • 1
  • 7

1 Answers1

2

In pymongo, to include a regular expression you could try something like this:

import re
regx = re.compile(".*2015-05-02.*")
for item in collection.find({"date_found": regx})
    print item

Or using the $regex operator:

import re
regx = re.compile(".*2015-05-02.*")
for item in collection.find({"date_found": {"$regex": regx} })
    print item
chridam
  • 100,957
  • 23
  • 236
  • 235
  • both of those are throwing an "sre_constants.error: nothing to repeat" at the regx line... – Silas May 04 '15 at 15:09
  • 2
    Looks like the leading `.` is missing in those regex strings. – JohnnyHK May 04 '15 at 15:11
  • I have updated the answer the include the leading `.`. Thanks to @JohnnyHK for the heads-up. – chridam May 04 '15 at 15:18
  • works like a charm. do you mind shedding some light on the difference? i'm assuming it has something to do with pymongo but i'm not really sure...just trying to understand. thanks again! – Silas May 04 '15 at 15:24
  • 1
    Basically the leading `.` matches any char except newline, best explained by this [**Python 2.7 Regular Expressions Cheat Sheet**](https://github.com/tartley/python-regex-cheatsheet/blob/master/cheatsheet.rst). – chridam May 04 '15 at 15:31
  • gotcha. i was talking more about the difference between query mongoldb directly and/or using pymongo. – Silas May 04 '15 at 15:38