0

Consider the following:

I have a MongoDB collection named C_a. It contains a very large number of documents (e.g., more than 50,000,000).

For the sake of simplicity let's assume that each document has the following schema:

{
    "username" : "Aventinus"
    "text": "I love StackOverflow!",
    "tags": [
      "programming",
      "mongodb"
    ]
}

Using text index I can return all documents which contain the keyword StackOverflow like this:

db.C_a.find({$text:{$search:"StackOverflow"}})

My question is the following:

Considering that the query above may return hundreds of thousands of documents, what is the easiest/fastest way to directly save the returned results into another collection named C_b?

Note: This post explains how to use aggregate to find exact matches (i.e., specific dates). I'm interested in using Text Index to save all the posts which include a specific keyword.

Aventinus
  • 1,322
  • 2
  • 15
  • 33

1 Answers1

1

The referenced answer is correct. The example query from that answer can be updated to use your criteria:

db.C_a.aggregate([
  {$match: {$text: {$search:"StackOverflow"}}},
  {$out:"C_b"}
]);

From the MongoDB documentation for $text:

If using the $text operator in aggregation, the following restrictions also apply.

  • The $match stage that includes a $text must be the first stage in the pipeline.
  • A text operator can only occur once in the stage.
  • The text operator expression cannot appear in $or or $not expressions.
  • The text search, by default, does not return the matching documents in order of matching scores. Use the $meta aggregation expression in the $sort stage.
Jason Cust
  • 10,743
  • 2
  • 33
  • 45