0

I'm trying to understand the concept of Documents on Google App Engine's Search API. The concept I'm having trouble with is the idea behind storing documents. So for example, say in my database I have this:

class Business(ndb.Model):
   name = ndb...
   description = ndb...

For each business, I am storing a document so I can do full-text searches on the name and description. My questions are:

  1. Is this right? Does these mean we are essentially storing each entity TWICE, in two different places, just to make it searchable?

  2. If the answer to above is yes, is there a better way to do it?

  3. And again if the answer to number 1 is yes, where do the documents get stored? To the high-rep DS?

I just want to make sure I am thinking about this concept correctly. Storing entities in docs means I have to maintain each entity in two separate places... doesn't seem very optimal just to keep it searchable.

flynn
  • 1,572
  • 2
  • 12
  • 26

1 Answers1

3

You have it worked out already.

Full Text Search Overview

The Search API allows your application to perform Google-like searches over structured data. You can search across several different types of data (plain text, HTML, atom, numbers, dates, and geographic locations). Searches return a sorted list of matching text. You can customize the sorting and presentation of results.

As you don't get to search "inside" the contents of the models in the datastore the search API provides the ability to do that for text and html.

So to link a searchable text document (e.g a product description) to a model in the datastore (e.g. that product's price) you have to "manually" make that link between the documents and the data-store objects they relate to. You can use the search api and the datastore totally independently of each other also so you have to build that in. AFAIK there is no automatic linkage between them.

Paul Collingwood
  • 9,053
  • 3
  • 23
  • 36
  • Thanks Paul, but isn't that what I'm doing? I'm creating documents based on my models, hence I'm storing data in the datastore and also storing it in a Search API document, making it searchable, right? – flynn Mar 07 '13 at 15:34
  • Ah, I see. I had misinterpreted your question. What you can do is store the text in the full text search system and for each document also store an additional field (e.g. a URL or datastore key) which links to the actual product (or whatever) in the datastore. So you only ever store the text in the searchable text index, but once a document has been found you can examine it's (say) product code field to determine what product that document links to. – Paul Collingwood Mar 07 '13 at 17:40
  • Okay, that's basically what I am doing so I guess it confirms that I'm thinking about it correctly. I guess intuitively, I would have just assumed you can use the document to describe the stored data in the datastore without actual storing datastore entities in documents. What I mean is instead of actual STORING the data in documents, I intuitively thought that maybe the document was there to just describe a datastore Model/entity. – flynn Mar 07 '13 at 17:52
  • yeagh, you have to "manually" make that link between the documents and the data-store objects they relate to as you can use the search api and the datastore totally independently of each other also. AFAIK there is no automatic linkage between them, – Paul Collingwood Mar 07 '13 at 18:05
  • Thanks Paul - if you update your answer with that I will mark it as correct. – flynn Mar 08 '13 at 16:00