10

I am struggling to find good material on best practices for filtering data using firebase firestore. I want to filter my data based on the categories selected by the user. I have a collection of documents stored on my firestore database and each document have an array which has all the appropriate categories for that single document. For the sake of filtering, I'm keeping a local array with a user's preferred categories as well. All I want to do is to filter the data based on the user's preferred categories.

firestore categories field

consider I have the user's preferred categories stored as an array of strings ( ["Film", "Music"] ) .I was planning on using firestore's 'array-contains' method like

db.collection(collectioname)
.where('categoriesArray', 'array-contains', ["Film", "Music"])

Later I found out that I can't use 'array-contains' against an array itself and after investigating on this issue, I decided to change my data structure as mentioned here.

categories changed to Map

Once I changed the categories from an array to map, I thought I could use multiple where conditions to filter the documents

let query = db.collection(collectionName)
      .where(somefield, '==', true)

this.props.data.filterCategories.forEach((val) => {
  query = query.where(`categories.${val}`, '==', true);
});

query = query
        .orderBy(someOtherField, "desc")
        .limit(itemsPerPage)

const snapshot = await query.get()

Now problem number 2, firebase requires to add indexes for compound queries. The categories I have saved within each document is dynamic and there's no way I can add these indexes in advance. What would be the ideal solution in such cases? Any help would be deeply appreciated.

nithinpp
  • 1,785
  • 12
  • 24
  • What do you exactly mean when you say that you cannot make indexes in advanced? – andresmijares Jun 11 '19 at 13:18
  • In my use case, the categories field in each document is different, or I can't simply define a master set of these categories beforehand. Upon creating a new document, the user will be able to choose categories that suits the current context from a list of categories. This list comes from another collection , say categories, and the documents in this collection could be different each time. means new categories might get added to this collection or existing ones might get deleted. In such case I won't be able to keep up with the whole indexing thing. – nithinpp Jun 11 '19 at 13:30
  • Are you saying that Firestore rejects the query? Can you be more specific about this? Try writing your query without any loops (your current forEach loops looks like it wouldn't work - it's not actually building a query object properly). – Doug Stevenson Jun 11 '19 at 15:18
  • Is this an OR query or an AND query? Do you want to fetch documents where the category is music or film or documents where the categories include music and film? And, yes, the composite index limitation is a real hurdle but should not get in the way if you properly denormalize your data. – trndjc Jun 11 '19 at 15:42
  • @bsod I'm looking to fetch all the documents where the categories include either music or film or both along with few other filtering conditions and a setup to paginate my data. Could you please guide me a lil more detail about how I can overcome such a limitation? – nithinpp Jun 11 '19 at 16:21
  • @DougStevenson The above forEach loop snippet is just for the sake of explaining the problem. I was able to construct a query with my conditions and upon executing the query, firestore asked me to create composite index. In that case I was using the second approach mentioned in my question (using map). Now, when I executed the query, db .where('categories.Film' , '==', true) .where('categories.Music' , '==', true), Firestore threw me an error saying I need to create composite indexes for 'categories.Film' , 'categories.Music' etc. – nithinpp Jun 11 '19 at 16:29
  • Could you edit the question to show the actual code, and not some simulation of the code? It's generally expected on Stack Overflow that the question provide an MCVE. https://stackoverflow.com/help/minimal-reproducible-example – Doug Stevenson Jun 11 '19 at 16:36
  • I will point out that simply filtering on two values should not require you to make an index. If you add a range filter to that, then you will have to create an index. But since we can't see your actual query, we wouldn't know for sure. – Doug Stevenson Jun 11 '19 at 16:51
  • @DougStevenson I've updated my question with the query that I'm using right now. When I executed the query, firebase threw me the link to create an index with all the values I have in my categories map field – nithinpp Jun 12 '19 at 05:03
  • You will have to remove the orderBy part of your query in order to have flexible filters for equality. Consider sorting the results on the client. – Doug Stevenson Jun 12 '19 at 05:25
  • @DougStevenson So I'll have to get all my documents at once and then handle my sorting and pagination stuff from the client side? – nithinpp Jun 12 '19 at 05:43

4 Answers4

6

This is a new feature of Firebase JavaScript SDK launched at November 7, 2019:

"array-contains-any operator to combine up to 10 array-contains clauses on the same field with a logical OR. An array-contains-any query returns documents where the given field is an array that contains one or more of the comparison values"

citiesRef.where('regions', 'array-contains-any',
    ['west_coast', 'east_coast']);
narcello
  • 501
  • 4
  • 11
  • 1
    This seems like a clean solution. I'm glad that Firebase actually added such a feature. Will try this one out for sure. – nithinpp Nov 28 '19 at 11:05
2

Instead of iterating through each category that you wish to query and appending clauses to a single query object, each iteration should be its own independent query. And you can keep the categories in an array.

<document>
    - itemId: abc123
    - categories: [film, music, television]

If you wish to perform an OR query, you would make n-loops where each loop would query for documents where array-contains that category. Then on your end, you would dedup (remove duplicates) from the results based on the item's identifier. So if you wanted to query film or music, you would make 2 loops where the first iteration queried documents where array-contains film and the second loop queried documents where array-contains music. The results would be placed into the same collection and then you would simply remove all duplicates with the same itemId.

This also does not pose a problem with the composite-index limit because categories is a static field. The real problem comes with pagination because you would need to keep a record of all fetched itemId in case a future page of results returns an item that was already fetched and this would create an O(N^2) scenario (more on big-o notation: https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/). And because you're deduping locally, pagination blocks as the user sees them are not guaranteed to be even. If each pagination block is set to 25 documents, for example, some pages may end up displaying 24, some 21, others 14, depending on how many duplicates were removed from each block.

trndjc
  • 11,654
  • 3
  • 38
  • 51
  • Thanks for the deatiled information. So I'll have to read the same documents from firestore a couple of times and do all the business logic from the client side? – nithinpp Jun 12 '19 at 11:20
  • In this case, yes, because Firestore is not capable of performing an OR query. – trndjc Jun 12 '19 at 14:59
  • I think I'm gonna keep my pagination and ordering stuff as it is and filter data as per user preferred categories from the client side. This seems the better approach for me at this stage. Thanks for the support btw – nithinpp Jun 13 '19 at 09:41
0

Are you planning on retrieving documents with the exact category array? Say, your user preference is listed as ["Film", "Music"]. Do you wish to retrieve only those documents with Film AND Music, or do you wish to retrieve documents having Film OR music?

If it's the latter, then maybe you can query for all documents with "Film" and then query for all documents with "Music", then merge it. However, the drawback here is some redundant document reads, when such document has both "Film" and "Music" in the categoryArray field.

You can also explore using Algolia to enable full-text search. In this case, you'd probably store the category list as a string maybe separated by commas, then update the whole string when the user changes their preferences.

For the former case, I have not come across sa workable solution other than maybe storing it as a concatenated string in alphabetical order? Others might have a more solid solution than mine.

Hope this helps!

johnsing
  • 141
  • 5
  • Hey, Thanks for the info. I'm looking to retrieve all the documents having either Film or Music or Both along with some other filtering conditions and a setup to paginate my data. – nithinpp Jun 11 '19 at 16:24
0

Your query includes an orderBy clause. This, in combination with any equality filter, requires that you create an index to support that query. There is no way to avoid this.

If you remove the orderBy, you will be able to have flexible, dynamic filters for equality using the map properties in the document. This is the only way you will be able to have a dynamic filter without creating an index. This of course means that you will have to order and page the query results on the client.

Doug Stevenson
  • 297,357
  • 32
  • 422
  • 441
  • Well, I was kinda hoping for a better thing there. Now that I know I can't do all these stuffs altogether, Filtering the data on client side seems like a good option for me and I'm gonna go with that. – nithinpp Jun 13 '19 at 05:54