2

I have following documents in my Azure Cosmos DB:

{
    "id": "token",
    "User": {
        "UserToken": "token",
        "Email": "test@email.com"
    },
    "_ts": 1541493290
}

When I run the following query:

SELECT * FROM root  
WHERE ((root["User"]["UserToken"] = "token")  
OR CONTAINS(root["User"]["Email"], "token"))  
ORDER BY root["_ts"] DESC 

Nothing is returned. But when I change it a bit. For example byconverting Email to email:

SELECT * FROM root  
WHERE ((root["User"]["UserToken"] = "token")  
OR CONTAINS(root["User"]["email"], "token"))  
ORDER BY root["_ts"] DESC 

The result is found. Moreover when I remove ORDER BY clause, also query returns me a result. So the query is like following

SELECT * FROM root  
WHERE ((root["User"]["UserToken"] = "token")  
OR CONTAINS(root["User"]["Email"], "token"))  

Moreover, when I edit the document (like open it, add an empty line and save), some magic happens in the background and the document is found. For quite "new" documents (less than 1-3 months), I can search them without my "magic" trick.

Indexes definition is:

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*",
            "indexes": [
                {
                    "kind": "Range",
                    "dataType": "Number",
                    "precision": -1
                },
                {
                    "kind": "Hash",
                    "dataType": "String",
                    "precision": 3
                }
            ]
        }
    ],
    "excludedPaths": []
}

What I did wrong?

UPDATE the answer is not a full explanation but it helps a lot. Full explanation is in my blog (https://stapp.space/ridiculous-bug-in-azure-cosmos-db/)

Piotr Stapp
  • 19,392
  • 11
  • 68
  • 116

1 Answers1

1

CONTAINS(root["User"]["Email"], "token") won't work if you have strings indexed as Hash. They need to be Range with -1 precision. Hash only works for equality checks.

That's why the lowercase one is working. Because it cannot find the property and it just ignores it, falling back to the equality check. The first one finds it, sees that it's not indexed as Range and it just fails to return.

Changing indexing to this, will work:

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*",
            "indexes": [
                {
                    "kind": "Range",
                    "dataType": "Number",
                    "precision": -1
                },
                {
                    "kind": "Range",
                    "dataType": "String",
                    "precision": -1
                }
            ]
        }
    ],
    "excludedPaths": []
}

On a side note, the _ts field is not the best way to do ordering based on creation. It is a unix timestamp in seconds, so any documents created in the same second won't be properly ordered.

Nick Chapsas
  • 6,872
  • 1
  • 20
  • 29
  • How to force "re-index"? I did above change, nothing happened. – Piotr Stapp Nov 07 '18 at 13:23
  • And one more thing. The reason for the second is 100% understable. But still I don't know what is the difference between 1st and 3rd? – Piotr Stapp Nov 07 '18 at 13:35
  • Re indexing will kick in automatically upon saving the changes. You can track the progress via code. However ordering with the _ts field might not work because it has its own indexing policy as it is a system defined meta property. I had many issues with it myself and I ended up making my own property for creation time with millisecond precision. – Nick Chapsas Nov 07 '18 at 13:47