1

I'm trying to write a dynamodb expression to filter certain things out. I've noticed that the following partiql is significantly more expensive than I thought it would be.

 aws dynamodb execute-statement --return-consumed-capacity INDEXES --statement 'SELECT * FROM "hit_counts"."as_of_when-the_url-index-2" where "as_of_when" = ? and not contains("the_url", ?)' --parameters "[{\"S\": \"2022-06-06\"},{\"S\":\".html-dev\"}]"

 *snip*

 "ConsumedCapacity": {
    "TableName": "hit_counts",
    "CapacityUnits": 66.0,
    "Table": {
        "CapacityUnits": 0.0
    },
    "GlobalSecondaryIndexes": {
        "as_of_when-the_url-index-2": {
            "CapacityUnits": 66.0
        }
    }
}

However if I omit the contains portion in a standard query (It constantly says I can't use a primary key attribute in a filter expression), it uses significantly less capacity.

aws dynamodb query --table-name hit_counts \                                                                                               
    --index-name as_of_when-the_url-index-2 \                                                                    
    --key-condition-expression 'as_of_when=:foo' --return-consumed-capacity INDEXES \
    --expression-attribute-values "{\":foo\": {\"S\": \"2022-06-06\"}}" \
    --select ALL_PROJECTED_ATTRIBUTES

*snip* 

"ConsumedCapacity": {
    "TableName": "hit_counts",
    "CapacityUnits": 1.0,
    "Table": {
        "CapacityUnits": 0.0
    },
    "GlobalSecondaryIndexes": {
        "as_of_when-the_url-index-2": {
            "CapacityUnits": 1.0
        }
    }
}

What's going on here, and how does partiql actually create a query plan?

HSchmale
  • 1,838
  • 2
  • 21
  • 48

1 Answers1

1

Your only indexed attributes are the partition key and sort key, and to use these indexes your partition key has to be provided as an exact value and the sort key as either a value, range, or prefix. Your contains query doesn't express a constraint that can be index optimized, so DynamoDB has to filter to limit the results, and when filtering your consumption is based on the pre-filtered data size.

There is no straightforward way to do an index-driven contains constraint in DynamoDB.

Look at the vanilla non-PartiQL interface to DynamoDB. Behind the scenes these are the primitives that PartiQL has to work with.

hunterhacker
  • 6,378
  • 1
  • 14
  • 11
  • `as_of_when` is the partition key, and that is nailed down tight, and typically my sort key is scanned, but when I ask for it exclude the keys that end with a string it starts to use a lot more read request units. The prefiltered data is still tiny, a lot less than 66 RCUs would be. – HSchmale Jun 06 '22 at 23:42
  • OK, how big is it? – hunterhacker Jun 07 '22 at 00:15
  • The consumed capacity is 1 RCU (37 items) when I'm not filtering anything on the sort key. I've just tried to convert the PartiQL into a boto3 expression and it's not liking my expression saying you can't use a primary key in the FilterExpression, yet what could partiql possibly be doing? – HSchmale Jun 07 '22 at 00:19