2

I think I have tried all kind of combinations of annotation but I am getting either Sorting not supported for scan expressions or no HASH key for GSI or Both the Hash Key and the Range Key element in the KeySchema have the same name error.

Expectation: Get values with pagination and sort by createdAt field descending.

Repository:

@Repository
@EnableScan
@EnableScanCount
interface StatementRepository : DynamoDBPagingAndSortingRepository<Statement?, String?> {
    override fun findById(id: String): Optional<Statement?>
}

Entity:

@DynamoDBTable(tableName = "statements")
class Statement {
    @get:DynamoDBAutoGeneratedKey
    @get:DynamoDBHashKey
    var id: String? = null

    @get:DynamoDBAttribute
    var statementText: String? = null

    @get:DynamoDBAttribute
    var source: String? = null

    @get:DynamoDBAttribute
    @get: DynamoDBIndexHashKey(globalSecondaryIndexName = "id")
    @get: DynamoDBIndexRangeKey(globalSecondaryIndexNames = ["id-createdAt-index"])
    var createdAt: Long? = null
}

Service:

statementRepository.findAll(PageRequest.of(page, size, Sort.by("createdAt").descending())).forEach {...

I tried almost every solution on the internet and all combination of DynamoDBIndexHashKey and DynamoDBIndexRangeKey annotations but getting diffrent errors.

Note: Paging was working, the problem started when applied sorting as Sort.by("createdAt").descending()

Index on the table (created myself during hit-and-trail):

Snapshot of index created with Partition Key id and Sort Key createdAt

Please check the annotations and help me to place them right or let me know if I am missing anything.

I am also new to Kotlin but I have knowledge of Java.

Prem
  • 316
  • 1
  • 5
  • 23
  • 1
    DynamoDB scans do not support sorting. – Seth Geoghegan Jun 21 '21 at 12:51
  • Hi Seth Geoghegan,Can you please suggest a way to achieve the expected result? – Prem Jun 21 '21 at 13:11
  • I have not worked with DynamoDB via Java annotations, so I'm afraid I won't be much help there. If you want to get data out of DynamoDB sorted by time, you'll need to define a primary key that has a time component in the sort key. From there, you'll use the `query` operation (not `scan`) to fetch the data. – Seth Geoghegan Jun 21 '21 at 14:22
  • In table/entity createdAt is a type of Long (milliseconds), still need to create PK that has time? – Prem Jun 21 '21 at 14:47
  • Items in DynamoDB are uniquely identified by their primary key. Primary keys consist of a partition key and an optional sort key. You'd use a sort key whenever you need to apply sorting to your query results. Since you want to sort by date, you need to define a primary key for your item that includes the `createdAt` value as the sort key. – Seth Geoghegan Jun 21 '21 at 15:26
  • Okay, Are you suggesting creating an index in DynamoDB? What would be Partition Key and sort key, I think id, createdAt respectively because if I am mentioning created in both Partition Key and Sort Key gives error as its same. Right? – Prem Jun 21 '21 at 15:53
  • The correct partition key/sort key for your item depends on how your application needs to use the data. Since you didn't describe your access patterns, it's hard to suggest how you should define your primary key. You might want to post another StackOverflow question about selecting a primary key and include specifics about how your application will use the data. – Seth Geoghegan Jun 21 '21 at 16:00
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/234036/discussion-between-prem-and-seth-geoghegan). – Prem Jun 21 '21 at 16:08
  • Hi Seth, the current use case is pretty simple. Have table statement with field id,..., statement, createdAt. Want to get the elements sorted by createdAt desc, along with Pagination. – Prem Jun 21 '21 at 18:46

1 Answers1

4

Learn DynamoDB. I think this is an issue a lot of developers have especially coming from JPA Spring background, they simply find the DynamoDB implementation to a JPA repository and have at it. But DynamoDB is more of a Key Value store than a SQL database.

In DynamoDB you partition your data, (Wether it is in a GSI, LSI etc doesn't matter). Within a partition data can be sorted (optionally). So you can see if you wanted to find all Statements by oldest to newest globally, well then you are out of luck my friend, you cannot do this. You can only do this within a partition.

So this seemingly trivial thing you want to do is not so trivial in DynamoDB. If you wanted for example the oldest 10 Statements, the correct way to do this is that when data comes in to Dynamo, you have a Lambda in a Stream which pre-calculates anything you want to read later, such as the oldest/newest/average etc.

If this sounds like a massive pain, then probably you don't want to use DynamoDB. The reason you would want to use it is its performance is off the charts amazing and it's cheap. It isn't a swappable SQL database.

Derrops
  • 7,651
  • 5
  • 30
  • 60
  • Also I noted that there was a suggestion to maybe partition by date, and then you could just go to the partition that was oldest/newest. This would seem like a good idea at first but what could happen depending on your app is that you would then suffer from hot-partitions, whereby you may have 1,000s or partitions, but only be writing to 1, which can cause throughput/capacity problems. – Derrops Jul 07 '21 at 03:23
  • Hi @Derrops, thanks for the reply. Can you please suggest any specific source where I can get knowledge about my specific requirement/problem. – Prem Jul 07 '21 at 19:47
  • Hi @Derrops, I understand that I am not an expert in using DynamoDB in spring boot even new to this Sorting in the spring repository. Can you please help me to learn? Please share some sources where I can understand it better and resolve the problem. – Prem Jul 07 '21 at 20:04
  • I really like this https://www.youtube.com/watch?v=DIQVJqiSUkE video showing how to design tables. But to get into finding the min/max/avg globally is not straight forward and you would need to use DynamoStreams. Otherwise you can do what you are doing but only within one partition at a time. That's key. – Derrops Jul 07 '21 at 23:50