Update: Now it's possible
AWS documentation:
https://docs.aws.amazon.com/personalize/latest/dg/export-data.html
You can use this AWS CLI command for exporting only interactions, that were added but PutEvents/PutUsers/PutItems API calls:
aws personalize create-dataset-export-job \
--job-name job name \
--dataset-arn dataset ARN \
--job-output "{\"s3DataDestination\":{\"kmsKeyArn\":\"kms key ARN\",\"path\":\"s3://bucket-name/folder-name/\"}}" \
--role-arn role ARN \
--ingestion-mode PUT
In that case --ingestion-mode PUT
will make sure, that:
Specify PUT to export only data that you imported incrementally using the console or the PutEvents, PutUsers, or PutItems operations.
So I believe it covers your use case.
No, it's not possible
It's simply impossible right now to export this data.
There is no API to retrieve a dump of your Interactions dataset in Personalize.
I believe Lambda + Firehose workaround for this is correct approach.
But how to test, if PutEvents works?
To make sure, that Interactions added through PutEvents, you can make use of Filters feature:
https://docs.aws.amazon.com/personalize/latest/dg/filter-expressions.html
Pretty much create a new Filter, with similar expression:
EXCLUDE ItemID WHERE Interactions.EVENT_TYPE IN ("your_event_type_name")
Which will exclude from recommendations any item, that user previously interacted with.
Then you can test, if events added through PutEvents API are recognized correctly:
- Create Filter expression as described above.
- Create any campaign for simple recommendations (User-Personalization recipe).
- Connect the filter to campaign.
- Get recommendations for any user and save them somewhere.
- Call PutEvents API with any of the recommended items, that was returned in 4 and user id from 4.
- Again get recommendations for the same user as in 4.
If the item, that you did added with PutEvents call is no longer recommended, then you have a proof, that events added through PutEvents call are correctly added to Interactions dataset.
What if PutEvents call doesn't affect recommendations in that case?
Then simply you are providing incorrect values in API call. Personalize might return 200 response, even if event provided was invalid.
To fix that, try:
- Make sure date is in correct format. Personalize might ignore events with very old timestamps, if there are much more newer events (it's possible to configure it in Solution config).
- Check if you are not passing any strange values like "null" or "undefined" for sessionId, userId, trackingId in PutEvents params. It might cause ignoring the event by Personalize (https://github.com/aws/aws-sdk-js/issues/3371)
- Make sure, you are passing correct eventType value (should match eventType in Solution and Filter).
- If it still doesn't work, raise a support ticket to AWS with an example PutEvents API call params.
Are there any simpler solutions?
Well, maybe there are, but in our project we use this approach and it also tests, if filtering feature is working correctly. You will probably make use of Filtering anyways in the future, so I believe it's good enough method.