1

Following AWS Personalize documents, I successfully imported my datasets (User, Item, Interaction) from S3, created an EventTrcker, trained the model, and deployed the campaign. The solution works without any issue and I get the recommendations.

I rely on Putevent to add new user-item interaction events. I also dump those interaction events using Lambda+firehose in my s3. But I am wondering if AWS Personalize internally creates/augments the original user-item interaction dataset? How I can access and download the revised version of the dataset? I cannot see any new dataset in "Dataset groups > Datasets" rather than my original 3 datasets...

I prefer to dump it regularly from AWS Personalize to my S3 storage rather than using my own Lambda+Firehose solution.

This is the output of my Putevent call. I see 200...but not sure it works fine or not...should I see any new dataset in "Dataset groups > Datasets" created by putevents?

{
        "ResponseMetadata": {
            "RequestId": "a6c96496-cbd6-4ad8-9183-371d1794cbd8",
            "HTTPStatusCode": 200,
            "HTTPHeaders": {
                "content-type": "application/json",
                "date": "Mon, 04 Jan 2021 18:04:28 GMT",
                "x-amzn-requestid": "a6c96496-cbd6-4ad8-9183-371d1794cbd8",
                "content-length": "0",
                "connection": "keep-alive"
            },
            "RetryAttempts": 0
        }
    }
user2867237
  • 457
  • 1
  • 5
  • 20

1 Answers1

1

Update: Now it's possible

AWS documentation: https://docs.aws.amazon.com/personalize/latest/dg/export-data.html

You can use this AWS CLI command for exporting only interactions, that were added but PutEvents/PutUsers/PutItems API calls:

aws personalize create-dataset-export-job \
  --job-name job name \
  --dataset-arn dataset ARN \
  --job-output "{\"s3DataDestination\":{\"kmsKeyArn\":\"kms key ARN\",\"path\":\"s3://bucket-name/folder-name/\"}}" \
  --role-arn role ARN \
  --ingestion-mode PUT

In that case --ingestion-mode PUT will make sure, that:

Specify PUT to export only data that you imported incrementally using the console or the PutEvents, PutUsers, or PutItems operations.

So I believe it covers your use case.

No, it's not possible

It's simply impossible right now to export this data.

There is no API to retrieve a dump of your Interactions dataset in Personalize.

I believe Lambda + Firehose workaround for this is correct approach.

But how to test, if PutEvents works?

To make sure, that Interactions added through PutEvents, you can make use of Filters feature: https://docs.aws.amazon.com/personalize/latest/dg/filter-expressions.html

Pretty much create a new Filter, with similar expression:

EXCLUDE ItemID WHERE Interactions.EVENT_TYPE IN ("your_event_type_name")

Which will exclude from recommendations any item, that user previously interacted with.

Then you can test, if events added through PutEvents API are recognized correctly:

  1. Create Filter expression as described above.
  2. Create any campaign for simple recommendations (User-Personalization recipe).
  3. Connect the filter to campaign.
  4. Get recommendations for any user and save them somewhere.
  5. Call PutEvents API with any of the recommended items, that was returned in 4 and user id from 4.
  6. Again get recommendations for the same user as in 4.

If the item, that you did added with PutEvents call is no longer recommended, then you have a proof, that events added through PutEvents call are correctly added to Interactions dataset.


What if PutEvents call doesn't affect recommendations in that case?

Then simply you are providing incorrect values in API call. Personalize might return 200 response, even if event provided was invalid.

To fix that, try:

  1. Make sure date is in correct format. Personalize might ignore events with very old timestamps, if there are much more newer events (it's possible to configure it in Solution config).
  2. Check if you are not passing any strange values like "null" or "undefined" for sessionId, userId, trackingId in PutEvents params. It might cause ignoring the event by Personalize (https://github.com/aws/aws-sdk-js/issues/3371)
  3. Make sure, you are passing correct eventType value (should match eventType in Solution and Filter).
  4. If it still doesn't work, raise a support ticket to AWS with an example PutEvents API call params.

Are there any simpler solutions?

Well, maybe there are, but in our project we use this approach and it also tests, if filtering feature is working correctly. You will probably make use of Filtering anyways in the future, so I believe it's good enough method.

PatrykMilewski
  • 922
  • 8
  • 17