how to update/delete a record in hudi table in AWS?

Question

I have a requirement to update or delete a record the hudi table, one way is to do that with pyspark/scala by following the steps mentioned in the below guide

https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hudi-work-with-dataset.html Also is it possible to do that with aws-cli?

Which one could be better to do this? by calling it through lamda or Glue .

score 1 · Answer 1 · answered Apr 09 '22 at 15:56

1

You can use aws-cli to submit spark jobs with EMR steps or notebooks to do adhoc analysis. Submitting spark jobs to EMR is preferred approach.

answered Apr 09 '22 at 15:56

gbharat

276
1
4

Thanks for the answer. I used a similar approach. Developed a Glue job with pyspark code and then used the aws-cli commands to trigger the job from git. – GOPI M Apr 21 '22 at 15:07

how to update/delete a record in hudi table in AWS?

1 Answers1