0

I'm trying to read the timestamp value of a Pubsub message from Apache Beam.

p.apply("Read PubSub messages", PubsubIO.readMessagesWithAttributes()
    .withIdAttribute("msg_id")
    .withTimestampAttribute("timestamp")
    .fromSubscription(options.getPubsubSubscription()))

But unfortunately, I got the following error which really surprises me as I thought every messages had a default timestamp.

An exception occured while executing the Java class. 
PubSub message is missing a value for timestamp attribute timestamp

Why is my message not timestamped ? Is it because I published it via the Pubsub UI ?

vdolez
  • 977
  • 1
  • 14
  • 33

1 Answers1

5

Every Pub/Sub message will have default timestamps assigned if you omit .withTimestampAttribute(). When you add .withTimestampAttribute("timestamp") it implies that you will be providing the timestamps in the timestamp attribute of each message. For example, using the UI:

enter image description here

Then, windowing will be relative to these timestamps and, if you need to access it from within the pipeline, you can use ProcessContext.timestamp() (more details here).

Guillem Xercavins
  • 6,938
  • 1
  • 16
  • 35
  • Thanks for the explanation ! It seems you're right. I was a bit disturbed because I thought it was the opposite way of working. Also, there's no method (yet) to retrieve the messageId from a PubsubMessage if not explicitly set with attributes https://issues.apache.org/jira/browse/BEAM-3489 :( – vdolez May 14 '19 at 13:14