0

I've come across a very annoying bug in Google Dataprep.

According to this page: https://cloud.google.com/dataprep/docs/html/Window-Transform_57344658, it should be possible to reverse the order of sorting by adding a dash in front of the column name.

However, although the preview shows that the data is correctly sorted, the output will always be sorted in ascending order.

I have tested it in various ways and I'm sure it is a bug in the system.

The formula I'm trying to use is a PREV(column_name, 1) function which is not grouped, but is ordered by column_name and -date.

To subsequently deduplicate the dataset based on this column: If(window==column_name)

Hopefully it will be solved as soon as possible. The current situation asks for a workaround. Does anyone know an elegant solution?

dsesto
  • 7,864
  • 2
  • 33
  • 50
B Delfos
  • 21
  • 2

1 Answers1

0

I have been able to reproduce the issue you reported when using a Window function ordered in reverse order using a dash -OrderColumn.

I see that you have already created a Dataprep issue in Public Issue Tracker and another member of the GCP Support team was also able to reproduce this behavior and reported it to the Dataprep team. Cloud Dataprep is a third party product developed and managed by a company called Trifacta, and this issue requires support from their product team. I see that the Trifacta Dataprep team is working on the issue reported, and their first response is that, while they work on this, a quick workaround is to use sort in a separate step.

As a final note, maybe next time you can share screenshots in your question in order to provide an easier understanding of your issue.

dsesto
  • 7,864
  • 2
  • 33
  • 50