0

I've been struggling for a while setting up a flink application that creates a Datastream<Tuple3<Integer, java.sql.Time, Double>> from a csv file. The columns in this file (columns ID, dateTime and Result) are all String but they should be converted to Integer, java.sql.Time and Double. The other thing I want is to create tumbling windows with data per day and average the values of the result column in that window. The problem is that I dont know the exact syntax for it. See my code below what I tried. The last part I have sum(2), but I want to calculate the average for the windows. I did not see in a function for this in the documentation. Do I need to write a method myself for this?


DataStream<Tuple3<String, java.sql.Time>> dataStream = env
                .readfile(path)
                .map()
                .keyBy(0)
                .timeWindow(Time.days(1));
user7432713
  • 197
  • 3
  • 17

1 Answers1

0

You can use your own logic to read csv or use library like univocity_parsers. And than instead of using env.readfile you can use env.fromCollection(list).

Here is the link of library In case you want: https://www.univocity.com/pages/univocity_parsers_tutorial#using-annotations-to-map-your-java-beans

You can give your own converter with anotaion @Convert(conversionClass = YourDataTimeCoverter.class)

For average refer following flink documentation with example:.

https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#aggregatefunction

Shishal
  • 27
  • 1
  • 9