0

Has anyone faced any issue with Connecting to Cassandra from a Flink Job when the Connection is made outside the Flink's DataStreams normally?

    Session session = clusterBuilder.getCluster().connect();
    ResultSet resultSet = session.execute(resultStatement.getQuery());

I am not facing this in Locale but in a dev environment. In local connection it is working fine. Even with same clusterbuilder settings when I am keeping this piece of code inside a DataStream processElement, The Connection is getting Established in Dev.

I am getting a programInvocation error in main and I can not see the whole error because of Flink 1.7's limitation. In dashboard you do not get the whole exception trace to see in Flink 1.7. The Job is not getting submitted.

Anybody has any clue on this or faced anything similar like this?

Samik
  • 29
  • 2
  • 12

1 Answers1

0

The most probably cause (I'm not Flink expert, but I have seen this problem with Spark) is that Session object is not Serializable, and couldn't be sent to executors/workers.

To workaround this, usually there is an API with explicit open/close calls that allow to initialize non-serializable classes. As I see, Flink has a notion of Asynchronous I/O for External Data Access, that could be potentially used for accessing Cassandra.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132