0

I am trying to create data flow pipeline using spring cloud data flow using shell(Not UI). Source being twitterstream and sink as File. Here is what i did to configure file-sink :

dataflow:>stream create demo --definition "twitterstream --credentials | file --dir=/opt/datastream --mode=APPEND --filename=tweets.txt"

I can consume data from kafka topic but unable to write on above sink location, file is not even created . NO error log while deploying the stream. Eventually i will change it to HDFS from local file system. Is there anything missing ?

PS: I tried default file-sink (without definition), which is supposed to create default file inside /tmp/xd/output, didn't happen either.

Dheeraj
  • 173
  • 3
  • 12

1 Answers1

2

On the latest 1.0.0.RELEASE (GA) release, the following stream definition works.

dataflow:>stream create demo --definition "twitterstream | file --directory=/someFolder --mode=APPEND --name=demo.txt"

A couple of things to point out:

1) The twitterstream source does not support --credentials as an OOTB property. See here.

2) The file sink does not support --filename as an OOTB property; you'd have to use --name instead. See here.

Sabby Anandan
  • 5,636
  • 2
  • 12
  • 21
  • Can you please suggest any demo on hdfs sink ? I am not getting it done with `dataflow:>stream create demo --definition "twitterstream | hdfs --directory=hdfs://ipaddress:9000/user/result --file-name=demo` . – Dheeraj Jul 18 '16 at 15:38
  • 1
    You need to use `--fs-uri=hdfs://ipaddress:9000` property to pass the `NameNode` URI and the directory should be the folder. In your case, `--directory=/user/result`. Please follow the [docs](http://docs.spring.io/spring-cloud-stream-app-starters/docs/1.0.0.RC1/reference/htmlsingle/#spring-cloud-stream-modules-hdfs-sink). – Sabby Anandan Jul 18 '16 at 21:29