2

I am trying to build a pipeline in StreamSets wherein when a file comes to a directory i want to invoke a rest api with just the file name; I don't want StreamSets to read the file or do any processing on it.

But whatever I try, it's trying to send the whole file to the destination.

The file is a special SEGD format file which is kind a binary file.

It is trying to read the file and failing.

My requirement is to invoke a REST API as soon as a file comes to a folder.

metadaddy
  • 4,234
  • 1
  • 22
  • 46

1 Answers1

1

As you've discovered, by default, StreamSets Data Collector's Directory origin will parse the contents of the file as JSON, delimited data etc. If you use the Whole File format, though, the origin will instead read only the file metadata, and pass a special record along the pipeline, with the following fields:

enter image description here

You can then use the HTTP Client processor or destination, referencing the filename with the expression ${record:value('/fileInfo/filename')}.

metadaddy
  • 4,234
  • 1
  • 22
  • 46
  • HI Metadaddy, Thanks for the help, i am getting the file name now but now i have a different problem If i use http client as processor then i am able to create a request json which has the name of the file but if i use httpclient as a destination, there is no config where i can set the json to send to the given resource url – Shashank Rawat Dec 12 '19 at 06:52
  • You should really ask a new question, but... Set the data format in the HTTP Client destination to JSON and it sends the JSON encoding of the record to the URL. So, you need to make the content of the record match what your REST API is expecting. You can use processors such as Field Renamer, Field Remover to do this. – metadaddy Dec 12 '19 at 16:19