Questions tagged [streamsets]

Use the streamsets tag for questions regarding StreamSets DataOps Platform which includes Data Collector, Transformer and Control Hub.

StreamSets DataOps Platform empowers your whole team, from highly skilled data engineers to visual ETL developers, to do powerful data engineering work. Only StreamSets makes it both simple to get started building pipelines quickly with intent-driven design and easy to extend to meet complex enterprise needs.

Useful Resources:

Initial Release: June 27th, 2014 - StreamSets Data Collector – the First Four Years

Latest Production Release Series:

EBooks:

183 questions
1
vote
1 answer

can't add external js script to javascript Evaluator in streamsets

I am using external javascript inside javascript evaluator in streamsets. But when i try to load the external code i got following error. How should i resolve this. Thanks ERROR SafeScheduledExecutorService - Uncaught throwable from …
Tamizharasan
  • 293
  • 1
  • 5
  • 18
1
vote
1 answer

Getting error with Oracle 11g CDC in StreamSets

com.streamsets.datacollector.util.PipelineException: PREVIEW_0003 - Encountered error while previewing : com.streamsets.pipeline.api.StageException: JDBC_87 - Interrupted while waiting to read data at…
1
vote
1 answer

StreamSets CDC origin: Mysql-Binarylog unable to get driver instance

I am trying to setup Mysql-BinaryLog in StreamSets, but it complains it cannot load the driver instance. my.cnf: [mysqld] server-id = 223344 log_bin = mysql-bin binlog_format = row binlog_row_image = full …
bsd
  • 1,207
  • 4
  • 15
  • 28
1
vote
1 answer

Credentials in Streamsets

In my current project I'm working with StreamSets and I would like to use Hashicorp Vault as my credentials store, however I'm not able to use credential:get() function wherever I want to. E.g. in Shared Access Key in Azure IoT Hub Producer block. I…
mrudzis
  • 13
  • 4
1
vote
1 answer

Can Streamsets Data Collector CDC read from and write to multiple tables?

I have a MSSQL database whose structure is replicated over a Postgres database. I've enabled CDC in MSSQL and I've used the SQL Server CDC Client in StreamSets Data Collector to listen for changes in that db's tables. But I can't find a way to write…
bsd
  • 1,207
  • 4
  • 15
  • 28
1
vote
1 answer

StreamSets JDBC Query Consumer - Undefined column. columnName=0

I need to convert records in a Phoenix table into a JSON file using StreamSets. For inital POC purposes, I am trying to do a simple fetch from Phoenix into a file. The origin is a JDBC Query Consumer that point to the Phoenix and, for now, it is…
pallupz
  • 793
  • 3
  • 9
  • 25
1
vote
1 answer

StreamSets Design Of Ingestion

Dears, I am considering options how to use Streamsets properly in a given generic Data Hub Architecture: I have several data types (csv, tsv, json, binary from IOT) that needs to be captured by CDC and saved into a Kafka topic with as-is format…
Cengiz
  • 303
  • 2
  • 9
1
vote
1 answer

Can StreamSets Data Collector automatically create tables in the destination database?

Is there a way for StreamSets Data Collector to automatically create tables in the destination database based on the origin database in the case of cdc? I am reading data from a source: mssql and writing to a destination, postgresql. If I am…
bsd
  • 1,207
  • 4
  • 15
  • 28
1
vote
0 answers

Writing data to MySQL from Kafka consumer using StreamSets

I am trying to write data from Kafka consumer to MySQL using JDBC. I am able to take data from databases like MySQL, PostgreSQL using JDBC but not able to write the data to a database. However I am able to write the data to a text file. What am I…
1
vote
2 answers

StreamSets converting Text to Json

I am trying to ingest text data from local directory to HDFS, before ingesting i need to convert text into valid json. For that, i am using JavaScript Evaluator processor. In javascript evaluator i unable to read any record. Here is my sample…
user6325753
  • 585
  • 4
  • 10
  • 33
1
vote
2 answers

How to request Authentication Token from StreamSets Control Hub API?

I am trying to build a JAVA client to POST to a RESTApi, however, while doing so I am getting the error "User not authenticated". When going through the Documentation for API service, I found I have to obtain an Auth Token before I make the call to…
Vijay Shekhawat
  • 149
  • 2
  • 13
1
vote
0 answers

Connecting to Google Analytics with Streamsets

I am trying to connect Streamsets to Google Analytics. However I am having trouble setting it up. With a regular CURL request I would do the following: Step 1) Go to the following link to get the authorization…
Adriaan
  • 13
  • 1
  • 4
1
vote
0 answers

Jersey JAVA REST Client giving Error 500 "BAD Request" for POST request, while POSTMAN is able POST to same Restful API

I am trying to post form data through a JAVA Jersey REST client but i receive the response code 500 and an according exception: java.lang.RuntimeException: Failed with HTTP error code : 500 The same request from POSTMAN(Chrome Extention) works…
Vijay Shekhawat
  • 149
  • 2
  • 13
1
vote
2 answers

Kafka: Cannot retrieve metadata for topic when changing active controller

I've a Cloudera cluster with a clusterized Kafka service. I've two instances of Kafka controllers, lets say C1 and C2. When C1 is the active controller everything seems to work fine. When for some reason, C2 becomes the active controller. Some of…
dhalfageme
  • 1,444
  • 4
  • 21
  • 42
1
vote
1 answer

How can I differentiate the data coming from multiple HTTP client origins in StreamSets

I have 6 pipelines each have the HTTP client origin connected to the SDCRPC destinations, my plan is to make another pipeline with SDCRPC origin and destination to hive tables. My question is after connecting to the SDCRPC origin how can I…
roh
  • 1,033
  • 1
  • 11
  • 19