Questions tagged [streamsets]

Use the streamsets tag for questions regarding StreamSets DataOps Platform which includes Data Collector, Transformer and Control Hub.

StreamSets DataOps Platform empowers your whole team, from highly skilled data engineers to visual ETL developers, to do powerful data engineering work. Only StreamSets makes it both simple to get started building pipelines quickly with intent-driven design and easy to extend to meet complex enterprise needs.

Useful Resources:

Initial Release: June 27th, 2014 - StreamSets Data Collector – the First Four Years

Latest Production Release Series:

EBooks:

183 questions
1
vote
1 answer

Scheduling the JDBC consumer job in Stream Sets

I need to schedule the JDBC consumer job to run everyday morning at 5 am, as far as I know, I can make the job run at 5 am when I start the job at 5 am and put 24 hours in the query interval. But I need to schedule the first instance to start at 5…
roh
  • 1,033
  • 1
  • 11
  • 19
1
vote
2 answers

StreamSets Data Collector reading incorrect times from Kafka

It seems that StreamSets data collector reads incorrect datetime values. I tried reading simple topic from Confluent: when I check datetime value in milliseconds with Landoop Kafka topics - it shows correct datetime, but when I read it with Kafka…
1
vote
1 answer

Can i map table columns in StreamSet using any of its API?

I need to map 10000 column table in a Stream Set pipeline and need to send data to it from (csv) file. So mapping each column in stream set application by mentioning column names is very big task for 10000 columns. So can anyone…
1
vote
1 answer

Is it possible to create Kafka topics through StreamSets Data Collector (SDC)?

I am using StreamSets Data Collector (SDC) web tool to create a pipeline that transfers data from my local system to Kafka through a Kafka producer. However, I have to first manually create the topic in which I want to store my data. Is it possible…
prachi
  • 13
  • 4
1
vote
2 answers

Issues with jdbc producer in streamsets

I was trying migrating data from local directory to mysql db using Streamsets pipeline.While previewing data is on the console but not written to mysql db.The pipeline showing no error but still data is not written to db.If anyone worked on…
1
vote
1 answer

StreamSets preview data from MySQL error

I was attempting to use StreamSets to query a MySQL database and publish the data into Elasticsearch (localhost). I downloaded StreamSets' tarball on my Mac and unzipped it into my home directory. Running StreamSets dc started up on my first try,…
1
vote
0 answers

How to ingest data in real time from oracle to Elasticsearch

I am using a loop in scala to query an Oracle table every 10 second, since Oracle table get continuously insertion. I create a select request then I create n json string containing n line from oracle that I push into Elasticsearch. After that I…
a.moussa
  • 2,977
  • 7
  • 34
  • 56
1
vote
0 answers

Field Type Conversion-Conversion Tab-StreamSets

When I am trying to do type conversion, "by field name" conversion method. When I click on field to convert there is an automatic drop down list which show list of data content from the source(CSV file). The list showing both /0 [0] ,/1 [1]. Why it…
1
vote
2 answers

How to parse multiple lines record of Log file using StreamSets?

I'm using StreamSets to parse a Log file, the problem that StreamSets parses line by line and my log record is multiple lines, something like this 00:01:03.930 [WebContainer : 41] Outbound message: 00:01:03.930 [WebContainer : 41] Values to hide…
Mohamed Seif
  • 382
  • 1
  • 3
  • 14
1
vote
1 answer

How do you add routing for elasticsearch destination in 2.5

I'm using StreamSets (2.5.1.1) to pipe data to Elasticsearch (5.4.1). My index requires routing but I do not see how to add routing to the Elasticsearch destination in my pipeline. I thought I could just add a "routing" http param but it needs to…
eze
  • 2,332
  • 3
  • 19
  • 30
1
vote
1 answer

Importing a python module in Jython StreamSets - ImportError: No module named

I'm running StreamSets in a docker on CentOS. Trying to import a python package in Jython, it returns the following error: SCRIPTING_05 - Script error while processing record: javax.script.ScriptException: ImportError: No module named pandas in…
Zahra
  • 6,798
  • 9
  • 51
  • 76
1
vote
1 answer

Streamsets class not found exception

I built a pipeline in streamsets to read data from my sql and do change data capture. When i started the execution of the pipeline i get the following error. Pipeline Status: START_ERROR: java.lang.NoClassDefFoundError:…
srini
  • 39
  • 1
  • 1
  • 7
1
vote
0 answers

HTTP_21 - OAuth2 authentication failed. Please make sure the credentials are valid

I'm implementing Google Analytics Api with StreamSet to stream real-time stats.I have given private key and jwt token correctly but always getting the same error "HTTP_21 - OAuth2 authentication failed. Please make sure the credentials are…
Naveen
  • 11
  • 2
1
vote
1 answer

StreamSets upgrade and LDAP authentication

Just upgraded StreamSets from 2.1.0.2 to 2.4.0.0 using Cloudera Manager (5.8.2). I can't login anymore into StreamSets - I get "login failed". The new version seem to be using a different LDAP lookup method. My logs BEFORE Update looks as below: Mar…
1
vote
1 answer

Streamsets DC and Crate exception. ERROR: SQLParseException: line 1:13: no viable alternative at input 'CHARACTERISTICS'

I am trying to connect to Crate as a Streamsets Data collector pipeline origin ( JDBC Consumer ). However I get this error: "JDBC_00 - Cannot connect to specified database: com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize…
gashey
  • 13
  • 2