Questions tagged [streamsets]

Use the streamsets tag for questions regarding StreamSets DataOps Platform which includes Data Collector, Transformer and Control Hub.

StreamSets DataOps Platform empowers your whole team, from highly skilled data engineers to visual ETL developers, to do powerful data engineering work. Only StreamSets makes it both simple to get started building pipelines quickly with intent-driven design and easy to extend to meet complex enterprise needs.

Useful Resources:

Initial Release: June 27th, 2014 - StreamSets Data Collector – the First Four Years

Latest Production Release Series:

EBooks:

183 questions
0
votes
1 answer

Framing JSON object in groovy - Streamsets

I am pretty new to Streamsets and I finding it a little confusing and challenging to frame a JSON object inside my Groovy Evaluator object. I need to frame the below JSON: { "filter": "(equals(type,'my/specific/Type') and…
Mike
  • 721
  • 1
  • 17
  • 44
0
votes
1 answer

Groovy script to stop StreamSets pipeline based on http response

I want to stop a streamsets pipeline by creating a custom event using Groovy evaluator. I have an http origin stage which gives a json response. I need to have a groovy which reads the json response and create a custom event when final record is…
AJ20
  • 1
0
votes
1 answer

How to enable custom stage library for streamsets data collector?

I have custom stage library which I want to use in Streamsets Data collector pipeline, I have followed all the steps given in below link to install custom stage lib, but still I am not able search stage library data collector. Could you please help…
saw
  • 7
  • 3
0
votes
1 answer

Create New Parameter

Trying to create a new parameter in StreamSets and I am getting this error > I am following exactly the official tutorial, but it seems this validation is new!
essamSALAH
  • 641
  • 2
  • 8
  • 15
0
votes
1 answer

Streamsets pipeline didn't finish if kafka consumer having 0 messages in topic

I have developed a streamsets pipeline which using KAFKA consumer as origin.My pipeline is working fine if Kafka consumer having message in it.But kafka consumer have 0 message in it the my pipeline went into loop and running contineously and didn't…
CMK
  • 40
  • 2
  • 10
0
votes
1 answer

Has Streamsets / datacollector-oss gone proprietary after SDC version 4.x?

I am unable to find any link to download Streamsets SDC Opensource version. Looks like they no longer will release opensource version of Streamsets/datacollector-oss. The last version of datacollector-oss was Apr 27, 2021 on GitHub and there are no…
ebeb
  • 429
  • 3
  • 12
0
votes
1 answer

Streamsets Using Direct Engine REST APIs. This Data Collector is not accessible

I am getting an error in accessing the Streamsets Connector. How do I change back to "Using Websocket Tunnelling?" I've been using Streamsets version 4.1 for a short time. The sales tech switched a configuration to use the "Direct Engine REST…
user361446
  • 96
  • 4
0
votes
0 answers

StreamSets - How to call Call JDBC Query Executed after the JDBC producer

I am building a pipeline where I am loading data from Kafka consumer stage to JDBC producer (SQL Server). After this insertion I want to run one stored procedure which has some logic applied on the inserted data however I am not seeing option to…
0
votes
1 answer

Read hive table (or HDFS data in parquet format) in Streamsets DC

Is it possible to read hive table (or HDFS data in parquet format) in Streamsets Data collector? I don't want to use Transformer for this.
0
votes
1 answer

How to install Streamsets on aws?

I am trying to install Streamsets on AWS. What is an easy way to set up Streamsets on AWS ? Is there any preferred AMI I can use or need to setup from scratch ?
0
votes
1 answer

BIT column from SQL source converted to Boolean by data collector

I'm pulling data from SQL server which has a bit Datatype Column having values 0 & 1.We are using Streamsets to load the data from sql into databricks. What happens is 0 and 1 from Source server is getting converted to False and True while Loading…
0
votes
1 answer

Streamset Ignoring File in HDFS

I am getting below Error while running my pipeline. Pipeline takes files from HDFS, Merges them and again stores the files on HDFS. Error: WARN Ignoring file 'Filename and Location' in spool directory as is lesser than offset file. Category :…
Rangan Roy
  • 71
  • 1
  • 1
  • 5
0
votes
1 answer

Problem with importing external Java library in Groovy Evaluator of Streamsets DataCollector

I am trying to configure the Groovy Evaluator properly in the Streamsets DataCollector software. I am using 3rd party Java library geohash-java (see https://github.com/kungfoo/geohash-java). I downloaded the library from…
Maxim
  • 11
  • 2
0
votes
1 answer

How to capture error on JDBC connection failure

I am trying to send an email notification whenever a JDBC connection is failing in my streamsets pipeline. I am able to send email notification when a JDBC query has encountered an error. But not when JDBC connection has encountered an error. Went…
Mike
  • 721
  • 1
  • 17
  • 44
0
votes
1 answer

Error -Cannot recognize package: go.opencensus.io While build datacollector-edge

I have created directory $GOPATH/src/github.com/streamsets . Then cloned https://github.com/streamsets/datacollector-edge.git into it. When I run './gradlew goClean dist publishToMavenLocal', after resolving many dependencies I get the below…