Apache Gobblin is a distributed data integration framework. It simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
Questions tagged [gobblin]
44 questions
0
votes
1 answer
Trying to use Apache Gobblin to read Salesforce data using SOAP API(s) instead of REST API
I am working on an existing tool (heavily based on Apache Gobblin) to import data from customers' Salesforce tables into local MySQL databases (one database per customer).
The tool works (as is) for customers who have enabled the Salesforce REST…

Marc K.
- 11
- 4
0
votes
1 answer
Gobblin Error:- java.io.IOException: java.lang.ClassNotFoundException:
I'm new to the Gobblin and trying to ingest data from the Kafka to HDFS. I was able to flow the Kafka-HDFS Ingestion example successfully. But now I need to add a time-based writer partition option to my job. I did go through the…

GihanDB
- 591
- 2
- 6
- 23
0
votes
1 answer
How to debug Gobblin standalone?
How to run Gobblin in debug mode from IntelliJ IDE using bin/gobblin-standalone.sh command?
Getting started tutorial suggests how to run sample Gobblin job, but it is unclear how to debug it.

alex
- 12,464
- 3
- 46
- 67
0
votes
1 answer
Error: Could not find or load main class org.apache.gobblin.runtime.cli.GobblinCli
I am new to gobblin. I build gobblin from incubator-gobblin GitHub master branch. Now I am tring wikipedia example from getting started guide but getting following error.
WARN: HADOOP_HOME is not defined. Gobblin Hadoop libs will be used in…

Chhaya Vankhede
- 316
- 2
- 14
0
votes
1 answer
Could not determine the dependencies of task ':gobblin-distribution:buildDistributionTar'
I am new to gobblin. I have downloaded incubator-gobblin-gobblin_0.11.0. while installing gobblin on windows 10 by following the instructions given here at execution of ./gradlew :gobblin-distribution:buildDistributionTar
I am getting below…

Chhaya Vankhede
- 316
- 2
- 14
0
votes
1 answer
Build Failed while installing gobblin
I am very new to Gobblin.
I am getting build failure while installing Gobblin.
Following is the terminal output:
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to…

Phani Dutt
- 1
- 1
0
votes
1 answer
Issue with custom service systemd when start Apache Gobblin
Running /opt/gobblin/bin/gobblin-standalone.sh start directly everything works, the output in logs are fine.
Running it through a systemd service, not works. Nothing are outputting in logs.
[vagrant@localhost ~]$ sudo systemctl start…

Bruno Wego
- 2,099
- 3
- 21
- 38
0
votes
1 answer
How to partition Gobblin output to 30 min partitions?
We are planning to migrate from Camus to Gobblin. In Camus we were using below mentioned…

mukul
- 433
- 7
- 18
0
votes
1 answer
Issue in running a Gobblin Job
I am new to Gobblin and i am trying to run a simple job in standalone mode but i am getting the folowing error:
Task failed due to "com.google.gson.JsonSyntaxException:
com.google.gson.stream.MalformedJsonException: Expected name at line 1
column…

sahil gaur
- 1
- 1
0
votes
1 answer
NoSuchMethodError when trying to run Gobblin on Dataproc
I'm trying to run Gobblin on Google Dataproc but I'm getting this NoSuchMethodError and can't figure out how to solve.
Waiting for job output...
...
Exception in thread "main" java.lang.reflect.InvocationTargetException
at…

Henrique G. Abreu
- 17,406
- 3
- 56
- 65
0
votes
1 answer
Gobblin Map-reduce job running successfully on EMR but no output in s3
I am running gobblin to move data from kafka to s3 using 3 node EMR cluster. I am running on hadoop 2.6.0 and I also built gobblin against 2.6.0.
It seems like map-reduce job runs successfully. On my hdfs i see metrics and working directory. metrics…

user2942227
- 1,023
- 6
- 19
- 26
0
votes
1 answer
How i eexec gobblin with Docker
I want to create 2 Docker containers. One with Hadood 2.7.2 and the other one with the last Gobblin realise. But I need to Launch the job to run on Hadoop
"$HADOOP_BIN_DIR/hadoop jar \" from the gobblin container. And I always recived the same…
-1
votes
1 answer
camus or gobblin which is preferable
Can you please help me in setting up camus or gobblin to store messages in HDFS from Kafka. A Working example could be great.
Gobblin is still in incubation phase and camus is phased out. So which one is preferable to use.
i downloaded gobblin and…

VIJ
- 1,516
- 1
- 18
- 34
-3
votes
1 answer
Spark as Data Ingestion/Onboarding to HDFS
While exploring various tools like [Nifi, Gobblin etc.], I have observed that Databricks is now promoting for using Spark for data ingestion/on-boarding.
We have a spark[scala] based application running on YARN. So far we are working on a hadoop and…

Chauhan B
- 461
- 8
- 27