Apache Gobblin is a distributed data integration framework. It simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
Questions tagged [gobblin]
44 questions
1
vote
1 answer
Apache gobblin build failed
I'm new to gobblin. I try to build a distribution using master branch of the project. I'm getting bellow error while following the instruction.
FAILURE: Build failed with an exception.
* Where:
Script…

GihanDB
- 591
- 2
- 6
- 23
1
vote
1 answer
Gobblin ERROR: Unable to convert field:derivedwatermarkcolumn for value:"abc" for record:
I am tring to ingest data from mysql table to hdfs. but it is giving me below error
IST ERROR [TaskExecutor-0] org.apache.gobblin.runtime.Task [demo_user_1582873318919_0] 504 - Processing record incurs an unexpected…

Chhaya Vankhede
- 316
- 2
- 14
1
vote
1 answer
Gobblin: java.lang.ClassNotFoundException: org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
I am trying mysql to hdfs data ingestion using gobblin. While running mysql-to-gobblin.pull using steps below:
1) start hadoop:
sbin\start-all.cmd
2) start mysql service:
sudo service mysql start
3) set GOBBLIN_WORK_DIR:
export…

Chhaya Vankhede
- 316
- 2
- 14
1
vote
1 answer
Error with KafkaHDFS example: java.lang.NoSuchMethodError
I have trouble trying out the Kafka-HDFS data ingestion example .
I have tried both 0.10.0 and 0.14.0 version.
For the 0.10.0 version i use the ready distribution and for the 0.14.0 version i made a build by myself following the instructions in the…

user3061395
- 21
- 1
1
vote
0 answers
Gobblin MapReduce convert from protobuf to Parquet
Trying to find an example of how to convert protobuf messages to parquet using Gobblin. Unable to find any.
Scenario:
- Kafka messages are in Protobuf
- Gobblin Consumer: consumes protobuf from kafka and writes them as parquet into HDFS
Gobblin…

Pritam
- 929
- 1
- 7
- 16
1
vote
0 answers
Kafka to kafka using Gobblin behind krb5 security
Everything works if run a simple job with kafka to kafka without kerberos security. I need do same but behind kerberos security. Take a look at my job code below:
job.name=Kafka2KafkaSimple
job.group=Kafka
job.description=This is a job that runs…

Bruno Wego
- 2,099
- 3
- 21
- 38
1
vote
2 answers
Gobblin Kafka to HDFS gobblin-api-***.jar FileNotFoundException
I want to collect kafka message and store it in hdfs by gobblin,
when i run the gobblin-mapreduce.sh, the script throws a exception:
2017-10-19 11:49:18 CST ERROR [main] gobblin.runtime.AbstractJobLauncher 442 - Failed to launch and run job…

user1978965
- 99
- 1
- 9
1
vote
2 answers
Gobblin QuickStart sample exception:ClassNotFoundException: org.apache.gobblin.example.wikipedia.WikipediaSource
I'm learning gobblin following the quickstart , sub section "Running Gobblin as a Daemon".
I do it step by step as the guide:
create config dir and set the environment variable GOBBLIN_JOB_CONFIG_DIR, and put wikipedia.pull in it;
create work dir…

user1978965
- 99
- 1
- 9
1
vote
0 answers
Gobblin grouping workunits for Kafka source
In https://gobblin.readthedocs.io/en/latest/case-studies/Kafka-HDFS-Ingestion/#grouping-workunits section of Gobblin documentation we can read about Single-level packing with following desc
The single-level packer uses a worst-fit-decreasing…

Purple
- 711
- 2
- 10
- 19
1
vote
1 answer
Gobblin - how to get post from Facebook
I have been investigating Gobblin for awhile and currently I am experiencing difficulties in using Gobblin to get post from Facebook. I could not find any connection example on the internet or I may have searched it wrongly.
I am looking at…

Leo
- 265
- 1
- 4
- 18
1
vote
1 answer
Gobblin Kafka to HDFS pull job error
I'm trying to pull data from Kafka to HDFS using Gobblin.
Gobblin version (compiled from github source code with command sudo ./gradlew clean build -PuseHadoop2 -PhadoopVersion=2.7.1 -x test):
0.6.2-546-g431188b
Hadoop version:
Hadoop…

Dmitry
- 123
- 1
- 6
1
vote
1 answer
How do I use Java to read AVRO data in Spark 1.3.1?
I am trying to develop a Java Spark Application that reads AVRO records (https://avro.apache.org/) from HDFS put there by a technology called Gobblin (https://github.com/linkedin/gobblin/wiki).
A sample HDFS AVRO data…

Mark
- 66
- 1
- 5
0
votes
1 answer
No AbstractFileSystem configured for scheme: gs
I am getting below error while running a gobblin job.
My core-site.xml looks fine and it has the required value.
core-site.xml
fs.AbstractFileSystem.gs.impl
com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS
…

1stenjoydmoment
- 229
- 3
- 14
0
votes
1 answer
How to setup gobblin in windows ? What should be the version of gradle and gobblin?
I am trying to setup gobblin in my system, but facing issue while building gradle.
Which verion of gobblin and gradle do I need to use ?
Error :-
Caused by: org.gradle.api.plugins.UnknownPluginException: Plugin with id 'pegasus' not found.

Rajesh dash
- 11
- 3
0
votes
1 answer
Gradle sync failed : Cannot cast object 'main classesDirs' with class 'org.gradle.api.internal.file.collections.DefaultConfigurableFileCollection'
I am facing below while building gradle.
I am using gradle 6.5 and gobblin apache-gobblin-incubating-sources-0.14.0 version.
I have added build.gradle file and idesSetup.gradle…

Rajesh dash
- 11
- 3