Questions tagged [flume-ng]

Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. The Flume-NG is refactoring of the first generation Flume to solve certain known issues and limitations of the original design.

This tag should be used with questions about Flume-NG API and specific features of new-generation versions (e.g. Flume HDFS Sink was introduced only in NG version and cannot be used in previous releases).

397 questions
4
votes
1 answer

Remote debugging Flume's custom source and logging

I have a custom source for my Flume (version 1.5.0) agent and I want to debug it. It's actually custom Twitter source, from Cloudera's example here. I have a number of questions: (1) Is it possible to remote debug the Flume source (written in Java)…
oikonomiyaki
  • 7,691
  • 15
  • 62
  • 101
4
votes
3 answers

Expected timestamp in the Flume event headers, but it was null

I am using below configuration details to push Twitter feeds into HDFS using Flume, but getting Expected timestamp in the Flume event headers, but it was null twitter.conf TwitterAgent.sources = Twitter TwitterAgent.channels =…
Farooque
  • 3,616
  • 2
  • 29
  • 41
4
votes
2 answers

Can I use System Properties in flume configuration

I have following flume config for a flume sink # Describe the sink a1.sinks.k1.type = file_roll a1.sinks.k1.sink.directory = ~/flume_file_sink a1.sinks.k1.rollInterval = 0 I want to make sink.directory, channels.c1.capacity, channels.c1.capacity…
Pushkar
  • 541
  • 4
  • 18
4
votes
1 answer

Flume 1.6 kafka source

kafka_2.10-0.8.2.0 flume 1.6 This is my flume configuration: a1.sources = r1 a1.sinks = k1 a1.channels = c1 a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource a1.sources.r1.zookeeperConnect = a2:3181 …
satu phat
  • 43
  • 1
  • 5
3
votes
2 answers

How to use Flume's Kafka Channel without specifying a source

I have an existing Kafka topic and a flume agent that reads from there and writes to HDFS. I want to reconfigure my flume agent so it will move away from the existing setup; a Kafka Source, file Channel to HDFS Sink, to use a Kafka Channel. I read…
darkCode
  • 140
  • 8
3
votes
1 answer

Kafka partition leader election fails after uncontrolled broker shutdown

We have got 3 kafka brokers and topic with 40 partitions and replication factor set to 1. After uncontrolled kafka broker shutdown for some partition we see that it wasn't possible to elect new leader (see logs below). Eventually we cannot read from…
3
votes
0 answers

Flume classpath contains multiple SLF4J bindings, fetching twitter data

when fetching twitter data using command: ./bin/flume-ng agent -n TwitterAgent -c conf -f /usr/lib/apache-flume-1.4.0-bin/conf/flume.conf a warning popups in terminal saying: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding…
Anish Arya
  • 518
  • 1
  • 7
  • 24
3
votes
3 answers

Stopping Flume Agent

I have a requirement where I want to run Flume agent with spooling directory as source. After all the files from the spool directory is copied to HDFS(sink) I want the agent to stop as I know all the files are pushed to channel. Also I want to run…
Aditya Calangutkar
  • 486
  • 1
  • 6
  • 21
3
votes
1 answer

Cloudera 5.4.2: Avro block size is invalid or too large when using Flume and Twitter streaming

There is tiny problem when I try Cloudera 5.4.2. Base on this article Apache Flume - Fetching Twitter Data http://www.tutorialspoint.com/apache_flume/fetching_twitter_data.htm It tries to fetching tweets using Flume and twitter streaming for data…
dong
  • 51
  • 1
  • 4
3
votes
1 answer

custom Flume interceptor: intercept() method called multiple times for the same Event

TL;DR When a Flume source fails to push a transaction to the next channel in the pipeline, does it always keep event instances for the next try? In general, is it safe to have a stateful Flume interceptor, where processing of events depends on…
Shadocko
  • 1,186
  • 9
  • 27
3
votes
1 answer

FLUME IllegalStateException: begin() called when transaction is OPEN

I have written custom flume sink, named MySink, whose process method is indicated in the first snippet below. I am getting an IllegalStateException as follows (detailed stack trace is available in the 2nd snippet below): Caused by:…
F. Aydemir
  • 2,665
  • 5
  • 40
  • 60
3
votes
0 answers

Custom Flume Sink Deployment in Flume 1.6

I am using Flume 1.6 and have a custom sink implementation. I have built a JAR file with all necessary dependencies and placed it under /plugins.d/MySink/lib/MySink.jar As far as I can tell from reading the resources available, if I place…
F. Aydemir
  • 2,665
  • 5
  • 40
  • 60
3
votes
2 answers

avro events from kafka to HDFS with flume

I have kafka cluster that receives avro events from producers. I would like to use flume in order to consume these events and put them as avro files in HDFS Is this possible with flume? Does anyone have example of a configuration file demonstrating…
yosi
  • 639
  • 1
  • 12
  • 21
3
votes
1 answer

Flume memory chanel to HDFS sink

I'm facing an issue with Flume (1.5 on Cloudera CDH 5.3): spoolDir source -> memory channel -> HDFS sink What i'm trying to do: Every 5mins, about 20 files are pushed to the spooling directory (grabbed from a remote storage). Each files contains…
Adagyo
  • 422
  • 1
  • 4
  • 16
3
votes
1 answer

Flume 1.6.0 Agent start failed

I have an issue while starting an apache flume agent with the flume-ng file within the bin folder. I have no clue how to fix this. I just wanted to run a flume example. I´m using CentOS (Linux distribution), only command line. Below you can see my…
J. Doe
  • 31
  • 2
1
2
3
26 27