Questions tagged [apache-apex]

Apex is a YARN-native platform that unifies stream and batch processing. It processes big data in-motion in a way that is highly scalable, highly performant, fault tolerant, stateful, secure, distributed, and easily operable.

Apache Apex is an open source platform for a unified stream and batch processing. It runs natively on YARN. Apache Apex itself is divided into two core modules

Developing application on Apex

To start with, documentation is available at: http://apex.apache.org/docs.html

51 questions
1
vote
2 answers

InvocationTargetException in Yarn task with Hadoop

While running Kafka -> Apache Apex ->Hbase, it is saying following exception in Yarn tasks: com.datatorrent.stram.StreamingAppMasterService: Application master, appId=4, clustertimestamp=1479188884109, attemptId=2 2016-11-15 11:59:51,068 INFO…
syam
  • 51
  • 2
  • 9
1
vote
1 answer

Unable to fetch Application Overview Page of a Running Application

Most of the times, I can't go to Application Overview of my application's Application Monitor Page. Sometimes when I go, all the stats on the page such as: Logical Plan, Physical Plan tabs don't exist at all and none of the stats show up. However,…
atchn
  • 53
  • 3
1
vote
1 answer

AbstractFileOutputWriter Generating duplicate tmp files

I have an Apache Apex application consuming Kafka Logs and writing it to HDFS. The DAG is simple enough that there is a Kafka Consumer (20 partitions of 2 GB memory for operator) connected by a stream to a "MyWriter extends…
atchn
  • 53
  • 3
1
vote
2 answers

How to use datatorrent in a kappa architecture?

I read a lot about lambda and kappa architectures where we need to use either Apache Spark or Apache Storm. I just discovered a new tool called DataTorrent which can do batch and real-time process. I was wondering if DataTorrent can do, at the same…
user6134689
1
vote
1 answer

How to renew a delegation token for a long running applications besides the time set in the hadoop cluster

I have an Apache Apex application which runs on my Hadoop environment. I have no problem with the application except that, it is failing after 7days. And, i realized that it is because of the cluster level setting for any application. Is there any…
Raja
  • 513
  • 5
  • 18
1
vote
0 answers

Apache Apex - Custom written Kafka Offset Manager

If I write my own implementation of OffsetManager for kafka input operator, How should I configure kafka input operator to use my custom build offset manager ? I know about the property “OffsetManager”, But can someone please share some working…
1
vote
1 answer

How to get ApplicationID from inside Apache Apex application?

Is it possible to retrieve the Apex application ID: e.g.application_1463594017097_0024 within the Apex program? For example from the DAG object or some other object?
user6147934
1
vote
0 answers

What do the numbers "total processed" & "total emitted" mean exactly for an Apex application

Apache Apex application gives out few metrics when an application is running, such as "total processed" and "total emitted". What do these numbers mean exactly? Are they the number of records processed/emitted, till now, by the corresponding…
PradeepKumbhar
  • 3,361
  • 1
  • 18
  • 31
1
vote
2 answers

How to use Apache Apex to ingest data in batch from DB2 to Vertica

Use Case: Ingest transaction data (e.g. rows = 10,000) in a single batch from DB2 and insert them to a Vertica database. Question: Should I get a single row from database or batch of 10k rows, process and then insert into destination database? Is…
user6147934
1
vote
1 answer

Apache Apex - kryo ArrayList Exception

I've implemented an operator to deserialize from avro byte[] to Object. After that I sent the object to ConsoleOutputOperator. public final transient DefaultInputPort input = new DefaultInputPort() { @Override public void…
Vic
  • 95
  • 1
  • 7
1
vote
3 answers

Error : HDFS Not Ready (Data-torrent RTS sandbox)

I was trying Datatorrent sandbox but was getting this error .... HDFS Not Ready HDFS may still be starting up, or there may be other configuration issues with your hadoop services. The console checks for changes in the status of these services every…
Rahul Shukla
  • 505
  • 7
  • 20
1
vote
2 answers

Can I get example code to consume avro kafka message?

I just set up Datatorrent RTS (Apache Apex) platform and run the pi demo. I want to consume "avro" messages from kafka and then aggregate and store the data into hdfs. Can I get an example code for this or kafka?
Vic
  • 95
  • 1
  • 7
0
votes
1 answer

Does apache apex have some Web UI

I'm quite new to Apache Apex platform. Does it have some web ui? I was able to run docker sandbox and some example app. Nevertheless, yarn tracking url points to 404 pages. For example…
Matzz
  • 670
  • 1
  • 7
  • 17
0
votes
0 answers

Failed to execute goal org.apache.maven.plugins:maven-resources

I am executing the mvn clean install -DskipTests -X command in apex-core folder lying under my work directory to run an apache apex application. But there is a build error which says : Failed to execute goal …
0
votes
2 answers

IBM Websphere MQ to Apache Apex Operator Stream?

I have been searching the INTERNET extensively for a way to use web-sphere MQ in Apache Apex to stream the MQ message through and into a DAG. However, there seems to be no IBM documentation on the matter. I know it might be similar to ActiveMQ and I…
RandomUser
  • 15
  • 1
  • 7