Questions tagged [beam]

This tag should be used for questions about the BEAM, the Erlang virtual machine.

The BEAM (Bogdan/Björn's Erlang Abstract Machine) is the Erlang virtual machine. Besides erlang, there are also other languages that can target the BEAM virtual machine, such as Joxa, lfe, elixir, and others.

Disambiguation

Use apache-beam for questions related to Apache Beam, an SDK for batch and stream processing.
Use android-beam for questions related to Android Beam, the NFC peer-to-peer mode NDEF message exchange mechanism in Android.
Use beam-search for questions related to the heuristic search algorithm beam search.

106 questions

vote

3 answers

Reloading/Recompiling/Refreshing .beam files inside a terminal

I use Eclipse and Erlide to develop in Erlang. To run the software I enter the ebin/ directory with my terminal since I don't like the console Eclipse provides. However after each change I have to exit and re-enter erl in the terminal to reload the…

eclipse makefile erlang beam erlide

asked Feb 21 '14 at 23:50

danihodovic

1,151
3
18
28

vote

2 answers

How can I get beam size for Erlang?

I have a legacy Erlang program that needs optimizations. This piece of code uses up to 20G memory in run time. I'm wondering if there is a way to get the Erlang Beam size of the process itself in run time? If that is possible then I can do something…

erlang gen-server beam

asked Dec 03 '12 at 22:53

Jian Wang

vote

2 answers

tsung ts_config_server Can't start newbeam on host (reason: timeout) Aborting

I am currently in the midst of doing distributed load testing on Amazon's EC2 services and have diligently followed all documentation/forum/support on how to get things to work, but unfortunately find myself stuck at this point. No one in any of the…

erlang timeout load-testing tsung beam

asked Jul 20 '12 at 13:00

ikosuave

votes

0 answers

Spark Task Data loss after worker dies in Java

I have a Java program in which I use Spark as a runner for beam pipeline. There is a Spark task that collects some data. It got finished correctly but, after that, its worker died and this task got assigned to another worker. Why doesn't it recover…

java apache-spark beam

asked Aug 27 '23 at 08:11

Ahmed Hesham

votes

0 answers

Dataflow - process single input to multiple outputs using a ptransform

I read data from Pubsub and there are different types of data. I would provide runtime argument based on which it will have to create multiple outputs( using branching) but the idea is i would get multiple Pcollection after Ptransform but should i…

google-cloud-dataflow apache-beam google-cloud-pubsub beam

asked Aug 22 '23 at 10:38

Anesh P

votes

0 answers

Apache Beam dataflow combine per key

I have a problem with my pipeline. My goal is a read around 4k parquet files read it as a numpy array and then make some aggregations eg from one file can make 100 keys each key has some numbers of data. Then I have combine per key logic and my goal…

python google-cloud-dataflow apache-beam beam

asked Aug 17 '23 at 17:24

Dawid

votes

1 answer

Java - Apache Beam - Control number of connections when writing in MongoDB

I'm currently working on a streaming pipeline in Apache Beam (v2.43) to insert data in mongoDB. It runs on dataflow quite fine, but I'm not able to control the number of connections : in case of input peak (PubSub), dataflow scales up and…

java mongodb google-cloud-dataflow beam

asked Jun 07 '23 at 18:18

Thibault Coudert

votes

0 answers

Does GroupIntoBatches guarantee that code is run only once per batch?

Reading this article https://cloud.google.com/blog/products/data-analytics/after-lambda-exactly-once-processing-in-google-cloud-dataflow-part-1 The side effects section says Cloud Dataflow does not guarantee that this code is run only once per…

java google-cloud-dataflow apache-beam beam

asked Jun 02 '23 at 21:13

knoeh

votes

1 answer

Apache Beam : java.lang.IllegalStateException when reading from MSSQL table

I'm having a beam pipeline that reads from MSSQL table using a simple query : return "SELECT " + "U.ID as userid, " + "U.firstname as firstname, " + "U.lastname as lastname, " + "email as email, " + "U.IP…

sql apache-beam beam

asked May 21 '23 at 14:55

ah_ben

votes

1 answer

How to use PCollection as a sideinput in Beam?

I am working on a Beam (Dataflow) pipeline, where the task is to read the messages from pubsub and then perform some transformations. In case there are some failures in any of these transformations I want to send message to the dead letter…

google-cloud-dataflow apache-beam google-cloud-pubsub apache-beam-io beam

asked May 18 '23 at 21:11

trougc

votes

1 answer

How to cancel a GCP Dataflow job programmingly using Beam and just the job ID

We have a GCP Dataflow project which requires us to cancel a running Dataflow job. All we have at the time of cancellation is the Job ID. From other posts on Stackoverflow, I learned we can cancel a job using something like this: PipelineResult…

java google-cloud-platform google-cloud-dataflow beam

asked May 16 '23 at 11:51

ZZZ

votes

0 answers

Apache Beam write kafka Records to Avro File

I would like to read couple of rows from Kafka topic and create a avro file. I have the partial code working which is reading from kafka topic and printing to console works. what I would like to know how to use the avroIO to write the generic record…

apache-kafka beam

asked May 04 '23 at 16:08

developer2015

votes

1 answer

How does apache Beam give exactly once guarantee and do stateful calculation without checkpoint or fault tolerence?

Things like groupby or combine needs exactly once guarantee for trivial calculation like sum But apache beam seems to not have checkpoint baked in to the library, does it rely on flink or spark to manage fault tolerence and consistency in state?

apache-beam beam

asked Apr 20 '23 at 20:24

olaf

votes

1 answer

How to limit througput on an Apache Beam pileline in Go?

I wrote a basic pipeline in Go running on Google Dataflow. Basically it transforms Pubsub events to elastic documents and then update Elastic document in bulk. I need to find a way to limit the number of Bulk request per second. Because when my…

google-cloud-dataflow apache-beam apache-beam-io beam

asked Apr 19 '23 at 19:11

boolangery

votes

0 answers

Apache Beam IOElasticsearchIO.read() method (Java), which expects a PBegin input and a means to handle a collection of queries

I'm running into an issue using the ElasticsearchIO.read() to handle more than one instance of a query. My queries are being dynamically built as a PCollection based on an incoming group of values. I'm trying to see how to load the .withQuery()…

google-cloud-dataflow beam

asked Apr 12 '23 at 17:05

user21627820

Prev 1 2 3 4

6 7 8 Next