In signal processing windowing allows to work on overlapping segments of the signal.
Questions tagged [windowing]
224 questions
4
votes
3 answers
Clean way to identify runs on a PySpark DF ArrayType column
Given a PySpark DataFrame of the form:
+----+--------+
|time|messages|
+----+--------+
| t01| [m1]|
| t03|[m1, m2]|
| t04| [m2]|
| t06| [m3]|
| t07|[m3, m1]|
| t08| [m1]|
| t11| [m2]|
| t13|[m2, m4]|
| t15| [m2]|
| t20| [m4]|
|…

Jedi
- 3,088
- 2
- 28
- 47
4
votes
1 answer
Flink Tumble Window Trigger time
I am using Flink to aggregate the data from kafka topics. I am using a tumble window of 1 hour, with the time characteristic set to Event Time. I am also using AscendingTimestampExtractor and assigning watermarks to the input based on a particular…

Madhu Kumar Tallapalli
- 75
- 2
- 10
4
votes
1 answer
Apache Flink: Watermarks, Dropping Late Events, and Allowed Lateness
I am having trouble understanding the concept of watermarks and allowed lateness.
Following is an excerpt from the [mail archive|https://www.mail-archive.com/user@flink.apache.org/msg08758.html] that talks about Watermarks but I have a couple of…

Sheel Pancholi
- 621
- 11
- 25
4
votes
2 answers
SQL - Flattening a table using windowing and case statements
First time asker -- I'm having some problems combining case logic and windowing in SqlServer 2012. I need to flatten the data structure shown below, so I'll be running MAX statements against these results afterwards. I'm using case/when logic to…

Brian Blanton
- 43
- 4
4
votes
1 answer
How to use a context window to segment a whole log Mel-spectrogram (ensuring the same number of segments for all the audios)?
I have several audios with different duration. So I don't know how to ensure the same number N of segments of the audio. I'm trying to implement an existing paper, so it's said that first a Log Mel-Spectrogram is performed in the whole audio with 64…

user2687945
- 43
- 1
- 4
4
votes
3 answers
sql windowing to persist a record on a given condition
I have some data around a website where the website has different shop sections but when the user checks out at the end, we only know what shop section it is by looking for their most recent section hit
For example if I have data that looks…

shecode
- 1,716
- 6
- 32
- 50
3
votes
2 answers
Apache beam windowing: consider late data but emit only one pane
I would like to emit a single pane when the watermark reaches x minutes past the end of the window. This let's me ensure I handle some late data, but still only emit one pane. I am currently working in java.
At the moment I can't find proper…

Joe Stoker
- 157
- 11
3
votes
2 answers
how to find number of active users for say 1 day,2 days, 3 days.....postgreSQL
A distribution of # days active within a week: I am trying to find how many members are active for 1 day, 2days, 3days,…7days during a specific week 3/1-3/7.
Is there any way to use aggregate function on top of partition by?
If not what can be used…

KK44
- 31
- 1
3
votes
1 answer
Find the first index for which an array goes below a certain threshold (and stay below for some time)
Let A be a 1D numpy array, a threshold t, and a window length K.
How to find the minimal index j, such that A[j:j+K] < t? (i.e. the first time A stays below the threshold on a full window of width K).
I've tried (unfinished) things with a loop, but…

Basj
- 41,386
- 99
- 383
- 673
3
votes
0 answers
Apache Beam: Custom Windowing (windowfn)
Gurus - I am new to Apache Beam and trying to implement, what seems to be a pretty straight forward use case. I have stock data and I need to find a rolling average-price of the stock over the past 10 transactions.
Now since there is no fixed time…

hpep
- 37
- 3
3
votes
1 answer
Apache Flink: Skewed data distribution on KeyedStream
I have this Java code in Flink:
env.setParallelism(6);
//Read from Kafka topic with 12 partitions
DataStream line = env.addSource(myConsumer);
//Filter half of the records
DataStream> line_Num_Odd =…

froblesmartin
- 1,527
- 18
- 25
3
votes
1 answer
TSQL Replacing "Quirky Update" Calculations with Windowing and CTE
I am trying to come up with an alternative to using the "Quirky Update" using windowing and CTE for a somewhat complex calculation of "running performance". The quick math for running performance is ((1 + Running) * (1 + Daily)) - 1. This running…

user789221
- 51
- 1
- 3
3
votes
5 answers
How to get a negative rownumber in sql server
I have a set of data with a DateTime, say CalculatedOn what I would like is to get start at the current date getdate() and get an x amount of records from before the current date, and the same amount from after.
If x = 50 then 50 prior to now and 50…

sprocket12
- 5,368
- 18
- 64
- 133
2
votes
1 answer
Rolling window on timestamped DataFrame with a custom step?
I have been fiddling about with pandas.DataFrame.rolling for some time now and I haven't been able to achieve the result that I am looking for, so before I write a custom windowing function I figured I would ask if I'm missing something.
I have…

abscond
- 133
- 6
2
votes
1 answer
StructuredStreaming withWatermark - TypeError: 'module' object is not callable
I have a Structured Streaming pyspark program running on GCP Dataproc, which reads data from Kafka,
and does some data massaging, and aggregation.
I'm trying to use withWatermark(), and it is giving error.
Here is the code :
df_stream =…

Karan Alang
- 869
- 2
- 10
- 35