Questions tagged [online-algorithm]

20 questions
23
votes
3 answers

Possibility to apply online algorithms on big data files with sklearn?

I would like to apply fast online dimensionality reduction techniques such as (online/mini-batch) Dictionary Learning on big text corpora. My input data naturally do not fit in the memory (this is why i want to use an online algorithm) so i am…
register
  • 801
  • 1
  • 8
  • 15
9
votes
2 answers

VowpalWabbit: Differences and scalability

I am trying to ascertain how VowpalWabbit's "state" is maintained as the size of our input set grows. In a typical machine learning environment, if I have 1000 input vectors, I would expect to send all of those at once, wait for a model building…
3
votes
1 answer

Eligibility Traces: On-line vs Off-line λ-return algorithm

I have some problems with figuring out why you need to revisit all time steps from an episode on each horizon advance for the On-Line version of the λ-return algorithm from the book: Reinforcement Learning: An Introduction, 2nd Edition, Chapter 12,…
3
votes
1 answer

Online time series algorithms implemented in R/python/MOA

I am looking for implemented online learning time series algorithms. Does R, Python, MOA or any other tools have these kind of algorithms implemented? TIA!
brock
  • 181
  • 2
  • 10
3
votes
1 answer

Online algorithm for computing average and variance from a subset of data

I took this as a reference for online computing the variance and mean from a variable-length array of data: http://www.johndcook.com/standard_deviation.html. The data is a set from 16-bit unsigned values, which may have any number of samples…
3
votes
2 answers

Online Algorithm for Standard Deviation Proof

I saw this algorithm in an answer to this question. Does this correctly calculate standard deviation? Can someone walk me through why this works mathematically? Preferably working back from this formula: public class Statistics { private int…
3
votes
2 answers

How to handle new data for recommendation system?

Here's a theoretical question. Let's assume that I have implemented two types of collaborative filtering: user-based CF and item-based CF (in the form of Slope One). I have a nice data set for these algorithms to run on. But then I want to do two…
3
votes
2 answers

Online algorithm for calculating standard deviation

Normally, I have a more technical problem but I will simplify it for you with an example of counting balls. Assume I have balls of different colors and one index of an array (initialized to all 0's) reserved for each color. Every time I pick a ball,…
Erol
  • 6,478
  • 5
  • 41
  • 55
2
votes
2 answers

Online (as opposed to bulk processed) data mining packages

By "bulk processed" I mean a static data set of facts (as in a CSV) processed all at once to extract knowledge. While "online", it uses a live backing store: facts are added as they happen ("X buys Y") and queries happen on this live data ("what…
Jesvin Jose
  • 22,498
  • 32
  • 109
  • 202
2
votes
1 answer

Welford's online variance algorithm, but for Interquartile Range?

Short Version Welford's Online Algorithm lets you keep a running value for variance - meaning you don't have to keep all the values (e.g. in a memory constraned system). Is there something similar for Interquartile Range (IQR)? An online algorithm…
Ian Boyd
  • 246,734
  • 253
  • 869
  • 1,219
1
vote
0 answers

significant difference detection on a stream of data

There are 2 groups of user. Based on their query I return some search results to them (a1,a2,a3). The search results could vary based on either the group that user belongs to or some user specific parameter. I want to measure, whether the search…
1
vote
1 answer

Online Algorithm approach for alternating subsequence

Consider a sequence A = a1, a2, a3, ... an of integers. A subsequence B of A is a sequence B = b1, b2, .... ,bn which is created from A by removing some elements but by keeping the order. Given an integer sequence A, the goal is to compute an…
1
vote
1 answer

What is the difference between an online sorting algorithm and an external sorting algorithm?

What is the difference between online sorting algorithm and external sorting algorithm ? Are they same or different?
1
vote
1 answer

Gain maximization on trees

Consider a tree in which each node is associated with a system state and contains a sequence of actions that are performed on the system. The root is an empty node associated with the original state of the system. The state associated with a node n…
0
votes
1 answer

Efficient algorithm for online Variance over image batches

I have a large amount of images and want to calculate the variance (of each channel) across all of them. I'm having problem finding an efficient algorithm / setup for that. I read on of the Welford's online algorithm but it is way to slow as it is…
Daraan
  • 1,797
  • 13
  • 24
1
2