Questions tagged [large-data-volumes]
302 questions
11
votes
4 answers
javascript to find memory available
Let's make it immediately clear: this is not a question about memory leak!
I have a page which allows the user to enter some data and a JavaScript to handle this data and produce a result.
The JavaScript produces incremental outputs on a DIV,…

Zo72
- 14,593
- 17
- 71
- 103
10
votes
10 answers
What if 2^32 is just not enough?
what if you have so many entries in a table, that 2^32 is not enough for your auto_increment ID within a given period (day, week, month, ...)?
What if the largest datatype MySQL provides is not enough?
I'm wondering how should I solve a situation…

mike
- 5,047
- 2
- 26
- 32
10
votes
1 answer
Custom ObservableCollection or BindingList with support for periodic notifications
Summary
I have a large an rapidly changing dataset which I wish to bind to a UI (Datagrid with grouping). The changes are on two levels;
Items are frequently added or removed from the collection (500 a second each way)
Each item has a 4 properties…

CityView
- 657
- 9
- 22
10
votes
7 answers
Copy one column to another for over a billion rows in SQL Server database
Database : SQL Server 2005
Problem : Copy values from one column to another column in the same table with a billion+
rows.
test_table (int id, bigint bigid)
Things tried 1: update query
update test_table set bigid = id
fills up the transaction…

Adi Pandit
- 689
- 1
- 6
- 10
10
votes
12 answers
Best way to store/retrieve millions of files when their meta-data is in a SQL Database
I have a process that's going to initially generate 3-4 million PDF files, and continue at the rate of 80K/day. They'll be pretty small (50K) each, but what I'm worried about is how to manage the total mass of files I'm generating for easy lookup.…

SqlRyan
- 33,116
- 33
- 114
- 199
8
votes
1 answer
Can Apache Solr Handle TeraByte Large Data
I am an apache solr user about a year. I used solr for simple search tools but now I want to use solr with 5TB of data. I assume that 5TB data will be 7TB when solr index it according to filter that I use. And then I will add nearly 50MB of data per…

Mustafa
- 146
- 2
- 7
8
votes
1 answer
Java implementation of singular value decomposition for large sparse matrices
I'm just wondering if anyone out there knows of a java implementation of singular value decomposition (SVD) for large sparse matrices? I need this implementation for latent semantic analysis (LSA).
I tried the packages from UJMP and JAMA but they…

jake
- 1,405
- 3
- 19
- 33
8
votes
3 answers
How to plot large data vectors accurately at all zoom levels in real time?
I have large data sets (10 Hz data, so 864k points per 24 Hours) which I need to plot in real time. The idea is the user can zoom and pan into highly detailed scatter plots.
The data is not very continuous and there are spikes. Since the data set…

Pyrolistical
- 27,624
- 21
- 81
- 106
8
votes
4 answers
Alternatives to huge drop down lists (24,000+ items)
In my admin section, when I edit items, I have to attach each item to a parent item. I have a list of over 24,000 parent items, which are listed alphabetically in a drop down list (a list of music artists).
The edit page that lists all these items…
user15063
8
votes
4 answers
What technology for large scale scraping/parsing?
We're designing a large scale web scraping/parsing project. Basically, the script needs to go through a list of web pages, extract the contents of a particular tag, and store it in a database.
What language would you recommend for doing this on a…

Jonathan Knight
- 135
- 1
- 5
8
votes
3 answers
Which data structure should I use for geocoding?
I am trying to create a Python script which will take an address as input and will spit out its latitude and longitude, or latitudes and longitudes in case of multiple matches, quite like Nominatim.
So, the possible input and outputs could be:-
In:…

AppleGrew
- 9,302
- 24
- 80
- 124
7
votes
11 answers
Advice on handling large data volumes
So I have a "large" number of "very large" ASCII files of numerical data (gigabytes altogether), and my program will need to process the entirety of it sequentially at least once.
Any advice on storing/loading the data? I've thought of converting…

Jake
- 15,007
- 22
- 70
- 86
7
votes
5 answers
How can I determine the difference between two large datasets?
I have large datasets with millions of records in XML format. These datasets are full data dumps of a database up to a certain point in time.
Between two dumps new entries might have been added and existing ones might have been modified or deleted.…

NullUserException
- 83,810
- 28
- 209
- 234
7
votes
6 answers
How to limit bandwidth used by mysqldump
I have to dump a large database over a network pipe that doesn't have that much bandwidth and other people need to use concurrently. If I try it it soaks up all the bandwidth and latency soars and everyone else gets messed up.
I'm aware of the…

ʞɔıu
- 47,148
- 35
- 106
- 149
7
votes
2 answers
jQuery grid recommendations for large data sets?
I was looking around for jQuery grid recommendations and came across this question/answers:
https://stackoverflow.com/questions/159025/jquery-grid-recommendations
In looking through the many jQuery grid solutions out there, it seems they all want to…

Ed Sinek
- 4,829
- 10
- 53
- 81