Questions tagged [large-data-volumes]

302 questions
11
votes
4 answers

javascript to find memory available

Let's make it immediately clear: this is not a question about memory leak! I have a page which allows the user to enter some data and a JavaScript to handle this data and produce a result. The JavaScript produces incremental outputs on a DIV,…
Zo72
  • 14,593
  • 17
  • 71
  • 103
10
votes
10 answers

What if 2^32 is just not enough?

what if you have so many entries in a table, that 2^32 is not enough for your auto_increment ID within a given period (day, week, month, ...)? What if the largest datatype MySQL provides is not enough? I'm wondering how should I solve a situation…
mike
  • 5,047
  • 2
  • 26
  • 32
10
votes
1 answer

Custom ObservableCollection or BindingList with support for periodic notifications

Summary I have a large an rapidly changing dataset which I wish to bind to a UI (Datagrid with grouping). The changes are on two levels; Items are frequently added or removed from the collection (500 a second each way) Each item has a 4 properties…
10
votes
7 answers

Copy one column to another for over a billion rows in SQL Server database

Database : SQL Server 2005 Problem : Copy values from one column to another column in the same table with a billion+ rows. test_table (int id, bigint bigid) Things tried 1: update query update test_table set bigid = id fills up the transaction…
Adi Pandit
  • 689
  • 1
  • 6
  • 10
10
votes
12 answers

Best way to store/retrieve millions of files when their meta-data is in a SQL Database

I have a process that's going to initially generate 3-4 million PDF files, and continue at the rate of 80K/day. They'll be pretty small (50K) each, but what I'm worried about is how to manage the total mass of files I'm generating for easy lookup.…
SqlRyan
  • 33,116
  • 33
  • 114
  • 199
8
votes
1 answer

Can Apache Solr Handle TeraByte Large Data

I am an apache solr user about a year. I used solr for simple search tools but now I want to use solr with 5TB of data. I assume that 5TB data will be 7TB when solr index it according to filter that I use. And then I will add nearly 50MB of data per…
Mustafa
  • 146
  • 2
  • 7
8
votes
1 answer

Java implementation of singular value decomposition for large sparse matrices

I'm just wondering if anyone out there knows of a java implementation of singular value decomposition (SVD) for large sparse matrices? I need this implementation for latent semantic analysis (LSA). I tried the packages from UJMP and JAMA but they…
jake
  • 1,405
  • 3
  • 19
  • 33
8
votes
3 answers

How to plot large data vectors accurately at all zoom levels in real time?

I have large data sets (10 Hz data, so 864k points per 24 Hours) which I need to plot in real time. The idea is the user can zoom and pan into highly detailed scatter plots. The data is not very continuous and there are spikes. Since the data set…
Pyrolistical
  • 27,624
  • 21
  • 81
  • 106
8
votes
4 answers

Alternatives to huge drop down lists (24,000+ items)

In my admin section, when I edit items, I have to attach each item to a parent item. I have a list of over 24,000 parent items, which are listed alphabetically in a drop down list (a list of music artists). The edit page that lists all these items…
user15063
8
votes
4 answers

What technology for large scale scraping/parsing?

We're designing a large scale web scraping/parsing project. Basically, the script needs to go through a list of web pages, extract the contents of a particular tag, and store it in a database. What language would you recommend for doing this on a…
Jonathan Knight
  • 135
  • 1
  • 5
8
votes
3 answers

Which data structure should I use for geocoding?

I am trying to create a Python script which will take an address as input and will spit out its latitude and longitude, or latitudes and longitudes in case of multiple matches, quite like Nominatim. So, the possible input and outputs could be:- In:…
AppleGrew
  • 9,302
  • 24
  • 80
  • 124
7
votes
11 answers

Advice on handling large data volumes

So I have a "large" number of "very large" ASCII files of numerical data (gigabytes altogether), and my program will need to process the entirety of it sequentially at least once. Any advice on storing/loading the data? I've thought of converting…
Jake
  • 15,007
  • 22
  • 70
  • 86
7
votes
5 answers

How can I determine the difference between two large datasets?

I have large datasets with millions of records in XML format. These datasets are full data dumps of a database up to a certain point in time. Between two dumps new entries might have been added and existing ones might have been modified or deleted.…
NullUserException
  • 83,810
  • 28
  • 209
  • 234
7
votes
6 answers

How to limit bandwidth used by mysqldump

I have to dump a large database over a network pipe that doesn't have that much bandwidth and other people need to use concurrently. If I try it it soaks up all the bandwidth and latency soars and everyone else gets messed up. I'm aware of the…
ʞɔıu
  • 47,148
  • 35
  • 106
  • 149
7
votes
2 answers

jQuery grid recommendations for large data sets?

I was looking around for jQuery grid recommendations and came across this question/answers: https://stackoverflow.com/questions/159025/jquery-grid-recommendations In looking through the many jQuery grid solutions out there, it seems they all want to…
Ed Sinek
  • 4,829
  • 10
  • 53
  • 81
1
2
3
20 21