Questions tagged [large-data-volumes]

302 questions
2
votes
3 answers

How to process massive data-sets and provide a live user experience

I am a programmer at an internet marketing company that primaraly makes tools. These tools have certian requirements: They run in a browser and must work in all of them. The user either uploads something (.csv) to process or they provide a URL and…
RachelD
  • 4,072
  • 9
  • 40
  • 68
2
votes
2 answers

Select the latest recored in each 2-minute interval in MySQL

I want to read about 8000 files, each containing the daily stock prices of a distinct stock, into a single table and select the latest price in each 2-minute interval and write a Null if no record available in an interval. My idea is add a column…
2
votes
1 answer

Google BigQuery is running queries slowly

I'm running a simple bigQuery over my dataset which is about 84GB of log data. The query takes approx 110 seconds to complete. Is this normal for a data set of this size?
aloo
  • 5,331
  • 7
  • 55
  • 94
2
votes
2 answers

Inner join and Split on large volume of data

We are working on large volume data (row counts given below) : Table 1 : 708408568 rows -- 708 million Table 2 : 1416817136 rows -- 1.4 billion Table 1 Schema: ---------------- ID - Int PK column2 - Int Table 2…
Murtaza Mandvi
  • 10,708
  • 23
  • 74
  • 109
2
votes
1 answer

Space and time problems when using regular expressions on large data sets

I have a large (greater than 200K) array of Strings which I use to search for patterns in documents. I convert each entry in the array into a regular expression before I apply it to the document. When I do this, the amount of time it takes to…
Elliott
  • 5,523
  • 10
  • 48
  • 87
2
votes
2 answers

How to allocate 16GB of memory in Go?

I'm using the following simple Go code to allocate a 3D array of size 1024x1024x1024: grid = make([][][]TColor, 1024) for x = 0; x < 1024; x++ { grid[x] = make([][]TColor, 1024) for y = 0; y < 1024; y++ { grid[x][y] = make([]TColor,…
metaleap
  • 2,132
  • 2
  • 22
  • 40
1
vote
2 answers

large file through WCF service

Similar questions are flowing around and I looked at all of them. It appears none solve my issue. -- UPDATE: -- I am trying to upload a document (pdf, doc, or whatever) to a database using WCF Service. The call to the service looks like this: using…
Dmitry Efimenko
  • 10,973
  • 7
  • 62
  • 79
1
vote
2 answers

Accessing large data sets and/or storing them

At the moment I am dealing with large amounts of float/double datasets to be used for calculation. I have a set of files to compare Data A to Data B and I would like to compute the Euclidean distance / Cosine similarity. I.E. Data A point 1 iterates…
natchan
  • 138
  • 1
  • 1
  • 12
1
vote
1 answer

How do I efficiently search a potentially large database?

This is more of a discussion. We have a system which is multitenanted and will have tables that can have millions of rows. Our UI allows users to perform searches against these tables with many different search criterias -- so they can have any…
Amitesh
  • 677
  • 1
  • 10
  • 24
1
vote
3 answers

C# Charting - Reasonble Large Data Set and Real-time

I'm looking for a C# WinForms charting component, either commercial or open source, that can handle relatively large data sets and be reasonable scalable with regards to chart rendering and updates. The number of data sets to be displayed would be…
Suggan Buggan
1
vote
1 answer

Optimizing a moving window MYSQL query

I have an MEMORY MYSQL database that contains a table with 200k+ rows. I use this database to test trading strategies, so it is queried repeatedly ad nauseum. No new data is added to the database. One column in this database's primary table is "Time…
Mike Furlender
  • 3,869
  • 5
  • 47
  • 75
1
vote
1 answer

Python: Print selected string in between two strings that call it

I have the following code which prints out a certain string I want from a file, however this script includes the line containing the strings used to call the line in the output. output I want the script to only print the middle line (not include the…
1
vote
1 answer

Datagrid with large number of rows

In my WPF application, I've got a screen with a tab control. Five of these tabs contain datagrids which need to display a large number of rows (at least 5000). The tables are bound to ObservableCollections of Part objects. Each row displays…
drowned
  • 530
  • 1
  • 12
  • 31
1
vote
0 answers

MongoDB : Maximum call stack size exceeded

I have built a web scraper that gets multiple URLs at once collects data and pushes it to a MongoDB. The total number of urls is 400k plus and it collects data of 100 urls at once and then uploads it to MongoDB atlas in a single collection, but…
1
vote
3 answers

Sql query with joins between four tables with millions of rows

We have a transact sql statement that queries 4 tables with millions of rows in each. It takes several minutes, even though it has been optimized with indexes and statistics according to TuningAdvisor. The structure of the query is like: SELECT…
Ole Lynge
  • 4,457
  • 8
  • 43
  • 57