3

Let us take this example scenario:

There exists a really complex function that involves mathematical square roots and cube roots (which are slower to process) to compute its output. As an example, let us assume the function accepts two parameters a and b and the input range for both the values a and b are well-defined. Let us assume the input values a and b can range from 0 to 100.

So essentially fn(a,b) can be either computed in real time or its results can be pre-filled in a database and fetched as and when required.

Method 1: Compute in realtime

function fn(a,b){

result = compute_using_cuberoots(a,b)

return result
}

Method 2: Fetch the function result from a database

We have a database pre-filled with the input values mapped to the corresponding result:

a   |  b  | result
0   |  0  |   12.4
1   |  0  |   14.8
2   |  0  |   18.6
.   |  .  |    .
.   |  .  |    .
100 | 100 |  1230.1

And we can

function fn(a,b){

result = fetch_from_db(a,b)

return result
}

My question:

Which method would you advocate and why? Why do you think one method is more efficient than the other?

I believe this is a scenario that most of us will face at some point during our programming life and hence this question.

Thank you.

Question Background (might not be relevant)

Example : In scenarios like Image-Processing, it is possible to come across such situations more often, where the range of values for the input (R,G,B) are known (0-255) and mathematical computation of square-roots and cube-roots introduce too much time for the server requests to be completed.

Let us take for an example you're building an app like Instagram - The time taken to process an image sent to the server by the user and the time taken to return the processed image must be kept minimal for an optimal User-Experience. In such situations, it is important to minimize the time taken to process the image. Worse yet, scalability problems are introduced when the number of such processing requests grow large.

Hence it is necessary to choose between one of the methods described above that will also be the most optimal method in such situations.

More details on my situation (if required):

Framework: Ruby on Rails, Database: MongodB

Community
  • 1
  • 1
dsignr
  • 2,295
  • 2
  • 35
  • 45

5 Answers5

3

I wouldn't advocate either method, I'd test them both (if I thought they were both reasonable) and get some data.

Having written that, I'll rise to the bait: given the relative speed of computation vs I/O I would expect computation to be faster than retrieving the function values from a database. I'll acknowledge the possibility (and no more) that in some special cases an in-memory database will be able to outperform (re-)computation, but as a general rule, no.

High Performance Mark
  • 77,191
  • 7
  • 105
  • 161
2

Computing results and reading from a table can be a good solution if inputs are fixed values. Computing real time and caching results for an optimum time can be a good solution if inputs varies in different situations.

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil" Donald Knuth

Selman Ulug
  • 867
  • 5
  • 17
2

"More efficient" is a fuzzy term. "Faster" is more concrete.

If you're talking about a few million rows in a SQL database table, then selecting a single row might well be faster than calculating the result. On commodity hardware, using an untuned server, I can usually return a single row from an indexed table of millions of rows in just a few tenths of a millisecond. But I'd think hard before installing a dbms server and building a database only for this one purpose.

To make "faster" a little less concrete, when you're talking about user experience, and within certain limits, actual speed is less important than apparent speed. The right kind of feedback at the right time makes people either feel like things are running fast, or at least makes them feel like waiting just a little bit is not a big deal. For details about exactly how to do that, I'd look at User Experience on the Stack Exchange network.

The good thing is that it's pretty simple to test both ways. For speed testing just this particular issue, you don't even need to store the right values in the database. You just need to have the right keys and indexes. I'd consider doing that if calculating the right values is going to take all day.

You should probably test over an extended period of time. I'd expect there to be more variation in speed from the dbms. I don't know how much variation you should expect, though.

Community
  • 1
  • 1
Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
  • Thank you, very precisely what I wanted to know. Also, +1 for the User Experience guideline, makes a lot of sense. Thanks!! – dsignr Mar 03 '12 at 07:36
0

I'd consider using a hash as a combination of calculating and storing. With he really complex function represented as a**b:

lazy = Hash.new{|h,(a,b)|h[[a,b]] = a**b}
lazy[[4,4]]
p lazy #=> {[4, 4]=>256}
steenslag
  • 79,051
  • 16
  • 138
  • 171
  • Thanks, this is a neat idea, but in my case this wouldn't work out I guess - For each request, this hash would be reset. Thanks for the suggestion though! – dsignr Mar 02 '12 at 20:10
0

I'd think about storing the values on the code itself:

class MyCalc
  RESULTS = [
    [12.4, 14.8, 18.6, ...]
    ...
    [..., 1230.1]
  ]
  def self.fn a, b
    RESULTS[a][b]
  end
end

MyCalc.fn(0,1)         #=> 14.8
Sony Santos
  • 5,435
  • 30
  • 41
  • Though this is a good idea, there are some problems with its practical implementation - if the values of both a and b range from 0 to 100, then to plot all the values of fn(a,b) as an array or a hash is no easy joke - 100x100 = 10,000 array elements. Code readability could easily become a problem with this approach.But hey, thanks for the suggestion though! – dsignr Mar 02 '12 at 20:09
  • @imaginonic Thank's for your reply. I was thinking of making a code generator, saving the real `fn(a,b)` into a text file with the sintax of the arrays (not manually typing 10000 different values!!!). Readability would be horrible though, I agree, but I would left that piece of code in another file, which can be loaded with `require`, since in Ruby we can reopen a class elsewhere to change it. :) – Sony Santos Mar 02 '12 at 21:12
  • Thanks for the info, I honestly didn't know about this before. Cheers! – dsignr Mar 03 '12 at 07:38