2

I'm working with finding a maximum value of a list (by using maximum) and I know that this function need to process the entire list to reach its goal and this obviously gets slower as the list gets bigger. Unfortunately, my list is huge (hundreds of million).

Is this the only way? Or there are faster ways to do this? I found that Haskell is fast but at this point (getting slower), I'm wondering is there any other option to find the maximum.

Sir DK
  • 179
  • 1
  • 9
  • How would you find the maximum of a list of unrelated numbers without checking every number? There are only two options: (a) either the numbers have some correlation you can exploit; or (b) you can perhaps use an array since this usually is faster to traverse. But in case of (b), it will still take *O(n)* time. – Willem Van Onsem Sep 28 '17 at 13:16
  • 7
    How are you generating the list? – MathematicalOrchid Sep 28 '17 at 13:17
  • 1
    A maximum is a commutative reduction, so you could distribute the job, if you have suitable parallel nodes to work on. But that assumes your huge set of data is already available to be chunked up; if it's just a list you're generating lazily then it is still faster to simply consume it. https://stackoverflow.com/questions/4028210/how-do-i-write-a-parallel-reduction-using-strategies-in-haskell might have some clues. – Yann Vernier Sep 28 '17 at 13:48
  • Being associative is enough for the job to be cut up and solved in parallel. No need for commutativity. – gallais Sep 28 '17 at 15:38
  • 2
    I think you are likely to get a more helpful answer more quickly if you edit your question to include additional detail. Several comments you made (to an answer that has now been deleted) suggested that the list was in a large text file, and you needed to perform multiple maximum/minimum calculations over various groupings, but none of this information is in the original question. It would be helpful to see examples of what the records in your file look like and the kinds of "groupings" you are attempting to find mins/maxes for. – K. A. Buhr Sep 28 '17 at 17:57
  • The question asked just a small part from a bigger work that I'm working with. The reason I do not explain where this ASCII file comes from, groupings and etc. because I think its not necessary. Let say you have a plain list `xs = [1..100000000]` and we use `maximum xs` or `minimum xs`, it will take some time to get the result. I'm just looking if there is a faster way than using those functions. – Sir DK Sep 29 '17 at 08:01
  • @SirDK Nope. Have you compiled with optimization though `ghc -O`? That can make a huge difference. – luqui Sep 30 '17 at 07:19
  • 1
    Without further knowledge about the "structure" of the list, there is no way to get around the `O(n)` complexity of the problem. – mschmidt Oct 03 '17 at 07:33

2 Answers2

1

I know that this function need to process the entire list to reach its goal and this obviously gets slower as the list gets bigger.

Finding a maximum is O(n) in Haskell (and presumably every other language as well). Unless you have some concrete benchmarks and code this seems totally expected.

0

Since you need to "look" at each value and pick the highest O(n) is the best solution (without any other information that could be used). If you have to perform the function multiple times, then you might want to sort your list (ascending or descending) and use the function head or last, that have a time complexity of O(1), while haskells sort is a mergesort with O(n log n) worst-case and average-case performance.

madnight
  • 420
  • 3
  • 10
  • Haskell's lists are lisp-style, behaving as possibly infinite singly linked lists. `last` may be O(n). Extracting the last element of an array is O(1). – Yann Vernier Dec 31 '17 at 14:14