Persistence Database(MySQL/MongoDB/Cassandra/BigTable/BigData) Vs Non-Persistence Array (PHP/PYTHON)

Question

How beneficial will it be to use Python/PHP Nonpersistent array for storing 6GB+ data with 800+ million rows in RAM, rather than using MySQL/MongoDB/Cassandra/BigTable/BigData(Persistence Database) database when it comes to speed/latency in simple query execution?

For example, finding one name in 800+ million rows within 1 second: is it possible? Does anyone have experience of dealing with a dataset of more than 1-2 billion rows and getting the result within 1 second for a simple search query?

Is there a better, proven methodology to deal with billions of rows?

Are you really just trying to lookup one row in a single, very long table? That will be fast either with an in-memory array with an appropriate index (e.g., a pandas dataframe), or in a database table with an index on the appropriate field. Without an index, a full scan of the in-memory array may be faster than a full scan of a table on disk, but mainly because you've already read the whole table into memory. If you use an in-memory array you will need to read the whole thing from disk at startup, write it back at the end, and worry about syncing between threads. With a database you won't. — Matthias Fripp, Aug 29 '17 at 21:54

score 4 · Answer 1 · answered Jan 30 '17 at 12:08

4

It should be very big different, around 4-5 orders of magnitude faster. The database stores records in 4KB blocks (usually), and has to bring each such block into memory it needs some milliseconds. Divide the size of your table with 4KB and get the picture. In contrast, corrresponding times for in-memory data are usually nanoseconds. There is no question that memory is faster, the real question is if you have enough memory and how long you can keep your data there.

However, the above holds for a select * from table query. If you want a select * from table where name=something, you can create an index on the name, so that the database does not have to scan the whole file, and the results should be much, much better, probably very satisfying for practical use.

answered Jan 30 '17 at 12:08

blue_note

27,712
9
72
90

Can you share your personal experiment/experience if you ever had a deal with 1 billion around rows and able to get the result within 1 second, what was your findings? – AMOL KHADE Jan 30 '17 at 13:34
I had tried that way, but when it comes to indexing too it won't give the performance as expected like what I said within 1 second. BigData & BigTable do that but then again its the art of distributing the file system and having multiple replicas of the same data so that multiple parallel workers/threads can perform the same job by sharing effectively. Here I was expecting someone who had really defeated the performance of BigData or & BigTable with a nonpersistent approach. – AMOL KHADE Jan 30 '17 at 13:34

egorlitvinenko · Answer 2 · 2017-08-30T09:35:38.527

4 bytes (int) * 1_000_000_000 ~ 4 Gb 4 bytes (int) * 1_000_000_000 / 64 bytes = 62500000 times (for L1 cache) 4 bytes (int) * 1_000_000_000 / 64 bytes = 62500000 times (for L2 cache)

Taken latency, which all should know for main memory 100 ns from here we get 100 s. If all inside L1 cache (64 Bytes line for intel) it is near about 31.25 ms. But before that there is also L2/L3 caches (same line size) is would be 218,75 ms. You can see that to read 1 Mb sequentially (in other words it's the best case), so for 4 Gb it is 4024 * 250 µs = 1006000 µs ~= 1 s. SSD disk has less latency, but it is not so simple. There is research (maybe expired now) showed that most of SSD disks, which available for all to buy, can not hold really very much high load rates (reasons - they fails and more interesting - they have own garbage collector, which can add big latency). But also there are solutions adaptive to SSD disks environment like Aerospike and generally, of course, SSDs are faster then HDD.

Just to understand. On typical laptop (my: intel core i5, x64, 16Gb RAM) I need near from 580 ms to 875 ms to calculate long sum for 1 billion int elements. I'm also can see Clickhouse speed from 300 Mb/s to 354.66 Mb/s to calculate sum on Int32 column on my SSD. (note that sum in both cases does not make sense, because of type overflow)

Of course, we also have CUDA as variant or even simple threading (suppose multiple threads will calculate sum, we can easily acheave).

So... What can we do?

There are two types of scaling: vertical and horizontal. Most of databases prefer horizontal scaling, I suppose the reason is simple. Horizontal scaling is simpler then vertical. For vertical scaling you need people (or you should have by your own) very good expertise in different areas. For example, from my life, I should know a lot of about Java/HotSpot/OS architectures/Many-many technologies/frameworks/DBs to write or understand benefits of different decisions when creating high performance applications/algorithms. And this is only beginning, there are much harder experts then me.

Other databases use vertical scaling, more accurately they uses special optimizations for particular scenarios/queries.

All of decisions are compromiss between different operations. For example, for Top N problem Vertica and Druid have specific implementations, which solve exactly this task. In Cassandra to make all selects fast you should create multiple tables or multiple views for one table with different representation efficient for particular query, of course, spending more storage place, because of data dublication.

One of the biggest real problem that even you can read 1 billion rows in one second - you can not write at the same time in the same table likely. In other words the main problem - it is hard to satisfy all user's requests for all user's tasks at the same time.

Is there a better, proven methodology to deal with billions of rows?

Some examples:

RAM with combination of memory-mapped files (to hold overhead): When you use memory-mapped files (or random access files), you can store more data and with good disks, receive high read/write rates.
Indexed memory-segments: The idea is split your data by index, which will be associate with segment, and do query inside segments, without processing all data.
Task's specific storages: when you know your data and requirements, you could come up with storage, which will be very efficient for it, but not for others (in your case "finding one name" you could indexing and partitioning data by letters, prefix, etc.);
Do complex calculation in C/C++, sometimes this could be faster. :) It's a little bit funny, but true stories. Through world of mouth C++ is more complex for programming and support, but it's easier to write fast application on C++, if you have enough expertise;
Data dublication (store data in multiple ways for different queries);
Code generation (generate code on the fly, which will be optimized for each particular task);
Multithreading, of course: do one tasks in multiple threads if you could effectively share cpu resources;
Caching, of course: cache resuls, based on disks/RAM/network (I mean external cache servers);

In some cases using own solution could be more expensive (and effective), then custom. In some cases it is not...

Comparison of strings is relatively complex, so I suppose you need start from calculating how much time you need to compare two strings. This simple example shows how much time we need to compare two strings in Java.

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.OptionsBuilder;

import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;

@State(Scope.Benchmark)
@BenchmarkMode(Mode.SampleTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Threads(1)
public class StringEquals {

    @Param({"0", "5", "10"})
    int prefix;

    String theSamePart, theSamePartQuery;

    @Setup(Level.Invocation)
    public void setData() {
        String value = String.valueOf(ThreadLocalRandom.current().nextInt());
        theSamePart = prefix > 0 ? value.substring(Math.min(prefix, value.length())) : value;
        value = String.valueOf(ThreadLocalRandom.current().nextInt());
        theSamePartQuery = prefix > 0 ? theSamePart + value.substring(Math.min(prefix, value.length())) : value;
    }

    @Benchmark
    public boolean equals(StringEquals stringEquals) {
        return stringEquals.theSamePart.equals(stringEquals.theSamePartQuery);
    }

    public static void main(String[] args) throws Exception {
        new Runner(new OptionsBuilder()
                .include(StringEquals.class.getSimpleName())
                .measurementIterations(10)
                .warmupIterations(10)
                .build()).run();
    }

}

Results:

Benchmark                           (prefix)    Mode      Cnt     Score   Error  Units
StringEquals.equals                        0  sample  3482270     0,047 ± 0,011  us/op
StringEquals.equals:equals·p0.00           0  sample              0,022          us/op
StringEquals.equals:equals·p0.50           0  sample              0,035          us/op
StringEquals.equals:equals·p0.90           0  sample              0,049          us/op
StringEquals.equals:equals·p0.95           0  sample              0,058          us/op
StringEquals.equals:equals·p0.99           0  sample              0,076          us/op
StringEquals.equals:equals·p0.999          0  sample              0,198          us/op
StringEquals.equals:equals·p0.9999         0  sample              8,636          us/op
StringEquals.equals:equals·p1.00           0  sample           9519,104          us/op
StringEquals.equals                        5  sample  2686616     0,037 ± 0,003  us/op
StringEquals.equals:equals·p0.00           5  sample              0,021          us/op
StringEquals.equals:equals·p0.50           5  sample              0,028          us/op
StringEquals.equals:equals·p0.90           5  sample              0,044          us/op
StringEquals.equals:equals·p0.95           5  sample              0,048          us/op
StringEquals.equals:equals·p0.99           5  sample              0,060          us/op
StringEquals.equals:equals·p0.999          5  sample              0,238          us/op
StringEquals.equals:equals·p0.9999         5  sample              8,677          us/op
StringEquals.equals:equals·p1.00           5  sample           1935,360          us/op
StringEquals.equals                       10  sample  2989681     0,039 ± 0,001  us/op
StringEquals.equals:equals·p0.00          10  sample              0,021          us/op
StringEquals.equals:equals·p0.50          10  sample              0,030          us/op
StringEquals.equals:equals·p0.90          10  sample              0,049          us/op
StringEquals.equals:equals·p0.95          10  sample              0,056          us/op
StringEquals.equals:equals·p0.99          10  sample              0,074          us/op
StringEquals.equals:equals·p0.999         10  sample              0,222          us/op
StringEquals.equals:equals·p0.9999        10  sample              8,576          us/op
StringEquals.equals:equals·p1.00          10  sample            325,632          us/op

So assume that you need 1_000_000_000 strings, you need approximately 8_000_000_000 us = 8000 s for processing 1 billion strings in 99.99% cases.

In contrast we could try to do it in parallel way:

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.OptionsBuilder;

import java.util.concurrent.*;

@State(Scope.Benchmark)
@BenchmarkMode(Mode.SampleTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Threads(1)
public class SearchBillionForkJoin {

    static final int availableProcessors = 4; // Runtime.getRuntime().availableProcessors()
    static final int size = 10_000_000, bucketSize = size / availableProcessors;
    static final int handlersCount = availableProcessors;

    @Param({"50", "100"})
    int spinner;

    String[] a;
    Callable<Integer>[] callables;
    ForkJoinTask<Integer>[] tasks;
    QueryHolder queryHolder;

    @Setup(Level.Trial)
    public void setup() {
        callables = new Callable[handlersCount];
        queryHolder = new QueryHolder();
        a = new String[size];
        for (int i = 0; i < callables.length; ++i) {
            switch (i) {
                case 0:
                    callables[i] = createForBucket(queryHolder, a, 0, bucketSize);
                    break;
                case 1:
                    callables[i] = createForBucket(queryHolder, a, bucketSize, bucketSize * 2);
                    break;
                case 2:
                    callables[i] = createForBucket(queryHolder, a, bucketSize * 2, bucketSize * 3);
                    break;
                case 3:
                    callables[i] = createForBucket(queryHolder, a, bucketSize * 3, size);;
                    break;
            }
        }
        tasks = new ForkJoinTask[handlersCount];
    }

    @Setup(Level.Invocation)
    public void setData() {
        for (int i = 0; i < a.length; ++i) {
            a[i] = String.valueOf(ThreadLocalRandom.current().nextInt());
        }
        queryHolder.query = String.valueOf(ThreadLocalRandom.current().nextInt());
    }

    @Benchmark
    public Integer forkJoinPoolWithoutCopy() {
        try {
            for (int i = 0; i < tasks.length; ++i) {
                tasks[i] = ForkJoinPool.commonPool().submit(callables[i]);
            }
            Integer position = -1;
            boolean findMore = true;
            head:
            while(position == -1 && findMore) {
                findMore = false;
                for (int i = 0; i < tasks.length; ++i) {
                    if (tasks[i].isDone() && !tasks[i].isCancelled()) {
                        final Integer value = tasks[i].get();
                        if (value > -1) {
                            position = value;
                            for (int j = 0; j < tasks.length; ++j) {
                                if (j != i && !tasks[j].isDone()) {
                                    tasks[j].cancel(true);
                                }
                            }
                            break head;
                        }
                    } else {
                        findMore = true;
                    }
                }
                int counter = spinner;
                while (counter > 0) --counter;
            }
            return position;
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    public static void main(String[] args) throws Exception {
        new Runner(new OptionsBuilder()
                .include(SearchBillionForkJoin.class.getSimpleName())
                .jvmArgs("-Xmx10G")
                .measurementIterations(10)
                .warmupIterations(10)
                .build()).run();
    }

    static boolean isDone(ForkJoinTask[] tasks) {
        for (int i = 0; i < tasks.length; ++i) {
            if (!tasks[i].isDone()) {
                return false;
            }
        }
        return true;
    }

    static Callable<Integer> createForBucket(QueryHolder queryHolder, String[] a, int start, int end) {
        return new Callable<Integer>() {
            @Override
            public Integer call() throws Exception {
                for (int j = start; j < end; ++j) {
                    if (queryHolder.query.equals(a[j])) {
                        return j;
                    }
                }
                return -1;
            }
        };
    }

    static class QueryHolder {
        String query = null;
    }

}

I use 10_000_000 and 4 threads (for 4 cpu cores), because I don't have enough memory for it. Results look still not appropriate.

Benchmark                                                                      (spinner)    Mode  Cnt    Score   Error  Units
SearchBillionForkJoin.forkJoinPoolWithoutCopy                                         50  sample  166   47,136 ± 1,989  ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.00           50  sample         5,521          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.50           50  sample        47,055          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.90           50  sample        54,788          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.95           50  sample        56,653          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.99           50  sample        61,352          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.999          50  sample        63,635          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.9999         50  sample        63,635          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p1.00           50  sample        63,635          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy                                        100  sample  162   51,288 ± 4,031  ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.00          100  sample         5,448          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.50          100  sample        49,840          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.90          100  sample        67,030          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.95          100  sample        90,505          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.99          100  sample       110,920          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.999         100  sample       121,242          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p0.9999        100  sample       121,242          ms/op
SearchBillionForkJoin.forkJoinPoolWithoutCopy:forkJoinPoolWithoutCopy·p1.00          100  sample       121,242          ms/op

In another words 63,635 ms * 100 = 6363,5 ms = 6 s. This results could be improved, for example, if you could use affinity locks (one full cpu per thread). But could be too complex.

Let's try to use segments just to show idea:

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.OptionsBuilder;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.*;

@State(Scope.Benchmark)
@BenchmarkMode(Mode.SampleTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Threads(1)
public class SearchInMapBillionForkJoin {

    static final int availableProcessors = 8; // Runtime.getRuntime().availableProcessors()
    static final int size = 10_000_000, bucketSize = size / availableProcessors;
    static final int handlersCount = availableProcessors;

    Map<Integer, List<StringWithIndex>> strings;
    QueryHolder queryHolder;
    ForkJoinTask<Integer>[] tasks;
    Callable<Integer>[] callables;
    @Param({"50", "100"})
    int spinner;

    @Setup(Level.Trial)
    public void setup() throws Exception {
        queryHolder = new QueryHolder();
        strings = new ConcurrentHashMap<>();
        tasks = new ForkJoinTask[handlersCount];
        callables = new Callable[handlersCount];
        setData();
    }

    public void setData() throws Exception {
        final int callableBucket = size / handlersCount;
        for (int i = 0; i < handlersCount; ++i) {
            callables[i] = createGenerateForBucket(strings, callableBucket);
            tasks[i] = ForkJoinPool.commonPool().submit(callables[i]);
        }
        while(!isDone(tasks)) {
            int counter = spinner;
            while (counter > 0) --counter;
        }
        Map<Integer, Integer> distribution = new HashMap<>();
        for (List<StringWithIndex> stringWithIndices : strings.values()) {
            distribution.compute(stringWithIndices.size(), (key, value) -> value == null ? 1 : value + 1);
        }
        int maxListSize = 0;
        for (int i = 0; i < handlersCount; ++i) {
            Integer max = tasks[i].get();
            if (max > maxListSize) {
                maxListSize = max;
            }
        }
        System.out.println("maxListSize = " + maxListSize);
        System.out.println("list size distribution = " + distribution);
        System.out.println("map size = " + strings.size());
        distribution = null;
        queryHolder.query = String.valueOf(ThreadLocalRandom.current().nextInt());
    }

    @Benchmark
    public Integer findInSegment() {
        final String query = this.queryHolder.query;
        final Integer hashCode = query.hashCode();
        final Map<Integer, List<StringWithIndex>> strings = this.strings;
        if (strings.containsKey(hashCode)) {
            List<StringWithIndex> values = strings.get(hashCode);
            if (!values.isEmpty()) {
                final int valuesSize = values.size();
                if (valuesSize > 100_000) {
                    final int bucketSize = valuesSize / handlersCount;
                    callables[0] = createSearchForBucket(query, values, 0, bucketSize);
                    callables[1] = createSearchForBucket(query, values, bucketSize, bucketSize * 2);
                    callables[2] = createSearchForBucket(query, values, bucketSize * 2, bucketSize * 3);
                    callables[3] = createSearchForBucket(query, values, bucketSize * 3, values.size());
                    try {
                        for (int i = 0; i < callables.length; ++i) {
                            tasks[i] = ForkJoinPool.commonPool().submit(callables[i]);
                        }
                        Integer position = -1;
                        boolean findMore = true;
                        head:
                        while (position == -1 && findMore) {
                            findMore = false;
                            for (int i = 0; i < tasks.length; ++i) {
                                if (tasks[i].isDone() && !tasks[i].isCancelled()) {
                                    final Integer value = tasks[i].get();
                                    if (value > -1) {
                                        position = value;
                                        for (int j = 0; j < tasks.length; ++j) {
                                            if (j != i && !tasks[j].isDone()) {
                                                tasks[j].cancel(true);
                                            }
                                        }
                                        break head;
                                    }
                                } else {
                                    findMore = true;
                                }
                            }
                            int counter = spinner;
                            while (counter > 0) --counter;
                        }
                        return position;
                    } catch (Exception e) {
                        throw new RuntimeException(e);
                    }
                } else {
                    for (StringWithIndex stringWithIndex : values) {
                        if (query.equals(stringWithIndex.value)) {
                            return stringWithIndex.index;
                        }
                    }
                }
            }
        }
        return -1;
    }

    public static void main(String[] args) throws Exception {
        new Runner(new OptionsBuilder()
                .include(SearchInMapBillionForkJoin.class.getSimpleName())
                .jvmArgs("-Xmx6G")
                .measurementIterations(10)
                .warmupIterations(10)
                .build()).run();
    }

    static class StringWithIndex implements Comparable<StringWithIndex> {
        final int index;
        final String value;

        public StringWithIndex(int index, String value) {
            this.index = index;
            this.value = value;
        }

        @Override
        public int compareTo(StringWithIndex o) {
            int a = this.value.compareTo(o.value);
            if (a == 0) {
                return Integer.compare(this.index, o.index);
            }
            return a;
        }

        @Override
        public int hashCode() {
            return this.value.hashCode();
        }

        @Override
        public boolean equals(Object obj) {
            if (obj instanceof StringWithIndex) {
                return this.value.equals(((StringWithIndex) obj).value);
            }
            return false;
        }

    }

    static class QueryHolder {
        String query = null;
    }

    static Callable<Integer> createSearchForBucket(String query, List<StringWithIndex> values, int start, int end) {
        return new Callable<Integer>() {
            @Override
            public Integer call() throws Exception {
                for (int j = start; j < end; ++j) {
                    StringWithIndex stringWithIndex = values.get(j);
                    if (query.equals(stringWithIndex.value)) {
                        return stringWithIndex.index;
                    }
                }
                return -1;
            }
        };
    }

    static Callable<Integer> createGenerateForBucket(Map<Integer, List<StringWithIndex>> strings,
                                                     int count) {
        return new Callable<Integer>() {
            @Override
            public Integer call() throws Exception {
                int maxListSize = 0;
                for (int i = 0; i < count; ++i) {
                    String value = String.valueOf(ThreadLocalRandom.current().nextInt());
                    List<StringWithIndex> values = strings.computeIfAbsent(value.hashCode(), k -> new ArrayList<>());
                    values.add(new StringWithIndex(i, value));
                    if (values.size() > maxListSize) {
                        maxListSize = values.size();
                    }
                }
                return maxListSize;
            }
        };
    }

    static boolean isDone(ForkJoinTask[] tasks) {
        for (int i = 0; i < tasks.length; ++i) {
            if (!tasks[i].isDone()) {
                return false;
            }
        }
        return true;
    }

}

Results:

Benchmark                                                       (spinner)    Mode      Cnt   Score    Error  Units
SearchInMapBillionForkJoin.findInSegment                               50  sample  5164328  ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.00           50  sample           ≈ 10⁻⁵           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.50           50  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.90           50  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.95           50  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.99           50  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.999          50  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.9999         50  sample            0.009           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p1.00           50  sample           18.973           ms/op
SearchInMapBillionForkJoin.findInSegment                              100  sample  4642775  ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.00          100  sample           ≈ 10⁻⁵           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.50          100  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.90          100  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.95          100  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.99          100  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.999         100  sample           ≈ 10⁻⁴           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p0.9999        100  sample            0.005           ms/op
SearchInMapBillionForkJoin.findInSegment:findInSegment·p1.00          100  sample            0.038           ms/op

Before do any global conclusions, is good to know some criticism for this example:

because of artifial of benchmark data has very good distribution among list'sizes: one example: maxListSize = 3, list size distribution = {1=9954167, 2=22843, 3=49}, map size = 9977059. maxListSize for all iterations was only 4.
as a result we never go inside "if (valuesSize > 100_000)" branch;
more over, in most cases we probably don't go inside "} else { for (StringWithIndex stringWithIndex : values) {", because of "if (strings.containsKey(hashCode))" condition;
this test in contrast of previous tests, was running on different PC (8 cpus, 32 Gb RAM, amd64);

Here you can get idea, that check is there key in map (or memory segment), obviously, is better then go over all data. This theme is very broad. There are many people who work with performance and can say that "Performance optimization is infinite process". :) I also should remind that "Pre-optimization is bad", and from me add that it does not mean that you should design your system without thinking, irrationally.

Disclaimer: All this information could be wrong. It is intended for information purposes only, and may not be incorporated into any contract. Before use it for production scenarios you should check on your own. And you should not use this information in production code refers to me. I'm not responsible for possible loss of money. All this information is not refers to any companies where I ever work. I'm not affiliated with any of MySQL/MongoDB/Cassandra/BigTable/BigData and also Apache Ignite/Hazelcast/Vertica/Clickhouse/Aerospike or any other database.

Thanks for responding, will wait for more insightful from your side. — AMOL KHADE, Aug 28 '17 at 11:49

score 0 · Answer 3 · answered Aug 30 '17 at 10:34

You can still take advantage of RAM based lookup,and still having extra functionalities that specialized databases provide as compared to a plain hashmap/array in RAM.
Your objective with ram based lookups is faster lookups, and avoid network overhead. However both can be achieved by hosting the database locally, or network would not even be a overhead for small data payloads like names.
By the RAM array method, the apps' resilience decreases as you have a single point of failure, no easy snapshotting i.e. you would have to do some data warming everytime your app changes or restarts, and you will always be restricted to single querying pattern and may not be able to evolve in future.
Equally good alternatives with reasonably comparable throughput can be redis in a cluster or master-slave configuration, or aerospike on SSD machines. You get advantage of constant snapshots,high throughput,distribution and resilience through sharding/clustering i.e. 1/8 of data in 8 instances so that there is no single point of failure.

Persistence Database(MySQL/MongoDB/Cassandra/BigTable/BigData) Vs Non-Persistence Array (PHP/PYTHON)

3 Answers3