4

I'm new with Aerospike. Is there any easy way to make a search with a part of text like Mysql. Example:

select * from test where column like '%hello%';

I find difficult to migrate to a NoSQL database if this common operations are not supported. Thanks.

Ronen Botzer
  • 6,951
  • 22
  • 41
karrtojal
  • 796
  • 7
  • 16
  • 2
    That common operation is one you shouldn't be using in an RDBMS either, at least if you expect to get back a result in a reasonable amount of time. LIKE can't use an index, so it has to scan the entire table. Scans on a very large set in Aerospike will be far faster than a scan of a similar sized table. – Ronen Botzer Apr 19 '16 at 16:30
  • Thanks rbotzer for your answer. It's a good technique what you propose. Although I'm still need to search for text. I have an application that search for words or parts of words. I think that a MongoDb suits better for this purpose and still be a NoSQL database. Best regards. – karrtojal Apr 22 '16 at 04:44

1 Answers1

3

Predicate filtering was added in release 3.12. You can use the stringRegex method of the PredExp class of the Java client to implement an equivalent to LIKE. Predicate filters also currently exists for the C, C# and Go clients.

This example in the Aerospike Java client shows something similar:

    Statement stmt = new Statement();
    stmt.setNamespace(params.namespace);
    stmt.setSetName(params.set);        
    stmt.setFilter(Filter.range(binName, begin, end));
    stmt.setPredExp(
        PredExp.stringBin("bin3"),
        PredExp.stringValue("prefix.*suffix"),
        PredExp.stringRegex(RegexFlag.ICASE | RegexFlag.NEWLINE)
        );

If you're using a language client that doesn't yet support predicate filtering, you'd implement this with a stream UDF attached to a scan or query. For example, in the Python client you would create an instance of class aerospike.Query with or without a predicate and call the aerospike.Query.apply() method.

Ideally you would accelerate this by bucketing and using a predicate to narrow down your search, rather than scanning the entire set. For example, you can create a startswith bin that holds the first letter, use the predicate to find that, then send the records matched through the stream UDF. Just note that a LIKE is a horribly slow operation on an RDBMS, as well, because it can't use an index.

local function bin_like(bin, val, plain)
    return function(rec)
        if rec[bin] and type(rec[bin]) == "string" then
            if string.find(rec[bin], val, 1, plain) then
                return true
            else
                return false
        else
            return false
        end
    end
end

local function map_record(rec)
  local ret = map()
  for i, bin_name in ipairs(record.bin_names(rec)) do
    ret[bin_name] = rec[bin_name]
  end
  return ret
end

function check_bins_match(stream, bin, val, plain)
  return stream : filter(bin_like(bin, val, plain)) : map(map_record)
end
Ronen Botzer
  • 6,951
  • 22
  • 41