WARNING
Allowing arbitrary Erlang functions via map-reduce is a security risk. Any valid Erlang can be executed, including sending your entire data set offsite or formatting the hard drive.
You have been warned.
However, if you implicitly trust any client that may connect to your cluster, you can allow Erlang source to be passed in a map-reduce request by setting {allow_strfun, true}
in the riak_kv section of app.config, (or in the advanced.config if you are using riak.conf).
Once you have allowed passing an Erlang function in a map-reduce phase, you need to pass in a function of the form fun(RiakObject,KeyData,Arg) -> [result] end.
Note that this must be an anonymous fun, so fun
is a keyword, not a name, and it must end with end.
Your function should handle the case where {error,notfound}
is passed as the first argument instead of an object. Simply adding a catch-all clause to the function could accomplish that.
Perhaps something like:
{
"inputs":{
"bucket":"test",
"key_filters":[["matches", ".*"]]
},
"query":[
{
"map":{
"language":"erlang",
"source":"fun(RiakObject, _KeyData, _Arg) ->
Key = riak_object:key(RiakObject),
Count = riak_kv_crdt:value(
RiakObject,
<<\"riak_kv_pncounter\">>),
[ {Key, Count} ];
(_,_,_) -> [{error,0}]
end."
}
}
]}
Allowing the source to be passed in the request is very useful while developing and debugging. For production, you really should put the functions in a dedicated pre-compiled module that you copy to the code path of each node so that the phase spec can specify the module and function by name instead of providing arbitrary code.
{"map":{
"language":"erlang",
"module":"yourprecompiledmodule",
"function":"functionname"}}