I have been running a Rails site for a couple of years and some articles are being pulled from the DB based on a weight field. The data structure is:
{name: 'Content Piece 1', weight: 50}
{name: 'Content Piece 2', weight: 25}
{name: 'Content Piece 3', weight: 25}
The Ruby code that I originally wrote looks like so:
choices = []
sum = articles.inject(0.0) { |sum, article|
sum += listing['weight']
}
pick = rand(sum)
choices << articles.detect { |listing|
if pick <= listing['weight']
true
else
pick -= listing['weight']
false
end
}
This works well at pulling out each content piece and respecting the weight. After running this code 100 times across the data set, multiple times I get the content pieces distributed fairly well based on the weights:
100.times do
choices = []
sum = articles.inject(0.0) { |sum, article|
sum += listing['weight']
}
pick = rand(sum)
choices << articles.detect { |listing|
if pick <= listing['weight']
true
else
pick -= listing['weight']
false
end
}
end
{:total_runs=>100, "Content Piece 1"=>51, "Content Piece 2"=>22, "Content Piece 3"=>27}
{:total_runs=>100, "Content Piece 1"=>53, "Content Piece 2"=>30, "Content Piece 3"=>17}
I am starting to more frequently use ElasticSearch at the moment and I was hoping I could index the data in ES and pull the content out based on weights.
I found a SO post talking about something very similar that can be found here:
Weighted random sampling in Elasticsearch
I have pulled the search query across and changed it to match my data structure:
{
"sort": ["_score"],
"size": 1,
"query": {
"function_score": {
"functions": [
{
"random_score": {}
},
{
"field_value_factor": {
"field": "weight",
"modifier": "none",
"missing": 0
}
}
],
"score_mode": "multiply",
"boost_mode": "replace"
}
}
}
This query does definitely respect the weighting and pulls out the Content Piece with the weight 50 a lot more than the other 2 content pieces with the weights of 25 but it doesn't distribute the content out of a total of 100 weight, if that makes sense. I run this query 100 times and get results like so:
{:total_runs=>100, "Content Piece 1"=>70, "Content Piece 2"=>22, "Content Piece 3"=>8}
{:total_runs=>100, "Content Piece 1"=>81, "Content Piece 2"=>7, "Content Piece 3"=>12}
{:total_runs=>100, "Content Piece 1"=>90, "Content Piece 2"=>3, "Content Piece 3"=>7}
As I am new to ES and still learning the ins and outs of the querying, scoring etc I was wondering if anyone could help with a solution to more mimic the Ruby code I wrote to more effectively distribute the content based on the weights out of 100. Would the Painless
scripting work for porting the Ruby code?
I hope this makes sense, let me know if you have any more questions to help explain what I am trying to achieve. Thanks!