Turning a bunch of numeric attributes into a single score

Question

This comes up a lot and it's surprising there doesn't seem to be a standard solution. Say I have a bunch of numeric attributes -- you can imagine using this for ranking colleges or cities based on a bunch of component scores like student/teacher ratio or pollution or whatnot -- and want to turn them into a single score.

I'd like to take a bunch of examples and interpolate to get a consistent scoring function.

Maybe there are standard multidimensional curve-fitting or data-smoothing libraries or something that makes this straightforward?

More examples:

Turning the two blood pressure numbers into a single score for how close to optimal your blood pressure is
Turning body measurements into a single measure of how far you are from your ideal physique
Turning a set of times (100-meter dash, etc) into a fitness score for a certain sport

There is no one scoring function. You have to supply its definition. If there was such a function, every one would agree on what's the best programming language, pop song, athlete, movie, car to buy, etc. — unutbu, Dec 30 '14 at 20:41
Right, I failed to clarify, I want to supply it via the examples that then get interpolated. I'm imagining an iterative process where you decide some scores for specific examples, then look at the scores for new examples. If you disagree with any then you adjust them and add them to the reference set and compute a new scoring function. Iterate until happy. — dreeves, Dec 30 '14 at 21:06
I think perhaps you are looking for [supervised machine learning regression algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/). — unutbu, Dec 30 '14 at 21:30
any single score always is/should biased to the purpose so you first need/must know for what the score will be used then select valid attributes and only then start constructing the scoring function according to the score attribute dependency for example BMI computation combines measurements in a specific ways (but the formula varies from race to race or geo location when you want the true BMI and not the average shit) The only universal solution I know of is neural/fuzzy network learned with bunch of examples what is good and what is bad ... — Spektre, Jan 02 '15 at 09:48
@dreeves, what do you know about what the score should look like? Do you know correct ranking, which should be coherent with the score, for example? — Artem Sobolev, Jan 02 '15 at 11:51
Yes, we can assume that we (ie, the human) know the ranking. In fact, let's assume we know everything about the score. So for any example we can say exactly what the score should be for any given example. Let's also assume, as in @aothman's answer, that every attribute is either good or bad -- ie, contributes positively or negatively to the score. — dreeves, Jan 02 '15 at 19:05

aothman · Accepted Answer · 2015-01-02T19:06:59.813

tl;dr: Check out HiScore. It will allow you to quickly write and maintain scoring functions that behave in sensible ways.

To instantiate your simple example, let's say you have an app that receives as input a set of distances and times, and you want to map them to a 1-100 score. For instance, you get (1.2 miles, 8:37) and you'd like to return, say, 64.

The typical approach is to pick several basis functions and then futz around with the coefficients of those basis functions to get scores that "look right". For instance, you may have a linear basis function on minutes-per-mile, with additional basis functions for distance (maybe both linear in distance and linear in the square root of distance). You could even use e.g., radial basis functions for more complex expressiveness across your range of inputs. (This is very similar to what other answers have suggested in terms of ML algorithms like SVMs and the like.)

This approach is typically pretty fast, but there are many downsides. First, you have to get the basis functions right, which can be hard for more abstract and expressive functions. Second, you'll find that your score will ossify quickly: if you find an input that you feel is mis-scored, figuring out how to change it while making sure the rest of the scoring function "looks right" will be a challenge. Third, adding another attribute to the score (e.g., if the runner is male or female) can be difficult, as you may find that you'll need to add many more terms to your basis. Finally, there's no explicit guarantee in this approach that your score will behave intelligently---depending on the basis functions and coefficients you select, someone running a mile in 7:03 could get a higher score than someone running 1.1 miles in 7:01.

A different approach exists in the form of HiScore, a python library I wrote when faced with a similar problem. With HiScore, you label a reference set of items with scores and then it generates a scoring function that intelligently interpolates through those scores. For instance, you could take the last 100 inputs to your app, combine them with a handful of your most extreme inputs (perhaps take the convex hull of your submitted inputs in (distance, time) space), label them, and use HiScore to produce a reasonable scoring function. And if it ever comes up with a score that you disagree with, just add it to the reference set with the correct label and re-create the scoring function, because HiScore guarantees interpolation through the reference set.

One property of HiScore is that your attributes need to be monotone, or always increasing or decreasing. This is not a problem for the "running times" setting, because the score should go up as distance increases (for a fixed time) and down as time increases (for a fixed distance). HiScore's monotonicity gives you confidence your score will behave as expected; it guarantees someone running a mile in 7:03 will score no higher than someone running 1.1 miles in 7:01.

The blood pressure setting you bring up is interesting because it's not monotone. Low blood pressure is bad, but high blood pressure is bad too. You can still use HiScore here though: just split each measurement into a "high blood pressure" and "low blood pressure" component, where at least one of these is zero. For instance a systolic reading of 160 would be mapped into a systolic+ attribute of 60 and a systolic- attribute of 0. The score should be decreasing in both of these new attributes, and so this approach turns a non-monotone two-dimensional problem (with attributes systolic and diastolic) into a monotone four-dimensional one (with attributes systolic+, systolic-, diastolic+, diastolic-). (This trick is similar to one that helps get Linear Programs into canonical form.)

score 1 · Answer 2 · answered Jan 02 '15 at 15:41

You need to teach it what correct values are. There is no other way to precisely determine what a correct solution is. So as you said in the comments above, you need a human to tell it what the correct value (or what is the correct direction) is.

This is exactly what Supervised Machine Learning is. You need to have a collection of classified values and then you train your algorithm by giving it a subset of the collection to fit it's values against and then using the remaining subset of the collection as a measurement of how accurate it is.

An example of this is ANN (Artificial neural networks) and SVM (Support vector machines)

SVM

Here we have an example of a SVM fitting a model to data with 2 values (represented as the X and Y axis) and that has 2 clusters. You could think of red as high risk of heart disease and blue as low risk of heart disease and the values as some sort of measurement.

Of course in real world examples you would have a much higher dimension of values and perhaps more classes.

If you need the answers to then use for yourself, you could in some cases use the values from the ANN algorithm.

score 1 · Answer 3 · 2015-01-05T14:42:24.290

If you are able to teach the system with score values for a number of numeric attributes combinations, then your problem is indeed multivariate interpolation. Most probably, your case is that of irregular data points.

If your distribution of sample points is sufficiently homogeneous, radial basis function interpolation is a good starting point.

Interpolation will let you compute a score from numeric attribute values not seen before. Make sure to provide enough training data to cover the whole domain, otherwise you can get meaningless estimates at places. Actually, it builds a function S(X; X0, X1, X2, ... Xn) where X is the unknown and the Xi are known samples, with known score Si; interpolation is such that S(Xi; X0, X1, X2, ... Xn) = Si.

You may also consider approximation techniques, that build a function such that S(Xi; X0, X1, X2, ... Xn) ~ Si to some accuracy. The advantage is that these behave more smoothly and can actually "fix" errors in the input data.

There is no standard solution for two reasons:

these techniques are difficult because of the nature of higher dimensional space,
there is no universal 'black box' technique, they all depend on the specifics of the data sets.

Radial basis functions are a great, smooth solution for approximation and interpolation generally. One of the very nice things about them is that they can scale up to arbitrary dimensions naturally. The issue is that if monotonicity is important in your application, which it typically is for scoring, their complex functional form makes them a poor fit. One approach is to try to induce monotonicity by generating row constraints by checking for monotonicity at a bunch of points. Unfortunately, this suffers from the curse of dimensionality as the number of checked points grows exponentially. — aothman, Jan 08 '15 at 06:26

score 0 · Answer 4 · answered Jan 04 '15 at 23:40

If your label information is ordinal (i.e. ranking data), then you should use Learning to rank approaches. One of them is SVM Rank.

It works like this: you put your dataset into a file in svmlight format and train a classifier via svm_rank_learn. You might want to tune parameters, it could give you better accuracy. Then feeding svm_rank_classify another dataset (with unknown ranking) will give you scores which you can either use for ranking or on its own.

Another thing to mention is that by default SVM Rank uses linear kernel, that means that scoring function will be weighted combination of attributes. You can try other kernels (like radial-basis functions), but authors of SVM Rank warn you:

You can in principle use kernels in SVM^rank using the '-t' option just like in SVMlight, but it is painfully slow and you are probably better off using SVM^light.

score 0 · Answer 5 · answered Jan 08 '15 at 15:21

Maybe you can use a probabilistic approach among the different measures. As an example, check the following 8min video where C. Sagan uses Drake equation to estimate the probability of other advanced civilizations in the universe, based on several (and different) measurements/estimates.

You could similarly generate your estimate and then a score based on that.

Turning a bunch of numeric attributes into a single score

5 Answers5