7

I have the file data.txt with two columns and N rows, something like this:

0.009943796 0.4667975
0.009795735 0.46777886
0.009623984 0.46897832
0.009564759 0.46941447
0.009546991 0.4703958
0.009428543 0.47224948
0.009375241 0.47475737
0.009298249 0.4767201
[...]

Every couple of values in the file correspond to one point coordinates (x,y). If plotted, this points generate a curve. I would like to calculate the area under curve (AUC) of this curve.

So I load the data:

data = load("data.txt");
X = data(:,1);
Y = data(:,2);

So, X contains all the x coordinates of the points, and Y all the y coordinates.

How could I calculate the area under curve (AUC) ?

Matt
  • 12,848
  • 2
  • 31
  • 53
DavideChicco.it
  • 3,318
  • 13
  • 56
  • 84

6 Answers6

4

Easiest way is the trapezoidal rule function trapz.

If your data is known to be smooth, you could try using Simpson's rule, but there's nothing built-in to MATLAB for integrating numerical data via Simpson's rule. (& I'm not sure how to use it for x/y data where x doesn't increase steadily)

Jason S
  • 184,598
  • 164
  • 608
  • 970
  • Thanx guys, I have tried with trapz(), but it strangely gives me always negative values. Why this? If it is an area, it should be always positive... Any idea? Thanx! – DavideChicco.it Dec 28 '11 at 20:02
  • 3
    If the curve go below 0 the area actually will be decreased. This is just integral, remember. To get the positive AUC you might need to change the baseline. For example, subtract the `min(Y)` from `Y`. Or you can use `abs(Y)` to sum up positive and negative areas. – yuk Dec 28 '11 at 20:20
  • 2
    Technically, if you use `trapz(x,y)`, the sign of the result depends on the sign of y and the sign of the change in x. (remember: this is integral of y dx) So if your y values are positive but x is decreasing, you'd get a negative number. It's actually a little more complicated than that: for closed curves, the sign should be positive for clockwise encircling and negative for counterclockwise encircling (see http://en.wikipedia.org/wiki/Green%27s_theorem#Area_Calculation ). – Jason S Dec 28 '11 at 22:34
4

just add AUC = trapz(X,Y) to your program and you will get the area under the curve

Simon
  • 41
  • 2
2

[~,~,~,AUC] = perfcurve(labels,scores,posclass);

% posclass might be 1

http://www.mathworks.com/matlabcentral/newsreader/view_thread/252131

Nucular
  • 683
  • 1
  • 5
  • 19
1

Source: Link

An example in MATLAB to help you get your answer ...

x=[3 10 15 20 25 30];
y=[27 14.5 9.4 6.7 5.3 4.5];
trapz(x,y)

In case you have negative values in y, you can do like,

y=max(y,0)
Faheem
  • 277
  • 2
  • 6
1

You can do something like that:

AUC = sum((Y(1:end-1)+Y(2:end))/2.*...
  (X(2:end)-X(1:end-1)));
Oli
  • 15,935
  • 7
  • 50
  • 66
0

There are some options to trapz for the person ready to do some coding by themselves. This link shows the implementation of Simpson's rule, with python code included. There is also a File Exchange on simpsons rule.

patrik
  • 4,506
  • 6
  • 24
  • 48