3

Say I have two lists called x and y, both which contain numbers (coordinates), say that:

x = [0, 1, 2, 3, 4, 4, 5]
y = [0, 1, 3, 3, 5, 6, 7]

I would need to calculate the area under the curve that is formed when combining these two lists to (x, y) coordinates. I don't really understand how to create a function that could calculate the area out of this information.

i.e.

def integrate(x, y):
""" x & y = lists, area_under_curve = float value of the area """
    area_under_curve = 0
    last_x = x[0]
    last_y = y[0] 

    for cur_x, cur_y in list(zip(x, y))[1:]:
       ## some code here

    return area_under_curve
bharatk
  • 4,202
  • 5
  • 16
  • 30
  • The problem is a graph of discrete points instead of a continuous line. So technically, there is no “curve” from a mathematical standpoint. The answer will vary by technique used. Using the Trapezoid Rule, left Riemann sums, right Riemann sums, or midpoints will all yield different approximations. Perhaps, python has some standard library for a solving such a problem. – ManLaw Sep 20 '19 at 11:25

2 Answers2

3

As was mentioned, using the Trapezoid Rule the function is rather simple

def integrate(x, y):
   sm = 0
   for i in range(1, len(x)):
       h = x[i] - x[i-1]
       sm += h * (y[i-1] + y[i]) / 2

   return sm

The theory behind is to do with the trapezoid area being calculated, with the use of interpolation.

kuco 23
  • 786
  • 5
  • 18
  • this generated a negative AUC value (-0.996) for my ROC curve (x=FAR, y=TAR), do you have any idea why it is negative? @kuco 23 – Mas A Jan 08 '22 at 16:32
  • It's a relative area under curve, so if y has negative values, the area will be negative. @MasA – kuco 23 Jan 09 '22 at 20:47
2
import numpy as np
x = [0, 1, 2, 3, 4, 4, 5]
y = [0, 1, 3, 3, 5, 6, 7]

def integrate(x, y):
    area = np.trapz(y=y, x=x)
    return area
print(integrate(x, y))

Try this

Vashdev Heerani
  • 660
  • 7
  • 21