0

I would like to fit a copula to a dataframe with 2 columns: a and b. Then I have to calculate the conditional probability of a < 0, when b<-2 (i.e. P(a<0|b<-1).

I have tried the following code in python using the library copula; I am able to fit the copula to the data but I am not sure about calculating cdf :

import copula
df = pandas.read_csv("filename")  
cop = copulas.multivariate.GaussianMultivariate()
cop.fit(df)

I know the function cdf can calculate the conditional probability but I am not fully sure how to use that here.

lsr729
  • 752
  • 2
  • 11
  • 25

1 Answers1

1

The cdf method takes in an array of inputs and returns an array of the same shape, being the cumulative probability of each input value.

give a try to this code:

import numpy as np

# the array of inputs where b<-2 and a<0
inputs = np.array([[x, y] for x, y in zip(df['a'], df['b']) if y<-2 and x<0])

# Pass the inputs...
conditional_prob = cop.cdf(inputs)

another possible approach (a bit more formal, but longer)

# inputs
pdf = cop.pdf(inputs)

# pass the inputs where b < -2 to the copula's pdf method to calculate the probability density function of B
pdf_b = cop.pdf(np.array([[x, y] for x, y in zip(df['a'], df['b']) if y<-2]))

# Calculate P(A and B)
p_a_and_b = pdf * pdf_b

# Calculate P(B)
p_b = cop.cdf(np.array([[x, y] for x, y in zip(df['a'], df['b']) if y<-2]))

# Calculate P(A|B)
conditional_prob = p_a_and_b / p_b

let us know if it works for you. cheers.

Lorenzo Bassetti
  • 795
  • 10
  • 15
  • The first code gives me the list of cdf as output, but I am looking for a value of probability. – lsr729 Jan 18 '23 at 21:52