how to calculate coskew and cokurtosis

Question

You can calculate skew and kurtosis with the the methods

However, there is no convenient way to calculate the coskew or cokurtosis between variables. Or even better, the coskew or cokurtosis matrix.

Consider the pd.DataFrame df

import pandas as pd
import numpy as np

np.random.seed([3,1415])
df = pd.DataFrame(np.random.rand(10, 2), columns=list('ab'))

df

          a         b
0  0.444939  0.407554
1  0.460148  0.465239
2  0.462691  0.016545
3  0.850445  0.817744
4  0.777962  0.757983
5  0.934829  0.831104
6  0.879891  0.926879
7  0.721535  0.117642
8  0.145906  0.199844
9  0.437564  0.100702

How do I calculate the coskew and cokurtosis of `a` and `b`?

**the current best answer is not correct** since it calculates the coskewness and cokurtosis matrix as square matrices. The coskewness and cokurtosis matrices are both tensors, which even when flattened would be rectangular arrays — develarist, Dec 06 '20 at 12:50

score 27 · Accepted Answer · answered Jan 27 '17 at 09:41

References

Calculating `coskew`

My interpretation of coskew is the "correlation" between one series and the variance of another. As such, you can actually have two types of coskew depending on which series we are calculating the variance of. Wikipedia shows these two formula

'left'

'right'

Fortunately, when we calculate the coskew matrix, one is the transpose of the other.

def coskew(df, bias=False):
    v = df.values
    s1 = sigma = v.std(0, keepdims=True)
    means = v.mean(0, keepdims=True)

    # means is 1 x n (n is number of columns
    # this difference broacasts appropriately
    v1 = v - means

    s2 = sigma ** 2

    v2 = v1 ** 2

    m = v.shape[0]

    skew = pd.DataFrame(v2.T.dot(v1) / s2.T.dot(s1) / m, df.columns, df.columns)

    if not bias:
        skew *= ((m - 1) * m) ** .5 / (m - 2)

    return skew

demonstration

coskew(df)

          a         b
a -0.369380  0.096974
b  0.325311  0.067020

We can compare this to df.skew() and check that the diagonals are the same

df.skew()

a   -0.36938
b    0.06702
dtype: float64

Calculating `cokurtosis`

My interpretation of cokurtosis is one of two

"correlation" between a series and the skew of another
"correlation" between the variances of two series

For option 1. we again have both a left and right variant that in matrix form are transposes of one another. So, we will only focus on the left variant. That leaves us with calculating a total of two variations.

'left'

'middle'

def cokurt(df, bias=False, fisher=True, variant='middle'):
    v = df.values
    s1 = sigma = v.std(0, keepdims=True)
    means = v.mean(0, keepdims=True)

    # means is 1 x n (n is number of columns
    # this difference broacasts appropriately
    v1 = v - means

    s2 = sigma ** 2
    s3 = sigma ** 3

    v2 = v1 ** 2
    v3 = v1 ** 3

    m = v.shape[0]

    if variant in ['left', 'right']:
        kurt = pd.DataFrame(v3.T.dot(v1) / s3.T.dot(s1) / m, df.columns, df.columns)
        if variant == 'right':
            kurt = kurt.T
    elif variant == 'middle':
        kurt = pd.DataFrame(v2.T.dot(v2) / s2.T.dot(s2) / m, df.columns, df.columns)

    if not bias:
        kurt = kurt * (m ** 2 - 1) / (m - 2) / (m - 3) - 3 * (m - 1) ** 2 / (m - 2) / (m - 3)
    if not fisher:
        kurt += 3

    return kurt

demonstration

cokurt(df, variant='middle', bias=False, fisher=False)

          a        b
a  1.882817  0.86649
b  0.866490  1.63200

cokurt(df, variant='left', bias=False, fisher=False)

          a        b
a  1.882817  0.19175
b -0.020567  1.63200

The diagonal should be equal to kurtosis

df.kurtosis() + 3

a    1.882817
b    1.632000
dtype: float64

Thank you very much for the detailed answer! worth an upvote :) Could you point to a real-world application of those that maybe doesn't have to do with high-level finance analysis, or at least something that isn't very abstract? I would be very interested in expert insights to get an intuition for those magnitudes — fr_andres, Nov 30 '17 at 06:03
Skew of a dataset is a measure how non-symmetrical its distribution is (leans to left or right). You can envision this by maybe the mean of the distribution is pulled to the right due to an outlier while the median is not influenced. The mean would be to the right of the median in a right skewed distribution. It ends up being a measure of how related the difference of a datum with the mean is to the square of that difference. While co-skew is a measure of how related the difference of a datum with its mean is relative to the square of the difference of a datum in another dataset is to its — piRSquared, Nov 30 '17 at 06:11
mean. Co-kurtosis has 2 interpretations. How related is one series' squared difference from its mean to another series squared difference to its mean. Or, the relationship of difference from mean relative to skew of another dataset. I apologize as I'm aware that this probably doesn't help clarify much. — piRSquared, Nov 30 '17 at 06:13
I guess it doesn't get less abstract than that haha thank you anyway for your quick answer and insights — fr_andres, Nov 30 '17 at 06:18
**the code in this answer is not correct** since it calculates the coskewness and cokurtosis matrix as square matrices. The coskewness and cokurtosis matrices are both tensors, which even when flattened would be rectangular arrays, not square arrays — develarist, Dec 06 '20 at 12:50

how to calculate coskew and cokurtosis

How do I calculate the coskew and cokurtosis of `a` and `b`?

1 Answers1

Calculating `coskew`

demonstration

Calculating `cokurtosis`

demonstration

Linked

how to calculate coskew and cokurtosis

How do I calculate the coskew and cokurtosis of a and b?

1 Answers1

Calculating coskew

demonstration

Calculating cokurtosis

demonstration

Linked

How do I calculate the coskew and cokurtosis of `a` and `b`?

Calculating `coskew`

Calculating `cokurtosis`