2

I have an excel file that is 199 x 15 (has 15 columns and 199 rows). My job is to compute the stats for every pairs of 15 columns. Basically column (1,2), (1,3), (1,4), ... (14,15)... and so on for all pairs.I have to do it for each of the 105 (15×14/2) pairs of variables provided. All of the data columns are filled with either a 0 or a 1. The stats I want to calculate is the chi square . Here is what I have so far:

import itertools
import matplotlib.pyplot as plt
import numpy
import pandas as pd
import numpy as np
from scipy.stats import chi2_contingency, chi2
from bioinfokit.analys import stat, get_data, chisq
from itertools import combinations

df = pd.read_csv(r'C:\Users\ether\OneDrive\Desktop\info.csv')


data = np.random.randint(0, 5, (199, 15))
p_value_matrix = np.zeros((15, 15))
for i, j in combinations(range(15), 2):
    _, p_val = chi2(data[:, i][:, None], data[:, j][:, None])
    p_value_matrix[i, j] = p_value_matrix[j, i] = p_val
    if p_val < 0.05:
        print('possibly dependent: {} -- {}'.format(i, j))

However, when I run this, i keep getting this error:

_, p_val = chi2(data[:, i][:, None], data[:, j][:, None])
TypeError: cannot unpack non-iterable rv_frozen object

I do not know how to fix it.. can someone fix it for me?

hershey10
  • 77
  • 4

0 Answers0