I have a model EM algorithm. The model should select random columns and then calculate the likelihood. For example, the dataset has 39 attributes, the model selects 19 attributes only and displays likelihood.
define several functions :
initialize_model_params
, expected_hidden_vars
, update_model
, likelihood
the problem when calculating likelihood
I got the error
KeyError: 0
It's related to the index
function likelihood
def calc_likelihood(data, means, covariances, weights):
n_samples, n_features = data.shape
n_components = len(means)
# Initialize likelihood to 0
likelihood = 0
# Loop over each sample in the data
for i in range(n_samples):
sample_likelihood = 0
# Loop over each component in the mixture
for j in range(n_components):
# Calculate the probability of the sample given the component
component_prob = weights[j] * multivariate_normal.pdf(data[i], mean=means[j], cov=covariances[j])
# Add the component probability to the sample likelihood
sample_likelihood += component_prob
# Add the sample likelihood to the overall likelihood
likelihood += np.log(sample_likelihood)
return likelihood
function em
def em(data, columns, num_iterations):
# Initialize the model parameters
model_params = initialize_model_params(data, columns)
# Select a random subset of columns
selected_columns = random.sample(list(columns), k=len(columns) // 2)
print(selected_columns)
for i in range(num_iterations):
# E step: calculate the expected value of the hidden variables using the selected columns
expected_hidden_vars = calculate_expected_hidden_vars(data, selected_columns, model_params)
# M step: update the estimates of the model parameters using the selected columns and the expected hidden variables
model_params = update_model_params(data, selected_columns, expected_hidden_vars)
means, covariances, weights = model_params
print('params', model_params)
# Calculate the likelihood of the data given the current model parameters
likelihood = calc_likelihood(data, means, covariances, weights)
print('likelihood : ' , likelihood)
# Return the maximum likelihood estimates of the model parameters
return model_params
how to fix it. can anyone help me?