0

I am trying to use the Gaussian mixture model to cluster my data in octave. As a start, I am trying to fit the data to a Gaussian distribution using the fitgmdist function in octave. However, I am getting the following error:

> (error: matrix cannot be indexed with . error:  
> called from fitgmdist at line 486 column 14)

whenever my source data gets bigger than a certain limit.

Attached below is a sample of the code I am using:

    clear; close all; clc
    pkg load statistics

    k = 30;                  % target number of clusters 
    X;                       % X is the input source data to be clustered 
    X_final = normalize(X);  % normalizing  the input data 

    gm = fitgmdist( X_final, k, 'start', 'plus',
                    'CovarianceType', 'diagonal',
                    'RegularizationValue', 0.0001
                  );
Amr Amer
  • 1
  • 2
  • I doubt this has anything to do with your problem, but there is a typo in the code above: you are calling fitgmdist with `x_final` instead of `X_final`. If this typo does indeed exist in your original code, perhaps you are calling fitgmdist with something that has the wrong type. Otherwise, there is no obvious error here, you would have to debug and see why fitgmdist fails at that line, and work your way up to see in what way the input was bad to cause that error. – Tasos Papastylianou Oct 21 '20 at 07:31
  • I checked again, and I've deleted my answer below, because I realised that it had nothing to do with what you're experiencing. My guess is that you probably have invalid data (e.g. -inf or nans) and therefore line 467: `if (log_likeli > best)` always fails because `nan > number` will always give false, and therefore you never create the `best_params` struct, leaving it uninitialised as `[]`. The octave error you get tells you you tried to access a matrix (i.e. `[]`) as if it was a struct. Therefore this isn't an octave bug after all. (Though I agree the error could be more informative.) – Tasos Papastylianou Oct 22 '20 at 10:02
  • many thanks Tasos, i have checked the input data file in excel using ISNUMBER() function and it seems that all the inputs are numbers( i.e. there is no -inf or nans), weirdly enough it seems as if it is a size issue ,, cause when i reduce the data size or the reduce the target number of clusters , the system works fine with no errors !! – Amr Amer Oct 30 '20 at 12:34
  • are you able to edit edit your post above to add the relevant data? (in copy-pastable form) – Tasos Papastylianou Oct 30 '20 at 14:43

0 Answers0