0

I have started working with GMM in Sklearn library. I have 1 D data like below

np.random.seed(2)
x = np.concatenate([np.random.normal(0, 2, 2000),
                    np.random.normal(5, 5, 2000),
                    np.random.normal(3, 0.5, 600)])

I would like to use sklearn GaussainMixture function to fit with 4 Gaussian Mixture. So i tried

clf= GaussianMixture(n_components = 4, max_iter=500, random_state=3).fit(x)

Problem

When i run the above code i recieve an error

Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

My trace back is

Traceback (most recent call last):
  File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\IPython\core\interactiveshell.py", line 2869, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-44-7de666249812>", line 1, in <module>
    clf= GaussianMixture(n_components = 4, max_iter=500, random_state=3).fit(x)
  File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\sklearn\mixture\base.py", line 194, in fit
    self.fit_predict(X, y)
  File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\sklearn\mixture\base.py", line 220, in fit_predict
    X = _check_X(X, self.n_components, ensure_min_samples=2)
  File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\sklearn\mixture\base.py", line 55, in _check_X
    ensure_min_samples=ensure_min_samples)
  File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\sklearn\utils\validation.py", line 552, in check_array
    "if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[-0.03338572  0.3163226  -1.94596018 ...  2.93448979  2.77931282
  3.28590084].

Question

Whether I cannot fit GMM for 1 D data ? I am not sure about the mistake what i have made, Kindly clarify-

Mari
  • 698
  • 1
  • 8
  • 27

1 Answers1

1

What you posted tells you how to proceed:

Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

If your dataset is 1D, it has a single feature, therefore:

x = x.reshape(-1, 1)

and the rest of code should work.

sentence
  • 8,213
  • 4
  • 31
  • 40