auto-keras model optimization crash

Question

I am using autokeras (version 0.3.6) to find the best model for my classification problem. The search process can take a lot of time and by providing enough time we can get better results. The problems is that it sometimes crashes. I am using ubuntu 16.04 LTS with python 3.6

I am using the following codes:

from autokeras  import ImageClassifier
import autokeras
clf.fit(trdata_X, trdata_Y, time_limit= 1 * 60)

trdata_X, trdata_Y are the data that I provided as numpy arrays.

I get errors like

    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [64, 192, 1, 1], expected input[128, 256, 32, 32] to have 192 channels, but got 256 channels instead

here are full logs:

Preprocessing the images.
Preprocessing finished.
Initializing search.
Initialization finished.
+----------------------------------------------+
|               Training model 0               |
+----------------------------------------------+

No loss decrease after 5 epochs.
Saving model.
+--------------------------------------------------------------------------+
|        Model ID        |          Loss          |      Metric Value      |
+--------------------------------------------------------------------------+
|           0            |   0.6538878202438354   |         0.6875         |
+--------------------------------------------------------------------------+
+----------------------------------------------+
|               Training model 1               |
+----------------------------------------------+
Epoch-1, Current Metric - 0:   0%|                                        | 0/2 [00:00<?, ? batch/s]
Current model size is too big. Discontinuing training this model to search for other models.

+----------------------------------------------+
|               Training model 2               |
+----------------------------------------------+
Epoch-1, Current Metric - 0:   0%|                                        | 0/2 [00:00<?, ? batch/s]
Current model size is too big. Discontinuing training this model to search for other models.

+----------------------------------------------+
|               Training model 3               |
+----------------------------------------------+

No loss decrease after 5 epochs.
Saving model.
+--------------------------------------------------------------------------+
|        Model ID        |          Loss          |      Metric Value      |
+--------------------------------------------------------------------------+
|           3            |   0.6291704177856445   |         0.6875         |
+--------------------------------------------------------------------------+
+----------------------------------------------+
|               Training model 4               |
+----------------------------------------------+

No loss decrease after 5 epochs.
Saving model.
+--------------------------------------------------------------------------+
|        Model ID        |          Loss          |      Metric Value      |
+--------------------------------------------------------------------------+
|           4            |   0.6316376447677612   |         0.6875         |
+--------------------------------------------------------------------------+
+----------------------------------------------+
|               Training model 5               |
+----------------------------------------------+

No loss decrease after 5 epochs.
Saving model.
+--------------------------------------------------------------------------+
|        Model ID        |          Loss          |      Metric Value      |
+--------------------------------------------------------------------------+
|           5            |    0.62800053358078    |         0.6875         |
+--------------------------------------------------------------------------+
+----------------------------------------------+
|               Training model 6               |
+----------------------------------------------+

No loss decrease after 5 epochs.
Saving model.
+--------------------------------------------------------------------------+
|        Model ID        |          Loss          |      Metric Value      |
+--------------------------------------------------------------------------+
|           6            |   0.6313011765480041   |         0.6875         |
+--------------------------------------------------------------------------+
+----------------------------------------------+
|               Training model 7               |
+----------------------------------------------+

No loss decrease after 5 epochs.
Saving model.
+--------------------------------------------------------------------------+
|        Model ID        |          Loss          |      Metric Value      |
+--------------------------------------------------------------------------+
|           7            |   0.632081127166748    |         0.6875         |
+--------------------------------------------------------------------------+
+----------------------------------------------+
|               Training model 8               |
+----------------------------------------------+
Epoch-1, Current Metric - 0:   0%|                                        | 0/2 [00:00<?, ? batch/s]Process ForkProcess-9:
Traceback (most recent call last):
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/site-packages/autokeras/search.py", line 350, in train
    raise e
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/site-packages/autokeras/search.py", line 343, in train
    verbose=verbose).train_model(**trainer_args)
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/site-packages/autokeras/nn/model_trainer.py", line 137, in train_model
    self._train()
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/site-packages/autokeras/nn/model_trainer.py", line 173, in _train
    outputs = self.model(inputs)
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/site-packages/autokeras/nn/graph.py", line 686, in forward
    temp_tensor = torch_layer(edge_input_tensor)
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/rapsodo/workspace_mike3352/anaconda2/envs/ali_tf_py36/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 301, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [64, 192, 1, 1], expected input[128, 256, 64, 64] to have 192 channels, but got 256 channels instead

I am trying to run the code multiple times to get results. I will appreciate of anyone helps me to find a solution for this problem.

auto-keras model optimization crash

0 Answers0