0

I have been running finetuning experiments for a while departing from some pre-trained models. These pre-trained models were not developed by me, and they were made available to me through h5 files. What I have been doing is load them and then fine-tuning (retraining) according to my finetuning criteria.

pretrained_model = load_model(pretrained.h5', compile=False)
pretrained_model.fit(train, steps_per_epoch=len(train), epochs=num_epochs, shuffle=True,...)

However, since I changed to a new computer (and now have a GPU available through Apple Metal), I get an exception whenever I try to fit the model. What is the problem here, and is there any way I can make this work without having to redo the pre-training?

Note1: in the other similar problems I've seen here in Github, the users always have access to the original code, which is not my case. I only have access to pre-trained h5 model files.

Note2: I know that the original code makes use of custom classes. Just leaving this info here, as I don't know if it affects this issue.

Exception has occurred: InvalidArgumentError
Cannot assign a device for operation model_1/conv_1_3x3_conv/Conv2D/ReadVariableOp: Could not satisfy explicit device specification '' because the node {{colocation_node model_1/conv_1_3x3_conv/Conv2D/ReadVariableOp}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0]. 
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
Equal: CPU 
AssignSubVariableOp: GPU CPU 
GreaterEqual: GPU CPU 
FloorDiv: CPU 
Sqrt: GPU CPU 
NoOp: GPU CPU 
Pow: GPU CPU 
Mul: CPU 
AssignVariableOp: GPU CPU 
Cast: GPU CPU 
SelectV2: GPU CPU 
ReadVariableOp: GPU CPU 
_Arg: GPU CPU 
Square: GPU CPU 
RealDiv: GPU CPU 
Sub: GPU CPU 
AddV2: GPU CPU 
Const: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  model_1_conv_1_3x3_conv_conv2d_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  lookahead_lookahead_update_mul_2_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  lookahead_lookahead_update_mul_4_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  lookahead_lookahead_update_sub_12_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  model_1/conv_1_3x3_conv/Conv2D/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/Const (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_4 (Cast) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Pow (Pow) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Pow_1 (Pow) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_1/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_1 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_1 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_2/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_2 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_1 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_3 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_2/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/mul_2 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_4/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_4 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_3 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_1 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_5/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_5 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_1 (ReadVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_2 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_4/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/mul_4 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_6/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_6 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Square (Square) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_5 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_2 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp_1 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_7/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_7 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_2 (ReadVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_3 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Sqrt (Sqrt) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_8/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_8 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_9/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_9 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_4 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_10/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_10 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_6 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_11/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/sub_11 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_5 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_7 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_6 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Sqrt_1 (Sqrt) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/GreaterEqual (GreaterEqual) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_8 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_3 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/truediv_7 (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/SelectV2 (SelectV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_9 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignSubVariableOp (AssignSubVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/group_deps (NoOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_4/y (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/add_4 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_6/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_6 (Cast) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Cast_7/x (Const) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_4 (ReadVariableOp) 
  Lookahead/Lookahead/update/sub_12/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/sub_12 (Sub) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_10 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/ReadVariableOp_5 (ReadVariableOp) 
  Lookahead/Lookahead/update/add_5 (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/floordiv (FloorDiv) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/mul_11 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/Equal (Equal) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/SelectV2_1/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/SelectV2_1 (SelectV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp_2 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/SelectV2_2/ReadVariableOp (ReadVariableOp) 
  Lookahead/Lookahead/update/SelectV2_2 (SelectV2) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/AssignVariableOp_3 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/group_deps_1 (NoOp) /job:localhost/replica:0/task:0/device:GPU:0
  Lookahead/Lookahead/update/group_deps_2 (NoOp) /job:localhost/replica:0/task:0/device:GPU:0

[[{{node model_1/conv_1_3x3_conv/Conv2D/ReadVariableOp}}]] [Op:__inference_train_function_29742]
MrT77
  • 811
  • 6
  • 25
  • 1
    Can you share standalone code to replicate your issue? so that we can try to help you. –  Sep 13 '21 at 02:35
  • @TFer2: Sorry for the late reply. I decided to stop reusing the old models, and tried to recreate them from scratch, but unfortunately the errors are the same, so I guess there's some incompatibility somewhere that I can't find. I submitted a new question in StackOverflow here (https://stackoverflow.com/questions/69393273/errors-while-using-blstm-in-tensorflow-cannot-assign-a-device-for-operation), where I have the most stripped down possible version of the code. – MrT77 Sep 30 '21 at 13:44

0 Answers0