0

I was working along the image classifier ML.net sample code over at https://learn.microsoft.com/en-US/dotnet/machine-learning/tutorials/image-classification

The classification there uses the following inception settings

private struct InceptionSettings
{
    public const int ImageHeight = 224;
    public const int ImageWidth = 224;
    public const float Mean = 117;
    public const float Scale = 1;
    public const bool ChannelsLast = true;
}

while using the tensorflow inception5h model. It appears to be working. Unclear for me is however, what breaks when I change Height and Width from 224 to say 64 to reduce the load and precision of the prediction the reuse and tune inception model part nearly instantly crashes with

System.InvalidOperationException: Splitter/consolidator worker encountered exception while consuming source data ---> Microsoft.ML.Transforms.TensorFlow.TFException: Computed output size would be negative: -4 [input_size: 2, effective_filter_size: 7, stride: 1]
     [[{{node avgpool0}}]]
   at Microsoft.ML.Transforms.TensorFlow.TFStatus.CheckMaybeRaise(TFStatus incomingStatus, Boolean last)
   at Microsoft.ML.Transforms.TensorFlow.TFSession.Run(TFOutput[] inputs, TFTensor[] inputValues, TFOutput[] outputs, TFOperation[] targetOpers, TFBuffer runMetadata, TFBuffer runOptions, TFStatus status)
   at Microsoft.ML.Transforms.TensorFlow.TFSession.Runner.Run(TFStatus status)
   at Microsoft.ML.Transforms.TensorFlowTransformer.Mapper.UpdateCacheIfNeeded(Int64 position, ITensorValueGetter[] srcTensorGetters, String[] activeOutputColNames, OutputCache outputCache)
   at Microsoft.ML.Transforms.TensorFlowTransformer.Mapper.<>c__DisplayClass8_0`1.<MakeGetter>b__3(VBuffer`1& dst)
   at Microsoft.ML.Data.DataViewUtils.Splitter.InPipe.Impl`1.Fill()
   at Microsoft.ML.Data.DataViewUtils.Splitter.<>c__DisplayClass5_1.<ConsolidateCore>b__2()
   --- End of inner exception stack trace ---
   at Microsoft.ML.Data.DataViewUtils.Splitter.Batch.SetAll(OutPipe[] pipes)
   at Microsoft.ML.Data.DataViewUtils.Splitter.Cursor.MoveNextCore()
   at Microsoft.ML.Data.RootCursorBase.MoveNext()
   at Microsoft.ML.Trainers.TrainingCursorBase.MoveNext()
   at Microsoft.ML.Trainers.LbfgsTrainerBase`3.TrainCore(IChannel ch, RoleMappedData data)
   at Microsoft.ML.Trainers.LbfgsTrainerBase`3.TrainModelCore(TrainContext context)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at D:\My\MLTrainer.Program.ReuseAndTuneInceptionModel(MLContext mlContext, TrainerData trainerData, String dataLocation, String inputModelLocation, String outputModelLocation) in MLTrainer\Program.cs:line 66
   at MLTrainer.Program.Main(String[] args) in D:\My\MLTrainer\Program.cs:line 29

Now I don't get what I can do and where in the details the issue is buried. Is the pre-trained model already fixed to this somewhat strange resolution? The resolution itself seems not to be used somewhere else, nor do I get why the splitter don't like it.

Do I hit some sort of min size condition I am not aware of? If so, what are the boundaries? I tried 1024x1024 for instance, which failed with another error.

Any hints on that are appreciated :)

Samuel
  • 6,126
  • 35
  • 70

1 Answers1

1

You could install Netron to look at your model.

You'll see that on the first layer will stand something like Nx224x224x3.

The model is fixed to this input resolution, because it is trained with a large dataset in exactly this resolution. You could change the input layer with e.g. keras. But I never tried to change the input layer in ML.NET.

Alex_lit
  • 38
  • 3