I have implemented a few algorithms for multi-class semantic segmentation and am now at the stage where I will be comparing them against each other and evaluating the results. I was wondering if there are any best practice methods/formulae for comparing segmentation, and their advantages etc?
So far I've looked into the problem of class imbalance pointing out the need for something more complex than just pixel counting, which has led me to the Sorensen-Dice coefficient - while this seems to be appropriate for single class scenarios (which I could apply to my current problem), I am looking for something more directly appropriate for the task at hand. Thanks
Note: The method does not need to be fast/efficient or run in real time, it is just the results I am after.