I am hoping to design a single end-to-end CNN to extract three features: segmentation and bifurcation for thing A, and detection for thing B. There will be a common weight trunk and then three branches with their own weights for the three feature types, and then the branches will be merged. I'm hoping to achieve this with a custom stochastic gradient decent function.
I need to combine different datasets of the same subject matter, where each image contains A and B, but the different datasets contain different ground truths. I am hoping to add an extra vector to each image indicating which of the three ground truths are available (eg [0 0 1]). This is so that the common weights w_0 will always update, but the individual branch weights w_t know to ignore an unsuitable image when encountered or even suitable images if enough are not encountered within a batch.
The problem is I'm not sure how to handle this.
- Does the ground truth parameter need to be the same size as the image, and passed as an extra channel of the images with redundant 0s? If so, how do I ensure it is not treated like a normal channel?
- Can I pass it in separately, eg [x_train y_train g_train]? How would other layers handle this, particularly compilation and validation?
I am considering doing this with Theano in Lasagne instead of my original intention of Keras due to the latter's higher level of abstraction. Detection of thing B can also be ignored if it over complicates things.