I am currently working on a convolutional neural network
for pathological changes detection on x-ray images
. It is a simple binary classification
task. In the beginning of the project we gathered around 6000 x-rays
and asked 3 different doctors (domain experts) to label
them. Each of them got around 2000 randomly selected images (and those 3 sets were separable - one image was labelled only by one doctor).
After the labelling
was finished I wanted to check how many cases per doctor were labelled as having
and non-having
the changes and this is what I've got:
# A tibble: 3 x 3
doctor no_changes (%) changes (%)
<int> <dbl> <dbl>
1 1 15.9 84.1
2 2 54.1 45.9
3 3 17.8 82.2
From my perspective, if each of the doctors got a randomly sampled dataset of x-rays
, the % of cases with and without changes should be pretty much the same for each of them, assuming that they are "thinking similarly", which isn't the case in here.
We were talking with one of the doctors and he told us that it's possible that one doctor can say that there are changes on the x-ray
and another can say something different, because typically they're not looking at changes in a binary way - so for example amount/size
of changes could decide in labelling and each of the doctors could have a different cutoff
in the mind.
Knowing that I started thinking about removing/centering
labels bias
. This is what I come up with:
- Because I know doctor 1 (let's say he is the best expert) I decided to "move" labels of doctor 2 and 3 into direction of doctor 1.
- I gathered 300 new images and ask all 3 of them to
label
them (so each image waslabelled
by 3 different doctors this time). Than I've checked the distribution of labels between doctor 1 and 2/3. For example for doctor 1 and 2 I got something like:
doctor2 no_changes changes all
doctor1 no_changes 15 3 18
changes 154 177 331
all 169 180
From this I can see that doctor 2 had 169
cases that he lebeled
as not having changes and doctor 1 agreed with him only in 15
cases. Knowing that I've changed labels (probabilities) for doctor 2 in non-changes case from [1, 0] to [15/169, 1- 15/169]. Similarly doctor 2 had 180
cases of changes in x-rays
and doctor 1 agreed with him in 177
cases so I've changed labels (probabilities) for doctor 2 in changes case from [0, 1] to [1 - 177/180, 177/180].
- Do the same thing for doctor 3
Doing that I've retrained neural network with cross-entropy
loss.
My question is, is my solution correct or should I do something differently? Are the any other solutions for this problem ?