4

I'm implementing a loss function that will use a mask tensor(M) consisting of 0s and 1s to eliminate some loss values given predictions(P) and ground-truth(G) tensors.

So, I have 2 possible ways:

Element-wise multiplication:

loss = K.sum(M * K.binary_crossentropy(G, P))

Conditional selection:

bin_ce = K.binary_crossentropy(G, P)
loss = K.sum(tf.where(tf.equal(M, 1), bin_ce, 0))

So, which will be more efficient in terms of running time?

mkocabas
  • 703
  • 6
  • 19

1 Answers1

4

I did the benchmark and it is quite clear that multiplication is far better than conditional selection.

Here is the results:

A chart is worth a thousand words..

A chart is worth a thousand words.

Benchmark code:

import keras.backend as K
import tensorflow as tf
import numpy as np
import sys
import time
import matplotlib.pyplot as plt


def elm(G, P, M):
        return K.sum(M * K.binary_crossentropy(G, P))

def cond(G, P, M, t):
        C = K.variable(np.zeros((t, t)))
        bin_ce = K.binary_crossentropy(G, P)
        return K.sum(tf.where(tf.equal(M, 1), bin_ce, C))


s = [100, 1000, 10000, 100000]
elms = []
conds = []

for t in s:
        print t
        t = int(t)
        # number of 1s in mask
        n = int(t/2)

        M = np.zeros((t,t))
        P = np.random.rand(t, t)
        G = np.random.rand(t, t)

        for i in range(n):
                r = np.random.randint(0, t)
                c = np.random.randint(0, t)
                M[r,c] = 1

        M = K.variable(M)
        P = K.variable(P)
        G = K.variable(G)

        start_time = time.time()
        elm(G, P, M)
        elms.append(time.time() - start_time)

        start_time = time.time()
        cond(G, P, M, t)
        conds.append(time.time() - start_time)

print elms
print conds

# create plot
fig, ax = plt.subplots()
index = np.arange(n_groups)
bar_width = 0.35
opacity = 0.8

rects1 = plt.bar(index, elms, bar_width,
                 alpha=opacity,
                 color='b',
                 label='Element-wise')

rects2 = plt.bar(index + bar_width, conds, bar_width,
                 alpha=opacity,
                 color='g',
                 label='Conditional')

plt.xlabel('Input tensor size')
plt.ylabel('Execution time (s)')
plt.title('')
plt.xticks(index + bar_width, ('100', '10e3', '10e4', '10e5'))
plt.legend()

plt.tight_layout()
plt.show()
mkocabas
  • 703
  • 6
  • 19