0

The code is below:

import tensorflow as tf
tf.enable_eager_execution()
def categorical_column_with_vocabulary_list():
    print('------categorical_column_with_vocabulary_list------')
    feature = {
        'price': [1,2,3, -1, -5, -1, -6, -1]
    }
    price_column = tf.feature_column.categorical_column_with_vocabulary_list(
        key='price',
        vocabulary_list=[1,2,3],
        num_oov_buckets=1,
        dtype=tf.dtypes.int64
    )
    price_column = tf.feature_column.embedding_column(price_column, 5)
    feature_columns = [price_column]
    inputs = tf.feature_column.input_layer(feature, feature_columns)
    print(inputs.numpy())
    print('------categorical_column_with_vocabulary_list------')

The output :

------categorical_column_with_vocabulary_list------
[[-0.37697318   0.5353571    0.055607256  0.34294307   0.20049882 ]
 [-0.6880904    0.10378731   0.016472543 -0.32594556   0.19428569 ]
 [-0.20143655  -0.13469279   0.031137802 -0.009433172 -0.19912559 ]
 [ 0.           0.           0.           0.           0.         ]
 [ 0.59028286  -0.7852301    0.8745925   -0.23695591  -0.08997129 ]
 [ 0.           0.           0.           0.           0.         ]
 [ 0.59028286  -0.7852301    0.8745925   -0.23695591  -0.08997129 ]
 [ 0.           0.           0.           0.           0.         ]]
------categorical_column_with_vocabulary_list------

What's the difference between key = -1 and others? I think that -1, -5 and -6 are out-of-vocabulary key, thus their corresponding embedding should be same. But the results are different? Why?

My tf version is 1.15.

Ocxs
  • 149
  • 2
  • 10

0 Answers0