The code is below:
import tensorflow as tf
tf.enable_eager_execution()
def categorical_column_with_vocabulary_list():
print('------categorical_column_with_vocabulary_list------')
feature = {
'price': [1,2,3, -1, -5, -1, -6, -1]
}
price_column = tf.feature_column.categorical_column_with_vocabulary_list(
key='price',
vocabulary_list=[1,2,3],
num_oov_buckets=1,
dtype=tf.dtypes.int64
)
price_column = tf.feature_column.embedding_column(price_column, 5)
feature_columns = [price_column]
inputs = tf.feature_column.input_layer(feature, feature_columns)
print(inputs.numpy())
print('------categorical_column_with_vocabulary_list------')
The output :
------categorical_column_with_vocabulary_list------
[[-0.37697318 0.5353571 0.055607256 0.34294307 0.20049882 ]
[-0.6880904 0.10378731 0.016472543 -0.32594556 0.19428569 ]
[-0.20143655 -0.13469279 0.031137802 -0.009433172 -0.19912559 ]
[ 0. 0. 0. 0. 0. ]
[ 0.59028286 -0.7852301 0.8745925 -0.23695591 -0.08997129 ]
[ 0. 0. 0. 0. 0. ]
[ 0.59028286 -0.7852301 0.8745925 -0.23695591 -0.08997129 ]
[ 0. 0. 0. 0. 0. ]]
------categorical_column_with_vocabulary_list------
What's the difference between key = -1 and others? I think that -1, -5 and -6 are out-of-vocabulary key, thus their corresponding embedding should be same. But the results are different? Why?
My tf version is 1.15.