0

I am attempting to design a neural network model using Tensorflow with the following specifications:

The model accepts two inputs: X, a list of n 3-dimensional vectors, and Y, a list of n ascending natural numbers starting from 0. It produces an output Z, consisting of m 3-dimensional vectors.

Y contains m unique numbers, each representing a class of 3-dimensional input vectors. The number of input vectors per class may vary.

The model's architecture consists of three layers. The first layer transforms each vector in X into a 2-dimensional vector and applies the 'gelu' activation function. The second layer performs a 'segment_sum' operation to condense the n 2-dimensional vectors into m 2-dimensional vectors, using Y as the guide. The third layer then transforms the m 2-dimensional vectors into the desired m 3-dimensional output vectors.

I utilize cosine dissimilarity loss and the Adam optimizer to train the model.

Below is the code I've developed for this purpose:

import numpy as np
import tensorflow as tf
from tensorflow import keras

# Prepare the input and output data (example)
n = 10
m = 4
X = np.random.random((n, 3)).astype('float32')
Y = np.array([0, 0, 1, 1, 2, 2, 2, 3, 3, 3]).astype('int32')
Z = np.random.random((m, 3)).astype('float32')


class CustomModel(tf.keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(2, activation='gelu')
        self.dense2 = keras.layers.Dense(3)

    def call(self, inputs):
        X, Y = inputs
        X = self.dense1(X)
        X = tf.math.segment_sum(X, Y)
        Z = self.dense2(X)
        return Z


model = CustomModel()

model.compile(loss=tf.keras.losses.CosineSimilarity(axis=1), optimizer=tf.keras.optimizers.Adam())

model.fit([X, Y], Z, epochs=10)

Motivated by the Deep Sets paper, the model is designed to learn an aggregation function.

Let f be a permutation-invariant aggregation function over a set of vectors A. Then, the function can be written as: f(A) = \rho(\sum_{a \in A} \phi(a))

So, if we learn \rho and \phi, we learn the function f, essentially what the above model is expected to do.

However, I get the following error:

Traceback (most recent call last):
  File "/home/nitesh/PycharmProjects1/pythonProject/research/reasoning_with_vectors/custom_model.py", line 31, in <module>
    model.fit([X, Y], Z, epochs=10)
  File "/home/nitesh/miniconda3/envs/relbert/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/nitesh/miniconda3/envs/relbert/lib/python3.10/site-packages/keras/engine/data_adapter.py", line 1852, in _check_data_cardinality
    raise ValueError(msg)
ValueError: Data cardinality is ambiguous:
  x sizes: 10, 10
  y sizes: 4
Make sure all arrays contain the same number of samples.

I tried a lot, but I didn't find any way to encode the model using TensorFlow 2.0. I tried asking GPT4 and Bard but didn't get any satisfactory answers.

  • You are literally creating your data with different values for the first dimension, which makes no sense, why are you doing that? – Dr. Snoopy May 08 '23 at 07:25
  • Please remember that Stack Overflow is not your favourite Python forum, but rather a question and answer site for all programming related questions. Thus, always include the tag of the language you are programming in, that way other users familiar with that language can more easily find your question. Take the [tour] and read up on [ask] to get more information on how this site works, then [edit] the question with the relevant tags. – Adriaan May 08 '23 at 09:58
  • The model is learning an aggregation function described in the Deep Sets paper. Check my original post. Now I have added the motivation. – Nitesh Kumar May 08 '23 at 09:59

0 Answers0