0

I am working on a knowledge graph dataset, which has triples of the format , i.e., triples of subject, predicate and object.

Say, the knowledge graph looks as follows:

X = np.array([['a', 'y', 'b'],
              ['b', 'y', 'a'],
              ['a', 'y', 'c'],
              ['c', 'y', 'a'],
              ['a', 'y', 'd'],
              ['c', 'y', 'd'],
              ['b', 'y', 'c'],
              ['f', 'y', 'e']])

In the loss function, it iterates over each triple:

x_pos_tf = tf.cast(dataset_iterator.get_next(), tf.int32)

(where:

dataset_iterator = dataset.make_one_shot_iterator()

)

Now, for each triple, I wish to fetch all the triples from the knowledge graph which have the same subject as the triple in question. For example: for triple (a,y,b)I wish to fetch ((a,y,c),(a,y,d)). [Note: Not including the triple being evaluated].

I have carried out this operation using numpy lists, by making a dictionary data structure for the knowledge graph as follows:

d = {s: [tuple(x) for x in X if x[0] == s] for s in np.unique(X[:, 0])}

This returns a dictionary of the format:

d={'f': [('f', 'y', 'e')],
   'c': [('c', 'y', 'a'), ('c', 'y', 'd')],
   'a': [('a', 'y', 'b'), ('a', 'y', 'c'), ('a', 'y', 'd')],
   'b': [('b', 'y', 'a'), ('b', 'y', 'c')]}

And then, I do a simple lookup for any triple as follows:

return list({triple for x in x_to_score for triple in self.d[x[0]]} - set(x_to_score))

where x_to_score is the triple being evaluated.

This returns a list [('a','y','c'), ('a','y','d')] for the sample triple ('a','y','b'), as an example.

However, the problem now is, as I iterate over the triples (i.e. a batch of triples processed at onece), they need to be passed as tensors, therefore I cannot do numpy operations or list comprehensions to do this.

I need to process the triple to be evaluated as a tensor, and then return a list of tensors for the result as well.

As I am new to tensorflow, I am unable to figure out how to go about this.

Further, this needs to work for evaluating a batch of triples.

I have tried the tf.slice() operation for fetching the subject, and some tf.sets() functions, but am not able to figure it out, as I am very new to tensorflow.

Any help will be appreciated! Thank you.

snelzb
  • 157
  • 3
  • 16
  • I am not sure where you want to go with it, but if you want you can use the index in the matrix instead of the list itself... Or am I missing something? – Maayao Jul 30 '19 at 15:47
  • @Maayao could you please give some more details on that? It would be a great help! – snelzb Jul 30 '19 at 22:02
  • Again, I am not sure I understand you right, but if you want to still store data on a matrix - but use a hash table as well, you can do something like "index a"->(1,2) which would be the index of the item marked as 1 in the matrix [[0,0,0],[0,0,1]],[0,0,0]], then you can use hashtables and still use the original data structure. you can even wrap it all as a class or whatever you want. – Maayao Jul 31 '19 at 12:31

0 Answers0