0

So, I've been using the set method "symmetric_difference" between 2 ndarray matrices in the following way:

x_set = list(set(tuple(i) for i in x_spam_matrix.tolist()).symmetric_difference(
                 set(tuple(j) for j in partitioned_x[i].tolist())))

x = np.array([list(i) for i in x_set])

this method works fine for me, but it feel a little clumsy...is there anyway to conduct this in a slightly more elegant way?

  • Do you both have an outer `i` and another `i` inside your set comprehension? Also if you split the x_set line maybe it will look less clumsy. – Moberg Jun 20 '18 at 06:42
  • yes, this block lives under a for loop (it's a part of a cross validation process), so the i inside "partitioned_x[i] " is not relevant. By "more elegant" I meant less complicated...perhaps a built in function that converts a set of tuples into a list of lists or something of that sort. thanks though! – Tomer Daloomi Jun 20 '18 at 06:46

2 Answers2

0

The Zen of Python:

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
[...]

There is nothing wrong with your code. Though if I had to code review it, I would suggest the following

spam_matrix_set = set(tuple(item) for item in x_spam_matrix.tolist())
partitioned_set = set(tuple(item) for item in partitioned_x[index].tolist())
disjunctive_union = spam_matrix_set.symmetric_difference(partitioned_set)

x = np.array([list(item) for item in disjunctive_union])
Sebastian Loehner
  • 1,302
  • 7
  • 5
0

A simple list of tuples:

In [146]: alist = [(1,2),(3,4),(2,1),(3,4)]

put it in a set:

In [147]: aset = set(alist)
In [148]: aset
Out[148]: {(1, 2), (2, 1), (3, 4)}

np.array just wraps that set in an object dtype:

In [149]: np.array(aset)
Out[149]: array({(1, 2), (3, 4), (2, 1)}, dtype=object)

but make it into a list, and get a 2d array:

In [150]: np.array(list(aset))
Out[150]: 
array([[1, 2],
       [3, 4],
       [2, 1]])

Since it is a list of tuples, it can also be made into a structured array:

In [151]: np.array(list(aset),'i,f')
Out[151]: array([(1, 2.), (3, 4.), (2, 1.)], dtype=[('f0', '<i4'), ('f1', '<f4')])

If the tuples varied in length, the list of tuples would be turned into a 1d array of tuples (object dtype):

In [152]: np.array([(1,2),(3,4),(5,6,7)])
Out[152]: array([(1, 2), (3, 4), (5, 6, 7)], dtype=object)
In [153]: _.shape
Out[153]: (3,)
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • beautiful!!! thank you! I was missing the fact that "np.array(list(aset))" would give me the same result as "np.array([list(i) for i in x_set])". this was exactly what I was looking for. – Tomer Daloomi Jun 20 '18 at 08:21