Numpy summarizes large arrays
, which is convenient when working in an interactive session. Unfortunately, structured arrays
and recarrays
are not summarized very well by default. Is there a way to change this?
By default, the full array
is displayed if there are 1000 or fewer items. When there are more items than this the array
is summarized. This can be set with np.set_printoptions(threshold=<number of items to trigger summarization>, edgeitems=<number of items to show in summary>)
.
This works fine for standard datatypes, for example:
np.set_printoptions(threshold=3, edgeitems=1)
print(np.zeros(3))
print(np.zeros(4))
results in
[ 0. 0. 0.]
[ 0. ..., 0.]
However, when more complex datatypes are used, the summarization is less helpful
print(np.zeros(4, dtype=[('test', 'i4', 3)]))
print(np.zeros(4, dtype=[('test', 'i4', 4)]))
[([0, 0, 0],) ..., ([0, 0, 0],)]
[([0, 0, 0, 0],) ..., ([0, 0, 0, 0],)]
The array is summarized, but the sub datatypes are not. This becomes a problem with large arrays
using complex datatypes. For instance the array
np.zeros(1000, dtype=[('a', float, 3000), ('b', float, 10000)])
hangs up my ipython instance.
There are a couple of workarounds, rather than using the np.array
type directly it's possible to subclass and write a custom __repr__
. This would work for big projects, but doesn't solve the underlying issue and isn't convenient for quick exploration of data in an interactive python session.
I've also implemented a custom filter in my editor that truncates very long console output. This is a bit of a hack and doesn't help when I fire up a python session elsewhere.
Is there a numpy setting I'm unaware of, or a python or ipython setting that could fix this?