2

If I have a dtype like

foo = dtype([('chrom1', '<f4', (100,)), ('chrom2', '<f4', (13,))])

How can I create an instance of that dtype, as a scalar.

Background, in case There's A Better Way:

I want to efficiently represent arrays of scalars mapping directly to the bases in a genome, chromosome by chromosome. I don't want arrays of these genomic arrays, each one is simply a structured set of scalars that I want to reference by name/position, and be able to add/subtract/etc.

It appears that dtype.type() is maybe the path forward, but I haven't found useful documentation for correctly calling this function yet.

So suppose I have:

chrom1_array = numpy.arange(100)
chrom2_array = numpy.arange(13)
genomic_array = foo.type([chrom1_array, chrom2_array])

That last line isn't right, but hopefully it conveys what I'm currently attempting.

Is this a horrible idea? If so, what's the right idea? If not, what's the correct way to implement it?

This sort of works, but is terrible:

 bar = np.zeros(1, dtype=[('chrom1', 'f4', 100), ('chrom2', 'f4', 13)])[0]
traeki
  • 33
  • 6
  • I think the closest you can get to that is a "scalar array": `bar = np.array((chrom1_array, chrom2_array), dtype=foo)`. `bar` is an array with shape `()`. – Warren Weckesser Nov 05 '14 at 00:21
  • How many of these 'genomic_array's are there? What kinds of math operations are you doing with them? So far your description does not make a good case for using structured arrays. Multidimensional arrays are your best choice for efficient math, and class/objects best for defining complex objects. – hpaulj Nov 05 '14 at 05:47

1 Answers1

1

try this:

foo = np.dtype([('chrom1', '<f4', (100,)), ('chrom2', '<f4', (13,))])
t = np.zeros((), dtype=foo)
HYRY
  • 94,853
  • 25
  • 187
  • 187
  • Ah! I didn't realize you could pass the empty tuple as a shape. Cool, thanks. – traeki Nov 05 '14 at 05:38
  • You can then fill values with `t['chrom1']=chrom1_array`. Or you can create and fill in one step: `t1=np.array((chrom1_array, chrom2_array),dtype=foo)`. – hpaulj Nov 05 '14 at 08:01
  • 1
    But how useful is this 'scalar' array? Any more or less useful than `d={'chrom1':chrom1_array, 'chrom2':chrom2_array}`? – hpaulj Nov 05 '14 at 08:06