4

I'm doing some work with CERN's pyROOT module, and I'm trying to store an array of strings as a leaf in a binary tree. In order to do so, I have to pass it an array, obviously, using not lists or dictionaries, but the array module. The module supports standard C arrays, of characters, integers, and so forth, but does anyone know of a way I can nest them in order to have an array of strings, or, effectively, an array of character arrays? Or have I gone too far and I need to take a step back from the keyboard for a while :)?

Code:

import ROOT

rowtree = ROOT.TTree("rowstor", "rowtree")

ROOT.gROOT.ProcessLine(
    "struct runLine {\
    Char_t test[20];\
    Char_t test2[20];\
    };" );
from ROOT import runLine
newline = runLine()
rowtree.Branch("test1", newline, "test/C:test2")

newline.test = ["AbcDefgHijkLmnOp","aaaaaaaaaaaaaaaaaaa"]

rowtree.Fill()

Error:

python branchtest
Traceback (most recent call last):
  File "branchtest", line 14, in <module>
    newline.test = ["AbcDefgHijkLmnOp","aaaaaaaaaaaaaaaaaaa"]
TypeError: expected string or Unicode object, list found

I'm wondering if it's possible to turn the list shown in this example into an array of strings.

ndawe
  • 338
  • 3
  • 14
ecapstone
  • 269
  • 5
  • 14
  • don't know ROOT, but ctypes should be able to do it... quick search shows this: http://stackoverflow.com/questions/4101536/multi-dimensional-char-array-array-of-strings-in-python-ctypes – Corley Brigman Oct 30 '13 at 19:03
  • 1
    The line of code in your traceback doesn't match the code you posted. – abarnert Oct 30 '13 at 19:09
  • Fixed it, sorry. The traceback shown was from the original piece of code, not the demo I put together for this post. – ecapstone Oct 30 '13 at 19:17

2 Answers2

3

A char array and a Python list of Python strings are two very different things.

If you want a branch containing a char array (one string) then I suggest using Python's built-in bytearray type:

import ROOT
# create an array of bytes (chars) and reserve the last byte for null
# termination (last byte remains zero)
char_array = bytearray(21)
# all bytes of char_array are zeroed by default here (all b'\x00')

# create the tree
tree = ROOT.TTree('tree', 'tree')
# add a branch for char_array
tree.Branch('char_array', char_array, 'char_array[21]/C')
# set the first 20 bytes to characters of a string of length 20
char_array[:21] = 'a' * 20
# important to keep the last byte zeroed for null termination!
tree.Fill()
tree.Scan('', '', 'colsize=21')

The output of tree.Scan('', '', 'colsize=21') is:

************************************
*    Row   *            char_array *
************************************
*        0 *  aaaaaaaaaaaaaaaaaaaa *
************************************

So we know the tree is accepting the bytes correctly.

If you want to store a list of strings, then I suggest using a std::vector<std::string>:

import ROOT

strings = ROOT.vector('string')()

tree = ROOT.TTree('tree', 'tree')
tree.Branch('strings', strings)
strings.push_back('Hello')
strings.push_back('world!')
tree.Fill()
tree.Scan()

The output of tree.Scan() is:

***********************************
*    Row   * Instance *   strings *
***********************************
*        0 *        0 *     Hello *
*        0 *        1 *    world! *
***********************************

In a loop you would want to strings.clear() before filling with a new list of strings in the next entry.

Now, the rootpy package (also see the repository on github) provides a better way of creating trees in Python. Here is an example of how you can use char arrays in a "friendlier" way with rootpy:

from rootpy import stl
from rootpy.io import TemporaryFile
from rootpy.tree import Tree, TreeModel, CharArrayCol

class Model(TreeModel):
    # define the branches you want here
    # with branchname = branchvalue
    char_array = CharArrayCol(21)
    # the dictionary is compiled and cached for later
    # if not already available
    strings = stl.vector('string')

# create the tree inside a temporary file
with TemporaryFile():
    # all branches are created automatically according to your model above
    tree = Tree('tree', model=Model)

    tree.char_array = 'a' * 20
    # attemping to set char_array with a string of length 21 or longer will
    # result in a ValueError being raised.
    tree.strings.push_back('Hello')
    tree.strings.push_back('world!')
    tree.Fill()
    tree.Scan('', '', 'colsize=21')

The output of tree.Scan('', '', 'colsize=21') is:

***********************************************************************
*    Row   * Instance *            char_array *               strings *
***********************************************************************
*        0 *        0 *  aaaaaaaaaaaaaaaaaaaa *                 Hello *
*        0 *        1 *  aaaaaaaaaaaaaaaaaaaa *                world! *
***********************************************************************

See another example of using TreeModels with rootpy here:

https://github.com/rootpy/rootpy/blob/master/examples/tree/model_simple.py

ndawe
  • 338
  • 3
  • 14
  • just a comment: `strings = ROOT.vector('string')()` worked for the new `BTagCalibrationReader`, which takes 3rd argument of type `const std::vector & otherSysTypes={}` as additional systematics of the calibration. Like in: `ROOT.BTagCalibrationReader( ROOT.BTagEntry.OP_MEDIUM, "central", bCalib_systs)`, where systs should be `["up", "down"]`. – xealits Aug 29 '17 at 17:49
0

You've defined the test member of a runLine as an array of 20 chars:

Char_t test[20];\

But then you're trying to pass it a list of two strings:

newline.test = ["AbcDefgHijkLmnOp","aaaaaaaaaaaaaaaaaaa"]

This doesn't make any sense in C (or CINT) or in Python, so of course it doesn't make any sense in PyROOT either.

Also, there seems to be a lot of confusion in your question. You say you need to pass PyROOT "an array, obviously, using not lists or dictionaries, but the array module"… but PyROOT doesn't particularly care about the Python array module. You've tagged your question numpy, which implies that you may be thinking of numpy rather than array as "the array module", but last time I checked (which, admittedly, is quite some time ago), they didn't interact together at all; you had to explicitly ask numpy to export buffers if you wanted something you could pass to PyROOT.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • I'm trying to use the test leaf to store an array of 20-character strings. If that's even possible, I would imagine I'd need some sort of nested array structure similar to python's nested lists, or a two dimensional array, although I don't know how ROOT would feel about being passed that. The numpy tag is there because I've seen multiple other questions mention it for similar problems, but I've been unable to find anything really relating in the documentation. – ecapstone Oct 30 '13 at 19:35
  • @user221884: First, what does the word `leaf` mean in that comment? Second, do you know basic C? In C, `Char_t[20]` is just an array of 20 characters. You need a second `[]` in there to make it an array of arrays (or a `*` to make it a pointer to arrays stored out-of-line). If you can show us some simple C for what you wanted to do, maybe we can explain how to do the same thing in Python, but as long as you're asking for something that doesn't mean anything sensible nobody can tell you how to do it. – abarnert Oct 30 '13 at 19:54