1

I'm looking for a way -- using the LLVM API -- to obtain an identifier for a BasicBlock which I can use to look up (again via the API) the same block later.

Whatever this ID is, I need it to be "stable over serialisation" (remain valid and refer to the same block after a bitcode serialise/deserialise cycle).

The block ID needn't necessarily be globally unique: if the ID is unique to a function, I can make a globally unique pair by combining the block ID with the function's symbol name.

Candidates:

  • Index of the block in order of iteration (over the parent function's blocks). But is the order of iteration defined and stable over serialisation?
  • The StringRef returned by Node->printAsOperand(). But can I query a function for the block using this as a key, or would I have to do a search with lots of string comparisons? And is this stable over serialisation?
  • Use Block::setName() to assign each block my own ID. This will work, but will bloat the bitcode.

Thank you.

Edd Barrett
  • 3,425
  • 2
  • 29
  • 48
  • Function name + delimiter + block name is unique and stable, if the deliimiter doesn't occur anywhere. This is your chance to add obscenity or profanity. But pay attention to [these utility functions](https://llvm.org/doxygen/BasicBlockUtils_8h_source.html), which are called quite often. And [these](https://llvm.org/doxygen/Cloning_8h.html), BTW. – arnt Apr 21 '21 at 18:11
  • I was hoping to find a solution which didn't involve adding names to blocks, as it'd make the bitcode larger than it needs to be. Someone on discord has just told me that the iteration order of a function's blocks is stable, so something like `funcname_blockidx` should be possible... – Edd Barrett Apr 21 '21 at 18:20
  • 1
    You don't have to add names to blocks, you just use the names that are there already. The iteration order is indeed stable and may be used, but adding or removing blocks will then disturb the blocks after the added/removed blocks. – arnt Apr 21 '21 at 19:26
  • The problem is that the blocks are all named "" by default (i.e. the empty string). I'm not planning on mutating the IR, so the block indices are probably what I need. Thanks. – Edd Barrett Apr 22 '21 at 08:59
  • Oh, I forgot that. Sorry. (All basic blocks have names in the small part of the universe where I am.) – arnt Apr 22 '21 at 09:02
  • Just to be sure, did you turn off discarding of value names? – Andrea Apr 23 '21 at 14:24

0 Answers0