10

The LLVM project does not distinguish between signed and unsigned integers as described here. There are situations where you need to know if a particular variable should be interpreted as signed or as unsigned though, for instance when it is size extended or when it is used in a division. My solution to this is to keep a separate type information for every variable that describes whether it is an integer or a cardinal type.

However, I am wondering, isn't there a way to "attribute" a type in LLVM that way? I was looking for some sort of "user data" that could be added to a type but there seems to be nothing. This would have to happen somehow when the type is created since equal types are generated only once in LLVM.

My question therefore is:

Is there a way to track whether an integer variable should be interpreted as signed or unsigned within the LLVM infrastructure, or is the only way indeed to keep separate information like I do?

Thanks

Jens
  • 8,423
  • 9
  • 58
  • 78
Rick
  • 680
  • 1
  • 7
  • 20
  • 6
    If you're writing a compiler, this is typically information you'd maintain yourself in the frontend, independently from LLVM. – Ismail Badawi May 29 '15 at 01:42

1 Answers1

5

First of all, you have to be sure that you need inserting extra type meta-data since Clang already handles signed integer operations appropriately for example by using sdiv and srem rather than udev and urem.

Additionally, It's possible to utilize that to implement some lightweight type-inference based on how the variables are accessed in the IR. Note that an operation like add doesn't need signdness info since it is based on two-complement representation.

Otherwise, I think that the best way to do that is to modify the front-end (Clang) to add some custom DWARF debug info. Here is a link that might get you started.

UPDATE: If your goal is to implement static-analysis directly on LLVM IR. This paper can offer a thorough discussion.

Navas, J.A., Schachte, P., Søndergaard, H., Stuckey, P.J.: Signedness-agnostic program analysis: Precise integer bounds for low-level code. In: Jhala, R., Igarashi, A. (eds.) APLAS 2012. LNCS, vol. 7705, pp. 115–130. Springer, Heidelberg (2012)

Codoka
  • 836
  • 10
  • 11
  • 1
    Thanks, I'm writing my own frontent though, not for C either so I'm not using Clang, but I will check out the idea about the meta data, I haven't paid much attention to it so far, but will read up on it to check if I can use it that way – Rick May 30 '15 at 23:12