2

How can I change the Erlang VM to use a random 128 bit value for one of it's pid values?

It seems the largest value at this time that I can set is:

32> pid(1, 32767, 8191).
** exception error: bad argument
     in function  list_to_pid/1
        called as list_to_pid("<1.32767.8191>")
     in call from c:pid/3 (c.erl, line 419)
33> pid(0, 32767, 8191).
<0.32767.8191>

It looks like the generation of pid comes down to something like this in erts/emulator/beam/erl_ptab.h:283:

ERTS_GLB_INLINE Eterm
erts_ptab_make_id(ErtsPTab *ptab, Eterm data, Eterm tag)
{
    HUint huint;
    Uint32 low_data = (Uint32) data;
    low_data &= (1 << ERTS_PTAB_ID_DATA_SIZE) - 1;
    low_data <<= ERTS_PTAB_ID_DATA_SHIFT;
    huint.hval[ERTS_HUINT_HVAL_HIGH] = erts_ptab_data2pix(ptab, data);
    huint.hval[ERTS_HUINT_HVAL_LOW] = low_data | ((Uint32) tag);
    return (Eterm) huint.val;
}
Seki
  • 11,135
  • 7
  • 46
  • 70
Eric des Courtis
  • 5,135
  • 6
  • 24
  • 37

1 Answers1

5

Why do you want to do this? Creating a pid does not guarantee that there is a process with that pid, or that will ever be a process with that pid, only a return from spawn ensures that. Read the answers to Can someone explain the structure of a Pid in Erlang? to get an eplanation what the various fields mean. It will help explain why you can't just set it to any value.

You can set the size of the process table when you start erlang with the '+P Number' option. This gives the maximum value of the second field.

EDIT: Just some more comments about the question and the comments below.

Note that a pid, Process Identifier, is just a reference to a process, it is not the process itself. When you spawn a process you get both a new process and a new pid referring to it. When you create a pid with either pid/3 in the shell or using list_to_pid/1 you get just a pid which may or may not refer to a process.

There is today no way in the BEAM to control which pid you get when you create a process. If you really need this functionality you would have to go in and modify the BEAM internally to do that. Considering the BEAM is structured internally (with a process table) and how a pid is structured that could be very difficult to do. For example one field in a pid is the index of the process in the process table so it is illegal to have to different pids with the same table index.

Wouldn't a better solution instead be to create an indentifier/pid table?

Community
  • 1
  • 1
rvirding
  • 20,848
  • 2
  • 37
  • 56
  • I want to create a process with a pid containing a random 128bit value. So that I can use it as a capability. http://en.wikipedia.org/wiki/Capability-based_security – Eric des Courtis May 30 '13 at 04:09
  • @EricdesCourtis when you create a process you have no control over which pid it has, the system takes the "next" free one. All you can be certain about it is that it is unique. – rvirding May 30 '13 at 12:13
  • Understood, but my question is specifically how to modify the Beam VM to get this property. – Eric des Courtis May 30 '13 at 12:16
  • If I can change the pid so that it contains some unpredictable and imposible to bruteforce pid then I can expose my vm directly to the internet without worrying about getting hacked. The only thing that can access it would be someone or something who was given the pid to begin with. – Eric des Courtis May 30 '13 at 12:23
  • Today the BEAM cannot do this and it is designed in such a way that it would not be easy to do. You would have to hack the BEAM internals to do this. I would never trust a BEAM exposed to the internet. – rvirding May 30 '13 at 12:44
  • Interesting, could you shed some light as to why this would be difficult? Also why wouldn't you trust it? – Eric des Courtis May 30 '13 at 12:46
  • 1
    The second question is easy to answer: just being generally mistrustful and not wanting to connect anything directly to the internet. Also the BEAM is totally insecure in that once you are in you can do anything, it was designed to be used behind a firewall and you only let trusted systems access directly without control of what they do. – rvirding May 30 '13 at 13:14
  • 1
    For the first question I would just say that the structure and fields a pid has directly mirror the internal handling of processes, for example a node field, an index into the process table field and a counter field. – rvirding May 30 '13 at 13:16
  • Understood there is this interesting master's thesis on the subject. http://www.erlang.se/publications/xjobb/0109-naeser.pdf I am curious about the implementation challenges. – Eric des Courtis May 30 '13 at 13:17
  • Which field do you think would be the easiest to change? – Eric des Courtis May 30 '13 at 13:19
  • By the way have you verified that the second field can be larger than 32767 when +P is added? It doesn't seem to work for me. – Eric des Courtis May 30 '13 at 13:22
  • 2
    Then I will have to check exactly what it means. – rvirding May 30 '13 at 13:33