0

So I'm working on creating a simple, compile-to-C language that has syntax similar to Python. Here is some sample source code:

# All comments start with pound signs

# Integer declaration
speed = 4
motor = 69.5
text = "hey +  guys!"
junk =   5    +4

# Move function
def move():
  speed = speed + 1
  print speed

# Main function (program entry)
def main():
  localvar = 43.2
  move()
  if true:
    print localvar

Like Python, the language emphasizes readability by indentation policies. It also has a very loose type declaration system. Types are determined by the context.

object = 5            // Creates an integer
object_two = "stuff"  // Creates a string
object_three = 5.23   // Creates a float

The sample source code I have above is internally represented as such:

[
  [
    "GLOBAL",
    [
      "speed = 4",
      "motor = 69.5",
      "text = \"hey +  guys!\"",
      "junk =   5    +4"
    ],
    [
      "SCOPE",
      [
        "speed",
        "motor",
        "text",
        "junk"
      ],
      [
        "INT",
        "FLOAT",
        "STRING",
        "INT"
      ],
      [
        0,
        1,
        2,
        3
      ]
    ]
  ],
  [
    "def move():",
    [
      "  speed = speed + 1",
      "  print speed"
    ],
    [
      "SCOPE",
      [
        "speed",
        "motor",
        "text",
        "junk"
      ],
      [
        "GLOBAL",
        "GLOBAL",
        "GLOBAL",
        "GLOBAL"
      ],
      [
        0,
        1,
        2,
        3
      ]
    ]
  ],
  [
    "def main():",
    [
      "  localvar = 43.2",
      "  move()",
      "  if true:",
      "    print localvar"
    ],
    [
      "SCOPE",
      [
        "speed",
        "motor",
        "text",
        "junk",
        "localvar"
      ],
      [
        "GLOBAL",
        "GLOBAL",
        "GLOBAL",
        "GLOBAL",
        "FLOAT"
      ],
      [
        0,
        1,
        2,
        3,
        0
      ]
    ]
  ]
]

Every function is packed into this representation along with respective local variables and their types (also the index of the line they are declared on respective to the function).

I'm trying to convert this intermediate representation into actual C code (actually it is NXC code, so it slightly differs from C).

My question is how can I make sense of variable types (particularly the variables declared in a function argument). The only way I can possibly do this is guessing based on the context in which the function was called.

Not to mention, I'm creating the intermediate representation in a linear fashion. What happens if a function is defined but not called until later on? Will I have to do several runs modifying this intermediate representation until I obtain all the necessary type information?

turnt
  • 3,235
  • 5
  • 23
  • 39
  • You may have to do multiple passes, yes, if you are going to allow functions used before they are defined. Which you pretty much need to do or allow for some kind of prototype syntax. Build up an abstract syntax tree in the parser and traverse that as needed. But, for gods sake, don't make white space significant. – Charlie Burns Oct 06 '13 at 22:12
  • Also, think about what you can do at compile time versus what HAS to be done at runtime with this type of language. It's not going to be easy to compile a python like language into C without some kind of a VM behind it. – Charlie Burns Oct 06 '13 at 22:15
  • Have you thought about what to do if the same function is called with _different_ argument types in different places in your program? There are many cases where this makes sense. – alexis Oct 06 '13 at 22:53

0 Answers0