Accessing an "out-of-bounds" index in an interpreted versus a compiled language

Question

What is the difference between accessing an out-of-bounds (negative, or otherwise inaccessible) index in a compiled programming language (such as C) versus an interpreted language (such as MATLAB)?

As per the recommendation of this site, I have researched a number of threads concerning the accessing of out-of-bounds indices. Most of these threads, however, only focus on resolving an issue with source. That said, I have was able to garner from this site that accessing an out-of-bounds index while using C results in undefined behavior. Through experimentation using MATLAB, it is my guess that interpreted languages perform tests to determine if an index should be inaccessible and "catch" poorly-written code before out-of-bounds indexes are accessed. Is this actually the case with interpreted languages in general, or do they, similar to the C (compiled) language, cause a level of undefined behavior to occur? Does the accessing of an out-of-bounds index within the program of any compiled language cause undefined behavior?

It depends on the language. Java and C# are compiled, yet they still perform checks for indices. It is not about compiled/interpreted. — Sami Kuhmonen, Jun 27 '15 at 06:45
Ah, so MATLAB, does, in fact, check for out-of-bounds indices, then? This being the case, I also suppose the resultant "out-of-bounds" memory is not accessed? — Whippy_Raton, Jun 27 '15 at 06:52
Many languages check for out of bounds access and refuse to abuse memory (Pascal, for instance). It is a compiled language. Other languages simply don't have out of bounds elements; if you try to access a non-existent element, the element is created (think `awk`, Perl, ...). Other languages simply leave it as undefined behaviour (C, C++). You get what you get, which may or may not be what you deserve or expect — it usually isn't what you intended, anyway. — Jonathan Leffler, Jun 27 '15 at 06:57
I suppose that the handling of out-of-bounds access is simply intrinsic to the language itself then, and not to the way it is implemented. I believe that nullifies my question, and I'm grateful for the insight. Thank you. Out pure interest, however, how is it that compiled languages may check for out-of-bounds access? Is this performed before compiling? — Whippy_Raton, Jun 27 '15 at 07:04
It's almost always done during runtime. For tests like bounds checking to be performed at compile time you need a very "rich" type system, e.g. depended types. In languages without automatic bounds checking you can usually implement it yourself, or use respective library functions like C++ vector at: http://en.cppreference.com/w/cpp/container/vector/at — Daniel Jour, Jun 27 '15 at 07:36

Gil · Answer 1 · 2015-06-27T16:03:26.630

Some languages leave it as implementation "details" and others clearly specify what behavior is expected... but this has changed over time for several programming languages.

Regarding C, it is perfectly legitimate (and useful) to use negative indexes in an array, even if that may lead sometimes to crashes or code/data corruption (intended or not) because C tries not to limit your capabilities as a programmer. If you know how the C language is implemented then there's not that much incertainty about what will happen with mis-addressed stack-based or malloc-based memory blocks. C compilers may issue warnings during compilation to help preventing errors (unintented negative array indexes).

Other languages decide for the programmer and try to block these actions, either at compilation time (PASCAL is a good old example) or at execution time (JIT, VMs, etc.). There is no general rule unless the language specifications define a specific behavior.

Even in C, you can use many ways to prevent unintended damages, like guardian memory areas surrounding the array's memory block. Fault can then be processed by a signal handler.

Since most other languages rely on C/C++ implementations this is also how 'more modern' programming languages handle these issues (negative array index) as part of the specifications or implementations. Tests for negative array indexes may also be used, but at a performance penalty.

C# or Java variables take more space than in C because they allocate more information (locks, garbage collection, guardian areas, etc.), while the wasted space for C variables may only result from alignment when the default behavior is not replaced by something more sophisticated by the programmer.

Accessing an "out-of-bounds" index in an interpreted versus a compiled language

1 Answers1