Why using Low-level-Languages or close to it ( C ) for embedded system and not a high level language, when all will be compiled to machine code?

Question

I have searched but I couldn't find a clear answer. If we are compiling the code in a computer(powerful) then we are only sending a machine instruction to the memory in the embedded device. This, for my understandings, will make no difference if we use any sort of language because, in the end, we will be sending only a machine code to the embedded device, the code compilation which is the expensive phase is already done by a powerful machine!

Why using language like C ? Why not Java? we are sending a machine code at the end.

I meant , the microcontroller is not responsible for compiling . — Omranovic, Mar 27 '16 at 10:15
Depending on the type of embedded we are talking about here. If you use Linux/WinCE based system with a ARM V7a processor and GBs of memory, then you can use Java, python or whatever you want. The reason you want to use C in those systems is performance: although the UI can be written in other languages, the business logic demands the performance of lower level tuning, like SIMD. I also like writing system daemons/services in C because Linux's API is C based and C and C++ are the language I learn most (a lot of embedded system programmers only know those two, or only C). — user3528438, Mar 27 '16 at 11:32
The other thing is, lower level language gives programming gives deterministic runtime behavior, very importantly, timing, because a lot of embedded applications are control-related and has timing deadlines that can not be missed. So if you use JAVA you need to control when garbage collection and can not happen, which is not trivial, if possible. But if written in C or C++, it's pretty much like "I write it this way, it will run that way, and take that much time, always" . And of course, memory, you can not install an interpreter on the target when the target only has KBs of memory. — user3528438, Mar 27 '16 at 11:39
@user3528438 : All useful content, but should be posted as an answer rather than answering in a comment. — Clifford, Mar 27 '16 at 13:51
@OmranKaddah : Ok, I assumed the "latter" referred to "C" rather than "embedded system". I see that was a misinterpretation, but perhaps you could clarify the title. — Clifford, Mar 27 '16 at 13:53
@OmranKaddah in an ideal world the MPU would have as much memory and speed as you need. But these may be limited by other engineers or accountants. Java runs on a virtual machine, so the MPU would need to host that. Even if you assemble bytecode it may not be possible to shoehorn that into the MPU. C is one of the languages of choice for micros, because it is lean and efficient. The MPU might not even *have* an OS, and if it does, may be a minimal implementation. Even `float` types or `sprintf` may not be implemented by default in the C compiler, because they use a lot of resources. — Weather Vane, Mar 27 '16 at 17:58
For the specific case of Java, it turned out to be a complete fiasco as general-purpose programming language, mainly because of the VM requirement and the bloated amount of libraries included in the language. For those who studied embedded systems programming in the late 90s, there was this big hype that Java VM cores would become industry standard in the future, and therefore universities stopped teaching C in favour of Java. This never happened. I think a few rare, exotic processors with built-in VM were made, and then everyone dropped the whole stupid idea. Today, the schools teach C again. — Lundin, Mar 29 '16 at 06:47

score 5 · Answer 1 · edited May 23 '17 at 11:53

The answer partly lies in the runtime requirements and platform-provided expectations of a language: The size of the runtime for C is minimal - it needs a stack and that is about it to be able to start running code. For a compliant implementation static data initialisation is required, but you can run code without it - the initialisation itself could even be written in C, and even heap and standard library initialisation are optional, as is the presence of a library at all. It need have no OS dependencies, no interpreter and no virtual machine.

Most other languages require a great deal more runtime support and this is usually provided by an OS, runtime-library, or virtual machine. To operate "stand-alone" these languages would require that support to be "built-in" and would consequently be much larger - so much so that you may as well in many cases deploy a system with an OS and/or JVM for example in any case.

There are of course other reasons why particular languages are suited to embedded systems, such as hardware level access, performance and deterministic behaviour.

While the issue of a runtime environment and/or OS is a primary reason you do not often see higher-level languages in small embedded systems, it is by no means unheard of. The .Net Micro Framework for example allows C# to be used in embedded systems, and there are a number of embedded JVM implementations, and of course Linux distributions are widely embedded making language choice virtually unlimited. .Net Micro runs on a limited number of processor architectures, and requires a reasonably large memory (>256kb), and JVM implementations probably have similar requirements. Linux will not boot on less than about 16Mb ROM/4Mb RAM. Neither are particularly suited to hard real-time applications with deadlines in the microsecond domain.

C is more-or-less ubiquitous across 8, 16, 32 and 64 bit platforms and normally available for any architecture from day one, while support for other languages (other than perhaps C++ on 32 bit platforms at least) may be variable and patchy, and perhaps only available on more mature or widely used platforms.

From a developer point of view, one important consideration is also the availability of cross-compilation tools for the target platform and language. It is therefore a virtuous circle where developers choose C (or increasingly also C++) because that is the most widely available tool, and tool/chip vendors provide C and C++ tool-chains because that is what developers demand. Add to that the third-party support in the form of libraries, open-source code, debuggers, RTOS etc., and it would be a brave (or foolish) developer to select a language with barely any support. It is not just high level languages that suffer in this way. I once worked on a project programmed in Forth - a language even lower-level than C - it was a lonely experience, and while there were the enthusiastic advocates of the language, they were frankly a bit nuts favouring language evangelism over commercial success. C has in short reached critical mass acceptance and is hard to dislodge. C++ benefits from broad interoperability with C and similarly minimal runtime requirements, and by tool-chains that normally support both languages. So the only barrier to adoption of C++ is largely developer inertia, and to some extent availability on 8 and 16 bit platforms.

Thanks for your reply, but it seems nobody got my question, my english is basic. I know that high level languages need a VM or interpreter, but these will be installed on the powerful machine( computer ) right ? all we have to do is complie it then VM tranfer it to a machine code , then we send the code to the device. The Embedded device will be only responsible in proccessing the machine code. Is the machine code that is generated by VM is longer ? is that the reason ? — Omranovic, Mar 27 '16 at 10:20
@omrankaddah: For a lot of reasons, having not do with interpretation at all, this is possible, but too much for embedded systems (see my answer for details). — 3442, Mar 27 '16 at 10:25
@omrankaddah "I know that high level languages need a VM or interpreter, but these will be installed on the powerful machine( computer ) right ?" **No**. The VM or interpreter is needed _at runtime_ = when the program run = on the embedded system. And this has a very high cost in term of storage (ex flash), memory and CPU usage. — jbm, Mar 27 '16 at 10:31
@jbm Aha thanks man !! that what i was looking for , now i get it !! and you got my question . — Omranovic, Mar 27 '16 at 10:41
@OmranKaddah : So my answer still stands and exactly answers your question. Your objection to this answer seems to be related to your lack of understanding of your own question. My point is that a language such as Java requires a large run-time environment to execute the byte code. — Clifford, Mar 27 '16 at 13:33
@OmranKaddah : I have added to this answer covering the support that *is* available for languages such as C# and Java. — Clifford, Mar 27 '16 at 13:48
@Clifford you are right, i thought i understood your answer, but i actually i didn't. jbm answer maybe was simplier and i got the point ,i have just read your answer again , yours with more elaborated with some terms i did not get it . i need to look up more about run-time and byte code. my university sucks really . Thank you . — Omranovic, Mar 27 '16 at 18:34

3442 · Answer 2 · 2016-03-27T10:32:51.933

You're misunderstanding things a bit. Let's start by explaining the foundation of how computers work internally. I'll use simple and practical concepts here. For the underlying theories, read about Turing machines. So, what's your machine made up of? All computers have two basic components: a processor and a memory.

The memory is a sequential group of "cells" that works sort of like a table. If you "write" a value into the Nth cell, you can then retrieve that same value by "reading" from the Nth cell. This allows computers to "remember" things. If a computer is to perform a calculation, it needs to retrieve input data for it from somewhere, and to output data from it into somewhere. That place is the memory. In practice, the memory is what we call RAM, short for random access memory.

Then we have the processor. Its job is to perform the actual calculations on memory. The actual operations that are to be performed are mandated by a program, that is, a series of instructions that the processor is able to understand and execute. The processor decodes and executes an instruction, then the next one, and so on until the program halts (stops) the machine. If the program is add cell #1 and cell #2 and store result in cell #3, the processor will grab the values at cells 1 and 2, add their values together, and store the result into cell 3.

Now, there's some sort of an intrinsic question. Where is the program stored, if at all? First of all, a program can't be hardcoded into the wires. Otherwise, the system is not more of a computer than your microwave. To these problems are two distinct approaches/solutions: the Harvard architecture and the Von Neumann Architecture.

Basically, in the Harvard architecture, the data (as always has been) is stored in the memory. The code (or program) is stored somewhere else, usually in read-only memory. In the Von Neumann architecture, code is stored in memory, and is just another form of data. As a result, code is data, and data is code. It's worth noting that most modern systems use the Von Neumann architecture for several reasons, including the fact that this is the only way to implement just-in-time compilation, an essential part of runtime systems for modern bytecode-based programming languages, such as Java.

We now know what the machine does, and how it does that. However, how are both data and code stored? What's the "underlying format", and how shall it be interpreted? You've probably heard of this thing called the binary numeral system. In our usual decimal numeral system, we have ten digits, zero through nine. However, why exactly ten digits? Couldn't they be eight, or sixteen, or sixty, or even two? Be aware that it's impossible to create an unary based computational system.

Have you heard that computers are "logical and cold". Both of them are true... unless your machine has an AMD processor or a special kind of Pentium. The theory states that every logical predicate can be reduced to either "true" or "false". That is to say that "treu" and "false" are the basis of logic. Plus, computers are made up of electrical cruft, no? A light switch is either on or off, no? So, at the electrical level we can easily recognize two voltage levels, right? And we want to handle logic stuff, such as numbers, in computers, right? So zero and one may be, as the only feasible solution they are.

Now, taking all the theory into account, let's talk about programming languages and assembly languages. Assembly languages are a way to express binary instructions in a (supposedly) readable way to human programmers. For instance, something like this...

ADD 0, 1 # Add cells 0 and 1 together and store the result in cell 0

Could be translated by an assembler into something like...

110101110000000000000001

Both are equivalent, but humans will only understand the former, and processors will only understand the later.

A compiler is a program that translates input data that is expected to conform to the rules of a given programming language into another, usually lower-level form. For instance, a C compiler may take this code...

x = some_function(y + z);

And translate it into assembly code such as (of course this is not real assembly, BTW!)...

# Assume x is at cell 1, y at cell 2, and z at cell 3.
# Assuem that, when calling a function, the first argument
# is at cell 16, and the result is stored in cell 0.
MOVE 16, 2
ADD 16, 3
CALL some_function
MOVE 1, 0

And the assembler will spit (this is not random)...

11101001000100000000001001101110000100000000001110111011101101111010101111101111110110100111010010000000100000000

Now, let's talk about another language, namely Java. Java's compiler does not give you assembly/raw binary code, but bytecode. Bytecode is... like a generic, higher-level form of assembly language that the CPU can't understand (there are exceptions), but another program that directly runs on the CPU does. This means that the lie that some badly educated people spread around, that "both interpreted and compiled programs ultimately boil down to machine code" is false. If, for example, the interpreter is written in C, and has this line of code...

Bytecode some_bytecode;
/* ... */
execute_bytecode(&some_bytecode);

(Note: I won't translate that into assembly/binary again!) The processor executes the interpreter, and the interpreter's code executes the bytecode, by performing the actions specified by the bytecode. Although, if not optimized correctly, this can severely degrade performance, this is not the problem per se, but the fact that things such as reflection, garbage collection, and exceptions can add quite some overhead. For embedded systems, whose memories are small and whose processors are slow, this is something you want. You're wasting precious system resources on things you don't need. If C programs are slow on your Arduino, image a full blown Java/Python program with all sorts of bells and whistles! Even if you translated bytecode into machine code before inserting it into the system, support must be there for all that extra stuff, and results in basically the same unwanted overhead/waste. You would still need support for reflection, exceptions, garbage collection, etc... It's basically the same thing.

On most other environments, this is not a big deal, as memory is cheap and abundant, and processors are fast and powerful. Embedded systems have special needs, they're special by themselves, and things are not free in that land.

Wow man that is a long comment, i will read it after hour, my reply may be late . Thank you !! — Omranovic, Mar 27 '16 at 10:35
Thanks again, unfortunately there are some things i couldnot grasp because of my very limted knowledge in computer sience. — Omranovic, Mar 27 '16 at 12:18
But you answered my question, i will try to erease the ambiguity by reading more on bytecode . — Omranovic, Mar 27 '16 at 12:22
" Even if you translated bytecode into machine code before inserting it into the system, support must be there for all that extra stuff, and results in basically the same unwanted overhead/waste. You would still need support for reflection, exceptions, garbage collection, etc... It's basically the same thing. " this was the most importatnt part i think ! — Omranovic, Mar 27 '16 at 12:22

score 0 · Answer 3 · answered Mar 27 '16 at 09:23

0

Why using language like C ? why not Java ? we are sending a machine code at the end.

No, Java code does not compile to machine code, it needs a virtual machine (the JVM) on the target system.

You're partly right about the compilation, however, but still "higher-level" languages can result in less efficient machine code. For instance, the language can include garbage collection, run-time correctness checks, can't use all the "native" numeric types, etc.

answered Mar 27 '16 at 09:23

Ilya

5,377
2
18
33

Hey downvoters: please enlighten me with an explanation, thanks! – Ilya Mar 27 '16 at 09:34
So the Machine code that's genereated by VM is maybe longer ? – Omranovic Mar 27 '16 at 10:29
The VM is an interpreter of the bytecode generated by the "compiling" of the .java. It's a program that runs on the target, not a compiler. So it's faster than interpreting source code, but harder/slower than running machine code directly. – Ilya Mar 27 '16 at 10:48

score 0 · Answer 4 · answered Mar 27 '16 at 09:27

0

In general it depends on the target. On small targets (i.e. microcontrollers like AVR) you don't have that complex programs running. Additionally, you need to access the hardware directly (f.e. a UART). High level languages like Java don't support accessing the hardware directly, so you usually end up with C.

In the case of C versus Java there's a major difference:

With C you compile the code and get a binary that runs on the target. It directly runs on the target.

Java instead creates Java Bytecode. The target CPU cannot process that. Instead it requires running another program: the Java runtime environment. That translates the Java Bytecode to actual machine code. Obviously this is more work and thus requires more processing power. While this isn't much of a concern for standard PCs it is for small embedded devices. (Note: some CPUs do actually have support for running Java bytecode directly. Those are exceptions though.)

Generally speaking, the compile step isn't the issue -- the limited resources and special requirements of the target device are.

answered Mar 27 '16 at 09:27

bluebrother

8,636
1
20
21

But it's all done in the powerful computer. I understand that language like Java needs a VM , but it's located in Powerful machine ( computer), and the tanrslation to MC is done there, then is send to embedded device. – Omranovic Mar 27 '16 at 10:25
could it be that the machine code that is generated by VM is longer ? – Omranovic Mar 27 '16 at 10:25
No, the JVM is needed on the machine executing the code. Java doesn't compile machinecode that runs on the target architecture, it compiles its own bytecode. That's why you _need_ the JVM on the machine executing it. It's how Java is designed. – bluebrother Mar 27 '16 at 10:28
@bluebrother: [GCJ can do precisely that](https://gcc.gnu.org/java/). However, it's true that this is not the way things were supposed to be done. – 3442 Mar 27 '16 at 10:30
@KemyLand: yes, but as far as I know GCJ has been mostly given up for the sake of OpenJDK, plus it's rarely used. I left that information out since I felt it will not help this discussion but rather confuse things. There are also microcontrollers that can run Java bytecode, but those are rare as well (and again, that doesn't help for this discussion) – bluebrother Mar 27 '16 at 10:32
@bluebrother: Of course, you're right. I just wanted to notice that it's quite possible (but not a good idea) to do that, and because the OP has been asking that same question all over the comments. – 3442 Mar 27 '16 at 10:33

score 0 · Answer 5 · answered Mar 27 '16 at 10:04

0

you misunderstand something , 'compiling' java gives a different output then compiling a low level language , it is true that both are machine codes , but in c case the machine code is directly executable by the processor , whereas with java the output will be in an intermediate stage , a bytecode , and it can't be executed by the processor , it needs some extra work , a translation to a machine code , that is the only directly executable format , while that takes a extra time , c will be an attractive choice , because of its speed , with low level language you write you code then you compile to a target machine ( you need to specify the target to the compiler since each processor have his own machine code ) , then your code is understandable by the processor . in the other hand c allows direct hardware access , that is not allowed in java-like languages even via an api

answered Mar 27 '16 at 10:04

Hassen Dhia

555
4
11

you didn't get my question , and i know about this intermediate stage . – Omranovic Mar 27 '16 at 10:21
Thanks for your reply, but it seems nobody got my question, my english is basic. I know that high level languages need a VM or interpreter, but these will be installed on the powerful machine( computer ) right ? all we have to do is complie it then VM tranfer it to a machine code , then we send the code to the device. The Embedded device will be only responsible in proccessing the machine code. Is the machine code that is generated by VM is longer ? is that the reason ? – Omranovic Mar 27 '16 at 10:21
Not all high level languages require a VM. However, the VM must be implemented on the machine where the program is executed, regardless of whether the VM is on the machine that compiles the code. – Peter Mar 27 '16 at 10:34
No actually the java compiling process doesn't terminate in the development stage , the compiler does only the first step , that is translating java source into bytecode . then the embedded device interprets that output and for each bytecode he is programmed to do a specific task , this is the virtual machine concept , what i think you are missing is that the virtual machine runs on the embedded device not on a 'powerful computer' , your android device interprets that bytecode and not a 'powerful computer , if you i am misunderstanding please explain more – Hassen Dhia Mar 27 '16 at 10:39

xvan · Answer 6 · 2016-03-27T10:47:09.657

It's an industry thing.

There are three kinds of high level languages. Interpreted (lua, python, javascript), compiled to bytecode (java, c#), and compiled to machinne code (c, c++, fortran, cobol, pascal)

Yes, C is a high level language, and closer to java than to assembly.

High level languages are popular for two reasons.

Memory management, and a wide standard library.

Managed memory comes with a cost, somebody must manage it. That's an issue not only for java and c#, where somebody must implement a VM, but also to baremetal c/c++ where someone must implement the memory allocation functions.

A wide standard library can't be supported by all targets because there aren't enough resources. ie, avr arduino doesn't support the full c++ standard library.

C gained popularity, because it can easily be converted to equivalent assembly code. Most statements can be converted, without optimization, to a bunch of fixed assembly instructions, so compilers are easy to program. And its standard is compact and easy to implement. C prevailed because it became the defacto standard for the lowest high level language of any arch.

So in the end, besides special snowflakes like cython, go, rust, haskell etc, industry decided that machinne code is compiled from C, C++ and most optimization efforts went that way

Languages, like java, decided to hide memory from the progarammer, so good luck trying to interface with low level stuff there. As by design they do that, almost nobody bothers trying to bring them to compete with C. Realistically, java without GC would be C++ with different syntax.

Finally, if all the industry money goes to one language, the cheapest/easyest thing to do is choosig that language.

score 0 · Answer 7 · answered Mar 27 '16 at 23:07

You are right in that you can use any language that generates machine code. But JAVA is not one of them. JAVA, Python and even some languages that compile to machine code may have heavy system requirements. You could and some folks use Pascal, but C won the C vs Pascal war many years ago. There are some other languages that fell by the wayside that if you had a compiler for you could use. there are some new languages you can use, but the tools are not as mature and not as many targets as one would like. But it is very unlikely that they will unseat C. C is just the right amount of power/freedom, low enough and high enough.

score -1 · Answer 8 · answered Mar 27 '16 at 09:30

Java is an interpreted language and (like all interpreted languages) produces an intermediate code that is not directly executable by the processor. So what you send to the embedded device would be the Bytecode and you should have a JVM running on it and interpreting your code. Clearly not feasible. For what concern the compiled languages (C, C++...) you are right to say that at the end you send machine code to the device. However consider that using high level features of a language will produce much more machine code that you would expect. If you use polymorphism for example, you have just a function call, but when you compile the machine code explodes. Consider also that very often the use of dynamic memory (malloc, new...) is not feasible on an embedded device.

you misunderstood me , i know all of that . Thanks anyway look up at the my replies — Omranovic, Mar 27 '16 at 10:26

Why using Low-level-Languages or close to it ( C ) for embedded system and not a high level language, when all will be compiled to machine code?

8 Answers8