3

I have this piece of code:

constexpr static VOID fStart()
{
    auto a = 3;
    a++;
}

__declspec(naked) 
constexpr static VOID fEnd() {};

static constexpr auto getFSize()
{
    return (SIZE_T)((PBYTE)fEnd - (PBYTE)fStart);
}

static constexpr auto fSize = getFSize();
static BYTE func[fSize];

Is it possible to declare "func[fSize]" array size as the size of "fStart()" during compilation without using any std library? It is necessary in order to copy the full code of fStart() into this array later.

DBenson
  • 377
  • 3
  • 12
  • 3
    I'm sorry but functions don't work that way my friend. – D-RAJ Jan 29 '21 at 18:54
  • 1
    Even with standard library features, functions do not have sizes as far as the language is concerned. It is necessarily an implementation detail. – François Andrieux Jan 29 '21 at 18:55
  • Most probably it's impossible, even at runtime for standard C++. There is no guarantee that fEnd will be placed right after fStart in the executable, so the difference computed by getFsize() may not be the size of fStart. – pts Jan 29 '21 at 18:55
  • 8
    What do you need that for? Maybe it's a XY problem, and we can propose a different solution for it. – πάντα ῥεῖ Jan 29 '21 at 18:56
  • @TedKleinBergman, I think RVA are assigned at linking time. – DBenson Jan 29 '21 at 18:57
  • @pts, if we turn off optimizations and several other options it does. – DBenson Jan 29 '21 at 18:58
  • @πάνταῥεῖ, I need to copy function fStart() into func[fSize] array – DBenson Jan 29 '21 at 18:59
  • @David: Maybe there is a solution that works with the C++ compiler in Visual Studio 2019 with some specific flags, but there is definitely no solution in standard C++, and no portable solution suported by many versions of many C++ compilers. – pts Jan 29 '21 at 19:00
  • 1
    @David So you have an array of function pointers or what? I still don't get it. Also you didn't tell what you really want to solve. What is a `func[]` array? – πάντα ῥεῖ Jan 29 '21 at 19:00
  • 3
    This is classic example of [XY problem](http://xyproblem.info/). Please explain WHY? Why do you need this stage thing? Explain functionality your code suppose to provide, not what you think is needed to provide that functionality. Classic method of overcoming this issue is starting explanation like this: "As an end user I want .... ". – Marek R Jan 29 '21 at 19:02
  • @πάνταῥεῖ, nope. I need to copy fStart function (not pointer, the whole function with all opcodes) into func array later. I plan to modify fStart () many times, so it is inconvenient to calculate its size every time. – DBenson Jan 29 '21 at 19:03
  • 1
    Most C++ compilers (most probably including Visual Studio 2019) haven't even started code generation before deciding about the size of static arrays, thus they are genuinely unable to tell the size of the function at that time. It's highly unlikely that you can do this in a single run of the compiler. – pts Jan 29 '21 at 19:05
  • 1
    It is possible he needs just `std::function`. – Marek R Jan 29 '21 at 19:06
  • 2
    @David _"... I plan to modify fStart () many times, ..."_ What exactly do you need to modify there, what can't be done via parametrization (be it constexpr / templates, or at runtime)? How would you know the OpCodes which should be changed there? – πάντα ῥεῖ Jan 29 '21 at 19:09
  • **Why do you ask?** I am extremely curious – Basile Starynkevitch Jan 29 '21 at 19:27
  • @BasileStarynkevitch, oh.. I'm just so lazy to increase BYTE array size by hand each time in order to copy code of function fStart(). Later it is gonna be struct like {BYTE code[fSize]; DWORD a...} I know, that I can do much easy like Alloc + BYTE* pointer. – DBenson Jan 29 '21 at 19:33
  • **But why do you need to copy or move machine code?** In many cases, it is not position independent! I strongly believe your approach could be wrong, but I cannot guess what actual problem you are trying to solve.... – Basile Starynkevitch Jan 29 '21 at 19:36
  • 1
    @David If the question is about VC++ and Windows specifically, then you should [edit](https://stackoverflow.com/posts/65960187/edit) the question and state so, also add the appropriate tags.These clarifications belong into the question, not as comments. – dxiv Jan 29 '21 at 20:04
  • related: https://www.quora.com/How-do-I-get-the-address-of-an-instruction-in-C – loa_in_ Jan 29 '21 at 20:16
  • 3
    Just a reminder that the code generated from one function might not even be placed in a single contiguous block of memory. For an explanation read https://easyperf.net/blog/2019/03/27/Machine-code-layout-optimizatoins#function-splitting – Ben Voigt Jan 29 '21 at 20:56

3 Answers3

7

There is no method in standard C++ to get the length of a function.

You'll need to use a compiler specific method.

One method is to have the linker create a segment, and place your function in that segment. Then use the length of the segment.

You may be able to use some assembly language constructs to do this; depends on the assembler and the assembly code.

Note: in embedded systems, there are reasons to move function code, such as to On-Chip memory or swap to external memory, or to perform a checksum on the code.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
2

The following calculates the "byte size" of the fStart function. However, the size cannot be obtained as a constexpr this way, because casting loses the compile-time const'ness (see for example Why is reinterpret_cast not constexpr?), and the difference of two unrelated function pointers cannot be evaluated without some kind of casting.

#pragma runtime_checks("", off)
__declspec(code_seg("myFunc$a")) static void fStart()
{   auto a = 3; a++; }
__declspec(code_seg("myFunc$z")) static void fEnd(void)
{   }
#pragma runtime_checks("", restore)

constexpr auto pfnStart = fStart;                               // ok
constexpr auto pfnEnd = fEnd;                                   // ok
// constexpr auto nStart = (INT_PTR)pfnStart;                   // error C2131

const auto fnSize = (INT_PTR)pfnEnd - (INT_PTR)pfnStart;        // ok
// constexpr auto fnSize = (INT_PTR)pfnEnd - (INT_PTR)pfnStart; // error C2131
dxiv
  • 16,984
  • 2
  • 27
  • 49
0

On some processors and with some known compilers and ABI conventions, you could do the opposite:

generate machine code at runtime.

For x86/64 on Linux, I know GNU lightning, asmjit, libgccjit doing so.

The elf(5) format knows the size of functions.

On Linux, you can generate shared libraries (perhaps generate C or C++ code at runtime (like RefPerSys does and GCC MELT did), then compiling it with gcc -fPIC -shared -O) and later dlopen(3) / dlsym(3) it. And dladdr(3) is very useful. You'll use function pointers.

Read also a book on linkers and loaders.

But you usually cannot move machine code without doing some relocation, unless that machine code is position-independent code (quite often PIC is slower to run than ordinary code).

A related topic is garbage collection of code (or even of agents). You need to read the garbage collection handbook and take inspiration from implementations like SBCL.

Remember also that a good optimizing C++ compiler is allowed to unroll loops, inline expand function calls, remove dead code, do function cloning, etc... So it may happen that machine code functions are not even contiguous: two C functions foo() and bar() could share dozens of common machine instructions.

Read the Dragon book, and study the source code of GCC (and consider extending it with your GCC plugin). Look also into the assembler code produced by gcc -O2 -Wall -fverbose-asm -S. Some experimental variants of GCC might be able to generate OpenCL code running on your GPGPU (and then, the very notion of function end does not make sense)

With generated plugins thru C and C++, you carefully could remove them using dlclose(3) if you use Ian Taylor's libbacktrace and dladdr to explore your call stack. In 99% of the cases, it is not worth the trouble, since in practice a Linux process (on current x86-64 laptops in 2021) can do perhaps a million of dlopen(3), as my manydl.c program demonstrates (it generates "random" C code, compile it into a unique /tmp/generated123.so, and dlopen that, and repeat many times).

The only reason (on desktop and server computers) to overwrite machine code is for long lasting server processes generating machine code every second. If this was your scenario, you should have mentioned it (and generating JVM bytecode by using Java classloaders could make more sense).

Of course, on 16 bits microcontrollers things are very different.

Is it possible to calculate function length at compile time in C++?

No, because at runtime time some functions do not exist anymore.

The compiler have somehow removed them. Or cloned them. Or inlined them.

And for C++ it is practically important with standard containers: a lot of template expansion occurs, including for useless code which has to be removed by your optimizing compiler at some point.

(Think -in 2021 of compilation with a recent GCC 10.2 or 11. using everywhere, and linking with, gcc -O3 -flto -fwhole-program: a function foo23 might be defined but never called, and then it is not inside the ELF executable)

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Thank you for your kindly reply. Yeah, it is very easy to do using assembly, JIT and so on. When you working with MSVC x86, assembly is fine. However, x64 does not allow to insert assembly in the convenient way, so sometimes it is more easy to write C code and copy it to some buffer for further needs. – DBenson Jan 29 '21 at 19:44
  • 2
    No: generate C code, and compile it at runtime as a plugin, then load that plugin. This is doable on Linux, Windows, MacOSX. Use function pointers. – Basile Starynkevitch Jan 29 '21 at 19:45
  • 3
    Why did you vote to close a question that you answered yourself? Did you change your mind about how clear the question is? – cigien Jan 29 '21 at 19:50
  • Because the question is so unclear that I had to make several guesses about it, and because the question did not provide, as it should, any [mre], and has *no motivation* – Basile Starynkevitch Jan 29 '21 at 19:51
  • In that case, why did you answer it? Either action is fine, i.e. answering/voting to close. I'm just confused why you chose to do both. – cigien Jan 29 '21 at 20:36