1

Is there an API call or any another similar way, that uses only ntdll.dll, to allocate memory on the stack?

I know alloca() does that, but I can't use it because I can use only function from ntdll.dll.

Thanks!

macro_controller
  • 1,469
  • 1
  • 14
  • 32
  • There is no such function. You can write it yourself easily. – David Heffernan Dec 13 '16 at 18:33
  • Stack allocation isn't provided by the OS, it's something that every single user function does in its prologue. Your compiler should translate `alloca()` into the same instructions found in the prologue, no function call required. If it doesn't, look at inline assembly. – Ben Voigt Dec 13 '16 at 18:34
  • @DavidHeffernan - exist, you wrong. – RbMm Dec 13 '16 at 18:34
  • @RbMm can you give me a link to the doc – David Heffernan Dec 13 '16 at 18:37
  • @DavidHeffernan - no, can not. I use this already many years :) ntdll.dll export `_alloca_probe` (`__chkstk`) - yes of course this is undocumented, but not I ask this question – RbMm Dec 13 '16 at 18:39
  • 1
    @RbMm: `_alloca_probe()` or `__chkstk()` may be used together with allocation on the stack, but it does not do the stack allocation. – Ben Voigt Dec 13 '16 at 18:41
  • @BenVoigt - no, thay exactly do this task. and what thay do by your version ? I use this functions *many years* – RbMm Dec 13 '16 at 18:43
  • 2
    You will probably get better answers if you explain why you "can only use functions from `ntdll.dll`." I can think of at least two good reasons why you might need to do that -- but both of them involve doing things that only Microsoft's employees have all the necessary information to do properly, and if you worked at Microsoft you wouldn't be asking this question here. – zwol Dec 13 '16 at 18:45
  • @RbMb: They stride through the new allocation, making sure that every page is accessed so that the guard page exception occurs, triggering the OS paging logic to force pages to be committed for this area of the stack if it was formerly only reserved.. – Ben Voigt Dec 13 '16 at 18:45
  • @zwol: Well, I'm pretty sure what he actually meant was "do not use functions except those found in `ntdll.dll`" and that is entirely possible, because `sub esp, NNN` doesn't require any functions at all. – Ben Voigt Dec 13 '16 at 18:46
  • @BenVoigt - yes, may be and this say - `_alloca` is patially compiler intrinsic which called this function internally – RbMm Dec 13 '16 at 18:46
  • @BenVoigt It is my understanding that you cannot normally avoid having `kernel32.dll` loaded into your process, so "only ntdll" doesn't actually make any sense as a design constraint. Unless you're writing CSRSS.EXE or similar - but then you're Microsoft. – zwol Dec 13 '16 at 18:51
  • @zwol - may be boot execute app – RbMm Dec 13 '16 at 18:52
  • @zwol: Ok, but maybe he doesn't want his code to call into `kernel32.dll`. Perhaps because of address space layout randomization? – Ben Voigt Dec 13 '16 at 18:52
  • @BenVoigt Why would one care about ASLR, unless they're writing a shellcode or exploit? – Mark Segal Dec 13 '16 at 18:57
  • @Mark: Hotpatching is used by defenders as well as attackers. – Ben Voigt Dec 13 '16 at 18:59
  • @BenVoigt Still doesn't explain the unnecessary fear from ASLR. – Mark Segal Dec 13 '16 at 19:00
  • @BenVoigt - about boot execute apps, starting by `smss.exe` like `chkdsk.exe` which can use only `ntdll.dll` you listen ? but this is only my guess – RbMm Dec 13 '16 at 19:00
  • @RbMm: Yes, that's another possibility. – Ben Voigt Dec 13 '16 at 19:07
  • use `alloca` in self code ! linker can say you about unresolved external (`_alloca_probe_16` or `__chkstk` ) but you can found this symbols in `alloca16.obj` and `chkstk.obj`which can be found in `VC` subfolder and also in `ntdllp.lib` (but not in `ntdll.lib`) – RbMm Dec 13 '16 at 19:35
  • @RbMm That is the _other_ possibility (besides subsystem servers) that I was thinking of. It, too, requires documentation that I was under the impression was not available outside Microsoft. – zwol Dec 13 '16 at 19:44
  • @zwol - what is "documented" or "documentation" is very volatile. formal can say use `ntdll.dll` is taboo at all. but by fact it can be use. – RbMm Dec 13 '16 at 19:51
  • @BenVoigt - and about "actual allocation" in x86 the `_alloca_probe_16` , `_alloca_probe` and `_chkstk`(last 2 is the same) do the *actual* allocation, not only stack checking, despite they name. but in x64 - only stack checking , when `sub rsp,rax` do compiler (all this is for `CL` compiler) if be absolute exactly – RbMm Dec 13 '16 at 20:32

3 Answers3

2

alloca is partially intrinsic function, implemented by compiler. but internally it call _alloca_probe_16 (for x86) or __chkstk(x64) for move guard page down on stack. implementation of this functions exist in alloca16.obj and chkstk.objwhich can be found in VC subfolder (where exacly depended from VC version) - you can add this obj for link process or even first convert it to lib. also in latest WDK libs - exist ntdllp.lib (not confuse with ntdll.lib) - it also containing all need for implementation ( ntdll.dll export _chkstk (for x86) and __chkstk (for x64))


again in more details:

when you write in src code

alloca(cb) CL compiler generate in x86

mov eax,cb
call _alloca_probe_16 ; do actual stack allocation and probe

and in x64 version

mov         ecx,eax 
add         rcx,0Fh 
cmp         rcx,rax 
ja          @@0
mov         rcx,0FFFFFFFFFFFFFF0h 
@@0:
and         rcx,0FFFFFFFFFFFFFFF0h 
mov         rax,rcx 
call        __chkstk ; probe only
sub         rsp,rax ; actual stack allocation

so _alloca_probe_16 and/or __chkstk must be implemented somewhere or you got link error - unresolved external symbol.

in latest WDK builds exist ntdllp.lib (note about p - not ntdll.lib) which containing this implementation. in this case your PE will be import __chkstk or _alloca_probe from ntdll.dll (this functions exported how minimum from XP - both this functions is point to same code, simply alias)

another solution - in VC folders can be found alloca16.obj and chkstk.obj - you can use this obj as link input (or merge alloca16.obj + chkstk.obj in single lib file). in this case your PE will be nothing import.

RbMm
  • 31,280
  • 3
  • 35
  • 56
1

You don't need something architecture dependent because allocation on the stack is (generally) architecture independent.

If you're using C99 you have a standard way of doing this, using Variable Length Arrays: https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html

You'd quite simply write something like this:

char mybuffer[my_size];

And it will be allocated on the stack.

Mark Segal
  • 5,427
  • 4
  • 31
  • 69
  • what if `my_size` *unknown* and compile time ? this is *not a solution* – RbMm Dec 13 '16 at 18:41
  • @RbMm well, VLAs are explicitly designed for that scenario – David Heffernan Dec 13 '16 at 18:43
  • @RbMm: That's what makes it a variable-length array. This IS a solution, if you have a C99 compiler. Most Windows compilers aren't. – Ben Voigt Dec 13 '16 at 18:43
  • @RbMM VLA is the definition of "when `my_size` is unknown in compile time" – Mark Segal Dec 13 '16 at 18:53
  • if we use windows this also worked ? if I want use CL but not gcc ? – RbMm Dec 13 '16 at 19:02
  • @RbMm: "Variable length arrays are not currently supported in Visual C++." – Ben Voigt Dec 13 '16 at 19:11
  • @BenVoigt - but `_alloca` worked in this place. and I know this exactly because very active use in self code. compiler partial internal implement this, but for move guard page calling `__chkstk` or also `_alloca_probe_16` (on x86 only). this function implementation we can got say from obj in VC complect or ntdllp.lib. interesting what is wrong in my answer ? – RbMm Dec 13 '16 at 19:15
  • also VLA can not be used in loop - for additional space allocation - so this much less functional solution – RbMm Dec 13 '16 at 23:41
  • I believe this solution will call __chkstk or another C runtime function, so it isn't really any different to using _alloca. – Harry Johnston Dec 14 '16 at 01:41
  • @HarryJohnston - yes, this really call analog of __chkstk internaly, *but* this can not be used in loop for additional allocations, by syntax limitation, when _alloca - can. I very frequently use this feature in self code – RbMm Dec 14 '16 at 02:00
  • i mean next template `do { if (cb < rcb) cb = RtlPointerToOffset(buf = alloca(rcb - cb), stack); status = SomeFunc(buf, cb, &rcb); } while(status == );` - how this implement with VLA – RbMm Dec 14 '16 at 02:04
1

Because alloca manipulates the stack pointer, it isn't a "real" function, it's a "compiler intrinsic". If you compile a function that uses alloca to assembly language, you should see that it is translated directly to sub esp, NNN rather than call alloca. (There might be a call to a function in addition to the sub esp, NNN. In that case you need to find out what that function does, where it's normally defined, and arrange for your application to provide a substitute. You're already jumping through all sorts of unusual hoops to use nothing but NTDLL, this is just one more.)

If you do see call alloca and no sub esp, NNN, that is very likely to mean that your compiler has only a fake implementation of alloca that is not giving you memory allocated from the stack, and you shouldn't use it at all.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • 2
    In addition to adjusting the stack pointer, having `alloca` or a VLA used inside the function body forces the compiler to use the long prolog and epilog, e.g. `push ebp; mov ebp, esp; sub esp, NNN` .... `mov esp, ebp; pop ebp` instead of simply `sub esp, NNN` ... `add esp, NNN` – Ben Voigt Dec 13 '16 at 18:49
  • we need ckeck stack allocation - which `CL` do by call `__chkstk` which already not "compiler intrinsic" - you forget about guard pages in stack – RbMm Dec 13 '16 at 18:51
  • @zwol - are you using stack allocations in self code, have experience in this ? – RbMm Dec 13 '16 at 18:58
  • @RbMm Yes. Not specifically in the environment OP is asking about, but a stack is a stack and there's only a few ways `alloca` can be implemented. I believe this answer is essentially in agreement with yours; I didn't give the name of the function that might be called, because it's totally compiler-dependent what its name is or whether there even _is_ a function call, but I'm thinking of the same thing you are. – zwol Dec 13 '16 at 19:41
  • @RbMm if you've never heard of the `alloca` implementation being fake, see http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/alloca.c – zwol Dec 13 '16 at 19:43
  • i oriented to `CL` - may be i wrong, but faster of all OP used it. simply down stack pointer not a problem, but we need check, move guard page. this is windows os requirements. `CL` call `_alloca_probe_16` (for x86) or `__chkstk` (x64) implementation of this small independent functions can be found in `VC` folder and also in `ntdllp.lib` (native if OP use ntdll) - so i give absolute concrete answer how use this – RbMm Dec 13 '16 at 19:48