6

Is there a relationship between DLL size in memory and size on the hard disk?

This is because I am using Task Manager extension (MS), and I can go to an EXE in the list and right click -> Module, then I can see all the DLLs this EXE is using. It has a Length column, but is it in bytes? And the value (Length) of the DLL seems to be different from the (DLL) size on the hard disk. Why?

Craig McQueen
  • 41,871
  • 30
  • 130
  • 181
RoundPi
  • 5,819
  • 7
  • 49
  • 75
  • 1
    If the dll performs dynamic allocations, are you trying to include that? – Ben Voigt Mar 19 '12 at 00:43
  • Can you give an example? Is the "length" column smaller than the file size? Larger? Rounded up to the next multiple of 4K? What kind of relationship is there between the two numbers? – André Caron Mar 19 '12 at 01:30
  • @Andre: give you a example, MSVCR80 is 632565 bytes on disk on TMExtention, it's showing 634880. Also TAExtention showed two copied of MSVCP71.dll with different length. One is 503808 another one is 352256 while on disk it's actually 503808(same as one listed in the TMExtention). – RoundPi Mar 19 '12 at 18:34
  • So MSVCR80's size is rounded up to the next 4K boundary (ceil(632565/4096)*4096 == 634880). The other two sizes are also integer multiples of 4K. In the case of MSVCP71, you might have more than 1 on disk (e.g. through multiple installations of Visual Studio redistributables). Dos the TMExtension show the full file paths of the two MSVCP71.dll files loaded into memory? – André Caron Mar 19 '12 at 19:03
  • @Andre: yes I think it does how path,I will have look tmr. So you mean it's always multiple times of 4k for 32bits system & should always around the dll file size on disk. – RoundPi Mar 19 '12 at 21:37
  • @Gob00st: note that the actual disk space consumption is also rounded up to some nearest power of two (4K is also a popular choice for disk sector size). If you check the file properties, you should see "file size" and "file size on disk". There are some important remarks in most of the answers below. These may explain extra difference between the "length" field in the TMExtensions and the file size, if any. – André Caron Mar 19 '12 at 22:02
  • @Andre: I have just checked and strangely the MSVCR71D.dll has been listed twice with differnt length but with the SAME path(c:\windows\systems32\) ! One is about the same(round up) size of the dll size on disk but the other one is higher 770048... Could this be some dynamically allocated memory ? – RoundPi Mar 20 '12 at 16:23
  • @Gob00st: I strongly doubt this is due to dynamic memory allocation, or all of the libraries would have unexplained sizes. Perhaps it uses [thread-local storage](http://en.wikipedia.org/wiki/Thread-local_storage) and the second process uses sufficiently many threads, resulting in the module being extended or something. Maybe the second process is 64-bit and som re-mapping is being done. You'd have to dig deep into the Microsoft PE format and the library loader to get an actual list of factors that can influence this reading. – André Caron Mar 20 '12 at 20:42
  • @Andre: thanks for all your input ! I am reviewing the PE file format/trying to manually write one so I may understand it better later. Cheers. – RoundPi Mar 21 '12 at 00:37

5 Answers5

5

There's a relationship, but it's not entirely direct or straightforward.

When your DLL is first used, it gets mapped to memory. That doesn't load it into memory, just allocates some address space in your process where it can/could be loaded when/if needed. Then, individual pages of the DLL get loaded into memory via demand paging -- i.e., when you refer to some of the address space that got allocated, the code (or data) that's mapped to that/those address(es) will be loaded if it's not already in memory.

Now, the address mapping does take up a little space (one 4K page for each megabyte of address space that gets mapped). Of course, when you load some data into memory, that uses up memory too.

Note, however, that most pages can/will be shared between processes too, so if your DLL was used by 5 different processes at once, it would be mapped 5 times (i.e., once to each process that used it) but there would still only be one physical copy in memory (at least normally).

Between those, it can be a little difficult to even pin down exactly what you mean by the memory consumption of a particular DLL.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • @HansPassant: "in memory == in random access memory"? What are you trying to clarify? – André Caron Mar 19 '12 at 01:25
  • "in memory" is a quite ambiguous term. "In RAM" is one interpretation, "in virtual memory / address space" another. The problem here is that Jerry contrasts the two. – MSalters Mar 19 '12 at 09:52
2

There are two parts that come into play in determining the size of a dll in memory:

  1. As everyone else pointed out, dll's get memory mapped, this leads to thier size being page aligned (on of the reasons preferred load addresses from back in the day had to be page aligned). generally, page alignment is 4Kb for 32bit systems, 8Kb for 64 bit systems (for a more indepth look at this on windows, see this).
  2. Dll's contain a segment for uninitialized data, on disk this segment is compressed, generally to a base + size, when the dll is loaded and initialized, the space for the .bss segment gets allocated, increasing its size. Generally this a small and will be absored by the page alignment, but if a dll contains huge static buffers, this can balloon its virtualized size.
Necrolis
  • 25,836
  • 3
  • 63
  • 101
  • Point #2 is interesting. I'd like to know more; do you have a reference to read up on this? – André Caron Mar 19 '12 at 01:29
  • 1
    @AndréCaron: its slightly compiler/implementation specific, so best I can give is a wikipedia link: http://en.wikipedia.org/wiki/.bss – Necrolis Mar 19 '12 at 13:08
1

The memory footprint will usually be bigger than on disk size because when it is mapped into memory it is page aligned. Standard page sizes are 4KB and 8KB so if your dll is 1KB of code its still going to use 4KB in memory.

0-0
  • 482
  • 4
  • 11
  • Not entirely true, hard drives also have sectors of about 4KB, it's just less obvious. – Thomas Mar 18 '12 at 20:42
  • @Thomas: I assume OP is talking about the real file size, not the "size used on disk". – André Caron Mar 19 '12 at 01:27
  • @AndréCaron there is no difference. It's the same concept for both memory and disk, they both allocate data in pages/sectors. So the "real file size" would be equivalent to the "memory used without taking into account pages", which is not very useful. – Thomas Mar 19 '12 at 01:38
  • @Thomas: Windows reports two different file sizes: the "real file size" and the "file size used on disk" (the "real file size" rounded up to the next integer multiple of the sector size). My comment was there to point out that we don't know which one the poster is using. – André Caron Mar 19 '12 at 01:41
  • @AndréCaron ah, I see. Of course, one can use either of those, but we have to stay consistent with how we define "size" on computer media. Point taken. – Thomas Mar 19 '12 at 02:19
1

Don't think of a .dll or a .exe as something that gets copied into memory to be executed.

Think of it as a set of instructions for the loader. Sure it contains the program and static data text. More importantly, it contains all the information allowing that text to be relocated, and to have all its unsatisfied references hooked up, and to export references that other modules may need.

Then if there's symbol and line number information for debugging, that's still more text.

So in general you would expect it to be larger than the memory image.

Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
0

It all depends on what you call "memory", and what exactly does your TaskManager extension show.

Every executable module (Exe/Dll) is mapped into an address space. The size of this mapping equals to its size. And, I guess, this is what your "extension" shows to you.

valdo
  • 12,632
  • 2
  • 37
  • 67