Trimming executables Part 1: Getting rid of the C runtime
In a previous post I mentioned my problems with the Visual C Runtimes past version 6.0. This post is post 1 of n, with n having a high probability of being 2, on how to trim executable / download size. I will explain how to get rid of the beast that is the CRT, while retaining functionality.
The simplest way: Don’t use it altogether
For a very small project, most of the time, the only reason you actually use the CRT, is because you don’t know that you can not use it. But you can tell the compiler that you don’t want it, and here’s how.
The first thing to do is to turn on the /Zl (omit default library names) compiler option in your project / makefile. I find this cleaner than using the /NODEFAULTLIB linker option, because it only affects the CRT default libraries. Anyway, at this point, you’ve already done it, the CRT will not be linked anymore in your binary.
But suddenly, your project will not link anymore. One of the first reasons for that is because the CRT inserts its own entry point and the linker expects to find it, but since you have removed the CRT you’ll have to supply your own versions of these. Hopefully, the names of the entry points don’t change much, it’s just a matter of picking the right one:
// __DllMainCRTStartup@12 extern "C" BOOL WINAPI _DllMainCRTStartup(HINSTANCE, DWORD, LPVOID); // _WinMainCRTStartup extern "C" __declspec(noreturn) void __cdecl WinMainCRTStartup(void); // _mainCRTStartup extern "C" __declspec(noreturn) void _cdecl mainCRTStartup(void);
You’ll instantly notice that the WinMain and main entry points are declared with __declspec(noreturn) and have a void return type. That is not totally true. You can return from that function, at least in Windows NT, and BaseProcessStart will happily terminate the thread. But only the thread. That is why I prefer to end these functions with a call to ExitProcess instead.
As for the DllMain entry points, there is a big probability that it won’t be needed. The CRT absolutely needs it to do some initialization, but if you’re like me, and try to do as little as possible in this function, you probably just have a call to DisableThreadLibraryCalls and then return TRUE. If that is the case, you can use the /NOENTRY linker option, which will link the DLL with no entry point at all.
Next, you will encounter linking errors because several “special” CRT symbols are referenced by code that is added by the compiler. Hopefully, these are easy to weed out:
| Symbol reference | How to get rid of it |
|---|---|
___security_cookie@__security_check_cookie@4
|
Setting the /GS- compiler option. (VS: set “Buffer Security Check” to No) |
@_RTC_CheckStackVars@8__RTC_CheckEsp__RTC_Shutdown__RTC_InitBase
|
Removing all the /RTCxx compiler option. (VS: set “Basic Runtime Checks” to Default) |
__chkstk
|
Not using more than 4k of stack variables. |
_memset_memcpy
|
Not using the assignment operator to initialize structs inside functions, or to copy between structs. |
Note that is you are doing operations on __int64’s or using structured exception handlers (SEH), you will need the CRT. More on this later.
The last part on getting rid of the CRT is to replace CRT calls to equivalent API calls. I have put together a quick little table mapping the must used functions:
| CRT function | Win32 API function |
|---|---|
strxxx |
lstrxxx |
sprintf |
wsprintf |
vsprintf |
wvsprintf |
malloc |
HeapAlloc |
realloc |
HeapReAlloc |
free |
HeapFree |
I’m not going to list every function out there, but the point is, for the most simplistic functions of the CRT, there is often a straightforward Win32 API equivalent. When there is not, it is usually simple to code one (such as atol). But, as you can see, as soon as the project grows a little, it becomes quickly unusable in practice.
Also, note that wsprintf and wvsprintf do not support floating point.
The fraught-with-peril way: Use it a little.
The problem with the CRT is the lengthy initialization code it adds to an EXE, even when linked dynamically. It seems to want to call every KERNEL32 API under the sun, even if in the end you will just use it for a string compare or two.
But sometimes you absolutely need to link in some stand-alone code from the CRT, such as the __int64 support code (__xllmul and __xlldiv, respectively). One quick way is to skip the initialization code by redirecting the entry point with the /ENTRY linker option. That way, execution starts right to your code, and (in your release build at least) the CRT initialization code won’t be linked in at all.
I have to take a break and post a disclaimer here: skipping the CRT initialization code is dangerous. Lots of CRT functions, even simples ones as memcpy or strcmp, reference global CRT variables which need to be initialized. Use this at your own risk, and in case of doubt, double-check the CRT source code.
This therefore implies that you can’t really call CRT functions anyway, apart from standalone ones. So the same rules as if you were not using the CRT at all apply, except for the fact that accidentally referencing a CRT symbol will be much harder to spot because it won’t cause a link error anymore. So, obviously, this is not a way that I often use.
The safest way: Use another CRT
This is the method I use for KoroIRC. Basically, this is using another set of header files and library files when compiling the program.
I found that the safest way to do it is not to change VS’s library and include paths, but rather to build my release binaries using a special script, and build my debug builds using the regular CRT with the /MT compiler option. I don’t care if my debug binaries weigh 1MB and have tons of dead code in them, as I never ship them.
The trick is to make a little batch file like this:
@echo off set INCLUDE=altcrt\include;%INCLUDE% set LIB=altcrt\lib;%LIB% devenv solution.sln /rebuild "Release^|Win32" /useenv
Replace altcrt with the proper path to your alternate CRT, run this in a Visual Studio Command Prompt, and let it churn.
I will now finish this post by presenting two alternate CRT’s you can use.
Lightweight but undocumented: NTDLL’s mini-CRT
A little-known fact is that NTDLL.DLL embeds its own CRT, probably for use by operating system components and native applications. Of course, you won’t find complex functions such as rand or time but all string functions, and more importantly, SEH support functions, are there. Since it is always loaded in all processes anyway, it costs nothing to use, except Windows 9x compatibility of course.
To use it, you’ll need the Windows DDK, which includes all the header files and library files needed in order to import from NTDLL. However, in order to be able to use it with PSDK-compiled applications, you will need to extract only the right files, that is, the whole contents of the inc\crt directory and the lib\wxp\i386\ntdll.lib file for x86 or lib\wnet\amd64\ntdll.lib for x64. Then it is only a matter of using them with the little script I presented earlier.
Harder but safer: Good old MSVCRT
In order to do that, you will need two things: an old Visual Studio 6.0 CD, and a copy of the Windows Server 2003 SP1 Platform SDK that was released just before Visual Studio 2005.
On the Visual Studio 6.0 CD, you will be able to extract the VS6 CRT headers from the VC98\INCLUDE directory, and the libraries from the VC98\LIB directory. The only problem is that they are mixed with Platform SDK files, so you’ll have to pick them one by one. Luckily, I have thown together this list of what files belong to the CRT.
The Platform SDK release linked above is special because it contailed a prerelease version of the x64 compiler that came with Visual Studio 2005. As such it also contains everything necessary in order to link with Windows x64’s MSVCRT.DLL. Hopefully, this time, the headers are in their own subdirectory, Include\crt. The libraries are in Lib\amd64 and the PDB files are in NoRedist\Win64\AMD64.
With all this in hand, you should be able to link with MSVCRT.DLL on x86 and x64 using the script.
Conclusion
With these methods, it is possible to reduce either the executable size or the download size (for not having to redistribute more recent CRTs) for your program. Of course, this only applies if you program mostly in C, because when you start using C++ or worse, the STL, these approaches stop working well.