Archive for the ‘General’ Category

Trimming executables Part 1: Getting rid of the C runtime

Tuesday, September 9th, 2008

In a previous post I mentioned my problems with the Visual C Runtimes past version 6.0. This post is post 1 of n, with n having a high probability of being 2, on how to trim executable / download size. I will explain how to get rid of the beast that is the CRT, while retaining functionality.

The simplest way: Don’t use it altogether

For a very small project, most of the time, the only reason you actually use the CRT, is because you don’t know that you can not use it. But you can tell the compiler that you don’t want it, and here’s how.

The first thing to do is to turn on the /Zl (omit default library names) compiler option in your project / makefile. I find this cleaner than using the /NODEFAULTLIB linker option, because it only affects the CRT default libraries. Anyway, at this point, you’ve already done it, the CRT will not be linked anymore in your binary.

But suddenly, your project will not link anymore. One of the first reasons for that is because the CRT inserts its own entry point and the linker expects to find it, but since you have removed the CRT you’ll have to supply your own versions of these. Hopefully, the names of the entry points don’t change much, it’s just a matter of picking the right one:

// __DllMainCRTStartup@12
extern "C" BOOL WINAPI _DllMainCRTStartup(HINSTANCE, DWORD, LPVOID);
// _WinMainCRTStartup
extern "C" __declspec(noreturn) void __cdecl WinMainCRTStartup(void);
// _mainCRTStartup
extern "C" __declspec(noreturn) void _cdecl mainCRTStartup(void);

You’ll instantly notice that the WinMain and main entry points are declared with __declspec(noreturn) and have a void return type. That is not totally true. You can return from that function, at least in Windows NT, and BaseProcessStart will happily terminate the thread. But only the thread. That is why I prefer to end these functions with a call to ExitProcess instead.

As for the DllMain entry points, there is a big probability that it won’t be needed. The CRT absolutely needs it to do some initialization, but if you’re like me, and try to do as little as possible in this function, you probably just have a call to DisableThreadLibraryCalls and then return TRUE. If that is the case, you can use the /NOENTRY linker option, which will link the DLL with no entry point at all.

Next, you will encounter linking errors because several “special” CRT symbols are referenced by code that is added by the compiler. Hopefully, these are easy to weed out:

Symbol reference How to get rid of it
___security_cookie
@__security_check_cookie@4
Setting the /GS- compiler option. (VS: set “Buffer Security Check” to No)
@_RTC_CheckStackVars@8
__RTC_CheckEsp
__RTC_Shutdown
__RTC_InitBase
Removing all the /RTCxx compiler option. (VS: set “Basic Runtime Checks” to Default)
__chkstk Not using more than 4k of stack variables.
_memset
_memcpy
Not using the assignment operator to initialize structs inside functions, or to copy between structs.

Note that is you are doing operations on __int64’s or using structured exception handlers (SEH), you will need the CRT. More on this later.

The last part on getting rid of the CRT is to replace CRT calls to equivalent API calls. I have put together a quick little table mapping the must used functions:

CRT function Win32 API function
strxxx lstrxxx
sprintf wsprintf
vsprintf wvsprintf
malloc HeapAlloc
realloc HeapReAlloc
free HeapFree

I’m not going to list every function out there, but the point is, for the most simplistic functions of the CRT, there is often a straightforward Win32 API equivalent. When there is not, it is usually simple to code one (such as atol). But, as you can see, as soon as the project grows a little, it becomes quickly unusable in practice.

Also, note that wsprintf and wvsprintf do not support floating point.

The fraught-with-peril way: Use it a little.

The problem with the CRT is the lengthy initialization code it adds to an EXE, even when linked dynamically. It seems to want to call every KERNEL32 API under the sun, even if in the end you will just use it for a string compare or two.

But sometimes you absolutely need to link in some stand-alone code from the CRT, such as the __int64 support code (__xllmul and __xlldiv, respectively). One quick way is to skip the initialization code by redirecting the entry point with the /ENTRY linker option. That way, execution starts right to your code, and (in your release build at least) the CRT initialization code won’t be linked in at all.

I have to take a break and post a disclaimer here: skipping the CRT initialization code is dangerous. Lots of CRT functions, even simples ones as memcpy or strcmp, reference global CRT variables which need to be initialized. Use this at your own risk, and in case of doubt, double-check the CRT source code.

This therefore implies that you can’t really call CRT functions anyway, apart from standalone ones. So the same rules as if you were not using the CRT at all apply, except for the fact that accidentally referencing a CRT symbol will be much harder to spot because it won’t cause a link error anymore. So, obviously, this is not a way that I often use.

The safest way: Use another CRT

This is the method I use for KoroIRC. Basically, this is using another set of header files and library files when compiling the program.

I found that the safest way to do it is not to change VS’s library and include paths, but rather to build my release binaries using a special script, and build my debug builds using the regular CRT with the /MT compiler option. I don’t care if my debug binaries weigh 1MB and have tons of dead code in them, as I never ship them.

The trick is to make a little batch file like this:

@echo off
set INCLUDE=altcrt\include;%INCLUDE%
set LIB=altcrt\lib;%LIB%
devenv solution.sln /rebuild "Release^|Win32" /useenv

Replace altcrt with the proper path to your alternate CRT, run this in a Visual Studio Command Prompt, and let it churn.

I will now finish this post by presenting two alternate CRT’s you can use.

Lightweight but undocumented: NTDLL’s mini-CRT

A little-known fact is that NTDLL.DLL embeds its own CRT, probably for use by operating system components and native applications. Of course, you won’t find complex functions such as rand or time but all string functions, and more importantly, SEH support functions, are there. Since it is always loaded in all processes anyway, it costs nothing to use, except Windows 9x compatibility of course.

To use it, you’ll need the Windows DDK, which includes all the header files and library files needed in order to import from NTDLL. However, in order to be able to use it with PSDK-compiled applications, you will need to extract only the right files, that is, the whole contents of the inc\crt directory and the lib\wxp\i386\ntdll.lib file for x86 or lib\wnet\amd64\ntdll.lib for x64. Then it is only a matter of using them with the little script I presented earlier.

Harder but safer: Good old MSVCRT

In order to do that, you will need two things: an old Visual Studio 6.0 CD, and a copy of the Windows Server 2003 SP1 Platform SDK that was released just before Visual Studio 2005.

On the Visual Studio 6.0 CD, you will be able to extract the VS6 CRT headers from the VC98\INCLUDE directory, and the libraries from the VC98\LIB directory. The only problem is that they are mixed with Platform SDK files, so you’ll have to pick them one by one. Luckily, I have thown together this list of what files belong to the CRT.

The Platform SDK release linked above is special because it contailed a prerelease version of the x64 compiler that came with Visual Studio 2005. As such it also contains everything necessary in order to link with Windows x64’s MSVCRT.DLL. Hopefully, this time, the headers are in their own subdirectory, Include\crt. The libraries are in Lib\amd64 and the PDB files are in NoRedist\Win64\AMD64.

With all this in hand, you should be able to link with MSVCRT.DLL on x86 and x64 using the script.

Conclusion

With these methods, it is possible to reduce either the executable size or the download size (for not having to redistribute more recent CRTs) for your program. Of course, this only applies if you program mostly in C, because when you start using C++ or worse, the STL, these approaches stop working well.

Visual C++ woes

Tuesday, June 17th, 2008

It seems that with every new version of Visual Studio, Microsoft does something to alienate its users in regard to the C runtime, compiler and/or linker.

Back in the days of Visual Studio 6.0 (still, in my opinion, the best version to have ever existed, despite its age1), things were simple. There were no useless compiler flags to disable in every project, and the C runtime library was MSVCRT.DLL, which was distributed with the operating system. Actually, this is not fully the case, but MSVCRT.DLL has always been there due to the strong reliance of operating system components on it. It was therefore very simple to create small applications and to distribute them.

Then came Visual Studio 7.0, a.k.a. Visual Studio .NET 2002. Suddenly, someone somewhere decided that MSVCRT.DLL belonged “to the operating system” and not to Visual C++, and that therefore, its C runtime should now reside in a new, separate, redistributable DLL, MSVCR70.DLL. So suddenly, there was no real incentive to compile with /MD, as it would create the need to distribute a ~330k DLL along with the application. Personally, I would have remained with Visual Studio 6.0, but in the latest Platform SDK, which at the time was the then-new Windows XP SP2 Platform SDK, suddenly the format of the .lib files was just slightly incompatible with the aging linker.

I think it was in Visual Studio 7.1, which, by the way, I have never used personally, that the damned /GS compiler flag was introduced. What a better way to inflate code size uselessly than to add a prolog and epilog in every function that checks for a “security cookie”? Personally, I never allocate buffers and strings on the stack anyway, and when I do, I very strictly validate their length, so the protection offered by this flag is quite nil to me.

Then, in Visual Studio 8.0 (2005), somebody sure got angered at all the developers that were copying MSVCR7x.DLL around freely, so it was spoken: “From now on, the C runtime will only be installed by our own installer”. So, to install MSVCR80.DLL on a system, it’s not as simple anymore as just including the file in your installer and dumping it in the application’s directory during installation, you have to have a freaking manifest to even be able to load it! And of course, since this uses Fusion, the files need to be installed into specific subdirectories of the WinSxS directory, and therefore, the only “installer” qualified to do it correctly is the Visual C++ 2005 Reditributable Package, which weighs a whopping 2.6 MB! Quite a lot for what would have been an application of a few hundred kilobytes, now isn’t it? And to make sure you don’t ever try to bypass this “rule” and copy it in the application’s directory, the DLL does a check at startup to see if it was loaded “correctly”, and, if not, refuses to load.

Now, it isn’t that bad, I found a workaround to that (which I will explain in a future post), so I got used to that version and grew to like it.

I then recently got Visual Studio 9.0 (2008) and installed it. Not that it was really useful, since I already had the Windows Vista (February CTP) SDK installed over my Visual Studio 2005 installation, but I installed it just to be using the latest version. Actually, it’s not so bad, it’s even an improvement over the previous version. The /DYNAMICBASE and /NXCOMPAT flags are enabled by default (unless you migrate your project from Visual Studio 2005), and the operating system and subsystem version number in the PE headers were finally bumped to being set to 5.0 by default. That last point is also the frustrating point though. It was probably done that way because version 9.0 of the C runtime wouldn’t run correctly on anything lower than 5.0 anyway, but if you are not using the C runtime at all, there is no reason to set them to 5.0, other than artificially restricting the versions of Windows your program can run on. So, you might say, “let’s just tell the linker to set them back to 4.0 then”. Except, when you try to do that, you get this lovely little error:

LINK : warning LNK4010: invalid subsystem version number 4.0; default subsystem version assumed

Yes, you guessed right. The linker won’t let you set 4.0 as the subsystem version number. It looks quite like an useless restriction put there just to annoy some of us who still support, or just want to code for Windows 9x/NT4.

But as always, something like that isn’t enough to stop me, I created my own workaround:

The fact that an executable linked with Visual Studio 2008, but not using its C runtime, runs fine on Windows NT 4.0 afterwards proves how much of an artificial restriction that is.

The tool is up for download here.

  1. It’s not so bad once you install WndTabs and work around the few bugs in the resouce editor.

Migrating a project from VS2005 to VS2008 disables ASLR

Friday, June 13th, 2008

I just spent about an hour trying to figure out why some of KoroIRC’s binaries did not have the IMAGE_DLLCHARACTERISTICS_DYNAMIC_BASE and IMAGE_DLLCHARACTERISTICS_NX_COMPAT flags enabled in their PE header, and some other had them.

Oddly, only the older projects in the solution had this problem, the newer were OK. I spent a while wondering if it might have been a linker bug, then I decided to look at the build log. Surprisingly, the affected projects had the /DYNAMICBASE:NO linker switch specified, even though I specified the opposite in the solution-global .vsprops files.

So I decided to take a look directly in the .vsproj files, and lo and behold, in the linker options, the following lines where there:

RandomizedBaseAddress="1"
DataExecutionPrevention="0"

Since I never add stuff directly to the .vsproj files, because I prefer .vsprops files which allow sharing common settings between projects, then in the migration from VS2005 to VS2008, it must have “helpfully” inserted these for me.

To quote Stan from South Park, “I learned something today…”