• Welcome to Powerbasic Museum 2020-B.
 

Reducing Executable Size In C/C++ Programs By Eliminating The C Runtime

Started by Frederick J. Harris, October 16, 2014, 03:25:27 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Frederick J. Harris

Hutch recently posted a Windows GUI template program in the PowerBASIC forums ( a C program) which he says compiles to 1.5 K.  I wasn't able to achieve that, but I'm not sure what compiler switches and optimizations he used.  But anyway, his posting of that caused me to look into the issue further.  I've always been intrigued by trying to get my programs as small and efficient as possible, but I've never gone to what I'd consider extraordinary pains to do so.  In my C or C++ work I usually just make sure I'm doing a Release build, I set the compiler to use size optimization, and I flick whatever switches are necessary to remove debug info from the executables.

In terms of exe sizes, I've always had better luck with the older GCC compiler suite than with Microsoft compilers.  Even with the old VC 6 its hard to get GUI templates down below 30 K.  And of course every new edition of any of the compilers just makes things worse.  The latest Microsoft compiler I have is VC9, and the executables are pretty large.  However, if you use the technique I'm going to describe next, I've been able to get a VC6 or VC9 GUI template down to 3584 bytes or 4096 on disk.  I'm getting the same results whether I use my message cracker scheme with function pointers and the for loop in the WndProc, or the slightly less verbose standard WndProc with switch.  Here is my test program...


//Main.h
#ifndef Main_h
#define Main_h

#define dim(x) (sizeof(x) / sizeof(x[0]))

struct WndEventArgs
{
HWND                         hWnd;
WPARAM                       wParam;
LPARAM                       lParam;
HINSTANCE                    hIns;
};

long fnWndProc_OnCreate       (WndEventArgs& Wea);
long fnWndProc_OnDestroy      (WndEventArgs& Wea);

struct EVENTHANDLER
{
unsigned int                 iMsg;
long                         (*fnPtr)(WndEventArgs&);
};

const EVENTHANDLER EventHandler[]=
{
{WM_CREATE,                  fnWndProc_OnCreate},
{WM_DESTROY,                 fnWndProc_OnDestroy}
};
#endif



// Main.cpp
// cl Main.cpp libctiny.lib Kernel32.lib User32.lib Gdi32.lib /O1 /FeForm1.exe
#include <windows.h>
#include <tchar.h>
#include "AggressiveOptimize.h"
#include "Form4.h"


long fnWndProc_OnCreate(WndEventArgs& Wea)
{
Wea.hIns=((LPCREATESTRUCT)Wea.lParam)->hInstance;
return 0;
}


long fnWndProc_OnDestroy(WndEventArgs& Wea)
{
PostQuitMessage(0);
return 0;
}



LRESULT CALLBACK fnWndProc(HWND hwnd, unsigned int msg, WPARAM wParam, LPARAM lParam)
{
WndEventArgs Wea;

for(unsigned int i=0; i<dim(EventHandler); i++)
{
     if(EventHandler[i].iMsg==msg)
     {
        Wea.hWnd=hwnd, Wea.lParam=lParam, Wea.wParam=wParam;
        return (*EventHandler[i].fnPtr)(Wea);
     }
}

return (DefWindowProc(hwnd, msg, wParam, lParam));
}


int WINAPI WinMain(HINSTANCE hIns, HINSTANCE hPrevIns, LPSTR lpszArgument, int iShow)
{
TCHAR szClassName[]=_T("Form4");
WNDCLASSEX wc;
MSG messages;
HWND hWnd;

wc.lpszClassName=szClassName;                wc.lpfnWndProc=fnWndProc;
wc.cbSize=sizeof (WNDCLASSEX);               wc.style=CS_BYTEALIGNCLIENT|CS_BYTEALIGNWINDOW;
wc.hIcon=LoadIcon(NULL,IDI_APPLICATION);     wc.hInstance=hIns;
wc.hIconSm=LoadIcon(NULL, IDI_APPLICATION);  wc.hCursor=LoadCursor(NULL,IDC_ARROW);
wc.hbrBackground=(HBRUSH)COLOR_BTNSHADOW;    wc.cbWndExtra=0;
wc.lpszMenuName=NULL;                        wc.cbClsExtra=0;
RegisterClassEx(&wc);
hWnd=CreateWindowEx(0,szClassName,szClassName,WS_OVERLAPPEDWINDOW,75,75,320,305,HWND_DESKTOP,0,hIns,0);
ShowWindow(hWnd,iShow);
while(GetMessage(&messages,NULL,0,0))
{
    TranslateMessage(&messages);
    DispatchMessage(&messages);
}

return messages.wParam;
}


And here's a link to the article I used to manage this all...

http://www.catch22.net/tuts/reducing-executable-size

You'll note the "AggressiveOptimize.h" header file included above.  That is also available at that link, but here it is...



//////////////////////////////
// Version 1.10
// Jan 23rd, 2000
// Version 1.00
// May 20th, 1999
// Todd C. Wilson, Fresh Ground Software
// (todd@nopcode.com)
// This header file will kick in settings for Visual C++ 5 and 6 that will (usually)
// result in smaller exe's.
// The "trick" is to tell the compiler to not pad out the function calls; this is done
// by not using the /O1 or /O2 option - if you do, you implicitly use /Gy, which pads
// out each and every function call. In one single 500k dll, I managed to cut out 120k
// by this alone!
// The other two "tricks" are telling the Linker to merge all data-type segments together
// in the exe file. The relocation, read-only (constants) data, and code section (.text)
// sections can almost always be merged. Each section merged can save 4k in exe space,
// since each section is padded out to 4k chunks. This is very noticable with smaller
// exes, since you could have only 700 bytes of data, 300 bytes of code, 94 bytes of
// strings - padded out, this could be 12k of runtime, for 1094 bytes of stuff!
// Note that if you're using MFC static or some other 3rd party libs, you may get poor
// results with merging the readonly (.rdata) section - the exe may grow larger.
// To use this feature, define _MERGE_DATA_ in your project or before this header is used.
// With Visual C++ 5, the program uses a file alignement of 512 bytes, which results
// in a small exe. Under VC6, the program instead uses 4k, which is the same as the
// section size. The reason (from what I understand) is that 4k is the chunk size of
// the virtual memory manager, and that WinAlign (an end-user tuning tool for Win98)
// will re-align the programs on this boundary. The problem with this is that all of
// Microsoft's system exes and dlls are not tuned like this, and using 4k causes serious
// exe bloat. Very noticable for smaller programs.
// The "trick" for this is to use the undocumented FILEALIGN linker parm to change the
// padding from 4k to 1/2k, which results in a much smaller exe - anywhere from 20%-75%
// depending on the size.


#ifdef NDEBUG
// /Og (global optimizations), /Os (favor small code), /Oy (no frame pointers)
#pragma optimize("gsy",on)

#pragma comment(linker,"/RELEASE")

// Note that merging the .rdata section will result in LARGER exe's if you using
// MFC (esp. static link). If this is desirable, define _MERGE_RDATA_ in your project.
#ifdef _MERGE_RDATA_
#pragma comment(linker,"/merge:.rdata=.data")
#endif // _MERGE_RDATA_

#pragma comment(linker,"/merge:.text=.data")
#pragma comment(linker,"/merge:.reloc=.data")

#if _MSC_VER >= 1000
// Only supported/needed with VC6; VC5 already does 0x200 for release builds.
// Totally undocumented! And if you set it lower than 512 bytes, the program crashes.
// Either leave at 0x200 or 0x1000
#pragma comment(linker,"/FILEALIGN:0x200")
#endif // _MSC_VER >= 1000

#endif // NDEBUG


The original reference for all this work was Matt Pietrek who used to write for Microsoft Systems Journal.  His web page, and the link where you can get his libctiny.lib file needed for all this, is here ...

http://www.wheaty.net/

The secret to this rather severe and outraegous level of optimatation is eliminating msvcrt code from being linked into the exe.  Through the /nodefaultlib switch in Visual Studio one can tell the linker not to link with that, but your program will no longer compile\link. So you need to use several of Matt Pietrek's substitutes for some of the startup code that needs to be called before the operating system calls main or WinMain().  The article gives the details. 

I used the IDE for my VC6 build, but for VC9 I just used this command line...

cl Main.cpp libctiny.lib Kernel32.lib User32.lib Gdi32.lib /O1 /FeForm1.exe

Its libctiny.lib that you need to download from Matt's site.

I'm really tickled with that 3584 figure.  Its got me to wondering if I might be able to make more use of this work.  If I make any more progress with this, I'll post about it here.  I'd be interested in looking at the 64 bit issue.  In Matt's download he has all the *.cpp and *.h files needed to create the lib, as well as a make file, but as of yet I haven't tried building it for myself in either 32 bit or 64 bit configuration.  I'm sure the 32 bit would work - don't know about 64 bit.


Frederick J. Harris

And I was able to find a link to Matt Pietrek's original Microsoft Systems Journal article ...

http://msdn.microsoft.com/library/bb985746.aspx


Patrice Terrier

Fred--

I don't think that would work with the current version(s) of C++ Visual Studio.

The only thing i have been able to do, was to use the Multithread (/MT) option, forcing the compiler to place the library name LIBCMT.lib into the .obj file so that the linker will use LIBCMT.lib to resolve external symbols.

...
Patrice Terrier
GDImage (advanced graphic addon)
http://www.zapsolution.com

Frederick J. Harris

My guess is it would Patrice, at least in terms of something like the basic template program I posted above, but I'm not sure how extensible it is in terms of more complex programs.  Anyway, aren't you using VS 2005 for some of your projects because it was making smaller executables for you?  I just seem to recall you stating that somewhere.  My VC9 that it worked for is with VS 2008.

Patrice Terrier

No, i am not using VS2005 anymore, but VS2010 (i have also VS2013, but keep using VS2010, for smallest size as you stated).

...
Patrice Terrier
GDImage (advanced graphic addon)
http://www.zapsolution.com

James C. Fuller

I really don't care about size anymore but when I want to try for the smallest I use TinyC.
Using bc9Basic I get this translation using standard windowsx.h message crackers.
32 - 3584
64 - 5120


// *********************************************************************
//  Created with bc9Basic - BASIC To C/C++ Translator (V) 9.1.9.0 (2014/09/08)
//       The bc9Basic translator (bc9.exe) was compiled with
//                           g++ (tdm64-2) 4.8.1
// ----------------------------------------------------------------------
//                 BCX (c) 1999 - 2009 by Kevin Diggins
// *********************************************************************

/* -------------------------------*/
/* Tiny C support for LinkRes2Exe */
/* -------------------------------*/
int dummy __attribute__ ((section(".rsrc")));
/* -------------------------------*/

//              Translated for compiling with a C Compiler
//                           On MS Windows
// *********************************************************************

typedef char *PCHAR, *LPCH, *PCH, *NPSTR, *LPSTR, *PSTR;
#include <windows.h>
#include <windowsx.h>

// *************************************************
//        User's GLOBAL ENUM blocks
// *************************************************

// *************************************************
//            System Defined Constants
// *************************************************

typedef const char* ccptr;
#define CCPTR const char*
#define cfree free
#define WAITKEY system("pause")
#define cSizeOfDefaultString 2048

// *************************************************
//            User Defined Constants
// *************************************************

// *************************************************
//          User Defined Types And Unions
// *************************************************


// *************************************************
//            User Global Variables
// *************************************************



// *************************************************
//               Standard Macros
// *************************************************

#define BOR |


// *************************************************
//               User Prototypes
// *************************************************

int     WINAPI WinMain (HINSTANCE, HINSTANCE, LPSTR, int);
LRESULT CALLBACK WndProc (HWND, UINT, WPARAM, LPARAM);
void    WndProc_OnDestroy (HWND);

// *************************************************
//            User Global Initialized Arrays
// *************************************************



// *************************************************
//                 Runtime Functions
// *************************************************


// ************************************
//       User Subs and Functions
// ************************************

int WINAPI WinMain (HINSTANCE hInst, HINSTANCE hPrev, LPSTR CmdLine, int CmdShow)
{
    char     szClassName[] = "Form4";
    WNDCLASSEX  wcx = {0};
    HWND     hWnd = {0};
    MSG      uMsg = {0};
    wcx.cbSize = sizeof( wcx);
    wcx.style = CS_HREDRAW  BOR  CS_VREDRAW;
    wcx.lpfnWndProc = WndProc;
    wcx.cbClsExtra = 0;
    wcx.cbWndExtra = 0;
    wcx.hInstance = hInst;
    wcx.hIcon = LoadIcon( NULL, IDI_WINLOGO);
    wcx.hCursor = LoadCursor( NULL, IDC_ARROW);
    wcx.hbrBackground = ( HBRUSH) GetStockObject( WHITE_BRUSH);
    wcx.lpszMenuName = NULL;
    wcx.lpszClassName = szClassName;
    hWnd = CreateWindowEx( 0, szClassName, szClassName, WS_OVERLAPPEDWINDOW, 75, 75, 320, 305, HWND_DESKTOP, (HMENU) 0, hInst, 0);
    if(hWnd == 0 )
    {
        MessageBox (GetActiveWindow(), ("Problem"), "", 0 );
        return 0;
    }
    ShowWindow(hWnd, SW_SHOW);
    while(GetMessage( &uMsg, NULL, 0, 0))
    {
        TranslateMessage( &uMsg);
        DispatchMessage( &uMsg);
    }

    return uMsg.wParam;
}


LRESULT CALLBACK WndProc (HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
    switch(msg)
    {
        HANDLE_MSG (hwnd, WM_DESTROY, WndProc_OnDestroy);
    default:
        return DefWindowProc (hwnd, msg, wParam, lParam);
    }
}

void WndProc_OnDestroy (HWND hWnd)
{
    PostQuitMessage(0);
}



Frederick J. Harris

#6
Hi James!  I'll look into that.

I just took a few minutes to see if Matt Pietrek's make file worked, and it did - like a charm!  Here's my command line output from compiling a couple dozen souece code files into objs and the tinyclib with NMAKE.  I used Visual Studio 2008 CV9 32 bit...



c:\Program Files\Microsoft Visual Studio 9.0\VC>NMAKE /?

Microsoft (R) Program Maintenance Utility Version 9.00.21022.08
Copyright (C) Microsoft Corporation.  All rights reserved.

Usage:  NMAKE @commandfile
        NMAKE [options] [/f makefile] [/x stderrfile] [macrodefs] [targets]

Options:

/A Build all evaluated targets
/B Build if time stamps are equal
/C Suppress output messages
/D Display build information
/E Override env-var macros
/ERRORREPORT:{NONE|PROMPT|QUEUE|SEND} Report errors to Microsoft
/G Display !include filenames
/HELP Display brief usage message
/I Ignore exit codes from commands
/K Build unrelated targets on error
/N Display commands but do not execute
/NOLOGO Suppress copyright message
/P Display NMAKE information
/Q Check time stamps but do not build
/R Ignore predefined rules/macros
/S Suppress executed-commands display
/T Change time stamps but do not build
/U Dump inline files
/Y Disable batch-mode
/? Display brief usage message

c:\Program Files\Microsoft Visual Studio 9.0\VC>
c:\Program Files\Microsoft Visual Studio 9.0\VC>cd C:\Code\VStudio\VC++9\libctiny

C:\Code\VStudio\VC++9\libctiny>NMAKE libctiny.mak

Microsoft (R) Program Maintenance Utility Version 9.00.21022.08
Copyright (C) Microsoft Corporation.  All rights reserved.

        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 CRT0TCON.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

CRT0TCON.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 CRT0TWIN.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

CRT0TWIN.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 DLLCRT0.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

DLLCRT0.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 ARGCARGV.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

ARGCARGV.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 PRINTF.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

PRINTF.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 SPRINTF.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

SPRINTF.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 PUTS.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

PUTS.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 ALLOC.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

ALLOC.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 ALLOC2.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

ALLOC2.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 ALLOCSUP.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

ALLOCSUP.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 STRUPLWR.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

STRUPLWR.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 ISCTYPE.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

ISCTYPE.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 ATOL.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

ATOL.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 STRICMP.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

STRICMP.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 NEWDEL.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

NEWDEL.CPP
        CL /c /W3 /DWIN32_LEAN_AND_MEAN /O1 INITTERM.CPP
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

INITTERM.CPP
        LIB /OUT:LIBCTINY.LIB CRT0TCON.OBJ CRT0TWIN.OBJ DLLCRT0.OBJ ARGCARGV.OBJ PRINTF.OBJ SPRINTF.OBJ PUTS.OBJ ALLOC.OBJ ALLOC2.OBJ ALLOCSUP.O
BJ STRUPLWR.OBJ  ISCTYPE.OBJ ATOL.OBJ STRICMP.OBJ NEWDEL.OBJ INITTERM.OBJ
Microsoft (R) Library Manager Version 9.00.21022.08
Copyright (C) Microsoft Corporation.  All rights reserved.


C:\Code\VStudio\VC++9\libctiny>cl Main.cpp libctiny.lib Kernel32.lib User32.lib Gdi32.lib /MT /O1 /FeForm1.exe
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

Main.cpp
Microsoft (R) Incremental Linker Version 9.00.21022.08
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:Form1.exe
Main.obj
libctiny.lib
Kernel32.lib
User32.lib
Gdi32.lib
libctiny.lib(CRT0TWIN.OBJ) : warning LNK4229: invalid directive '/OPT:NOWIN98' encountered; ignored
libctiny.lib(INITTERM.OBJ) : warning LNK4254: section '.CRT' (40000040) merged into '.data' (C0000040) with different attributes

C:\Code\VStudio\VC++9\libctiny>


I added the /MT switch to the end of my command line string for Form1.cpp, because I think by my mistaken mentioning of msvcrt.dll Patrice thought I was doing /MD...


cl Main.cpp libctiny.lib Kernel32.lib User32.lib Gdi32.lib /MT /O1 /FeForm1.exe


And I'm getting the same 3584 bytes you're getting with tiny c.

Tomorrow if I have time I might give it a go on x64.  I'm excited about this.  I think its cool.

Frederick J. Harris

Instead of going to bed I just tried to compile libctiny for x64 and got it!!!  One little problem that was easy to fix.   The replacement for operator new had an unsigned int parameter that needed to be changed to size_t.  I recall having to make all those replacements in my String Class.  Anyway, got the lib compiled, and created the x64 GUI template at 5120 bytes!  I'll sleep better now! :)