Windows Programming/Handles and Data Types
One of the first things that is going to strike many first-time programmers of the Win32 API is that there are tons and tons of old data types to deal with. Sometimes, just keeping all the correct data types in order can be more difficult than writing a nice program. This page will talk a little bit about some of the data types that a programmer will come in contact with.
Hungarian Notation
[edit | edit source]First, let's make a quick note about the naming convention used for some data types, and some variables. The Win32 API uses the so-called "Hungarian Notation" for naming variables. Hungarian Notation requires that a variable be prefixed with an abbreviation of its data type, so that when you are reading the code, you know exactly what type of variable it is. The reason this practice is done in the Win32 API is because there are many different data types, making it difficult to keep them all straight. Also, there are a number of different data types that are essentially defined the same way, and therefore some compilers will not pick up on errors when they are used incorrectly. As we discuss each data type, we will also note the common prefixes for that data type.
Putting the letter "P" in front of a data type, or "p" in front of a variable usually indicates that the variable is a pointer. The letters "LP" or the prefix "lp" stands for "Long Pointer", which is exactly the same as a regular pointer on 32 bit machines. LP data objects are simply legacy objects that were carried over from Windows 3.1 or earlier, when pointers and long pointers needed to be differentiated. On modern 32-bit systems, these prefixes can be used interchangeably.
LPVOID
[edit | edit source]LPVOID data types are defined as being a "pointer to a void object". This may seem strange to some people, but the ANSI-C standard allows for generic pointers to be defined as "void*" types. This means that LPVOID pointers can be used to point to any type of object, without creating a compiler error. However, the burden is on the programmer to keep track of what type of object is being pointed to.
Also, some Win32 API functions may have arguments labeled as "LPVOID lpReserved". These reserved data members should never be used in your program, because they either depend on functionality that hasn't yet been implemented by Microsoft, or else they are only used in certain applications. If you see a function with an "LPVOID lpReserved" argument, you must always pass a NULL value for that parameter - some functions will fail if you do not do so.
LPVOID objects frequently do not have prefixes, although it is relatively common to prefix an LPVOID variable with the letter "p", as it is a pointer.
DWORD, WORD, BYTE
[edit | edit source]These data types are defined to be a specific length, regardless of the target platform. There is a certain amount of additional complexity in the header files to achieve this, but the result is code that is very well standardized, and very portable to different hardware platforms and different compilers.
DWORDs (Double WORDs), the most commonly occurring of these data types, are defined always to be unsigned 32-bit quantities. On any machine, be it 16, 32, or 64 bits, a DWORD is always 32 bits long. Because of this strict definition, DWORDS are very common and popular on 32-bit machines, but are less common on 16-bit and 64-bit machines.
WORDs (Single WORDs) are defined strictly as unsigned 16-bit values, regardless of what machine you are programming on. BYTEs are defined strictly as being unsigned 8-bit values. QWORDs (Quad WORDs), although rare, are defined as being unsigned 64-bit quantities. Putting a "P" in front of any of these identifiers indicates that the variable is a pointer. putting two "P"s in front indicates it's a pointer to a pointer. These variables may be unprefixed, or they may use any of the prefixes common with DWORDs. Because of the differences in compilers, the definition of these data types may be different, but typically these definitions are used:
#include <stdint.h>
typedef uint8_t BYTE; typedef uint16_t WORD; typedef uint32_t DWORD; typedef uint64_t QWORD;
Notice that these definitions are not the same in all compilers. It is a known issue that the GNU GCC compiler uses the long and short specifiers differently from the Microsoft C Compiler. For this reason, the windows header files typically will use conditional declarations for these data types, depending on the compiler being used. In this way, code can be more portable.
As usual, we can define pointers to these types as:
#include <stdint.h>
typedef uint8_t * PBYTE; typedef uint16_t * PWORD; typedef uint32_t * PDWORD; typedef uint64_t * PQWORD;
typedef uint8_t ** PPBYTE; typedef uint16_t ** PPWORD; typedef uint32_t ** PPDWORD; typedef uint64_t ** PPQWORD;
DWORD variables are typically prefixed with "dw". Likewise, we have the following prefixes:
Data Type | Prefix |
---|---|
BYTE | "b" |
WORD | "w" |
DWORD | "dw" |
QWORD | "qw" |
LONG, INT, SHORT, CHAR
[edit | edit source]These types are not defined to a specific length. It is left to the host machine to determine exactly how many bits each of these types has.
- Types
typedef long LONG; typedef unsigned long ULONG; typedef int INT; typedef unsigned int UINT; typedef short SHORT; typedef unsigned short USHORT; typedef char CHAR; typedef unsigned char UCHAR;
- LONG notation
- LONG variables are typically prefixed with an "l" (lower-case L).
- UINT notation
- UINT variables are typically prefixed with an "i" or a "ui" to indicate that it is an integer, and that it is unsigned.
- CHAR, UCHAR notation
- These variables are usually prefixed with a "c" or a "uc" respectively.
If the size of the variable doesn't matter, you can use some of these integer types. However, if you want to exactly specify the size of a variable, so that it has a certain number of bits, use the BYTE, WORD, DWORD, or QWORD identifiers, because their lengths are platform-independent and never change.
STR, LPSTR
[edit | edit source]STR data types are string data types, with storage already allocated. This data type is less common than the LPSTR. STR data types are used when the string is supposed to be treated as an immediate array, and not as a simple character pointer. The variable name prefix for a STR data type is "sz" because it's a zero-terminated string (ends with a null character).
Most programmers will not define a variable as a STR, opting instead to define it as a character array, because defining it as an array allows the size of the array to be set explicitly. Also, creating a large string on the stack can cause greatly undesirable stack-overflow problems.
LPSTR stands for "Long Pointer to a STR", and is essentially defined as such:
#define STR * LPSTR;
LPSTR can be used exactly like other string objects, except that LPSTR is explicitly defined as being ASCII, not unicode, and this definition will hold on all platforms. LPSTR variables will usually be prefixed with the letters "lpsz" to denote a "Long Pointer to a String that is Zero-terminated". The "sz" part of the prefix is important, because some strings in the Windows world (especially when talking about the DDK) are not zero-terminated. LPSTR data types, and variables prefixed with the "lpsz" prefix can all be used seamlessly with the standard library <string.h> functions.
TCHAR
[edit | edit source]TCHAR data types, as will be explained in the section on Unicode, are generic character data types. TCHAR can hold either standard 1-byte ASCII characters, or wide 2-byte Unicode characters. Because this data type is defined by a macro and is not set in stone, only character data should be used with this type. TCHAR is defined in a manner similar to the following (although it may be different for different compilers):
#ifdef UNICODE #define TCHAR WORD #else #define TCHAR BYTE #endif
TSTR, LPTSTR
[edit | edit source]Strings of TCHARs are typically referred to as TSTR data types. More commonly, they are defined as LPTSTR types as such:
#define TCHAR * LPTSTR
These strings can be either UNICODE or ASCII, depending on the status of the UNICODE macro. LPTSTR data types are long pointers to generic strings, and may contain either ASCII strings or Unicode strings, depending on the environment being used. LPTSTR data types are also prefixed with the letters "lpsz".
HANDLE
[edit | edit source]HANDLE data types are some of the most important data objects in Win32 programming, and also some of the hardest for new programmers to understand. Inside the kernel, Windows maintains a table of all the different objects that the kernel is responsible for. Windows, buttons, icons, mouse pointers, menus, and so on, all get an entry in the table, and each entry is assigned a unique address known as a HANDLE. If you want to pick a particular entry out of that table, you need to give Windows the HANDLE value, and Windows will return the corresponding table entry.
HANDLEs are defined as void pointers (void*). They are used as unique identifiers to each Windows object in our program such as a button, a window an icon, etc. Specifically their definition follows: typedef PVOID HANDLE; and typedef void *PVOID; In other words HANDLE = void*.
HANDLEs are generally prefixed with an "h". Handles are unsigned integers that Windows uses internally to keep track of objects in memory. Windows moves objects like memory blocks in memory to make room, if the object is moved in memory, the handles table is updated.
Below are a few special handles that are worth discussing:
HWND
[edit | edit source]HWND data types are "Handles to a Window", and are used to keep track of the various objects that appear on the screen. To communicate with a particular window, you need to have a copy of the window's handle. HWND variables are usually prefixed with the letters "hwnd", just so the programmer knows they are important.
Canonically, main windows are defined as:
HWND hwnd;
Child windows are defined as:
HWND hwndChild1, hwndChild2...
and Dialog Box handles are defined as:
HWND hDlg;
Although you are free to name these variables whatever you want in your own program, readability and compatibility suffer when an idiosyncratic naming scheme is chosen - or worse, no scheme at all.
HINSTANCE
[edit | edit source]HINSTANCE variables are handles to a program instance. Each program gets a single instance variable, and this is important so that the kernel can communicate with the program. If you want to create a new window, for instance, you need to pass your program's HINSTANCE variable to the kernel, so that the kernel knows what program instance the new window belongs to. If you want to communicate with another program, it is frequently very useful to have a copy of that program's instance handle. HINSTANCE variables are usually prefixed with an "h", and furthermore, since there is frequently only one HINSTANCE variable in a program, it is canonical to declare that variable as such:
HINSTANCE hInstance;
It is usually a benefit to make this HINSTANCE variable a global value, so that all your functions can access it when needed.
HMENU
[edit | edit source]If your program has a drop-down menu available (as most visual Windows programs do), that menu will have an HMENU handle associated with it. To display the menu, or to alter its contents, you need to have access to this HMENU handle. HMENU handles are frequently prefixed with simply an "h".
WPARAM, LPARAM
[edit | edit source]In the earlier days of Microsoft Windows, parameters were passed to a window in one of two formats: WORD-length (16-bit) parameters, and LONG-length (32-bit) parameters. These parameter types were defined as being WPARAM (16-bit) and LPARAM (32-bit). However, in modern 32-bit and 64-bit systems, both WPARAM and LPARAM are able to hold a pointer, so are 32 resp. 64 bits long. The names however have not changed, for legacy reasons.
WPARAM and LPARAM variables are generic function parameters, and are frequently type-cast to other data types including pointers and DWORDs.