TN3049C.txt How to avoid memory Corruption
Category :OTHER
Platform :All
Product :BC All
Description:
Memory Corruption
POINTERS:
A pointer is a memory location that holds a memory address as its contents.
When you declare a pointer (e.g. int *foo) the compiler will allocate the
necessary space to hold the appropriate memory address. If the pointer is
declared globally the value of its address will be 0000 in the case of a
near pointer and 0000:0000 in the case of a far pointer. If the pointer
is declared inside a function definition as an auto variable (default),
then it is created on the stack and will have a default address of what
ever value happened to be at that location on the stack when it is
created. In either case these default memory addresses are invalid. This
is referred to as an uninitialized pointer and should never be
dereferenced. To initialize a pointer you must either dynamically
allocate memory using malloc, farmalloc, calloc, farcalloc, realloc,
farrealloc, allocmem or new (C++ only) or you can associate the pointer
with a variable who's memory is allocated by the compiler at compile
time (e.g. int foo[100]).
Once the pointer has been initialized it is then safe to dereference the
pointer using the * operator.
int *p; OK Uninitialized pointer
*p = 4; WRONG This puts a value of 4 at whatever address
happened to be stored where *p was created.
int i; OK Statically declared varible. Static meaning
the memory is allocated by the compiler at
compile time.
p = &i; OK Associating p with a statically declared
varible.
p = (int *) malloc(2); Associating p with dynamically allocated
OK memory. The parameter could also have been
sizeof(int).
Modifying the variable p (e.g. p = 0;) will make the memory address that
p holds equal to 0. When modifying dereferenced p, denoted by *p, the
value stored at the memory address p holds as its value is modified.
*p = 4; OK p still holds the address that malloc
returned above, but that address in memory
now holds a value of 4.
p = 4; OK* p now points to offset 4 from DS. It is
important to note that this type of assignment
should only be made if the address (4 in this
case) is valid. Otherwise memory corruption
will occur. Also note that if a far memory
model (see Memory Models) is used the segment
will not be DS but what ever its initial
value was it will remain unchanged.
There are 3 types of pointers. They are near, far, and huge.
Near pointers have a size of 2 bytes. They only store the offset of
the address the pointer is referencing. An address consisting of
only an offset has a range of 0 - 64K bytes starting from the
beginning of DGROUP. A near pointer can be incremented and
decremented using arithmetic operators (+, -, ++, and --) through the
entire address range. Any attempt to increment a near pointer that
has a value of 64K (0xffff) will result in a value of 0. This is
referred to as wrapping the pointer. A corresponding result can be
expected when attempting to decrement a pointer that contains an
address of 0, except the result will be 64K instead of 0. In addition
to being incremented and decremented, near pointers can be compared to
one another using relational operators ( <, >, ==, >= and <= ).
Far pointers have a size of 4 bytes. They store both the segment and
the offset of the address the pointer is referencing. A far pointer
has an address range of 0 - 1M bytes. It is important to understand
that an addressing range of 1M does not remove the 640K barrier from
the program. It means that the pointer can address the upper memory
area (641 - 1M) which typically contains video memory, ROM and anything
else that may be loaded high. A far pointer can be incremented
and decremented using arithmetic operators. When a far pointer is
incremented or decremented ONLY the offset of the pointer is actually
incremented or decremented. The segment is never incremented by the
arithmetic operators. This means that although a far pointer can
address up to 1Mb of memory, it can only be incremented through 64Kb
and the offset will start at zero again without changing the value
of the segment. This is referred to as "wrapping" the pointer (e.g.
0F3E:FFFF + 1 = 0F3E:0000). When a far pointer is decremented from
zero it will wrap the other way and become 64K. Far pointers are
not unique. It is possible to have two far memory addresses that
have different segments values and different offset values that
address the same memory location e.g. 0777:2222 has an absolute
address of 07770 + 2222 = 09992 and 0999:0002 has an absolute address
of 09990 + 0002 = 09992. When relational operators are used on far
pointers only the offsets are compared. For example: if we let
a = 0777:2222 and let b = 0999:0002 then a == b would return false
because this is equivalent to 2222 == 0002 which is in fact false.
In other words relational operators will only work on far pointers if
the segment values of the pointers being compared are the same.
Huge pointers have a size of 4 bytes. They store both the segment and
the offset of the address the pointer is referencing. A huge pointer
has an address range of 0 - 1M bytes. A huge pointer can be
incremented and decremented using arithmetic operators. The only
difference between a far pointer and a huge pointer is that a huge
pointer is normalized by the compiler. A normalized pointer is one
that has as much of the address as possible in the segment,
meaning that the offset is never larger than 15. A huge pointer is
normalized only when pointer arithmetic is performed on it. It is
not normalized when an assignment is made. You can cause it to be
normalized without changing the value by incrementing and then
decrementing it. The offset must be less than 16 because the segment
can represent any value greater than or equal to 16 (e.g. Absolute
address 0x17 in a normalized form would be 0001:0001. While a far
pointer could address the absolute address 0x17 with 0000:0017,
this is not a valid huge (normalized) pointer because the offset is
greater than 0000F.). Huge pointers can also be incremented and
decremented using arithmetic operators, but since they are normalized
they will not wrap like far pointers. Huge pointers can be reliably
used with relational operators because they are normalized. This
works because normalization of huge pointers insures that every huge
pointer is unique. It is important to understand that huge pointers
are never the default pointer, even in the huge memory model.
MEMORY MODELS:
The important difference between the memory models are the size of the
data and code pointers, number of data and code segments and the number
and type of heaps available. For our purposes we will refer to the tiny,
small and medium memory models as near memory models and the compact,
large and huge as far memory models. We use this notation because the
near memory models have both a near and far heap while the far memory
models have only a far heap. The near memory models do not have a
separate stack segment like the far memory models do. This is because
the data segment, the near heap and the stack are all part of DGROUP
meaning that the total size of these things must be less than or equal to
64K bytes. The tiny model is an exception to this in that it also
includes the code segment and psp (256 bytes) in DGROUP also.
TINY SML MED CMP LRG HUGE
Near Heap : Yes Yes Yes No No No
Far Heap : Yes Yes Yes Yes Yes Yes
Code pointers : near near far near far far
Data pointers : near near near far far far
Separate stack segment: No No No Yes Yes Yes
Multiple code segments: No No Yes No Yes Yes
Multiple data segments: No No No No No Yes
The near and far heap fields above indicate the whether or not
those heaps exist in the specified memory model. If a memory model is
used that doesn't contain a near heap and one or more of the near heap
functions i.e. malloc, free, heapwalk etc. are used they are mapped by
the compiler into a special far version of the call that will allocate
the memory off the correct heap (far). All of the parameters of these
special near memory allocation functions are the same, but the pointer
returned by this special version of the near memory allocation function
will be a far pointer rather than a near pointer. The code and data
pointer fields show the default pointer size for each memory model. It
is important to note that none of the memory models use huge pointers by
default. These defaults can all be overridden using the modifiers near,
far or huge (e.g. int far *foo; or void near goo();). The separate
stack segment field indicates whether the stack is part of DGROUP or in
its own segment. Multiple code and data segments indicate if the
specified memory model can have more than one of each type. If more
than one of either type is allowed the segments are divided based on
their corresponding source files e.g. a 3 source file program in the
huge memory model would have 3 code segments and 3 data segments where
the code and global data declared in each source file would go into the
corresponding segment.
HEAP:
The global variable _heaplen only applies to the near heap. It is used to
specify the size of the heap. If no value is specified for _heaplen then
DGROUP will default to 64K bytes. If a value is specified then the size
of DGROUP will be computed by summing the size of everything in DGROUP
and then adding the size of _heaplen and _stklen. The global variable
_stklen is used to specify how much memory to set aside for the stack.
The near heap, when present, begins just after the global data and grows
up toward the stack (higher memory) and the stack is situated at the end
of DGROUP and grows down toward the heap (lower memory). The empty space
separating the near heap and the stack can be used by either the heap or
the stack regardless of whether it was reserved using _heaplen or
_stklen variables. It is possible to dynamically allocate memory that
is actually being used by the stack without generating a warning or an
error. This is a common cause of memory corruption and can be avoided by
using coreleft to keep track of the available memory to insure it is
available prior to allocating it.
The far heap exists just above the stack in memory. The far heap's
initial size is zero. When the program processes a request for memory
from the far heap it requests DOS to reallocate the current program
size (minimum of 16 byte chunks) to include the requested size. DOS
maintains its own heap very similar to the heap used by a program.
Each block of memory that DOS allocates has a header called a memory
control block (MCB) which is used to manage the heap used by DOS.
When there is 1K bytes of free memory at the top of the far heap the
program again calls DOS to reallocate the program space giving the memory
back to DOS. All pointers used to reference the far heap should be either
far or huge since the far heap is in a separate segment. It is important
to note that if another process requests memory from DOS and the memory
is available then that other program will get the next free block DOS
owns which is the one immediately following the far heap in memory. This
means that the next time your program attempts to reallocate the current
program size the DOS allocate function will fail because your program has
now been blocked in by a DOS memory control block (MCB) other than your
program MCB. This is also the case if allocmem is called from within your
program.
STACK:
Stack checking is an option that can be turned on or off that will check
to see if the stack has overflowed. When this option is turned on the
compiler will generate code that will check, upon entering a new
function, to insure there is adequate room on the stack to make the call.
If there isn't room the appropriate error message will be generated and
the program will exit. The checking is done in the near memory models by
comparing __brklvl that marks the end of the near heap and SP which marks
the actual top of the stack. In the far memory models this checking is
done by comparing SP which marks the actual top of the stack and _stklen
which is the size of the stack segment. Stack checking is not fool
proof. The Run Time Library (RTL) of functions found in the Library
Reference Manual were not compiled with stack checking turned on. This
means there is a possibility of overflowing the stack when making a call
to an RTL function even though stack checking is turned on. Functions
that use a stack other than the program stack should not use the stack
checking option (e.g asynchronous interrupt service routines).
When floating point is turned on in any memory model the floating point
emulator information is located in the first 416 bytes of the physical
stack (SS:0000 - SS::415). This information must be transferred if you
are trying to switch stacks. Receiving false floating point error
messages is a good indication that something may be corrupting this
portion of the stack.
The size of the stack is determined by the startup code at run time. If
a link map is generated you will see a size specified for the stack that
may be different than the size you specified. This value you see in the
link map is a value that DOS is told, but the actual size requirement is
met at runtime.
In the near memory models the stack exists within the 64K byte limit
defined by DGROUP which also contains global data and the near heap. In
the tiny memory model it also contains the code segment. In the
near memory when the stack overflows the first thing corrupted is the
near heap followed by the global data.
In the far memory models the stack exists immediately following the
data segment and is just before the far heap. In the far memory models
the first thing that is corrupted when the stack overflows is the
floating point emulator followed by the far heap.
In order to avoid stack overflow it is important to understand what makes
up the stack. In the far memory model only, the emulator occupies the 416
bytes from SS:0000 to SS:0415 located at the top of the logical stack. As
each function is entered space is allocated for all its parameters. They
are created on the stack from right to left followed by the return address.
Space is then allocated for all auto (default for local variables)
variables in the order of declaration. All this space remains allocated
until the function returns to the calling function. In other words if you
then called another function, then the space allocated for the first
function would remain allocated until the first function regained control
and then returned control to it calling function. In the case of C++
programs, all class copies used by the compiler are implicitly created on
the stack unless a copy constructor is provided that uses the alternative
of dynamic memory allocation.
COMMON PROBLEMS LEADING TO MEMORY CORRUPTION:
Using an uninitialized pointer is probably the most common cause of
memory corruption. It is possible to use an uninitialized pointer and
have a program work. It just depends on what the default address is when
the pointer is created and if anything else is attempting to use that
address.
Failure to include ALLOC.H is a common mistake. When ALLOC.H is left out
the type checking is not performed. Since there is no prototype the
memory allocation functions are all thought by the compiler to be
returning integer types. A program will usually continue to work
without ALLOC.H in the near memory models, since an integer is the same
size as the near pointers used in these memory models. However in the
far memory models the segment half of the pointer is lost when the memory
allocation function attempts to return a far pointer. These programs may
or may not work. You should always include ALLOC.H.
Failure to check the return value from a memory allocation call can
result in a NULL pointer. Technically a NULL pointer is defined as an
invalid pointer. In more practical use, however, a NULL pointer is
interpreted as meaning a pointer that holds 0 as its address (e.g.
p = 0;). NULL for our purposes will mean a pointer that holds an address
of 0. A memory allocation function can fail and return NULL if there is
no memory left, the program was unable to resize its MCB or if there is
some sort of heap corruption whether it be in the DOS heap or your program
heap. Your code should always check to insure the pointer returned by a
memory allocation function is not NULL.
Stack overflow is a common problem. When making a function call the
stack can overflow and corrupt some other variables and return without
the user being the wiser. A later attempt to use the corrupted variable
can result in incorrect program results or hanging the system.
Indexing out of bounds occurs with both statically declared arrays and
dynamically declared arrays. An array declared as char foo[100] has a
valid index range of 0 - 99. Making an assignment to foo[100] is memory
corruption since foo[100] does not belong to the array foo. A non-huge
array will wrap if an a value larger than unsigned (0xffff) is used to
index it.
The largest dynamically allocated array that can be addressed using a
far pointer is 65531 bytes because the offset returned by farmalloc will
always have an offset of 0004. This offset is guaranteed because the far
heap is paragraph aligned. The missing 4 bytes are used by the program
heap manager as a block header to manage the heap. Of course a far pointer
has the potential to rival the power available with the huge pointers if
the user does the necessary normalizing of the pointer.
In a far memory model, a call to a near memory function will be mapped
into the corresponding special function call described above in the heap
section. In the near memory models all pointers declared are near by
default and should not be used with any of the far memory allocation
functions i.e. farmalloc and farcalloc. The compiler will generate a
warning about a suspicious pointer conversion but it won't prevent it
from happening. The first time such a pointer is used it will address
an offset of 0004 in the DGROUP. Also be aware of the memory model that
the program is compiled in. If it is one of the far memory models then
all calls to the near memory allocation functions will be mapped into the
corresponding far functions behind the scenes by the compiler. In some
cases this also changes the parameter and return types. It is strongly
suggested that if you are using a far memory model you use only the far
memory allocation functions to avoid confusion at a later date and make
debugging easier.
The duration of a variable is very important. A global pointer should
never reference a local auto variable, because when that local variable
goes out of scope its memory is released to be reused by the stack for
something else. If the global pointer is later used it may be
erroneously referencing something else.
Receiving the message "Null pointer assignment" after your program has
completed its run is usually caused by the use of an uninitialized
pointer. You can track the cause of this message down by placing the
two watches ( (char *)4,s and *(char *) 0,4m ) on your program.
The first 47 bytes of the data segment are not valid addresses for any
variables in your program. When your program exits these 47 bytes are
checked to see if they have been modified. If they have the warning
message is printed on your screen. These two watches will allow you to
monitor the beginning of the data segment to identify the offending line
of code.
The memory allocation functions used to free dynamically allocated memory
(free, farfree, delete) should only be used on pointers that hold an
address returned by a dynamic memory allocation function i.e. farmalloc,
malloc, new etc.. Any attempt to free an array declared as int foo[100]
will result in undefined results possibly corrupting memory. The memory
freeing functions should never be called twice with the same address as a
parameter unless that same address has been reallocated since the last
memory free function call. The results are undefined and can produce
memory corruption.
One very common problem is character arrays that are not NULL terminated.
Every character array must be terminated by a NULL or 0 i.e.
char foo[100];
foo[0] = 0; This is a NULL terminated "empty" string
foo[10] = '\0'; This is a NULL terminated string with 10
undefined characters preceding the NULL.
Remember indexing starts at 0 not 1.
foo[0] = '0' Incorrect this is not a NULL terminated
string.
All of the functions that are listed in string.h found in the include
directory rely on this NULL termination. The NULL is what they use to
tell them to they have reached the end of the string and should stop
processing the string. If the NULL is missing then these functions will
continue on processing possibly corrupting memory as they go. For most
of the string family of functions list in string.h you will see a
corresponding n family of functions i.e. strcpy and strncpy, stricmp and
strnicmp, strcat and strncat etc.. The corresponding n functions perform
the same tasks as their corresponding counter parts with the exception
that they take an extra parameter n which specifies how many characters
in the string to operate on. These n functions will be finished when
they have processed the string up to the NULL character or until they
have processed n characters in the string. It is strongly recommended
that these n family of functions be used rather than their counterparts
because if they do receive a string that is not NULL terminated they will
stop after they have processed n characters rather than continuing on and
corrupting memory. You will also see an _f family of functions in
string.h that have corresponding standard and n family of functions i.e.
strcat and _fstrcat; strchr and _fstrchr; strncmp and _fstrncmp etc..
The _f family of functions is not limited to the string family of
functions and will be addressed later.
In general most functions that work with pointers have a corresponding _f
family functions. The _f family of functions were designed for use
exclusively in the small and medium memory models because they are the
only memory models that have both a near and a far heap. The _f family
of functions are designed for use with far pointers in the small and
medium memory models. They are not necessary in the far memory models
because those RTL functions have already been compiled using the default
far pointers as parameter and return types. Near pointers present no
special problems in the far memory models because they can be implicitly
converted by the compiler.
The C++ language presents some special problems due to constructors and
destructors which are called by the compiler automatically. The first
rule is that for every constructor call there must be a corresponding
destructor call. If no dynamic memory allocation and deallocation is
taking place in these constructors and destructors one may not even
notice if the destructor gets called twice, but if it is the heap will
most likely become corrupted. One can check for violations of this
property by placing simple print statements or breakpoints (if using a
debugger) in each of the constructors and destructors and check for
compliance.
If dynamic memory allocation is used in a constructor in C++ then a copy
constructor is almost always required. A copy constructor is a
constructor that will dynamically allocate the needed memory and then
copy what is stored in the corresponding memory in the class instance
being copied. The compiler will often times need to generate temporaries
of the class instances e.g. when an instance is being passed as a
parameter. If a copy constructor is not defined it will use the default
memberwise copy resulting in two different pointers in two different
class instances pointing to the same dynamically allocated memory. This
means that when the first instance is destructed the memory addressed by
its pointer is freed leaving the remaining instance of the class with an
invalid pointer that if used will cause memory corruption. If a copy
constructor is defined the compiler will use it versus the default copy
constructor.
DEBUGGING TIPS AND TECHNIQUES:
Once you are sure that the problem is not one of the common mistakes
listed you roll up your sleeves and prepare for a real debugging session.
The real trick to debugging is to understand how memory works and how a
program is structured in that memory. If you understand these things
then you can use things like different size pointers and different
segment organization to debug your program. Switching memory models
makes all these things happen. The hard part is knowing which one to
switch to and what to look for when you get there. In some cases you may
not be able switch memory models due to program constraints and may be
forced to use other techniques. When none of the simpler tricks such as
changing memory models works, then the divide and conquer technique
should be used. The divide and conquer method is the most reliable
method, but it is often the most difficult to implement.
A good technique to employ in your programming is to initialize all
pointers to NULL when they are declared and and again after they have
been freed. This way all invalid pointers will contain the same address
and can be tested prior to use and prior to being freed.
All compiler warnings should be turned on. It is important to note that
the compiler isn't shipped with all the warnings turned on. Inside the
IDE you have to turn each warning on explicitly and on the command line
you should give the -w+ option. You can find more information about how
to turn these on in the users guide.
In a near memory model the stack and the near heap grow towards each
other in memory. It is possible for either one to overstep its bounds
and corrupt the other. This condition is easy to test for by moving to a
far memory model and then increase the stack size to 64K. If the program
then behaves differently this is a strong indication that one or the
other of these conditions existed. If this is the case, there are two
possible solutions. The first is to use the far memory allocation
functions and the second is to stay in the far memory model and adjust
the stack size as necessary.
In the far memory models the stack exists directly below the far heap in
memory. As the stack grows the stack pointer SP approaches zero. When
the stack overflows SP jumps from 0 to 0xffffh. If the stack has been
declared to be 64K then the beginning of the logical stack will be
corrupted else the far heap at a point exactly 64k higher in memory from
SS will be corrupted. Note if you are using floating point the emulator
located on the top of the logical stack will become corrupted before SP
even wraps. This could result in a program running OK, but giving
incorrect results from computations or aborting the program. The stack
size can be increased to 64K by the line "extern unsigned _stklen
= 0xffffu;" at file scope (outside any function definition) inside any
source file being linked into your program.
A common mistake in writing an overloaded operator is to allocate
a new instance of a class, calculate the new value of this
instance, and return the pointer to this new instance. This will
work, but the compiler has no way of knowing that it should free
up the new data after it has finished with it. A program may work
"perfectly" on small amounts of data, but crash after it has
exhausted most of available memory. This is called a "memory
leak". Overloading NEW and DELETE to keep a log of used/freed
memory will assist in tracking these leaks.
When debugging a live program, it is easy to display and verify an
integer, string or other simple data structure. Verification of
more complex structures may involve several layers of pointer
indirection. It may be tedious to examine all elements in a
linked list for example. If you write functions to check your
structures/classes, these can be called within your program at
debug time. For instances of classes, these should be virtual
member functions, so that the correct routine is called for
instances of derived classes accessed through base class pointers.
Call these functions at the start and end of all other member
functions to flag corrupted data and assist in isolating problems.
The divide and conquer method is implemented by manipulating the
different components composing the program. These components will most
likely be single functions or groups of functions that lend themselves to
being individually tested by a driver program.
A driver program is a small program used to test a particular module or
group of modules. It should set up the environment for and pass the
parameters required by the function being tested. Sometimes it may be
more desirable to test an entire program without a particular function.
This can be done by using a stub. A stub is a testing function designed
to take the same parameters and return the same type as the function it is
replacing. Once the offending function has been located the offending
line of code must be located by commenting out and changing various lines
within the function.
When You have identified what you suspect as being the offending line of
code you need to confirm this. This is done by writing a separated test
program to test the line of code all by itself. If it still doesn't work
correctly then you have found the problem. If it works in the example
program then you have most likely found a side effect caused by
something else that must be identified. For instance something else may
modify the value of a pointer that , when used by the suspect code, causes
the machine to hang. Enough information should be obtained from the code
producing the side effect to give you a new place to look. In our
example we would watch the address of the pointer that got corrupted by
using Turbo Debugger and setting a hardware breakpoint on the address of
the pointer that would stop program execution whenever the pointer value is
modified. It is rare that an identified problem needs more than 20 lines
of code to reproduce.
Once you feel that you have isolated it you should ask yourself is every
piece of code in your example necessary to reproduce the problem. For
instance if the problem requires a structure to manifest itself then what
the structure contains is probably not important. The divide portion of
the algorithm is usually the most difficult and the most time consuming.
Now the you have located the problem you can usually determine what the
cause was. If you know what the cause of the problem was you should be
able to summarize the problem in a few sentences. If you cannot do this
you probably don't have a complete grasp of the problem. Borland
technical support will be able to provide you with the best support
when you have the problem isolated and have reproduced it in an small
example. Whether you are just letting us know the problem is there or
you're not sure what is causing the problem.
A function familiar to C programmers is the assert() macro. This
function is well suited for C++ programming. Assert() takes an
expression as a parameter, and if the expression is false, usually
halts the program and prints a message giving the line number and
file where the problem occurred. At any point where an important
assumption is made about the correctness of calculated data,
insert an assert() call, which will display a message if the
assumptions is invalid. For example, the default for a switch
statement is an ideal place to insert an assert() call. Insure that you
include ASSERT.H. Defining the symbol "NDEBUG" will cause the compiler
to ignore (i.e. not generate code for) all assert() calls. This way the
source code will not have to be modified to remove the excess debugging
code.