Thursday, January 14, 2010

C -internals--Malloc


The following information can be useful if you want to write your wrappers for malloc and free.
With wrappers we have the problem about not knowing how many bytes we are going to free. This is because we only have pointer to the memory location which we are going to free.



But first, A small information about memory chunks
1. Memory is allocated from a chunk of fixed size
2. chunk sizes are spaced at 16 byte ( on 64 bit machine)
chunk sizes: 16 ,32,48 ,64, 80,96.....

3. Chunk contains, 2 chunk sizes-4 byte each + user requested size
(Please see the GLIBC explanation and the diagram below to get better understanding)
so for example, you request 40 bytes, through a malloc(40) call
8+40 =48 bytes are required.
So we will get a chunk of 48 bytes in size

Suppose if we had asked for 41 bytes
8+41 =49
We will get a chunk of size 64 bytes


This might explain as to why you may not get memory corruption issues some times, even if you have written few bytes beyond the memory allocation you have asked for.

Wrappers
Now Back to our wrappers for malloc and free.
Following program can be used to track, how many bytes we are actually using, by adding counters to hymalloc(), hyrealloc() and hyfree()

The point to be noted is, how can we calculate the chunk size from memory pointer we have with us.

#include
#include

/*
* Some System dependent memory calculation techniques
*/

#ifndef SIZE_SZ_C
#define SIZE_SZ_C sizeof(size_t)
#endif
#ifndef INTERNAL_SIZE_T_C
#define INTERNAL_SIZE_T_C size_t
#endif


#define PREV_INUSE_C 0x1
#define NON_MAIN_ARENA_C 0x4
#define IS_MMAPPED_C 0x2

#define SIZE_BITS_C (PREV_INUSE_C|IS_MMAPPED_C|NON_MAIN_ARENA_C)

/* Get size, ignoring use bits */
#define chunksize_c(p) ((p)->size & ~(SIZE_BITS_C))

struct malloc_chunk_c {

INTERNAL_SIZE_T_C prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T_C size; /* Size in bytes, including overhead. */

struct malloc_chunk_c* fd; /* double links -- used only if free. */
struct malloc_chunk_c* bk;

/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk_c* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk_c* bk_nextsize;
};
typedef struct malloc_chunk_c* mchunkptrm;

int hymemsize(void *ptr)
{

char * temp ;
mchunkptrm p; /* chunk corresponding to mem */
if(ptr == NULL)
return 0;

p = ((mchunkptrm)((char*)(ptr) - 2*SIZE_SZ_C));

printf("Chunk :0x%x size:%d \n",ptr, chunksize_c((p)));

return chunksize_c((p));
}


void *hymalloc(size_t size, int lno)
{
void * ptr;
int chunk;

ptr = malloc(size);
chunk =hymemsize(ptr);

printf("0x%x malloc size:%d chunksize %d\n",ptr, size,chunk);
return ptr;
}




void * hyrealloc( void * ptr ,size_t size,int lno)
{
int chunk, rechunk;

chunk = hymemsize(ptr);

ptr =(void *) realloc(ptr,size);
rechunk = hymemsize(ptr);

printf("0x%x realloc size:%d chunk=%d, rechunk=%d\n",ptr, size,chunk,rechunk);
return ptr;
}
void hyfree(void *ptr)
{
int chunk;

chunk = hymemsize(ptr);
printf("0x%x Free size:%d \n",ptr, chunk );

free(ptr);

}
/*
* End of Memory calculation
*
*/

int main(int argc, char *argv[])
{

char * memtest=NULL;
int msize=0;
if(argc !=2)
{
printf("Usage: memorytest n\n Where n is the number of bytes to allocate \n");
return -1;
}
else
{
msize = atoi(argv[1]);
}

memtest = (char *) hymalloc(msize,__LINE__);

hyfree(memtest);

return 0;
}


Output:
./memorytest 50
Chunk :0xc631010 size:64
0xc631010 malloc size:50 chunksize 64
Chunk :0xc631010 size:64
0xc631010 Free size:64

./memorytest 40
Chunk :0x173a2010 size:48
0x173a2010 malloc size:40 chunksize 48
Chunk :0x173a2010 size:48
0x173a2010 Free size:48


Just for quick reference
Some cut paste about malloc from GLIBC malloc.c file itself
Rather than me writing absured about how malloc works, its better I leave it to the original implementor.


GLIBC
/* glibc-2.10.1/malloc/malloc.c:1803:
malloc_chunk details:

(The following includes lightly edited explanations by Colin Plumb.)

Chunks of memory are maintained using a `boundary tag' method as
described in e.g., Knuth or Standish. (See the paper by Paul
Wilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a
survey of such techniques.) Sizes of free chunks are stored both
in the front of each chunk and at the end. This makes
consolidating fragmented chunks into bigger chunks very fast. The
size fields also hold bits representing whether chunks are free or
in use.

An allocated chunk looks like this:



chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if allocated | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |M|P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User data starts here... .
. .
. (malloc_usable_size() bytes) .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


M - Mapped flage P - Previous in use Flag

Where "chunk" is the front of the chunk for the purpose of most of
the malloc code, but "mem" is the pointer that is returned to the
user. "Nextchunk" is the beginning of the next contiguous chunk.

Chunks always begin on even word boundries, so the mem portion
(which is returned to the user) is also on an even word boundary, and
thus at least double-word aligned.

Free chunks are stored in circular doubly-linked lists, and look like this:


chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Forward pointer to next chunk in list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Back pointer to previous chunk in list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unused space (may be 0 bytes long) .
. .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


The P (PREV_INUSE) bit, stored in the unused low-order bit of the
chunk size (which is always a multiple of two words), is an in-use
bit for the *previous* chunk. If that bit is *clear*, then the
word before the current chunk size contains the previous chunk
size, and can be used to find the front of the previous chunk.
The very first chunk allocated always has this bit set,
preventing access to non-existent (or non-owned) memory. If
prev_inuse is set for any given chunk, then you CANNOT determine
the size of the previous chunk, and might even get a memory
addressing fault when trying to do so.

Note that the `foot' of the current chunk is actually represented
as the prev_size of the NEXT chunk. This makes it easier to
deal with alignments etc but can be very confusing when trying
to extend or adapt this code.

The two exceptions to all this are

1. The special chunk `top' doesn't bother using the
trailing size field since there is no next contiguous chunk
that would have to index off it. After initialization, `top'
is forced to always exist. If it would become less than
MINSIZE bytes long, it is replenished.

2. Chunks allocated via mmap, which have the second-lowest-order
bit M (IS_MMAPPED) set in their size fields. Because they are
allocated one-by-one, each must contain its own trailing size field.

*/

Debugging memory issues
On your shell do this
export MALLOC_CHECK_=3

This will start the internal memory check of malloc
After this while you run your program on that shell, any memory access violations,corruptions will be reported.


Reference:


No comments:

Post a Comment

Followers