======================================================================
=                         X86_memory_models                          =
======================================================================

                            Introduction
======================================================================
In computing, the x86 memory models are a set of six different memory
models of the x86 CPU operating in real mode which control how the
segment registers are used and the default size of pointers.


                        Memory segmentation
======================================================================
Four registers are used to refer to four segments on the 16-bit x86
segmented memory architecture. DS (data segment), CS (code segment),
SS (stack segment), and ES (extra segment). Another 16-bit register
can act as an offset into a given segment, and so a logical address on
this platform is written 'segment':'offset', typically in hexadecimal
notation. In real mode, in order to calculate the physical address of
a byte of memory, the hardware shifts the contents of the appropriate
segment register 4 bits left (effectively multiplying by 16), and then
adds the offset.

For example, the logical address 7522:F139 yields the 20-bit physical
address:
75220
+         F139
84359

Note that this process leads to aliasing of memory, such that any
given physical address has up to 4096 corresponding logical addresses.
This complicates the comparison of pointers to different segments.


                           Pointer sizes
======================================================================
Pointer formats are known as 'near', 'far', or 'huge'. These examples
each load two adjacent integers into AX and DX from an address stored
at [reg].
* 'Near' pointers are 16-bit offsets within the reference segment,
i.e. DS for data and CS for code. They are the fastest pointers, but
are limited to point to 64 KB of memory (to the associated segment of
the data type). Near pointers can be held in registers (typically BX,
SI, and DI).

mov bx, word [reg]
mov ax, word [bx]
mov dx, word [bx+2]

* 'Far' pointers are 32-bit pointers containing a segment and an
offset. To use them the segment register ES is used by using the
instruction LES Reg, dword ptr [mem]. They may reference up to 1024
KiB of memory. Note that pointer arithmetic (addition and subtraction)
does not modify the segment portion of the pointer, only its offset.
Operations which exceed the bounds of zero or 65535 (0xFFFF) will
undergo modulo 64K operation just as any normal 16-bit operation. For
example, if the segment register is set to 0x5000 and the offset is
being incremented, the moment this 'counter' offset becomes (0x10000),
the resulting absolute address will roll over to 0x5000:0000.

les bx,dword [reg]
mov ax,word [es:bx]
mov dx,word [es:bx+2]

* 'Huge' pointers are essentially far pointers, but are (mostly)
normalized every time they are modified so that they have the highest
possible segment for that address. This is very slow but allows the
pointer to point to multiple segments (i.e. to span more than 64 KB),
and allows for accurate pointer comparisons, as if the platform were a
flat memory model: It forbids the aliasing of memory as described
above, so two huge pointers that reference the same memory location
are always equal.

les bx,dword [reg]
mov ax,word [es:bx]
add bx,2
test bx,0xfff0
jz lbl
sub bx,0x10
mov dx,es
inc dx
mov es,dx
lbl:
mov dx,word [es:bx]

In reality, a compiler may optimize the above 'Huge' pointer code
using no branches. This version decrements the segment unconditionally
and to compensate, a segment length (0x10) is added to BX. Unlike the
previous 'Huge' example, this version does not require the
segment:offset to be normalized.

les bx,dword [reg]
mov ax,word [es:bx]
mov dx,es
dec dx
mov es,dx
mov dx,word [es:bx+(2+0x10)]


                           Memory models
======================================================================
The memory models are:
Model !! Data !! Code !! Definition
Tiny     colspan="2" align="center" | near       CS=DS=SS=ES
Small    near    near    DS=SS
Medium   near    far     DS=SS, multiple code segments
Compact          far     near    single code segment, multiple data segments
Large    far     far     multiple code and data segments
Huge     huge    far     multiple code and data segments; single array may
be >64 KB


                          Other platforms
======================================================================
In protected mode a segment cannot be both writable and executable.
Therefore, when implementing the Tiny memory model the code segment
register must point to the same physical address and have the same
limit as the data segment register. This defeated one of the features
of the 80286, which makes sure data segments are never executable and
code segments are never writable (which means that self-modifying code
is never allowed). However, on the 80386, with its paged memory
management unit it is possible to protect individual memory pages
against writing.

Memory models are not limited to 16-bit programs. It is possible to
use segmentation in 32-bit protected mode as well (resulting in 48-bit
pointers) and there exist C language compilers which support that.
However segmentation in 32-bit mode does not allow to access a larger
address space than what a single segment would cover, unless some
segments are not always present in memory and the linear address space
is just used as a 'cache' over a larger segmented virtual space.


x86-64
========
On the x86-64 platform, a total of seven memory models exist, as the
majority of symbol references are only 32 bits wide, and if the
addresses are known at link time (as opposed to position-independent
code). This does not affect the pointers used, which are always flat
64-bit pointers, but only how values that have to be accessed via
symbols can be placed.


                              See also
======================================================================
*Protected mode


                            Bibliography
======================================================================
* 'Turbo C++ Version 3.0 User's Guide'. Borland International,
Copyright 1992.


License
=========
All content on Gopherpedia comes from Wikipedia, and is licensed under CC-BY-SA
License URL: http://creativecommons.org/licenses/by-sa/3.0/
Original Article: http://en.wikipedia.org/wiki/X86_memory_models