Volume 0x0b, Issue 0x39, Phile #0x07 of 0x12

==Phrack Inc.==

Volume 0x0b, Issue 0x39, Phile #0x07 of 0x12

|=---=[ ICMP based remote OS TCP/IP stack fingerprinting techniques ]=---=|
|=-----------------------------------------------------------------------=|
|=---------------=[ Ofir Arkin & Fyodor Yarochkin ]=---------------------=|

--[ICMP based fingerprinting approach]--

TCP based remote OS fingerprinting is quite old(*1) and well-known
these days, here we would like to introduce an alternative method to
determine an OS remotely based on ICMP responses which are received
from the host. Certain accuracy level has been achieved with
different platforms, which, with some systems or or classes of
platforms (i.g. Win*), is significally more precise than
demonstrated with TCP based fingerprinting methods.

As mentioned above TCP based method, ICMP fingerprinting utilizes
several tests to perform remote OS TCP/IP stack probe, but unlike
TCP fingerprinting, a number of tests required to identify an OS
could vary from 1 to 4 (as of current development stage).

ICMP fingerprinting method is based on certain discoveries on
differencies of ICMP replies from various operating systems (mostly
due to incorrect, or inconsistant implementation), which were found
by Ofir Arkin during his "ICMP Usage in Scanning" research project.
Later these discoveries were summarised into a logical desicions
tree which Ofir entitled "X project" and practically implemented in
'Xprobe' tool.

--[Information/Noise ratio with ICMP fingerprints]--

As it's been noted, the number of datagrams we need to send and
receive in order to remotely fingerprint a targeted machine with
ICMP based probes is small. Very small. In fact we can send one
datagram and receive one reply and this will help us identify up to
eight different operating systems (or classes of operating systems).
The maximum datagrams which our tool will use at the current stage
of development, is four. This is the same number of replies we will
need to analyse. This makes ICMP based fingerprinting very
time-efficient.

ICMP based probes could be crafted to be very stealthy. As on the
moment, no maliformed/broken/corrupted datagrams are used to
identify remote OS type, unlike the common fingerprinting methods.
Current core analysis targets validation of received ICMP responses
on valid packets, rather than crafting invalid packets themselves.
Heaps of such packets appear in an average network on daily basis
and very few IDS systems are tuned to detect such traffic (and those
which are, presumably are very noisy and badly configured).

--[Why it still works?]--

Inheritable mess among various TCP/IP stack implementations with
ICMP handling implementations which implement different RFC
standards (original RFC 792, additional RFC 1122, etc), partial or
incomplete ICMP support (various ICMP requests are not supported
everywhere), low significance of ICMP Error messages data (who
verifies all the fields of the original datagram?!), mistakes and
misunderstanding in ICMP protocol implementation made our method
viable.

--[What do we fingerprint:]--

Several OS-specific differencies are being utilized in ICMP based
fingerprinting to identify remote operating system type:

IP fields of an 'offending' datagram to be examined:

* IP total length field

Some operating systems (i.g. BSD family) will add 20 bytes
(sizeof(ipheader)) to the original IP total length field (which
occures due to internal processing mistakes of the datagram, please
note when the same packet is read from SOCK_RAW the same behaviour
is seen: returned packet ip_len fiend is off by 20 bytes).

Some other operating systems will decrease 20 bytes from the
original IP total lenth field value of the offending packet.

Third group of systems will echo this field correctly.

* IP ID
some systems are seen not to echo this field correctly. (bit order
of the field is changed).

* 3 bits flags and offset

some systems are seen not to echo this field correctly. (bit order
of the field is changed).

* IP header checksum

Some operating systems will miscalculate this field, others just
zero it out. Third group of the systems echoes this field correctly.

* UDP header checksum (in case of UDP datagram)
The same thing could happen with UDP checksum header.

IP headers of responded ICMP packet:

* Precedence bits
Each IP Datagram has an 8-bit field called the 'TOS Byte', which
represents the IP support for prioritization and Type-of-Service
handling.

The 'TOS Byte' consists of three fields.

The 'Precedence field'\cite{rfc791}, which is 3-bit long, is intended to
prioritize the IP Datagram. It has eight levels of prioritization.

Higher priority traffic should be sent before lower priority traffic.

The second field, 4 bits long, is the 'Type-of-Service' field. It is
intended to describe how the network should make tradeoffs between
throughput, delay, reliability, and cost in routing an IP Datagram.

The last field, the 'MBZ' (must be zero), is unused and must be zero.
Routers and hosts ignore this last field. This field is 1 bit long.
The TOS Bits and MBZ fields are being replaced by the DiffServ
mechanism for QoS.

RFC 1812 Requires following for IP Version 4 Routers:

"4.3.2.5 TOS and Precedence

ICMP Source Quench error messages, if sent at all, MUST have their
IP Precedence field set to the same value as the IP Precedence field
in the packet that provoked the sending of the ICMP Source Quench
message. All other ICMP error messages (Destination Unreachable,
Redirect, Time Exceeded, and Parameter Problem) SHOULD have their
precedence value set to 6 (INTERNETWORK CONTROL) or 7 (NETWORK
CONTROL). The IP Precedence value for these error messages MAY be
settable".

Linux Kernel 2.0.x, 2.2.x, 2.4.x will act as routers and will set
their Precedence bits field value to 0xc0 with ICMP error messages.
Networking devices that will act the same will be Cisco routers
based on IOS 11.x-12.x and Foundry Networks switches.

* DF bits echoing
Some TCP/IP stacks will echo DF bit with ICMP Error datagrams,
others (like linux) will copy the whole octet completely, zeroing
certain bits, others will ignore this field and set their own.

* IP ID filend (linux 2.4.0 - 2.4.4 kernels)

Linux machines based on Kernel 2.4.0-2.4.4 will set the IP
Identification field value with their ICMP query request and reply
messages to a value of zero.

This was later fixed with Linux Kernels 2.4.5 and up.

* IP ttl field (ttl distance to the target has to be precalculated to
guarantee accuracy).

"The sender sets the time to live field to a value that represents
the maximum time the datagram is allowed to travel on the Internet".

The field value is decreased at each point that the IP header is
being processed. RFC 791 states that this field decreasement reflects
the time spent processing the datagram. The field value is measured
in units of seconds. The RFC also states that the maximum time to
live value can be set to 255 seconds, which equals to 4.25 minutes.
The datagram must be discarded if this field value equals zero -
before reaching its destination.

Relating to this field as a measure to assess time is a bit
misleading. Some routers may process the datagram faster than a
second, and some may process the datagram longer than a second.

The real intention is to have an upper bound to the datagram
lifetime, so infinite loops of undelivered datagrams will not jam the
Internet.

Having a bound to the datagram lifetime help us to prevent old
duplicates to arrive after a certain time elapsed. So when we
retransmit a piece of information which was not previously delivered
we can be assured that the older duplicate is already discarded and
will not interfere with the process.

The IP TTL field value with ICMP has two separate values, one for
ICMP query messages and one for ICMP query replies.

The IP TTL field value helps us identify certain operating systems
and groups of operating systems. It also provides us with the
simplest means to add another check criterion when we are querying
other host(s) or listening to traffic (sniffing).

TTL-based fingeprinting requires a TTL distance to the done to be
precalculated in advance (unless a fingerprinting of a local network
based system is performed system).

The ICMP Error messages will use values used by ICMP query request
messages.

A good statistics of ttl dependancy on OS type has been gathered at:
http://www.switch.ch/docs/ttl_default.html
(Research paper on default ttl values)

* TOS field

RFC 1349 defines the usage of the Type-of-Service field with the
ICMP messages. It distinguishes between ICMP error messages
(Destination Unreachable, Source Quench, Redirect, Time Exceeded,
and Parameter Problem), ICMP query messages (Echo, Router
Solicitation, Timestamp, Information request, Address Mask request)
and ICMP reply messages (Echo reply, Router Advertisement, Timestamp
reply, Information reply, Address Mask reply).

Simple rules are defined:
* An ICMP error message is always sent with the default TOS (0x0000)

* An ICMP request message may be sent with any value in the TOS
field. "A mechanism to allow the user to specify the TOS value to
be used would be a useful feature in many applications that
generate ICMP request messages".

The RFC further specify that although ICMP request messages are
normally sent with the default TOS, there are sometimes good
reasons why they would be sent with some other TOS value.

* An ICMP reply message is sent with the same value in the TOS
field as was used in the corresponding ICMP request message.

Some operating systems will ignore RFC 1349 when sending ICMP echo
reply messages, and will not send the same value in the TOS field as
was used in the corresponding ICMP request message.

ICMP headers of responded ICMP packet:

* ICMP Error Message Quoting Size:

All ICMP error messages consist of an IP header, an ICMP header
and certain amount of data of the original datagram, which triggered
the error (aka offending datagram).

According to RFC 792 only 64 bits (8 octets) of original datagram
are supposed to be included in the ICMP error message. However RFC
1122 (issued later) recommends up to 576 octets to be quoted.

Most of "older" TCP stack implementations will include 8 octets into
ICMP Errror message. Linux/HPUX 11.x, Solaris, MacOS and others will
include more.

Noticiably interesting is the fact that Solaris engineers probably
couldn't not read RFC properly (since instead of 64 bits Solaris
2.x includes 64 octets (512 bits) of the original datagram.

* ICMP error Message echoing integrity

Another artifact which has been noticed is that some stack
implementations, when sending back an ICMP error message, may alter
the offending packet's IP header and the underlying protocol data,
which is echoed back with the ICMP error message.

Since mistakes, made by TCP/IP stack programmers are different and
specific to an operating system, an analysis of these mistakes could
give a potential attacker a a possibilty to make assumptions about
the target operating system type.

Additional tweaks and twists:
* Using difererent from zero code fields in ICMP echo requests

When an ICMP code field value different than zero (0) is sent with
an ICMP Echo request message (type 8), operating systems that will
answer our query with an ICMP Echo reply message that are based on
one of the Microsoft based operating systems will send back an ICMP
code field value of zero with their ICMP Echo Reply. Other operating
systems (and networking devices) will echo back the ICMP code field
value we were using with the ICMP Echo Request.

The Microsoft based operating systems acts in contrast to RFC
792 guidelines which instruct the answering operating systems to
only change the ICMP type to Echo reply (type 0), recalculate the
checksums and send the ICMP Echo reply away.

* Using DF bit echoing with ICMP query messages

As in case of ICMP Error messages, some tcp stacks will respond
these queries, while the others: will not.

* Other ICMP messages:
* ICMP timestamp request
* ICMP Information request
* ICMP Address mask request

Some TCP/IP stacks support these messages and respond to some of
these requests.

--[Xprobe implementation]--

Currently Xprobe deploys hardcoded logic tree, developed by Ofir
Arkin in 'Project X'. Initially a UDP datagram is being sent to a
closed port in order to trigger ICMP Error message: ICMP
unreachable/port unreach. (this sets up a limitation of having at
least one port not filtered on target system with no service
running, generically speaking other methods of triggering ICMP
unreach packet could be used, this will be discussed further).
Moreover, a few tests (icmp unreach content, DF bits, TOS ...) could
be combined within a single query, since they do not affect results
of each other.
Upon the receipt of ICMP unreachable datagram, contents of the
received datagram is examined and a diagnostics decision is made, if
any further tests are required, according to the logic tree, further
queries are sent.

--[ Logic tree]---

Quickly recapping the logic tree organization:

Initially all TCP/IP stack implementations are split into 2 groups,
those which echo precedence bits back, and those which do not. Those
which do echo precendence bits (linux 2.0.x, 2.2.x, 2.4.x, cisco IOS
11.x-12.x, Extreme Network Switches etc), being differentiated
further based on ICMP error quoting size. (Linux sticks with RFC
1122 here and echoes up to 576 octets, while others in this subgroup
echo only 64 bits (8 octets)). Further echo integrity checks are
used to differentiate cisco routers from Extreme Network switches.

Time-to-live and IP ID fields of ICMP echo reply are being used to
recognize version of linux kernel.

The same approach is being used to recognize other TCP/IP stacks.
Data echoing validation (amounts of octets of original datagram
echoed, checksum validation, etc). If additional information is
needed to differ two 'similar' IP stacks, additional query is being
sent. (please refer to the diagram at
http://www.sys-security.com/html/projects/X.html for more detailed
explanation/graphical representation of the logic tree).

One of the serious problems with the logic tree, is that adding new
operating system types to it becomes extremely painful. At times
part of the whole logic tree has to be reworked to 'fit' a single
description. Therefore a singature based fingerprinting method took
our closer attention.

--[Sinature based approach]--

Singature based approach is what we are currently focusing on and
which we believe will be further, more stable, reliable and flexible
method of remote ICMP based fingerprints.

Signature-based method is currently based on five different tests,
which optionally could be included in each operating system
fingerprint. Initally the systems with lesser amount of tests are
being examined (normally starting with ICMP unreach test).

If no single OS stack found matching received signature, those
stacks which match a part, being grouped again, and another test
(based on lesser amounts of tests issued principle) is choosen and
executed. This verification is repeated until an OS stack,
completely matching the signature is found, or we run out of tests.

Currently following tests are being deployed:

* ICMP unreachable test (udp closed port based, host unreachable,
network unreachable (for systems which are believed to be gateways)
* ICMP echo request/reply test
* ICMP timestamp request
* ICMP information request
* ICMP address mask request

--[future implementations/development]--

Following issues are planned to be deployed (we always welcome
discussions/suggestions though):
* Fingerprints database (currently being tested)
* Dynamic, AI based logic (long-term project :))
* Tests would heavily dependent on network topology (pre-test
network mapping will take place).
* Path-to-target test (to calculate hops distance to the target)
filtering devices probes.
* Future implementations will be using packets with
actual application data to dismiss chances of being detected.
* other network mapping capabilities shall be included (
network role identification, search for closed UDP port, reachability
tests, etc).

--[code for kids]--

Currently implemented code and further documentation is available at
following locations:

http://www.sys-security.com/html/projects/X.html

http://xprobe.sourceforge.net

http://www.notlsd.net/xprobe/

Ofir Arkin <[email protected]>
Fyodor Yarochkin <[email protected]>

|=[ EOF ]=---------------------------------------------------------------=|