\" $NetBSD: __builtin_prefetch.3,v 1.4 2024/09/07 20:33:53 rillig Exp $
\"
\" Copyright (c) 2010 Jukka Ruohonen <[email protected]>
\" All rights reserved.
\"
\" Redistribution and use in source and binary forms, with or without
\" modification, are permitted provided that the following conditions
\" are met:
\" 1. Redistributions of source code must retain the above copyright
\"    notice, this list of conditions and the following disclaimer.
\" 2. Redistributions in binary form must reproduce the above copyright
\"    notice, this list of conditions and the following disclaimer in the
\"    documentation and/or other materials provided with the distribution.
\"
\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
\" PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
\" POSSIBILITY OF SUCH DAMAGE.
\"
Dd December 22, 2010
Dt __BUILTIN_PREFETCH 3
Os
Sh NAME
Nm __builtin_prefetch
Nd GNU extension to prefetch memory
Sh SYNOPSIS
Ft void
Fn __builtin_prefetch "const void *addr" "..."
Sh DESCRIPTION
The
Fn __builtin_prefetch
function prefetches memory from
Fa addr .
The rationale is to minimize cache-miss latency by
trying to move data into a cache before accessing the data.
Possible use cases include frequently called sections of code
in which it is known that the data in a given address is likely
to be accessed soon.
Pp
In addition to
Fa addr ,
there are two optional
Xr stdarg 3
arguments,
Fa rw
and
Fa locality .
The value of the latter should be a compile-time
constant integer between 0 and 3.
The higher the value, the higher the temporal locality in the data.
When
Fa locality
is 0, it is assumed that there is little or no temporal locality in the data;
after access, it is not necessary to leave the data in the cache.
The default value is 3.
The value of
Fa rw
is either 0 or 1, corresponding with read and write prefetch, respectively.
The default value of
Fa rw
is 0.
Also
Fa rw
must be a compile-time constant integer.
Pp
The
Fn __builtin_prefetch
function translates into prefetch instructions
only if the architecture has support for these.
If there is no support,
Fa addr
is evaluated only if it includes side effects,
although no warnings are issued by
Xr gcc 1 .
Sh EXAMPLES
The following optimization appears in the heavily used
Fn cpu_in_cksum
function that calculates checksums for the
Xr inet 4
headers:
Bd -literal -offset indent
while (mlen >= 32) {
       __builtin_prefetch(data + 32);
       partial += *(uint16_t *)data;
       partial += *(uint16_t *)(data + 2);
       partial += *(uint16_t *)(data + 4);

       \&...

       partial += *(uint16_t *)(data + 28);
       partial += *(uint16_t *)(data + 30);

       data += 32;
       mlen -= 32;

       \&...
Ed
Sh SEE ALSO
Xr gcc 1 ,
Xr attribute 3
Rs
%A Ulrich Drepper
%T What Every Programmer Should Know About Memory
%D November 21, 2007
%U https://www.akkadia.org/drepper/cpumemory.pdf
Re
Sh CAVEATS
This is a non-standard, compiler-specific extension.