/*
*
* 23. TESTING SPEED (all processors)
* ===================================
* The Pentium family or processors have an internal 64 bit clock counter
* which can be read into EDX:EAX using the instruction RDTSC (read time
* stamp counter). This is very useful for testing exactly how many clock
* cycles a piece of code takes.
*
* The program below is useful for measuring the number of clock cycles a piece
* of code takes. The program executes the code to test 10 times and stores the
* 10 clock counts. The program can be used in both 16 and 32 bit mode on the
* PPlain and PMMX:
*
* The 'filler' instructions before and after the piece of code to test are
* are included in order to get consistent results on the PPlain.
* The CLD is a non-pairable instruction which has been inserted to
* make sure the pairing is the same the first time as the subsequent times.
* The eight NOP instructions are inserted to prevent any prefixes in the code
* to test to be decoded in the shadow of the preceding instructions on the
* PPlain. Single byte instructions are used here to obtain the same pairing
* the first time as the subsequent times. The CLC after the code to test is
* a non-pairable instruction which has a shadow under which the 0FH prefix
* of the RDTSC can be decoded so that it is independent of any shadowing
* effect from the code to test on the PPlain.
*
* On The PMMX you may want to insert XOR EAX,EAX / CPUID before the
* instructions to test if you want the FIFO instruction buffer to be
* empty, or some time-consuming instruction (f.ex. CLI or AAD) if you
* want the FIFO buffer to be full.
*
* On the PPro and PII you have to put in a serializing instruction like
* CPUID before and after each RDTSC to prevent it from executing in parallel
* with anything else. (CPUID is a serializing instruction which means that
* it flushes the pipeline and waits for all pending operations to finish
* before proceeding. This is useful for testing purposes. CPUID has no
* shadow under which prefixes of subsequent instructions can decode.)
*
* The RDTSC instruction cannot execute in virtual mode on the PPlain and
* PMMX, so if you are running DOS programs you must run in real mode. (Press
* F8 while booting and select 'safe mode command prompt only' or 'bypass
* startup files').
*
* The Pentium processors have special performance monitor counters which can
* count events such as cache misses, misalignments, AGI stalls, etc. Details
* about how to use the performance monitor counters are not covered by this
* manual but can be found in the MMX technology developer's manual.
*
*/
#include <stdio.h>
#include <stdlib.h>
#define ITER 10 /* number of iterations */
int counter = 0; /* loop counter */
int tics = 0; /* temporary storage of clock */
int resultlist[ITER]; /* list of test results */
main()
{
int i;
asm("
.equ ITER, 10 # number of iterations
.equ OVERHEAD, 17 # 15 for PPlain, 17 for PMMX
#**************** Do any initializations here: ************************
#
# movl var3, %%edx # Ensure variable is in level 1 cache
#
#**************** End of initializations ************************