Linux SMP HOWTO
 David Mentr, [email protected]
 v1.9, 13 January 2000

 This HOWTO reviews main issues (and I hope solutions) related to SMP
 configuration under Linux.
 ______________________________________________________________________

 Table of Contents



 1. Introduction

 2. Questions related to any architectures

    2.1 Kernel Side
    2.2 User Side
    2.3 SMP Programming
       2.3.1 Parallelization methods
       2.3.2 The C Library
       2.3.3 Languages, Compilers and debuggers
       2.3.4 Other libraries
       2.3.5 Other points about SMP Programming

 3. x86 architecture specific questions

    3.1 Why it doesn't work on my machine?
    3.2 Possible causes of crash
    3.3 Motherboard specific information
       3.3.1 Motherboards with known problems
    3.4 Low cost SMP Linux box (dual Celeron box)
       3.4.1 Is it possible to run a dual Intel Celeron box ?
       3.4.2 How does Linux behave on a dual Celeron system ?
       3.4.3 Celeron processors are known to be easily overclockable. And dual Celeron system ?
       3.4.4 And making a quad Celeron system ?
       3.4.5 What about mixing Celeron and Pentium II processor ?

 4. Sparc architecture specific questions

    4.1 Which Sparc machines are supported ?
    4.2 Specific problem related to Sparc SMP support
    4.3 SMP specific limit with current kernel (2.2)

 5. PowerPC architecture specific questions

    5.1 Which PPC machines are supported ?
    5.2 Specific problem related to PPC SMP support

 6. Alpha architecture specific questions

    6.1 Which Alpha machines are supported ?
    6.2 Specific problem related to Alpha SMP support

 7. Useful pointers

    7.1 Various
    7.2 Multithreaded programs and library
    7.3 SMP specific patches
    7.4 Parallelizing/Optimizing Compilers for 586/686 machines (

 8. Glossary

 9. What's new ?

 10. List of contributors



 ______________________________________________________________________

 1.  Introduction

 Linux works on SMP (Symmetric Multi-Processors) machines. SMP support
 was introduced with kernel version 2.0, and has improved steadily ever
 since.  The kernel locking granularity is much finer in 2.2.x than in
 2.0.x, which enables better performance when processes are accessing
 the kernel!
 HOWTO maintained by David Mentr ([email protected]). The latest
 edition of this HOWTO can be found at

 o  http://www.irisa.fr/prive/mentre/smp-howto/ (France)

 o  http://www.phy.duke.edu/brahma/smp-faq/ (USA)


 If you want to contribute to this HOWTO, I would prefer a diff against
 the SGML version <http://www.irisa.fr/prive/mentre/smp-howto/smp-
 howto.sgml> of this document, but any remarks (in plain text) will be
 greatly appreciated. If you send me an email about this HOWTO, please
 include a tag like [Linux SMP HOWTO] in the Subject: field of your e-
 mail. It helps me to automatically sort mails (and you will have a
 faster reply ;)).


 This HOWTO is an improvement of a first draft
 <http://www.ihoc.net/linux-smp-faq-draft.html> made by Chris Pirih.


 All information contained in this HOWTO is provided "as is." All
 warranties, expressed, implied or statutory, concerning the accuracy
 of the information of the suitability for any particular use are
 hereby specifically disclaimed. While every effort has been taken to
 ensure the accuracy of the information contained in this HOWTO, the
 authors assume no responsibility for errors or omissions, or for
 damages resulting from the use of the information contained herein.


 2.  Questions related to any architectures


 2.1.  Kernel Side



 1. Does Linux support multi-threading?  If I start two or more
    processes, will they be distributed among the available CPUs?

    Yes. Processes and kernel-threads are distributed among processors.
    User-space threads are not.


 2. What kind of architectures are supported in SMP?


    From Alan Cox:
       SMP is supported in 2.0 on the hypersparc (SS20, etc.) systems
       and Intel 486, Pentium or higher machines which are Intel
       MP1.1/1.4 compliant. Richard Jelinek adds: right now, systems
       have been tested up to 4 CPUs and the MP standard (and so Linux)
       theoretically allows up to 16 CPUs.

       SMP support for UltraSparc, SparcServer, Alpha and PowerPC
       machines is in available in 2.2.x.


    From Ralf Bchle:
       MIPS, m68k and ARM does not support SMP; the latter two probly
       won't ever.

       That is, I'm going to hack on MIPS-SMP as soon as I get a SMP
       box ...

 3. How do I make a Linux SMP kernel?

    Most Linux distributions don't provide a ready-made SMP-aware
    kernel, which means that you'll have to make one yourself. If you
    haven't made your own kernel yet, this is a great reason to learn
    how.  Explaining how to make a new kernel is beyond the scope of
    this document; refer to the Linux Kernel Howto for more
    information. (C. Polisher)

    In kernel series 2.0 up to but not including 2.1.132, uncomment the
    SMP=1 line in the main Makefile (/usr/src/linux/Makefile).

    In the 2.2 version, configure the kernel and answer "yes" to the
    question "Symmetric multi-processing support" (Michael Elizabeth
    Chastain).

    AND

    enable real time clock support by configuring the "RTC support"
    item (from Robert G.  Brown). Note that inserting RTC support
    actually doesn't afaik prevent the known problem with SMP clock
    drift, but enabling this feature prevents lockup when the clock is
    read at boot time. A note from Richard Jelinek says also that
    activating the Enhanced RTC is necessary to get the second CPU
    working (identified) on some original Intel Mainboards.

    AND

    (x86 kernel) do NOT enable APM (advanced power management)! APM and
    SMP are not compatible, and your system will almost certainly (or
    at least probably ;)) crash while booting if APM is enabled (Jakob
    Oestergaard). Alan Cox confirms this : 2.1.x turns APM off for SMP
    boxes.  Basically APM is undefined in the presence of SMP systems,
    and anything could occur.

    AND

    (x86 kernel) enable "MTRR (Memory Type Range Register) support".
    Some BIOS are buggy as they do not activate cache memory for the
    second processor. The MTRR support contains code that solves such
    processor misconfiguration.



    You must rebuild all your kernel and kernel modules when changing
    to and from SMP mode. Remember to make modules and make
    modules_install (from Alan Cox).


    If you get module load errors, you probably did not rebuild and/or
    re-install your modules.  Also with some 2.2.x kernels people have
    reported problems when changing the compile from SMP back to UP
    (uni-processor).  To fix this, save your .config file, do make
    mrproper, restore your .config file, then remake your kernel (make
    dep, etc.)  (Wade Hampton). Do not forget to run lilo after copying
    your new kernel.

    Recap:



    ___________________________________________________________________
    make config # or menuconfig or xconfig
    make dep
    make clean
    make bzImage # or whatever you want
    # copy the kernel image manually then RUN LILO
    # or make lilo
    make modules
    make modules_install
    ___________________________________________________________________



 4. How do I make a Linux non-SMP kernel?

    In the 2.0 series, comment the SMP=1 line in the main Makefile
    (/usr/src/linux/Makefile).

    In the 2.2 series, configure the kernel and answer "no" to the
    question "Symmetric multi-processing support" (Michael Elizabeth
    Chastain).



    You must rebuild all your kernel and kernel modules when changing
    to and from SMP mode. Remember to make modules and make
    modules_install and remember to run lilo.  See notes above about
    possible configuration problems.



 5. How can I tell if it worked?


     cat /proc/cpuinfo



 Typical output (dual PentiumII):

 ______________________________________________________________________
 processor       : 0
 cpu             : 686
 model           : 3
 vendor_id       : GenuineIntel
 [...]
 bogomips        : 267.06

 processor       : 1
 cpu             : 686
 model           : 3
 vendor_id       : GenuineIntel
 [...]
 bogomips        : 267.06
 ______________________________________________________________________



 6. What is the status of converting the kernel toward finer grained
    locking and multithreading?

    Linux kernel version 2.2 has signal handling, interrupts and some
    I/O stuff fine grain locked.  The rest is gradually migrating. All
    the scheduling is SMP safe.


    Kernel version 2.3 (next 2.4) has really fine grained locking. In
    the 2.3 kernels the usage of the big kernel lock has basically
    disappeared, all major Linux kernel subsystems are fully threaded:
    networking, VFS, VM, IO, block/page caches, scheduling, interrupts,
    signals, etc. (Ingo Molnar)


 7. Does Linux SMP support processor affinity?



    Standard kernel
       No and Yes.  There is no way to force a process onto specific
       CPU's but the linux scheduler has a processor bias for each
       process, which tends to keep processes tied to a specific CPU.


    Patch
       Yes. Look at PSET - Processor Sets for the Linux kernel
       <http://isunix.it.ilstu.edu/~thockin/pset/>:

         The goal of this project is to make a source compatible
         and functionally equivalent version of pset (as defined
         by SGI - partially removed from their IRIX 6.4 kernel)
         for Linux.  This enables users to determine which proces-
         sor or set of processors a process may run on.  Possible
         uses include forcing threads to separate processors, tim-
         ings, security (a `root' only CPU?) and probably more.


    It is focused around the syscall sysmp().  This function takes a
    number of parameters that determine which function is requested.
    Functions include:

    o  binding a process/thread to a specific CPU

    o  restricting a CPU's ability to execute some processes

    o  restricting a CPU from running at all

    o  forcing a cpu to run _only_ one process (and its children)

    o  getting information about a CPU's state

    o  creating/destroying sets of processors, to which processes may
       be bound



 8. Where should one report SMP bugs to?

    Please report bugs to [email protected].


 9. What about SMP performance?

    If you want to gauge the performance of your SMP system, you can
    run some tests made by Cameron MacKinnon and available at
    http://www.phy.duke.edu/brahma/benchmarks.smp.


 2.2.  User Side


 1. Do I really need SMP?

    If you have to ask, you probably don't. :) Generally, multi-
    processor systems can provide better performance than uni-processor
    systems, but to realize any gains you need to consider many other
    factors besides the number of CPU's.  For instance, on a given
    system, if the processor is generally idle much of the time due to
    a slow disk drive, then this system is "input/output bound", and
    probably won't benefit from additional processing power. If, on the
    other hand, a system has many simultaneously executing processes,
    and CPU utilization is very high, then you are likely to realize
    increased system performance.  SCSI disk drives can be very
    effective when used with multiple processors, due to the way they
    can process multiple commands without tying up the CPU. (C.
    Polisher)


 2. Do I get the same performance from 2-300 MHz processors as from one
    600 MHz processor?

    This depends on the application, but most likely not.  SMP adds
    some overhead that a faster uniprocessor box would not incur (Wade
    Hampton).  :)


 3. How does one display mutiple cpu performance?

    Thanks to Samuel S. Chessman, here are some useful utilities:

    Character based:
       http://www.cs.inf.ethz.ch/~rauch/procps.html

       Basically, it's procps v1.12.2 (top, ps, et. al.)  and some
       patches to support SMP.

       For 2.2.x, Gregory R. Warnes as made a patch available at
       http://queenbee.fhcrc.org/~warnes/procps


    Graphic:
       xosview-1.5.1 supports SMP. And kernels above 2.1.85 (included)
       the cpuX entry in /proc/stat file.

       The official homepage for xosview is:
       http://lore.ece.utexas.edu/~bgrayson/xosview.html

       You'll find a version patched for 2.2.x kernels by Kumsup Lee :
       http://www.ima.umn.edu/~klee/linux/xosview-1.6.1-5a1.tgz

       The various Forissier's kernel patches are at: http://www-
       isia.cma.fr/~forissie/smp_kernel_patch/

    By the way, you can't monitor processor scheduling precisely with
    xosview, as xosview itself causes a scheduling perturbation. (H.
    Peter Anvin)


 4. How can I enable more than 1 process for my kernel compile?

    use:



    ___________________________________________________________________
            # make [modules|zImage|bzImages] MAKE="make -jX"
            where X=max number of processes.
            WARNING: This won't work for "make dep".
    ___________________________________________________________________



 With a 2.2 like kernel, see also the file
 /usr/src/linux/Documentation/smp.txt for specific instruction.

 BTW, since running multiple compilers allows a machine with sufficient
 memory to use use the otherwise wasted CPU time during I/O caused
 delays, make MAKE="make -j 2" -j 2 actually helps even on uniprocessor
 boxes (from Ralf Bchle).


 5. Why is the time given by the time command inaccurate?  (from Joel
    Marchand)

    In the 2.0 series, the result given by the time command is false.
    The sum user+system is right *but* the spreading between user and
    system time is false.

    More precisely: "The explanation is, that all time spent in
    processors other than the boot cpu is accounted as system time.  If
    you time a program, add the user time and the system time, then you
    timing will be almost right, except for also including the system
    time that is correctly accounted for" (Jakob stergaard).

    This bug is corrected in 2.2 kernels.



 2.3.  SMP Programming

 Section by Jakob stergaard.

 This section is intended to outline what works, and what doesn't when
 it comes to programming multi-threaded software for SMP Linux.


 2.3.1.  Parallelization methods


 1. POSIX Threads

 2. PVM / MPI Message Passing Libraries

 3. fork() -- Multiple processes

 Since both fork() and PVM/MPI processes usually do not share memory,
 but either communicate by means of IPC or a messaging API, they will
 not be described further in this section. They are not very specific
 to SMP, since they are used just as much - or more - on uniprocessor
 computers, and clusters thereof.


 Only POSIX Threads provide us with multiple threads sharing ressources
 like - especially - memory. This is the thing that makes a SMP machine
 special, allowing many processors to share their memory. To use both
 (or more ;) processors of an SMP, use a kernel-thread library. A good
 library is the LinuxThreads, a pthread library made by Xavier Leroy
 <http://pauillac.inria.fr/~xleroy/linuxthreads/> which is now
 integrated with glibc2 (aka libc6).  Newer Linux distributions include
 this library by default, hence you do not have to obtain a separate
 package to use kernel threads.

 There are implementations of threads (and POSIX threads) that are
 application-level, and do not take advantage of the kernel-threading.
 These thread packages keep the threading in a single process, hence do
 not take advantage of SMP.  However, they are good for many
 applications and tend to actually run faster than kernel-threads on
 single processor systems.

 Multi-threading has never been really popular in the UN*X world
 though. For some reason, applications requiring multiple processes or
 threads, have mostly been written using fork(). Therefore, when using
 the thread approach, one runs into problems of incompatible (not
 thread-ready) libraries, compilers, and debuggers.  GNU/Linux is no
 exception to this. Hopefully the next few sections will sched a little
 light over what is currently possible, and what is not.



 2.3.2.  The C Library

 Older C libraries are not thread-safe. It is very important that you
 use GNU LibC (glibc), also known as libc6. Earlier versions are, of
 course possible to use, but it will cause you much more trouble than
 upgrading your system will, well probably :)

 If you want to use GDB to debug your programs, see below.


 2.3.3.  Languages, Compilers and debuggers

 There is a wealth of programming languages available for GNU/Linux,
 and many of them can be made to use threads one way or the other (some
 languages like Ada and Java even have threads as primitives in the
 language).

 This section will, however, currently only describe C and C++. If you
 have experience in SMP Programming with other languages, please
 enlighten us.

 GNU C and C++, as well as the EGCS C and C++ compilers work with the
 thread support from the standard C library (glibc). There are however
 a few issues:


 1. When compiling C or C++, use the -D_REENTRANT define in the
    compiler command line. This is necessary to make certain error-
    handling functions work like the errno variable.

 2. When using C++, If two threads throw exceptions concurrently, the
    program will segfault.  The compiler does not generate thread-safe
    exception code.

    The workaround is to put a
    pthread_mutex_lock(&global_exception_lock) in the constructor(s) of
    every class you throw(), and to put the corresponding
    pthread_mutex_unlock(...) in the destructor.  It's ugly, but it
    works.  This solution was given by Markus Ferch.

 The GNU Debugger GDB as of version 4.18, should handle threads
 correctly. Most Linux distribution offer a patched, thread-aware gdb.


 It is not necessary to patch glibc in any way just to make it work
 with threads. If you do not need to debug the software (this could be
 true for all machines that are not development workstations), there is
 no need to patch glibc.

 Note that core-dumps are of no use when using multiple threads.
 Somehow, the core dump is attached to one of the currently running
 threads, and not to the program as a whole. Therefore, whenever you
 are debugging anything, run it from the debugger.

 Hint: If you have a thread running haywire, like eating 100% CPU time,
 and you cannot seem to figure out why, here is a nice way to find out
 what's going on: Run the program straight from the shell, no GDB. Make
 the thread go haywire. Use top to get the PID of the process.  Run GDB
 like gdb program pid. This will make GDB attach itself to the process
 with the PID you specified, and stop the thead. Now you have a GDB
 session with the offending thread, and can use bt and the like to see
 what is happening.


 2.3.4.  Other libraries

 ElectricFence: This library is not thread safe. It should be possible,
 however, to make it work in SMP environments by inserting mutex locks
 in the ElectricFence code.



 2.3.5.  Other points about SMP Programming


 1. Where can I found more information about parallel programming?

    Look at the Linux Parallel Processing HOWTO
    <http://yara.ecn.purdue.edu/~pplinux/PPHOWTO/pphowto.html>

    Lots of useful information can be found at Parallel Processing
    using Linux <http://yara.ecn.purdue.edu/~pplinux/>

    Look also at the Linux Threads FAQ <http://linas.org/linux/threads-
    faq.html>


 2. Are there any threaded programs or libraries?

    Yes. For programs, you should look at: Multithreaded programs on
    linux <http://www.informatik.uni-bremen.de/~hollow/mthread.html> (I
    love hyperlinks, did you know that ? ;))

    As far as library are concerned, there are:


    OpenGL Mesa library
       Thanks to David Buccarelli, Andreas Schiffler and Emil Briggs,
       it exists in a multithreaded version (right now [1998-05-11],
       there is a working version that provides speedups of 5-30% on
       some OpenGL benchmarks). The multithreaded stuff is now included
       in the regular Mesa distribution as an experimental option.  For
       more information, look at the Mesa library
       <http://www.ssec.wisc.edu/~brianp/Mesa.html>


    BLAS
       Pentium Pro Optimized BLAS and FFTs for Intel Linux
       <http://www.cs.utk.edu/~ghenry/distrib/>

       Multithreaded BLAS routines are not available right now, but a
       dual proc library is planned for 1998-05-27, see Blas News
       <http://www.cs.utk.edu/~ghenry/distrib/blasnews> for details.


    The GIMP
       Emil Briggs, the same guy who is involved in multithreaded Mesa,
       is also working on multithreaded The GIMP plugins. Look at
       http://nemo.physics.ncsu.edu/~briggs/gimp/index.html for more
       info.



 3.  x86 architecture specific questions


 3.1.  Why it doesn't work on my machine?


 1. Can I use my Cyrix/AMD/non-Intel CPU in SMP?

    Short answer: no.

    Long answer: Intel claims ownership to the APIC SMP scheme, and
    unless a company licenses it from Intel they may not use it. There
    are currently no companies that have done so.  (This of course can
    change in the future) FYI - Both Cyrix and AMD support the non-
    proprietary OpenPIC SMP standard but currently there are no
    motherboards that use it.


 2. Why doesn't my old Compaq work?

    Put it into MP1.1/1.4 compliant mode.

    check "Configure Hardware" -> "View / Edit details" -> "Advanced
    mode" (F7 I think) for a configuration option "APIC mode" and set
    this to "full Table mode". This is an official Compaq
    recommandation. (Daniel Roesen)

    (Adrian Portelli)To do this:

    a. Press F10 when the server boots to enter the System
       Configuration Utility

    b. Press Enter to dismiss the splash screen

    c. Immediately press CTRL+A

    d. A message will appear informing you that you are now in
       "Advanced Mode"

    e. Then select "Configure Hardware" -> "View / Edit details"

    f. You will then see the advanced settings (intermixed with the
       ordinary ones)

    g. Stroll down to "APIC Mode" and then select "Fully Mapped"

    h. Save changes and reboot



 3. Why doesnt my ALR work?

    From Robert Hyatt : ALR Revolution quad-6 seems quite safe, while
    some older revolution quad machines without P6 processors seem
    "iffy"...


 4. Why does SMP go so slowly? or Why does one CPU show a very low
    bogomips value while the first one is normal?

    From Alan Cox: If one of your CPU's is reporting a very low
    bogomips value the cache is not enabled on it. Your vendor probably
    provides a buggy BIOS. Get the patch to work around this or better
    yet send it back and buy a board from a competent supplier.

    A 2.0 kernel (> 2.0.36) contains the MTRR patch which should solve
    this problem (select option "Handle buggy SMP BIOSes with bad MTRR
    setup" in the "General setup" menu).

    I think buggy SMP BIOS handling is automatic in latest 2.2 kernels.


 5. I've heard IBM machines have problems


    Some IBM machines have the MP1.4 bios block in the EBDA, allowed
    but not supported below 2.2 kernels.

    There is an old 486SLC based IBM SMP box. Linux/SMP requires
    hardware FPU support.


 6. Is there any advantage of Intel MP 1.4 over 1.1 specification?

    Nope (according to Alan :) ), 1.4 is just a stricker specs of 1.1.


 7. Why does the clock drift so rapidly when I run linux SMP?


    This is known problem with IRQ handling and long kernel locks in
    the 2.0 series kernels.  Consider upgrading to a later 2.2 kernel.

    From Jakob Oestergaard: Or, consider running xntpd. That should
    keep your clock right on time.  (I think that I've heard that
    enabling RTC in the kernel also fixes the clock drift. It works for
    me! but I'm not sure whether that's general or I'm just being
    lucky)


    There are some kernel fixes in the later 2.2.x series that may fix
    this.



 8. Why are my CPU's numbered 0 and 2 instead of 0 and 1 (or some other
    odd numbering)?

    The CPU number is assigned by the MB manufacturer and doesn't mean
    anything.  Ignore it.



 9. My quad-Xeon system hangs as soon as it has decompressed the kernel

    (Doug Ledford) Try recompiling LILO with LARGE_EBDA support and
    then making sure to always use make bzImage when compiling the
    kernel.  That appears to have fixed the SMP boot hangs here on
    Intel multi-Xeon boards.  However, please note that this also
    appears to break LILO in that the root= option no longer works, so
    make sure you rdev your kernel image at the same time you run lilo
    to make sure that the kernel loads the correct root filesystem at
    boot.

    (Robert M. Hyatt) With 3 cpus, do you have a terminator in the 4th
    slot?


 10.
    During boot machine hang signaling an IOAPIC problem

    Try boot options "noapic" (John Aldrich) and/or "reboot=bios"
    (Terry Shull).


 11.
    My system locks up during heavy NFS traffic

    Try the later 2.2.x kernels and the knfsd patches.  This is
    currently under investigation. (Wade Hampton)



 12.
    My system locks up with no oops messages

    If you are using kernels 2.2.11 or 2.2.12, get the latest kernel.
    For example 2.2.13 has a number of SMP fixes.  Several people have
    reported these kernels to be unstable for SMP.  These same kernels
    may have NFS problems that can cause lockups.  Also, use a serial
    console to capture your oops messages. (Wade Hampton)

    If the problem remains (and the other suggestions on this list
    didn't help either), then you could try the latest 2.3 kernels.
    They have more verbose (and more robust) SMP/APIC code, and
    automatic hard-lockup-prevention code which will produce meaningful
    oopses instead of a silent hang. (Ingo Molnar)

    (Osamu Aoki) You MUST also disable all BIOS related power save
    features.  Example of good configuration (Dual Celeron 466 Abit
    BP6):

    ___________________________________________________________________
     POWER MANAGEMENT SETUP.
       ACPI:              Disabled
       POWER MANAGEMENT:  Disabled
       PM CONTROL by APM: No
    ___________________________________________________________________


 If power management features are activated, some random freeze can
 occur.


 13.
    Debugging lockups

    (item by Wade Hampton)

    A good means of debugging lockups is to get the ikd patch from
    Andrea Arcangeli: ftp://ftp.suse.com/pub/people/andrea/kernel-
    patches
    There are several of debug options, but do NOT use the soft lockup
    option!  For newer SMP boxes, turn kernel debugging then turn on
    the NMI oopser.  To verify that the NMI oopser is working, after
    booting the new kernel, /cat /proc/interrupts and verify that you
    are getting NMIs.  When the box locks up, you should get an OOPS.

    You may also try the %eip option.  This allows the kernel to print
    on the console the %eip address every time a kernel function is
    called.  When the box locks up, write down the first column ordered
    by the second column then lookup the addresses in the System.map
    file.  This works only in console mode.

    Also note that the use of a serial console can greatly facilitate
    debugging kernel lockups, not just SMP kernel lockups!


 14.
    "APIC error interrupt on CPU#n, should never happen" messages in
    logs

    A message like:

    ___________________________________________________________________
    APIC error interrupt on CPU#0, should never happen.
    ... APIC ESR0: 00000002
    ... APIC ESR1: 00000000
    ___________________________________________________________________


 indicates a 'receive checksum error'. This cannot be caused by Linux
 as the APIC message checksumming part is completely in hardware. It
 might be marginal hardware. As long as you dont see any instability,
 they are not a problem - APIC messages are retried until delivered.
 (Ingo Molnar)



 3.2.  Possible causes of crash

 In this section you'll find some possible reasons for a crash of an
 SMP machine (credits are due to Jakob stergaard for this part). As far
 as I (David) know, theses problems are Intel specific.



 o  Cooling problems

    From Ralf Bchle: [Related to case size and fans] It's important
    that the air is flowing.  It of course can't where cables etc. are
    preventing this like in too small cases.  On the other side I've
    seen oversized cases causing big problems.  There are some tower
    cases on the market that actually are worse for cooling than
    desktops.  In short, the right thing is thinking about aerodynamics
    in the case.  Extra cases for hot peripherals are usefull as well.

    Of course you can always go to Radio Shack (or similar) and get
    another fan.  You can use the lm_sensors to monitor the CPU
    temperature of newer PII and PIII processors.  This might help you
    to determine if heat is a problem. (Wade Hampton)



 o  Bad memory

    Don't buy cheap RAM and don't use mixed RAM modules on a
    motherboard that is picky about it.

    Especially Tyan motherboards are known to be picky about RAM speed
    (see the Tyan paragraph below for a possible solution).


    There have been some report of 10ns PC100 RAM being sold with
    motherboards where the CPU really needs 8ns RAM. (Wade Hampton)



 o  Bad combination of different stepping CPUs

    Check /proc/cpuinfo to see that your CPUs are same stepping.


 o  If your system is unstable, then DON'T overclock it!

    ...and even if it is stable, DON'T overclock.

    From Ralf Bchle: Overclocking causes very subtle problems.  I have
    a nice example, one of my overclocked old machines misscomputes a
    couple of pixels of a 640 x 400 fractal.  The problem is only
    visible when comparing them using tools. So better say never,
    nuncas, jamais, niemals overclock.



 o  2.0.x kernel and fast ethernet (from Robert G. Brown)

    2.0.x kernels on high performance fast ethernet systems have
    significant (and known) problems with a race/deadlock condition in
    the networking interrupt handler.

    The solution is to get the latest 100BT development drivers from
    CESDIS Linux Ethernet device drivers site
    <http://cesdis.gsfc.nasa.gov/linux/drivers/> (ones that define
    SMPCHECK).


 o  A bug in the 440FX chipset (from Emil Briggs)

    If you had a system using the 440FX chipset then your problem with
    the lockups was possibly due to a documented errata in the chipset.
    Here is a reference

    References: Intel 440FX PCIset 82441FX (PMC) and 82442FX (DBX)
    Specification Update.  pg. 13

    http://www.intel.com/design/pcisets/specupdt/297654.htm

    The problem can be fixed with a BIOS workaround (Or a kernel patch)
    and in fact David Wragg wrote a patch that's included with Richard
    Gooch's MTTR patch. For more information and a fix look here:

    http://nemo.physics.ncsu.edu/~briggs/vfix.html


 o  DONT run emm386.exe before booting linux SMP

    From Mark Duguid, dumb rule #1 with W6LI motherboards. ;)


 o  If the machine reboots/freezes after a while, there can be two good
    BIOS + memory related reasons (from Jakob stergaard)

 o  If the BIOS has settings like "memory hole at 16M" and/or "OS/2
    memory > 64MB", try disabling them both. Linux does not always
    react well with theese options.

 o  If you have more than 64 MB of memory in the machine, and you
    specified the exact number manually in the LILO configuration, you
    should specify one MB less than you actually have in the machine.
    If you have 128 MB, you lilo.conf line looks like:
    append="mem=127M"



 o  Be aware of IRQ related problems

    Sometime, some cards are not recognized or can trigger IRQ
    conflicts. Try shuffling cards on slots in different ways and
    possibly moving them to different IRQs.

    Contributed by hASCII : removing an " append="hisax=9,2,3"" line in
    lilo.conf allowed using a kernel from the 2.1.xx series with
    activated ISDN + Hisax support. Kernels from the 2.0.xx series
    doesn't make problems like this.

    Try also to set BIOS setup option like "MP 1.4 mode" or "route PCI
    interrupts through IOAPIC", or "OS Type" not set to DOS neither
    Novell (Ingo Molnar).



 o  Floppy access while sound is active

    If you lockup when trying to access the floppy (for example while
    sound is playing) you may have to edit drivers/pci/quirks.c and set
    /int isa_dma_bridge_buggy = 1; This is a problem with my Dell WS400
    dual PII/300, 2.2.x, SMP (Wade Hampton).



 3.3.  Motherboard specific information

 Please note: Some more specific information can be found with the list
 of Motherboards rumored to run Linux SMP <http://www.nlug.org/smp/>


 3.3.1.  Motherboards with known problems


 o  none right now



 3.4.  Low cost SMP Linux box (dual Celeron box)

 (Stphane colivet)


 The lowest cost SMP Linux boxes with nowadays buyable processors are
 dual Celeron systems. Such a system is not officially possible
 according to Intel.  Better think about the second generation of
 Celeron, those with 128 Kb L2 cache.



 3.4.1.  Is it possible to run a dual Intel Celeron box ?

 Official answer from Intel: no, Celeron cannot work in SMP mode.

 Practical answer: it is possible, but requires hardware alteration for
 Slot 1 processors.  Alteration is described by Tomohiro Kawada on his
 Dual Celeron System <http://kikumaru.w-
 w.ne.jp/pc/celeron/index_e.html> page.  Of course, this kind of
 modification removes warranties...  Some versions of Celeron processor
 are also available in Socket 370 format.  In that case, alteration may
 just be done on the Socket 370 to Slot 1 adapter or may even be sold
 pre-wired for SMP use. (Andy Poling, Hans - Erik Skyttberg, James
 Beard)

 There is also a motherboard (ABIT BP6) allowing two Celerons in Socket
 370 format to be inserted (Martijn Kruithof, Ryan McCue). ABIT
 Computer BP6 verified tested and native to linux with dual ppga socket
 370 (Andre Hedrick).


 3.4.2.  How does Linux behave on a dual Celeron system ?

 Fine, thank you.


 3.4.3.  Celeron system ?  Celeron processors are known to be easily
 overclockable. And dual

 It may work. However, overclocking this kind of system is not as easy
 as overclocking a mono-processor one. It is definitly not a good idea
 for a production system. For personal use, dual Celeron 300A systems
 running rock-solid at 450 MHz have been reported. (numerous people)


 3.4.4.  And making a quad Celeron system ?

 It is impossible. Celeron processors have nearly the same features as
 basic Pentium II chips.  If you want more than 2 processors in your
 system, you'll have to look at Pentium Pro, Pentium II Xeon or Pentium
 III (?) boxes.



 3.4.5.  What about mixing Celeron and Pentium II processor ?

 A system using a "re-enable" Celeron processor and a Pentium II
 processor with the same steppings may theorically work.

 Alexandre Charbey as made such a system:

 o  Asus P2B-D motherboard, proc 1: Celeron 366, proc 2: Pentium II
    400@266

 o  66Mhz and 75Mhz bus frenquencies where functionnal

 o  the fastest processor (in this case the Celeron) should be put on
    the second slot. Swapping processors (fatest first) leads to quick
    failure.


 4.  Sparc architecture specific questions



 4.1.  Which Sparc machines are supported ?

 Quoting the UltraLinux <http://ultra.linux.cz/> web page (only SMP
 systems):

 o  UltraSPARC PCI based workstations: Ultra60, Ultra450

 o  UltraSPARC SBUS based servers: Enterprise 1, 2, 150

 o  UltraSPARC SBUS based large servers: Enterprise 3000, 4000, 5000,
    6000, 10000

 o  UltraSPARC PCI based servers: Enterprise 250, 450

 o  SPARC sun4m SMP machines (Anton Blanchard)

 UltraLinux has ran on a 14 CPUs machine (see the dmesg output
 <http://lwn.net/1998/1210/a/dm-sparc.html>).


 4.2.  Specific problem related to Sparc SMP support

 (David Miller) There should not be any worries.

 The only known problem, and one we don't intend to fix, is that if you
 build an SMP kernel for 32-bit (ie. non-ultrasparc) systems, this
 kernel will not work on sun4c systems.


 4.3.  SMP specific limit with current kernel (2.2)

 (David Miller) There is a bug in the include/linux/tasks.h header
 file, it needs to define NR_CPUS to 64 on UltraSparc as this is the
 upper limit for the hardware we support :-)


 5.  PowerPC architecture specific questions



 5.1.  Which PPC machines are supported ?


 o  PowerSurge boards (including UMAX s900)

 o  PowerMac

 o  Motorola MTX: support under developement. Patches not yet
    integrated into the main kernel (Troy Benjegerdes)

 (Cort Dougan) Not supported: PPC RS/6000 systems



 5.2.  Specific problem related to PPC SMP support

 Nothing. Usual SMP compiling (see above). As usual, be aware, modules
 are specific either for UP or SMP. Recompile them. (Paul Mackerras)


 6.  Alpha architecture specific questions



 6.1.  Which Alpha machines are supported ?

 (Geerten Kuiper) SMP works for most, if not all, AXP servers.

 (Jay A Estabrook) SMP does seem to work on most of our [Compaq] boxes
 with 2 or more CPUs. That includes :

 o  AS2000/2100 (SABLE)

 o  AS4000/4100 (RAWHIDE)

 o  DS20 (DP264)

 It does not include :

 o  AS2100A (LYNX)

 o  TurboLaser bigboys (8200/8400)


 6.2.  Specific problem related to Alpha SMP support

 None (really ? :-)


 7.  Useful pointers


 7.1.  Various


 o  Parallel Processing using Linux
    <http://yara.ecn.purdue.edu/~pplinux/>

 o  Linux Parallel Processing HOWTO
    <http://yara.ecn.purdue.edu/~pplinux/PPHOWTO/pphowto.html>

 o  (outdated) Linux SMP home page
    <http://www.uk.linux.org/SMP/title.html>

 o  linux-smp mailing list

    To subscribe, send subscribe linux-smp in the message body at
    [email protected]

    To unsubscribe, send unsubscribe linux-smp in the message body at
    [email protected]

    Linux SMP archives <http://www.linuxhq.com/lnxlists/linux-smp/>

    Linux SMP archives at progressive-comp.com <http://www.progressive-
    comp.com/Lists/?l=linux-smp&r=1&w=2#linux-smp>



 o  pthread library made by Xavier Leroy
    <http://pauillac.inria.fr/~xleroy/linuxthreads/>

 o  Motherboards rumored to run Linux SMP <http://www.nlug.org/smp/>

 o  procps <http://www.cs.inf.ethz.ch/~rauch/procps.html>

 o  procps patch for 2.2.x <http://queenbee.fhcrc.org/~warnes/procps>

 o  xosview <http://lore.ece.utexas.edu/~bgrayson/xosview.html>

 o  xosview for 2.2.x
    <http://www.ima.umn.edu/~klee/linux/xosview-1.6.1-5a1.tgz>

 o  SMP Performance of Linux
    <http://www.phy.duke.edu/brahma/benchmarks.smp>

 o  CESDIS Linux Ethernet device drivers site
    <http://cesdis.gsfc.nasa.gov/linux/drivers/>

 o  Dual Celeron System <http://kikumaru.w-
    w.ne.jp/pc/celeron/index_e.html>



 7.2.  Multithreaded programs and library


 o  Linux Threads FAQ <http://linas.org/linux/threads-faq.html>

 o  Multithreaded programs on linux <http://www.informatik.uni-
    bremen.de/~hollow/mthread.html>

 o  Pentium Pro Optimized BLAS and FFTs for Intel Linux
    <http://www.cs.utk.edu/~ghenry/distrib/> (not available right now,
    but a dual proc library is planned for 5/27/98, see Blas News
    <http://www.cs.utk.edu/~ghenry/distrib/blasnews> for details)

 o  Mesa library <http://www.ssec.wisc.edu/~brianp/Mesa.html> (with
    experimental multi-threading)

 o  Parallel plugins for The GIMP
    <http://nemo.physics.ncsu.edu/~briggs/gimp/index.html>



 7.3.  SMP specific patches


 o  Forissier kernel patches <http://www-
    isia.cma.fr/~forissie/smp_kernel_patch/>

 o  Patch for a bug in the 440FX chipset
    <http://nemo.physics.ncsu.edu/~briggs/vfix.html>

 o  MTRR patch (latest version: 1.9)
    <http://www.atnf.csiro.au/~rgooch/kernel-patches.html>

 o  PSET - Processor Sets for the Linux kernel
    <http://isunix.it.ilstu.edu/~thockin/pset/>

 o  Ingo Molnar SMP patches <http://www.redhat.com/~mingo/> (for
    experts only, please read [email protected])



 7.4.  ( Sumit Roy ) Parallelizing/Optimizing Compilers for 586/686
 machines


 o  Pentium Compiler Group <http://www.goof.com/pcg/> creators of pgcc

 o  Absoft <http://www.absoft.com/> , Fortran 90 and Fortran 77
    compilers

 o  The Portland Group, Inc. <http://www.pgroup.com/>, supports the
    OpenMP <http://www.openmp.org> standard for Fortran parallelization
    on Linux

 o  Pacific-Sierra Research Corporation <http://www.psrv.com/>, has a
    free F90 compiler for Linux, as well as parallelizing compilers for
    SMP Linux

 o  Applied Parallel Research <http://s006.infomall.org/index.html>,
    currently have parallelizing compilers for WinNT

 o  KAI <http://www.kai.com> has a C++-Compiler for Linux, that
    understands OpenMPI. It is called Guide_OpenMP. Info under
    http://www.kai.com/parallel/kappro/guide. (Gero Wedemann)



 8.  Glossary


 o  SMP Symmetric Multi-Processors

 o  APIC Advanced Programmable Interrupt Controler

 o  thread A thread is a processor activity in a process. The same
    process can have multiple threads. Those threads share the process
    address space and can therefore share data.

 o  pthread Posix thread, threads defined by the Posix standard.

 o  APM Advanced Power Managment



 9.  What's new ?



    v1.9, 13 january 2000

    o  Remember to disable all BIOS power-save features (Osamu Aoki)

    o  Explain how to access to Compaq server into advanced
       configuration mode (Adrian Portelli)


    v1.8, 8 november 1999

    o  quad-celeron motherboard was a hoax, restored old paragraph
       (Simen Timian Thoresen)


    v1.7, 6 november 1999

    o  new introduction (C. Polisher aka cp)

    o  numerous typo and grammatical fixes (cp)

    o  introductory paragraph on kernel compilation (cp)

    o  introductory paragraph on SMP need (cp)

    o  reference on KAI optimizing compiler (Gero Wedemann)

    o  quad-celeron motherboard exists (Jeffrey H. Ingber)

    v1.6, 21 october 1999

    o  added information on xosview scheduling perturbation

    o  added "APIC error interrupt on CPU#n" message information

    o  added information on hard lockup

    o  deleted section "How to optain maximum performance" (was
       obsolete)

    o  added info on dual systems with different x86 procs (a Celeron
       and a P-II)


    v1.5, 4 october 1999

    o  more precision in PSET description


    v1.4, 30 september 1999

    o  precize to enable MTRR support for an x86 SMP kernel (me)


    v1.3, 29 september 1999

    o  many many grammar and typographical fixes (Wade Hampton aka hww)

    o  added info in short introduction related to 2.2/2.4/2.0 diffs
       (hww)

    o  added step by step things to do to recompile a kernel (hww and
       me)

    o  added info related to SMP/UP modules problems  (hww)

    o  added precision in Posix Threads section related to user  (hww)
       vs. kernel threads (hww)

    o  new item about NFS and kernel lock  (hww)

    o  new item about kernel lock without message  (hww)

    o  new item about debugging lockup problems  (hww)

    o  added info about heating problems  (hww)

    o  miscellaneous updates I've forget about  (hww)

    o  new item about floppy access and sound  (hww)


    v1.2, 27 septembre 1999

    o  name change: this document is now a HOWTO. TWD, and fast!
       (Guylhem Aznar)


    v1.1, 26 septembre 1999

    o  added a link to first Chris Pirih FAQ draft

    o  expanted an IRQ related problems


    v1.00, 25 septembre 1999

    o  first upgrade in a long long time!

    o  reprocessed the whole FAQ: 2.2 is here and 2.4 soon

    o  added kernel locking information from Ingo Molnar

    o  deleted item "How will my application perform under SMP?":
       outdated

    o  deleted item "My SMP system is locking up all the time.":
       outdated

    o  deleted item "You are running 2.0.35 aren't you ?": outdated

    o  deleted item "Some hardware is also known to cause problems.":
       outdated

    o  blanked section "Motherboards with known problems". We should
       restart from scratch

    o  deleted section "Motherboards with NO known problems": outdated

    o  updated dual celeron section (numerous people)

    o  added "SPARC sun4m SMP machines" to supported SMP sparc machines
       (Anton Blanchard)

    o  added a "During boot machine hang signaling an IOAPIC problem"
       item in "Why it doesn't work on my machine?" section

    o  added a "What about SMP performances?" item

    o  updated "Why doesn't my old Compaq work?" item

    o  fixed an outdated pointer

    o  added a pointer to Ingo test SMP patches


    v0.54, 13 march 1999

    o  Added a section about SMP Alpha systems


    v0.53, 08 march 1999

    o  Added a section about SMP PowerPC systems


    v0.52, 07 march 1999

    o  Added a section about SMP Sparc systems


    v0.51, 06 march 1999

    o  Added a dual-celeron section

    o  Deleted Adaptec section

    o  Updated procps link

    o  Updated xosview link

    o  Added an answer for quad Xeon boot hang

    o  Updated item about glibc patch for gd: should be included in RH
       5.2


    v0.50, 03 february 1999

    o  Updated "Multithreaded programs on linux" link


    v0.49, 13 january 1999

    o  Update about CONFIG_SMP. Added .txt to Documentation/smp.
       (Michael Elizabeth Chastain)


    v0.48, 10 december 1998

    o  Mispelled corrected. Email address corrected.


    v0.47, 20 november 1998

    o  Added that 2.0.36 as the MTRR patch (related to the BogoMips
       problem)


    v0.46, 10 november 1998

    o  Update about Epox KP6-LS motherboards


    v0.45, 25 october 1998

    o  Corrected an error regarding /proc/stat file

    o  Added a pointer to CESDIS Ethernet Linux Drivers site


    v0.44, 14 october 1998

    o  Updated the link to the web page: Motherboards rumored to run
       Linux SMP

    o  Added Jakob explanation how to time SMP systems with 2.0 kernels


    v0.43, 9 september 1998

    o  Updated first question in section 3.1

    o  Updated mt-Mesa link: multi-threaded is now included as
       experimental in the Mesa distribution


    v0.42, 2 september 1998

    o  Minor cosmetic update in sect 3.3

    o  Two links (multithreaded Mesa and SMP performance) marked
       outdated

    o  Updated the item about threads and exceptions in C++ (sect 3.3)


    v0.41, 1 september 1998

    o  Added a major section: "3.3 SMP Programming" written by Jakob
       stergaard

    o  moved some item of section "3.2 User side" in sect 3.3


    v0.40, 27 august 1998

    o  Updated section 3.1, item 7: processor affinity


    v0.39, 27 august 1998

    o  Updated needed Award BIOS version for Tyan motherboards (hASCII)

    o  Added an item on IRQ in the crash section (me and hASCII)

    o  Added good support of Asus P2B-DS (Ulf Rompe)

    o  Added another smp-list archive in pointer section (Hank
       Leininger)


    v0.38, 8 august 1998

    o  Added a pointer to the Linux Threads FAQ


    v0.37, 30 July 1998

    o  Emil Briggs is working on parallel plugins for Gimp (see "Is
       there any threaded programs or library?", sect. "User side")


    v0.36, 26 July 1998

    o  Thanks to Jakob stergaard, two changes in "Possible causes of
       Crash"

    o  Changed 2.0.33 to 2.0.35 (latest stable)

    o  Added a "BIOS related causes of failure"


    v0.35, 14 July 1998

    o  Added N440BX Server Board in Motherboards with NO problems

    o  Added a succes story for GigaByte motherboard with BIOS upgrade

    o  Added a "How to obtain maximum performance ?" section (waiting
       for your contributions ;)


    v0.34, 10 june 1998

    o  Added a "Parallelizing/Optimizing Compilers for 586/686
       machines" section in section "Useful Pointers", thanks to Sumit
       Roy

    o  Corrected a mispelling, "Asus P/I-UP5" is in fact "Asus P/I-
       P65UP5"


    v0.33, 3 june 1998

    o  Yet another success story for a GigaByte DLX Motherboard.

    o  A tip for Tyan motherboards, disable the "DRAM Fast Leadoff"
       BIOS option


    v0.32, 27 may 1998

    o  Asus P/I-UP5 added in the motherboard-with-NO-problem section


    v0.31, 18 may 1998

    o  Elitegroup P6LX2-A works with 2.1.100 and 101

    o  Bugs should be reported to [email protected]


    v0.30, 12 may 1998

    o  SuperMicro is now in the motherboard-with-NO-problem section


    v0.29, 11 may 1998

    o  A success story for a GigaByte 686 motherboard with 2.1.101

    o  Added a new item in the "User Side" section: "Is there any
       threaded programs or library?"

    o  OpenGL Mesa library is beeing multithreaded. Cool! See the new
       section for details.


    v0.28, 09 may 1998

    o  A US mirror of this FAQ is now available (see Introduction)

    o  Merge of the two confusing Gigabyte 686 entries


    v0.27, 05 may 1998

    o  New info for the Adaptec and TekRam drivers

    o  Micronics W6-LI motherboard works under SMP



 10.  List of contributors

 Many thanks to those who help me to maintain this HOWTO:


 1. Tigran A. Aivazian

 2. John Aldrich

 3. Niels Ammerlaan

 4. H. Peter Anvin


 5. Osamu Aoki

 6. Guylhem Aznar

 7. Ralf Bchle

 8. James Beard

 9. Troy Benjegerdes

 10.
    Anton Blanchard

 11.
    Emil Briggs

 12.
    Robert G. Brown

 13.
    Alexandre Charbey

 14.
    Michael Elizabeth Chastain

 15.
    Samuel S. Chessman

 16.
    Alan Cox

 17.
    Andrew Crane

 18.
    Cort Dougan

 19.
    Mark Duguid

 20.
    Stphane colivet

 21.
    Jocelyne Erhel

 22.
    Jay A Estabrook

 23.
    Byron Faber

 24.
    Mark Garlanger

 25.
    hASCII

 26.
    Wade Hampton

 27.
    Andre Hedrick

 28.
    Claus-Justus Heine
 29.
    Benedikt Heinen

 30.
    Florian Hinzmann

 31.
    Moni Hollmann

 32.
    Robert M. Hyatt

 33.
    Jeffrey H. Ingber

 34.
    Richard Jelinek

 35.
    Tony Kocurko

 36.
    Geerten Kuiper

 37.
    Martijn Kruithof

 38.
    Doug Ledford

 39.
    Kumsup Lee

 40.
    Hank Leininger

 41.
    Ryan McCue

 42.
    Paul Mackerras

 43.
    Cameron MacKinnon

 44.
    Joel Marchand

 45.
    David Maslen

 46.
    Chris Mauritz

 47.
    Jean-Francois Micouleau

 48.
    David Miller

 49.
    Ingo Molnar

 50.
    Ulf Nielsen

 51.
    Jakob Oestergaard

 52.
    C Polisher

 53.
    Adrian Portelli

 54.
    Matt Ranney

 55.
    Daniel Roesen

 56.
    Ulf Rompe

 57.
    Jean-Michel Rouet

 58.
    Volker Reichelt

 59.
    Sean Reifschneider

 60.
    Sumit Roy

 61.
    Thomas Schenk

 62.
    Terry Shull

 63.
    Chris K. Skinner

 64.
    Hans - Erik Skyttberg

 65.
    Szakacsits Szabolcs

 66.
    Jukka Tainio

 67.
    Simen Timian Thoresen

 68.
    El Warren

 69.
    Gregory R. Warnes

 70.
    Gero Wedemann

 71.
    Christopher Allen Wing

 72.
    Leonard N. Zubkoff