Antares-RAID-sparcLinux-HOWTO
Thom Coates (
[email protected]), Carl Munio, Jim Ludemann
v0.1, 28 April 2000
This document describes how to install, configure, and maintain a
hardware RAID built around the 5070 SBUS host based RAID controller by
Antares Microsystems. Other topics of discussion include RAID levels,
the 5070 controller GUI, and 5070 command line. A complete command
reference for the 5070's K9 kernel and Bourne-like shell is included.
______________________________________________________________________
Table of Contents
1. Preamble
2. Acknowledgements and Thanks
3. New Versions
4. Introduction
4.1 5070 Main Features
5. Background
5.1 Raid Levels
5.2 RAID Linear
5.2...1 SUMMARY
5.3 Level 1
5.3...1 SUMMARY
5.4 Striping
5.5 Level 0
5.5...1 SUMMARY:
5.6 Level 2 and 3
5.6...1 SUMMARY
5.7 Level 4
5.7...1 SUMMARY
5.8 Level 5
5.8...1 SUMMARY
6. Installation
6.1 SBUS Controller Compatibility
6.2 Hardware Installation Procedure
6.2...1 GNOME:
6.2...2 KDE:
6.2...3 XDM:
6.2...4 Console Login (systems without X windows):
6.2...5 All Systems:
6.2...6 SPARCstation 4, 5, 10, 20 & UltraSPARC Systems:
6.2...7 Ultra Enterprise Servers, SPARCserver 1000 & 2000 Systems, SPARCserver 6XO MP Series:
6.2...8 All Systems:
6.2...9 Verifying the Hardware Installation:
6.3 Serial Terminal
6.4 Hard Drive Plant
7. 5070 Onboard Configuration
7.1 Main Screen Options
7.1...1 <Figure 1: Main Screen>
7.2 [Q]uit
7.3 [R]aidSets:
7.3...1 <Figure 2: RAIDSet Configuration Screen>
7.4 [H]ostports:
7.4...1 <Figure 3: Host Port Configuration Screen>
7.5 [S]pares:
7.5...1 <Figure 4: Spare Device Configuration Screen>
7.6 [M]onitor:
7.6...1 <Figure 5: SCSI Monitor Screen>
7.7 [G]eneral:
7.7...1 <Figure 6: General Screen>
7.8 [P]robe
7.9 Example RAID Configuration Session
8. Linux Configuration
8.1 Existing Linux Installation
8.1.1 QLogic SCSI Driver
8.1.2 Device mappings
8.1.3 Partitioning
8.1.4 Installing a filesystem
8.1.5 Mounting
8.2 New Linux Installation
9. Maintenance
9.1 Activating a spare
9.2 Re-integrating a repaired drive into the RAID (levels 3 and 5)
10. Troubleshooting / Error Messages
10.1 Out of band temperature detected...
10.2 ... failed ... cannot have more than 1 faulty backend.
10.3 When booting I see: ... Sun disklabel: bad magic 0000 ... unknown partition table.
11. Bugs
12. Frequently Asked Questions
12.1 How do I reset/erase the onboard configuration?
12.2 How can I tell if a drive in my RAID has failed?
13. Advanced Topics: 5070 Command Reference
13.1 AUTOBOOT - script to automatically create all raid sets and scsi monitors
13.2 AUTOFAULT - script to automatically mark a backend faulty after a drive failure
13.3 AUTOREPAIR - script to automatically allocate a spare and reconstruct a raid set
13.4 BIND - combine elements of the namespace
13.5 BUZZER - get the state or turn on or off the buzzer
13.6 CACHE - display information about and delete cache ranges
13.7 CACHEDUMP - Dump the contents of the write cache to battery backed-up ram
13.8 CACHERESTORE - Load the cache with data from battery backed-up ram
13.9 CAT - concatenate files and print on the standard output
13.10 CMP - compare the contents of 2 files
13.11 CONS - console device for Husky
13.12 DD - copy a file (disk, etc)
13.13 DEVSCMP - Compare a file's size against a given value
13.14 DFORMAT- Perform formatting functions on a backend disk drive
13.15 DIAGS - script to run a diagnostic on a given device
13.16 DPART - edit a scsihd disk partition table
13.17 DUP - open file descriptor device
13.18 ECHO - display a line of text
13.19 ENV- environment variables file system
13.20 ENVIRON - RaidRunner Global environment variables - names and effects
13.21 EXEC - cause arguments to be executed in place of this shell
13.22 EXIT - exit a K9 process
13.23 EXPR - evaluation of numeric expressions
13.24 FALSE - returns the K9 false status
13.25 FIFO - bi-directional fifo buffer of fixed size
13.26 GET - select one value from list
13.27 GETIV - get the value an internal RaidRunner variable
13.28 HELP - print a list of commands and their synopses
13.29 HUSKY - shell for K9 kernel
13.30 HWCONF - print various hardware configuration details
13.31 HWMON - monitoring daemon for temperature, fans, PSUs.
13.32 INTERNALS - Internal variables used by RaidRunner to change dynamics of running kernel
13.33 KILL - send a signal to the nominated process
13.34 LED- turn on/off LED's on RaidRunner
13.35 LFLASH- flash a led on RaidRunner
13.36 LINE - copies one line of standard input to standard output
13.37 LLENGTH - return the number of elements in the given list
13.38 LOG - like zero with additional logging of accesses
13.39 LRANGE - extract a range of elements from the given list
13.40 LS - list the files in a directory
13.41 LSEARCH - find the a pattern in a list
13.42 LSUBSTR - replace a character in all elements of a list
13.43 MEM - memory mapped file (system)
13.44 MDEBUG - exercise and display statistics about memory allocation
13.45 MKDIR - create directory (or directories)
13.46 MKDISKFS - script to create a disk filesystem
13.47 MKHOSTFS - script to create a host port filesystem
13.48 MKRAID - script to create a raid given a line of output of rconf
13.49 MKRAIDFS - script to create a raid filesystem
13.50 MKSMON - script to start the scsi monitor daemon smon
______________________________________________________________________
1. Preamble
Copyright 2000 by Thomas D. Coates, Jr. This document's source is
licensed under the terms if the GNU general public license agreement.
Permission to use, copy, modify, and distribute this document without
fee for any purpose commercial or non-commercial is hereby granted,
provided that the author's names and this notice appear in all copies
and/or supporting documents; and that the location where a freely
available unmodified version of this document may be obtained is
given. This document is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY, either expressed or implied. While
every effort has been taken to ensure the accuracy of the information
documented herein, the
author(s)/editor(s)/maintainer(s)/contributor(s) assumes NO
RESPONSIBILITY for any errors, or for any damages, direct or
consequential, as a result of the use of the information documented
herein. A complete copy of the GNU Public License agreement may be
obtained from: Free Software Foundation, Inc., 59 Temple Place - Suite
330, Boston, MA 02111-1307, USA. Portions of this document are
adapted and/or re-printed from the 5070 installation guide and man
pages with permission of Antares Microsystems, Inc., Campbell CA.
2. Acknowledgements and Thanks
� Carl and Jim at Antares for the hardware, man pages, and other
support/contributions they provided during the writing of this
document.
� Penn State University - Hershey Medical Center, Department of
Radiology, Section of Clinical Image Management (My home away from
my home away from home).
� The software-raid-HOWTO Copyright 1997 by Linas Vepstas under the
GNU public license agreement. The software-raid-HOWTO is Available
from :
http://www.linuxdoc.org
3. New Versions
� The most recent version of this document can be found at my
homepage:
http://www.xray.hmc.psu.edu/~tcoates/
� Other versions may be found in different formats at the LDP
homepage:
http://www.linuxdoc.org and mirror sites.
4. Introduction
The Antares 5070 is a high performance, versatile, yet relatively
inexpensive host based RAID controller. Its embedded operating system
(K9 kernel) is modelled on the Plan 9 operating system whose design is
discussed in several papers from AT&T (see the "Further Reading"
section). K9 is a kernel targeted at embedded controllers of small to
medium complexity (e.g. ISDN-ethernet bridges, RAID controllers, etc).
It supports multiple lightweight processes (i.e. without memory
management) on a single CPU with a non-pre-emptive scheduler. Device
driver architecture is based on Plan 9 (and Unix SVR4) streams.
Concurrency control mechanisms include semaphores and signals.
The 5070 has three single ended ultra 1 SCSI channels and two onboard
serial interfaces one of which provides command line access via a
connected serial terminal or modem. The other is used to upgrade the
firmware. The command line is robust, implementing many of the
essential Unix commands (e.g. dd, ls, cat, etc.) and a scaled down
Bourne shell for scripting. The Unix command set is augmented with
RAID specific configuration commands and scripts. In addition to the
command line interface an ASCII text based GUI is provided to permit
easy configuration of level 0, 1, 3, 4, and 5 RAIDs.
4.1. 5070 Main Features
� RAID levels 0, 1, 3, 4, and 5 are supported.
� Text based GUI for easy configuration for all supported RAID
levels.
� A Multidisk RAID volume appears as an individual SCSI drive to the
operating system and can be managed with the standard utilities
(fdisk, mkfs, fsck,etc.). RAID Volumes may be assigned to
different SCSI IDs or the same SCSI IDs but different LUNs.
� No special RAID drivers required for the host operating system.
� Multiple RAID volumes of different levels can be mixed among the
drives forming the physical plant. For example in a hypothetical
drive plant consisting of 9 drives:
� 2 drives form a level 3 RAID assigned to SCSI ID 5, LUN 0
� 2 drives form a level 0 RAID assigned to SCSI ID 5, LUN 1
� 5 drives form a level 5 RAID assigned to SCSI ID 6, LUN 0
� Three single ended SCSI channels which can accommodate 6 drives
each (18 drives total).
� Two serial interfaces. The first permits
configuration/control/monitoring of the RAID from a local serial
terminal. The second serial port is used to upload new programming
into the 5070 (using PPP and TFTP).
� Robust Unix-like command line and NVRAM based file system.
� Configurable ASCII SCSI communication channel for passing commands
to the 5070's command line interpreter. Allows programming running
on host OS to directly configure/control/monitor all parameters of
the 5070.
5. Background
Much of the information/knowledge pertaining to RAID levels in this
section is adapted from the software-raid-HOWTO by Linas Vepstas . See
the acknowledgements section for the URL where the full document may
be obtained.
RAID is an acronym for "Redundant Array of Inexpensive Disks" and is
used to create large, reliable disk storage systems out of individual
hard disk drives. There are two basic ways of implementing a RAID,
software or hardware. The main advantage of a software RAID is low
cost. However, since the OS of the host system must manage the RAID
directly there is a substantial penalty in performance. Furthermore if
the RAID is also the boot device, a drive failure could prove
disastrous since the operating system and utility software needed to
perform the recovery is located on the RAID. The primary advantages of
hardware RAID is performance and improved reliability. Since all RAID
operations are handled by a dedicated CPU on the controller, the host
system's CPU is never bothered with RAID related tasks. In fact the
host OS is completely oblivious to the fact that its SCSI drives are
really virtual RAID drives. When a drive fails on the 5070 it can be
replaced on-the-fly with a drive from the spares pool and its data
reconstructed without the host's OS ever knowing anything has
happened.
5.1. Raid Levels
The different RAID levels have different performance, redundancy,
storage capacity, reliability and cost characteristics. Most, but not
all levels of RAID offer redundancy against drive failure. There are
many different levels of RAID which have been defined by various
vendors and researchers. The following describes the first 7 RAID
levels in the context of the Antares 5070 hardware RAID
implementation.
5.2. RAID Linear
RAID-linear is a simple concatenation of drives to create a larger
virtual drive. It is handy if you have a number small drives, and wish
to create a single, large drive. This concatenation offers no
redundancy, and in fact decreases the overall reliability: if any one
drive fails, the combined drive will fail.
5.2.0.0.1. SUMMARY
� Enables construction of a large virtual drive from a number of
smaller drives
� No protection, less reliable than a single drive
� RAID 0 is a better choice due to better I/O performance
5.3. Level 1
Also referred to as "mirroring". Two (or more) drives, all of the same
size, each store an exact copy of all data, disk-block by disk-block.
Mirroring gives strong protection against drive failure: if one drive
fails, there is another with the an exact copy of the same data.
Mirroring can also help improve performance in I/O-laden systems, as
read requests can be divided up between several drives. Unfortunately,
mirroring is also one of the least efficient in terms of storage: two
mirrored drives can store no more data than a single drive.
5.3.0.0.1. SUMMARY
� Good read/write performance
� Inefficient use of storage space (half the total space available
for data)
� RAID 6 may be a better choice due to better I/O performance.
5.4. Striping
Striping is the underlying concept behind all of the other RAID
levels. A stripe is a contiguous sequence of disk blocks. A stripe
may be as short as a single disk block, or may consist of thousands.
The RAID drivers split up their component drives into stripes; the
different RAID levels differ in how they organize the stripes, and
what data they put in them. The interplay between the size of the
stripes, the typical size of files in the file system, and their
location on the drive is what determines the overall performance of
the RAID subsystem.
5.5. Level 0
Similar to RAID-linear, except that the component drives are divided
into stripes and then interleaved. Like RAID-linear, the result is a
single larger virtual drive. Also like RAID-linear, it offers no
redundancy, and therefore decreases overall reliability: a single
drive failure will knock out the whole thing. However, the 5070
hardware RAID 0 is the fastest of any of the schemes listed here.
5.5.0.0.1. SUMMARY:
� Use RAID 0 to combine smaller drives into one large virtual drive.
� Best Read/Write performance of all the schemes listed here.
� No protection from drive failure.
� ADVICE: Buy very reliable hard disk drives if you plan to use this
scheme.
5.6. Level 2 and 3
RAID-2 is seldom used anymore, and to some degree has been made
obsolete by modern hard disk technology. RAID-2 is similar to RAID-4,
but stores ECC information instead of parity. Since all modern disk
drives incorporate ECC under the covers, this offers little additional
protection. RAID-2 can offer greater data consistency if power is lost
during a write; however, battery backup and a clean shutdown can offer
the same benefits. RAID-3 is similar to RAID-4, except that it uses
the smallest possible stripe size.
5.6.0.0.1. SUMMARY
� RAID 2 is largely obsolete
� Use RAID 3 to combine separate drives together into one large
virtual drive.
� Protection against single drive failure,
� Good read/write performance.
5.7. Level 4
RAID-4 interleaves stripes like RAID-0, but it requires an additional
drive to store parity information. The parity is used to offer
redundancy: if any one of the drives fail, the data on the remaining
drives can be used to reconstruct the data that was on the failed
drive. Given N data disks, and one parity disk, the parity stripe is
computed by taking one stripe from each of the data disks, and XOR'ing
them together. Thus, the storage capacity of a an (N+1)-disk RAID-4
array is N, which is a lot better than mirroring (N+1) drives, and is
almost as good as a RAID-0 setup for large N. Note that for N=1, where
there is one data disk, and one parity disk, RAID-4 is a lot like
mirroring, in that each of the two disks is a copy of each other.
However, RAID-4 does NOT offer the read-performance of mirroring, and
offers considerably degraded write performance. In brief, this is
because updating the parity requires a read of the old parity, before
the new parity can be calculated and written out. In an environment
with lots of writes, the parity disk can become a bottleneck, as each
write must access the parity disk.
5.7.0.0.1. SUMMARY
� Similar to RAID 0
� Protection against single drive failure.
� Poorer I/O performance than RAID 3
� Less of the combined storage space is available for data [than RAID
3] since an additional drive is needed for parity information.
5.8. Level 5
RAID-5 avoids the write-bottleneck of RAID-4 by alternately storing
the parity stripe on each of the drives. However, write performance is
still not as good as for mirroring, as the parity stripe must still be
read and XOR'ed before it is written. Read performance is also not as
good as it is for mirroring, as, after all, there is only one copy of
the data, not two or more. RAID-5's principle advantage over mirroring
is that it offers redundancy and protection against single-drive
failure, while offering far more storage capacity when used with three
or more drives.
5.8.0.0.1. SUMMARY
� Use RAID 5 if you need to make the best use of your available
storage space while gaining protection against single drive
failure.
� Slower I/O performance than RAID 3
6. Installation
NOTE: The installation procedure given here for the SBUS controller is
similar to that found in the manual. It has been modified so minor
variations in the SPARCLinux installation may be included.
6.1. SBUS Controller Compatibility
The 5070 / Linux 2.2 combination was tested on SPARCstation (5, 10, &
20), Ultra 1, and Ultra 2 Creator. The 5070 was also tested on Linux
with Symmetrical Multiprocessing (SMP) support on a dual processor
Ultra 2 creator 3D with no problems. Other 5070 / Linux / hardware
combinations may work as well.
6.2. Hardware Installation Procedure
If your system is already up and running, you must halt the operating
system.
6.2.0.0.1. GNOME:
1. From the login screen right click the "Options" button.
2. On the popup menu select System -> Halt.
3. Click "Yes" when the verification box appears
6.2.0.0.2. KDE:
1. From the login screen right click shutdown.
2. On the popup menu select shutdown by right clicking its radio
button.
3. Click OK
6.2.0.0.3. XDM:
1. login as root
2. Left click on the desktop to bring up the pop-up menu
3. select "New Shell"
4. When the shell opens type "halt" at the prompt and press return
6.2.0.0.4. Console Login (systems without X windows):
1. Login as root
2. Type "halt"
6.2.0.0.5. All Systems:
Wait for the message "power down" or "system halted" before
proceeding. Turn off your SPARCstation system (Note: Your system may
have turned itself off following the power down directive), its video
monitor, external disk expansion boxes, and any other peripherals
connected to the system. Be sure to check that the green power LED on
the front of the system enclosure is not lit and that the fans inside
the system are not running. Do not disconnect the system power cord.
6.2.0.0.6. SPARCstation 4, 5, 10, 20 & UltraSPARC Systems:
1. Remove the top cover on the CPU enclosure. On a SPARCstation 10,
this is done by loosening the captive screw at the top right corner
of the back of the CPU enclosure, then tilting the top of the
enclosure forward while using a Phillips screwdriver to press the
plastic tab on the top left corner.
2. Decide which SBUS slot you will use. Any slot will do. Remove the
filler panel for that slot by removing the two screws and
rectangular washers that hold it in.
3. Remove the SBUS retainer (commonly called the handle) by pressing
outward on one leg of the retainer while pulling it out of the hole
in the printed circuit board.
4. Insert the board into the SBUS slot you have chosen. To insert the
board, first engage the top of the 5070 RAIDium backpanel into the
backpanel of the CPU enclosure, then rotate the board into a level
position and mate the SBUS connectors. Make sure that the SBUS
connectors are completely engaged.
5. Snap the nylon board retainers inside the SPARCstation over the
5070 RAIDium board to secure it inside the system.
6. Secure the 5070 RAIDium SBUS backpanel to the system by replacing
the rectangular washers and screws that held the original filler
panel in place.
7. Replace the top cover by first mating the plastic hooks on the
front of the cover to the chassis, then rotating the cover down
over the unit until the plastic tab in back snaps into place.
Tighten the captive screw on the upper right corner.
6.2.0.0.7. 6XO MP Series: Ultra Enterprise Servers, SPARCserver 1000
& 2000 Systems, SPARCserver
1. Remove the two Allen screws that secure the CPU board to the card
cage. These are located at each end of the CPU board backpanel.
2. Remove the CPU board from the enclosure and place it on a static-
free surface.
3. Decide which SBUS slot you will use. Any slot will do. Remove the
filler panel for that slot by removing the two screws and
rectangular washers that hold it in. Save these screws and washers.
4. Remove the SBUS retainer (commonly called the handle) by pressing
outward on one leg of the retainer while pulling it out of the hole
in the printed circuit board.
5. Insert the board into the SBUS slot you have chosen. To insert the
board, first engage the top of the 5070 RAIDium backpanel into the
backpanel of the CPU enclosure, then rotate the board into a level
position and mate the SBUS connectors. Make sure that the SBUS
connectors are completely engaged.
6. Secure the 5070 RAIDium board to the CPU board with the nylon
screws and standoffs provided on the CPU board. The standoffs may
have to be moved so that they match the holes used by the SBUS
retainer, as the standoffs are used in different holes for an MBus
module. Replace the screws and rectangular washers that originally
held the filler panel in place, securing the 5070 RAIDium SBus
backpanel to the system enclosure.
7. Re-insert the CPU board into the CPU enclosure and re-install the
Allen-head retaining screws that secure the CPU board.
6.2.0.0.8. All Systems:
1. Mate the external cable adapter box to the 5070 RAIDium and gently
tighten the two screws that extend through the cable adapter box.
2. Connect the three cables from your SCSI devices to the three 68-pin
SCSI-3 connectors on the Antares 5070 RAIDium. The three SCSI
cables must always be reconnected in the same order after a RAID
set has been established, so you should clearly mark the cables and
disk enclosures for future disassembly and reassembly.
3. Configure the attached SCSI devices to use SCSI target IDs other
than 7, as that is taken by the 5070 RAIDium itself. Configuring
the target number is done differently on various devices. Consult
the manufacturer's installation instructions to determine the
method appropriate for your device.
4. As you are likely to be installing multiple SCSI devices, make sure
that all SCSI buses are properly terminated. This means a
terminator is installed only at each end of each SCSI bus daisy
chain.
6.2.0.0.9. Verifying the Hardware Installation:
These steps are optional but recommended. First, power-on your system
and interrupt the booting process by pressing the "Stop" and "a" keys
(or the "break" key if you are on a serial terminal) simultaneously as
soon as the Solaris release number is shown on the screen. This will
force the system to run the Forth Monitor in the system EPROM, which
will display the "ok" prompt. This gives you access to many useful
low-level commands, including:
ok show-devs
. . .
/iommu@f,e0000000/sbus@f,e000100SUNW, isp@1,8800000
. . .
The first line in the response shown above means that the 5070 RAIDium
host adapter has been properly recognized. If you don't see a line
like this, you may have a hardware problem.
Next, to see a listing of all the SCSI devices in your system, you can
use the probe-scsi-all command, but first you must prepare your system
as follows:
ok setenv auto-boot? False
ok reset
ok probe-scsi-all
This will tell you the type, target number, and logical unit number of
every SCSI device recognized in your system. The 5070 RAIDium board
will report itself attached to an ISP controller at target 0 with two
Logical Unit Numbers (LUNs): 0 for the virtual hard disk drive, and 7
for the connection to the Graphical User Interface (GUI). Note: the
GUI communication channel on LUN 7 is currently unused under Linux.
See the discussion under "SCSI Monitor Daemon (SMON)" in the "Advanced
Topics" section for more information.
REQUIRED: Perform a reconfiguration boot of the operating system:
ok boot -r
If no image appears on your screen within a minute, you most likely
have a hardware installation problem. In this case, go back and check
each step of the installation procedure. This completes the hardware
installation procedure.
6.3. Serial Terminal
If you have a serial terminal at your disposal (e.g. DEC-VT420) it may
be connected to the controller's serial port using a 9 pin DIN male to
DB25 male serial cable. Otherwise you will need to supplement the
above cable with a null modem adapter to connect the RAID controller's
serial port to the serial port on either the host computer or a PC.
The terminal emulators I have successfully used include Minicom (on
Linux), Kermit (on Caldera's Dr. DOS), and Hyperterminal (on a windows
CE palmtop), however, any decent terminal emulation software should
work. The basic settings are 9600 baud , no parity, 8 data bits, and 1
stop bit.
6.4. Hard Drive Plant
Choosing the brand and capacity of the drives that will form the hard
drive physical plant is up to you. I do have some recommendations:
� Remember, you generally get what you pay for. I strongly recommend
paying the extra money for better (i.e. more reliable) hardware
especially if you are setting up a RAID for a mission critical
project. For example, consider purchasing drive cabinets with
redundant hot-swappable power supplies, etc.
� You will also want a UPS for your host system and drive cabinets.
Remember, RAID levels 3 and 5 protect you from data loss due to
drive failure NOT power failure.
� The drive cabinet you select should have hot swappable drive bays,
these cost more but are definitely worth it when you need to
add/change drives.
� Make sure the cabinet(s) have adequate cooling when fully loaded
with drives.
� Keep your SCSI cables (internal and external) as short as possible
� Mark the drives/cabinet(s) in such a way that you will be able to
reconnect them to the controller in their original configuration.
Once the RAID is configured you cannot re-organize you drives
without re-configuring the RAID (and subsequently erasing the data
stored on it).
� Keep in mind that although it is physically possible to
connect/configure up to 6 drives per channel, performance will
sharply decrease for RAIDs with more than three drives per channel.
This is due to the 25 MHz bandwidth limitation of the SBUS.
Therefore, if read/write performance is an issue go with a small
number of large drives. If you need a really large RAID (~ 1
terabyte) then you will have no other choice but to load the
channels to capacity and pay the performance penalty. NOTE: if you
are serving files over a 10/100 Base T network you may not notice
the performance decrease since the network is usually the
bottleneck not the SBUS.
7. 5070 Onboard Configuration
Before diving into the RAID configuration I need to define a few
terms.
� "RaidRunner" is the name given to the the 5070 controller board.
� "Husky" is the name given to the shell which produces the ":raid;"
command prompt. It is a command language interpreter that executes
commands read from the standard input or from a file. Husky is a
scaled down model of Unix's Bourne shell (sh). One major difference
is that husky has no concept of current working directory. For more
information on the husky shell and command prompt see the "Advanced
Topics" section
� The "host port" is the SCSI ID assigned to the controller card
itself. This is usually ID 7.
� A "backend" is a drive attached to the controller on a given
channel.
� A "rank" is a collection of all the backends from each channel with
the same SCSI ID (i.e. rank 0 would consist of all the drives with
SCSI ID 0 on each channel)
� Each of the backends is identified by a three digit number where
the first digit is the channel, the second the SCSI ID of the
drive, and the third the LUN of the drive. The numbers are
separated by a period. The identifier is prefixed with a "D" if it
is a disk or "T" if it is a tape (e.g. D0.1.0). This scheme is
referred to as <device_type_c.s.l> in the following documentation.
� A "RAID set" consists of given number of backends (there are
certain requirements which I'll come to later)
� A "spare" is a drive which is unused until there is a failure in
one of the RAID drives. At that time the damaged drive is
automatically taken offline and replaced with the spare. The data
is then reconstructed on the spare and the RAID resumes normal
operation.
� Spares may either be "hot" or "warm" depending on user
configuration. Hot spares are spun up when the RAID is started,
which shortens the replacement time when a drive failure occurs.
Warm spares are spun up when needed, which saves wear on the drive.
The test based GUI can be started by typing "agui"
: raid; agui
at the husky prompt on the serial terminal (or emulator).
Agui is a simple ASCII based GUI that can be run on the RaidRunner
console port which enables one to configure the RaidRunner. The only
argument agui takes is the terminal type that is connected to the
RaidRunner console. Current supported terminals are dtterm, vt100 and
xterm. The default is dtterm.
Each agui screen is split into two areas, data and menu. The data
area, which generally uses all but the last line of the screen,
displays the details of the information under consideration. The menu
area, which generally is the bottom line of the screen, displays a
strip menu with a title then list of options or sub-menus. Each option
has one character enclosed in square brackets (e.g. [Q]uit) which is
the character to type to select that option. Each menu line allows
you to refresh the screen data (in case another process on the
RaidRunner writes to the console). The refresh character may also be
used during data entry if the screen is overwritten. The refresh
character is either <Control-l> or <Control-r>.
When agui starts, it reads the configuration of the RaidRunner and
probes for every possible backend. As it probes for each backend, it's
"name" is displayed in the bottom left corner of the screen.
7.1. Main Screen Options
7.1.0.0.1. <Figure 1: Main Screen>
The Main screen is the first screen displayed. It provides a summary
of the RaidRunner configuration. At the top is the RaidRunner model,
version and serial number. Next is a line displaying, for each
controller, the SCSI ID's for each host port (labeled A, B, C, etc)
and total and currently available amounts of memory. The next set of
lines display the ranks of devices on the RaidRunner. Each device
follows the nomenclature of <device_type_c.s.l> where device_type_ can
be D for disk or T for tape, c is the internal channel the device is
attached to, s is the SCSI ID (Rank) of the device on that channel,
and l is the SCSI LUN of the device (typically 0).
The next set of lines provide a summary of the Raid Sets configured on
the RaidRunner. The summary includes the raid set name, it's type,
it's size, the amount of cache allocated to it and a comma separated
list of it's backends. See rconf in the "Advanced Topics" section for
a full description of the above.
Next are the spare devices configured. Each spare is named
(device_type_c.s.l format), followed by it's size (in 512-byte
blocks), it's spin state (Hot or Warm), it's controller allocation ,
and finally it's current status (Used/Unused, Faulty/Working). If
used, the raid set that uses it is nominated.
At the bottom of the data area, the number of controllers, channels,
ranks and devices are displayed.
The menu line allows one to quit agui or select further actions or
sub-menus.
� [Q]uit: Exit the main screen and return to the husky prompt.
� [R]aidSets: Enter the RaidSet configuration screen.
� [H]ostports Enter the Host Port configuration screen.
� [S]pares Enter the Spare Device configuration screen.
� [M]onitor Enter the SCSI Monitor configuration screen.
� [G]eneral Enter the General configuration/information screen.
� [P]robe Re-probe the device backends on the RaidRunner. As each
backend is probed it's "name" (c.s.l format) is displayed in the
bottom left corner of the screen.
These selections are described in detail below.
7.2. [Q]uit
Exit the agui main screen and return to the husky ( :raid; ) prompt.
7.3. [R]aidSets:
7.3.0.0.1. <Figure 2: RAIDSet Configuration Screen>
The Raid Set Configuration screen displays a Raid Set in the data area
and provides a menu which allows you to Add, Delete, Modify, Install
(changes) and Scroll through all other raid sets (First, Last, Next
and Previous). If no raid sets have been configured, only the screen
title and menu is displayed. All attributes of the raid set are
displayed. For information on each attribute of the raid set, see the
rconf command in the "Advanced Topics" section. The menu line allows
one to leave the Raid Set Configuration screen or select further
actions:
� [Q]uit: Exit the Raid Set Configuration screen and return to the
Main screen. If you have modified, deleted or added a raid set and
have not installed the changes you will be asked to confirm this.
If you select Yes to continue the exit, all changes made since the
last install action will be discarded.
� [I]nst: This action installs (into the RaidRunner configuration
area) any changes that may have been made to raid sets, be that
deletion, addition or modification. If you exit prior to
installing, all changes made since the last installation will be
discarded. The installation process takes time. It is complete once
the typed "i" character, is cleared from the menu line.
� [M]od: This action allows you to modify the displayed raid set.
You will be prompted for each Raid Set attribute that can be
changed. The prompt includes allowable options or formats required.
If you don't wish to change a particular attribute, then press the
RETURN or TAB key. The attributes you can change are the raid set
name, I/O mode, status (Active to Inactive), bootmode, spares
usage, backend zone table usage, IO size (if raid set has never
been used - i.e. just added), cache size, I/O queues length, host
interfaces and additional stargd arguments. If you wish to change a
single attribute then use the RETURN or TAB key to skip all other
options. The changed attribute will be re-displayed as soon as you
press the RETURN key. When specifying cache size, you may suffix
the number with 'm' or 'M' to indicate the number is in Megabytes
or with 'k' or 'K' to indicate the number is in Kilobytes. Note you
can only enter whole integer values. When specifying io size, you
may suffix the number with 'k' or 'K' to indicate the number is in
Kilobytes. When you enter data, it is checked for correctness and
if incorrect, a message is displayed and all changes are discarded
and you will have to start again. Remember you must install
([I]nst.) any changes.
� [A]dd: When this option is selected you will be prompted for
various attributes of the new raid set. These attributes are the
raid set name, the raid set type, the initial host interface the
raid set is to appear on (in c.h.l format where c is the controller
number, h is the host port (0, 1, 2 etc) and l is the SCSI LUN) and
finally a list of backends. When backends are to be entered, the
screen displays a list of available backends, each with a numeric
index (commencing at 0). You select each backend by entering the
index and once complete enter q for Quit. As each backend index is
entered, it's backend name is displayed in a comma separated list.
When you enter data, it is checked for correctness and if
incorrect, a message is displayed and the addition will be ignored
and you will have to start again. Once the backends are complete,
the newly created raid set will be displayed on the screen with
supplied and default attributes. You can then modify the raid set
to change other attributes. Remember you must install ([I]nst.) any
new raid sets.
� [D]elete: This action will delete the currently displayed raid set.
If this raid set is Active, then you will not be allowed to delete
it. You will have to make it Inactive (via the [M]od. option) then
delete it. You will be prompted to confirm the deletion. Once you
confirm the deletion, the screen will be cleared and the next raid
set will be displayed, if configured. Remember you must install
([I]nst.) any changes.
� [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the
configured raid sets.
7.4. [H]ostports:
7.4.0.0.1. <Figure 3: Host Port Configuration Screen>
The Host Port Configuration screen displays for each controller, each
host port (labelled A, B, C, etc for port number 0, 1, 2, etc) and the
assigned SCSI ID. If the RaidRunner you use, has external switches for
host port SCSI ID selection, you may only exit ([Q]uit) from this
screen. If the RaidRunner you use, does NOT have external switches for
host port SCSI ID selection, then you may modify (and hence install)
the SCSI ID for any host port. The menu line allows one to leave the
Host Port Configuration screen or select further actions (if NO
external host):
� [Q]uit: Exit the Host Port Configuration screen and return to the
Main screen. If you have modified a host port SCSI ID assignment
and have not installed the changes you will be asked to confirm
this. If you select Yes to continue the exit, all changes made
since the last install action will be discarded.
� [I]nstall: This action installs (into the RaidRunner configuration
area) any changes that may have been made to host port SCSI ID
assign� ments. If you exit prior to installing, all changes made
since the last installation will be discarded. The installation
process takes time. It is complete once the typed "i" character, is
cleared from the menu line.
� [M]odify: This action allows you to modify the host port SCSI ID
assignments for each host port on each controller (if NO external
host port SCSI ID switches). You will be prompted for the SCSI ID
for each host port. You can enter either a SCSI ID (0 thru 15),
the minus "-" character to clear the SCSI ID assignment or RETURN
to SKIP. As you enter data, it is checked for correctness and if
incorrect, a message will be printed although previously correctly
entered data will be retained. Remember you must install ([I]nst.)
any changes.
7.5. [S]pares:
7.5.0.0.1. <Figure 4: Spare Device Configuration Screen>
The Spare Device Configuration screen displays all configured spare
devices in the data area and provides a menu which allows you to Add,
Delete, Mod� ify and Install (changes) spare devices. If no spare
devices have been configured, only the screen title and menu is
displayed. Each spare device displayed, shows it's name (in
device_type_c.s.l format), it's size in 512-byte blocks, it's spin
status (Hot or Warm), it's controller allocation, finally it's current
status (Used/Unused, Faulty/Working). If used, the raid set that uses
it is nominated. For information on each attribute of a spare device,
see the rconf command in the "Advanced Topics" section. The menu line
allows one to leave the Spare Device Configuration screen or select
further actions:
� [Q]uit: Exit the Spare Device Configuration screen and return to
the Main screen. If you have modified, deleted or added a spare
device and have not installed the changes you will be asked to
confirm this. If you select Yes to continue the exit, all changes
made since the last install action will be discarded.
� [I]nstall: This action installs (into the RaidRunner configuration
area) any changes that may have been made to the spare devices, be
that deletion, addition or modification. If you exit prior to
installing, all changes made since the last installation will be
discarded. The installation process takes time. It is complete once
the typed "i" character, is cleared from the menu line.
� [M]odify: This action allows you to modify the unused spare
devices. You will be prompted for each spare device attribute that
can be changed. The prompt includes allowable options or formats
required. If you don't wish to change a particular attribute, then
press the RETURN key. The attributes you can change are the new
size (in 512-byte blocks), the spin state (H or hot or W for Warm),
and the controller allocation (A for any, 0 for controller 0, 1 for
controller 1, etc). If you wish to change a single attribute of a
spare device, then use the RETURN key to skip all other attributes
for each spare device. The changed attribute will not be re-
displayed until the last prompted attribute is entered (or
skipped). When you enter data, it is checked for cor� rectness and
if incorrect, a message is dis� played and all changes are
discarded and you will have to start again. Remember you must
install ([I]nstall) any changes.
� [A]dd: When adding a spare device, the list of available devices is
displayed and you are required to type in the device name. Once
entered, the spare is added with defaults which you can change, if
required, via the [M]odify option. Remember you must install
([I]nstall) any changes.
� [D]elete: When deleting a spare device, the list of spare devices
allowed to be deleted is displayed and you are required to type in
the required device name. Once entered, the spare is deleted from
the screen. Remember you must install ([I]nstall) any changes.
7.6. [M]onitor:
7.6.0.0.1. <Figure 5: SCSI Monitor Screen>
The SCSI Monitor Configuration screen displays a table of SCSI
monitors configured for the RaidRunner. Up to four SCSI monitors may
be configured. The table columns are entitled Controller, Host Port,
SCSI LUN and Protocol and each line of the table shows the appropriate
SCSI Monitor attribute. For details on SCSI Monitor attributes, see
the rconf command in the "Advanced Topics" section. The menu line
allows one to leave the SCSI Monitor Configuration screen or modify
and install the table.
� [Q]uit: Exit the SCSI Monitor Configuration screen and return to
the Main screen. If you have made changes and have not installed
them you will be asked to confirm this. If you select Yes to
continue the exit, all changes made since the last install action
will be discarded.
� [I]nstall: This action installs (into the RaidRunner configuration
area) any changes that may have been made to SCSI Monitor
configuration. If you exit prior to installing, all changes made
since the last installation will be discarded. The installation
process takes time. It is complete once the typed "i" character, is
cleared from the menu line.
� [M]odify: This action allows you to modify the SCSI Monitor
configuration. The cursor will be moved around the table,
prompting you for input. If you do not want to change an attribute,
enter RETURN to skip. If you want to delete a SCSI monitor then
enter the minus "-" character when prompted for the controller
number. If you want to use the default protocol list, then enter
RETURN at the Protocol List prompt. As you enter data, it is
checked for correctness and if incorrect, a message will be printed
and any previously entered data is discarded. You will have to re-
enter the data again. Remember you must install ([I]nstall) any
changes.
7.7. [G]eneral:
7.7.0.0.1. <Figure 6: General Screen>
The General screen has a blank data area and a menu which allows one
to Quit and return to the main screen, or to select further sub-menus
which provide information about Devices, the System Message Logger,
Global Environment variables and throughput Statistics.
� [Q]uit: Exit the General screen and return to the Main screen.
� [D]evices: Enter the Device information screen. The Devices screen
displays the name of all devices on the RaidRunner. The menu line
allows one to Quit and return to the General screen or display
information about the devices.
<Figure 7: Devices Screen>
� [Q]uit: Exit the Devices screen and return to the General screen.
� [I]nformation: The Device Information screen displays information
about each device. You can scroll through the devices. For disks,
information displayed includes, the device name, serial number,
vendor name, product id, speed, version, sector size, sector count,
total device size in MB, number of cylinders, heads and sectors per
track and the zone/notch partitions. The menu line allows one the
leave the Device Information screen or browse through devices.
<Figure 8: Device Information Screen>
� [Q]uit: Exit the Device Information screen and return to the
Devices screen.
� [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the
devices and hence display their current data .
� Sys[L]og: Enter the System Logger Messages screen.
<Figure 9: System Logger Messages Screen>
� [Q]uit: Exit the System Logger Messages screen and return to the
General screen.
� [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the
system log.
� [E]nvironment: Enter the Global Environment Variable configuration
screen. The Environment Variable Configuration screen dis� plays
all configured Global Environment Variables and provides a menu
which allows you to Add, Delete, Modify and Install (changes)
variables. Each variable name is displayed followed by an equals
"=" and the value assigned to that variable enclosed in braces -
"{" .. "}". The menu line allows you to Quit and return to the
General screen or select further actions.
<Figure 10: Environment Global Variable Configuration Screen>
� [Q]uit: Exit the Environment Variable Configuration screen and
return to the General screen. If you have modified, deleted or
added an environment variable and have not installed the changes
you will be asked to confirm this. If you select Yes to continue
the exit, all changes made since the last install action will be
discarded.
� [I]nst: This action installs (into the RaidRunner configuration
area) any changes that may have been made to environment variables,
be that deletion, addition or modification. If you exit prior to
installing, all changes made since the last installation will be
discarded. The installation process takes time. It is complete once
the typed "i" character, is cleared from the menu line.
� [M]od: This action allows you to modify an environment variable's
value. You will be prompted for the name of the environment
variable and then prompted for it's new value. If the environment
variable entered is not found, a message will be printed and you
will not be prompted for a new value. If you do not enter a new
value, (i.e. just press RETURN) no change will be made. Remember
you must install ([I]nstall) any changes.
� [A]dd: When adding a new environment variable, you will be prompted
for it's name and value. Providing the variable name is not already
used and you enter a value, the new variable will be added and
displayed. Remember you must install ([I]nstall) any changes.
� [D]elete: When deleting an environment variable, you will be
prompted for the variable name and if valid, the environment
variable will be deleted. Remember you must install ([I]nstall) any
changes.
� [S]tats: Enter the Statistics monitoring screen. The Statistics
screen display various general and specific statistics about raid
sets configured and running on the RaidRunner. The first section of
the data area displays the current temperature in degrees Celsius
and the current speed of fans in the RaidRunner. The next section
of the data area displays various statistics about the named raid
set. The statistics are - the current cache hit rate, the
cumulative number of reads, read failures, writes and write
failures for each backend of the raid set and finally the read and
write throughput for each stargd process (indicated by it's process
id) that front's the raid set. The menu line allows one the leave
the Statistics screen or select further actions.
<Figure 11: Statistics Monitoring Screen>
� [Q]uit: Exit the Statistics screen and return to the General
screen.
� [F]irst, [L]ast, [N]ext and [P]rev allow you to scroll through the
statistics.
� [R]efresh: This option will get the statistics for the given raid
set and re-display the current statistics on the screen.
� [Z]ero: This option will zero the cumulative statistics for the
currently displayed raid set.
� [C]ontinuous: This option will start a back� ground process that
will update the statis� tics of the currently displayed raid set
every 2 seconds. A loop counter is created and updated every 2
seconds also. To inter� rupt this continuous mode of gathering
statistics, just press any character. If you need to re-fresh the
display, then press the refresh characters - <Control-l> or <Con�
trol-r>.
7.8. [P]robe
The probe option re-scans the SCSI channels and updates the backend
list with the hardware it finds.
7.9. Example RAID Configuration Session
The generalized procedure for configuration consists of three steps
arranged in the following order:
1. Configuring the Host Port(s)
2. Assigning Spares
3. Configuring the RAID set
Note that there is a minimum number of backends required for the
various supported RAID levels:
� Level 0 : 2 backends
� Level 3 : 2 backends
� Level 5 : 3 backends
In this example we will configure a RAID 5 using 6, 2.04 gigabyte
drives. The total capacity of the virtual drive will be 10 gigabytes
(the equivalent of one drive is used for redundancy). This same
configuration procedure can be used to configure other levels of RAID
sets by changing the type parameter.
1. Power on the computer with the serial terminal connected to the
RaidRunner's serial port.
2. When the husky ( :raid; ) prompt appears, Start the GUI by typing
"agui" and pressing return.
3. When the main screen appears, select "H" for [H]ostport
configuration
4. On some models of RaidRunner the host port in not configurable. If
you have only a [Q]uit option here then there is nothing further to
be done for the host port configuration, note the values and skip
to step 6. If you have add/modify options then your host port is
software configurable.
5. If there is no entry for a host port on this screen, add an entry
with the parameters: controller=0, hostport=0 , SCSI ID=0. Don't
forget to [I]nstall your changes. If there is already and entry
present, note the values (they will be used in a later step).
6. From this point onward I will assume the following hardware
configuration:
a. There are 7 - 2.04 gig drives connected as follows:
i. 2 drives on SCSI channel 0 with SCSI IDs 0 and 1 (backends
0.0.0, and 0.1.0, respectively).
ii.
3 drives on SCSI channel 1 with SCSI IDs 0 ,1 and 5 (backends
1.0.0, 1.1.0, and 1.5.0).
iii.
2 drives on SCSI channel 2 with SCSI IDs 0 and 1 (backends
2.0.0 and 2.1.0).
b. Therefore:
i. Rank 0 consists of backends 0.0.0, 1.0.0, 2.0.0
ii.
Rank 1 consists of backends 0.1.0, 1.1.0, 2.1.0
iii.
Rank 5 contains only the backend 1.5.0
c. The RaidRunner is assigned to controller 0, hostport 0
7. Press Q to [Q]uit the hostports screen and return to the Main
screen.
8. Press S to enter the [S]pares screen
9. Select A to [A]dd a new spare to the spares pool. A list of
available backends will be displayed and you will be prompted for
the following information:
Enter the device name to add to spares - from above:
enter
D1.5.0
Select I to [I]nstall your changes
Select Q to [Q]uit the spares screen and return to the Main screen
Select R from the Main screen to enter the [R]aidsets screen.
Select A to [A]dd a new RAID set. You will be prompted for each of the
RAID set parameters. The prompts and responses are given below.
1. Enter the name of Raid Set: cim_homes (or whatever you want to call
it).
2. Raid set type [0,1,3,5]: 5
3. Enter initial host interface - ctlr,hostport,scsilun: 0.0.0
Now a list of the available backends will be displayed in the form:
0 - D0.0.0 1 - D1.0.0 2 - D2.0.0 3 - D0.1.0 4 - D1.1.0 5 - D2.1.0
4. Enter index from above - Q to Quit: 1 press return 2 press return 3
press return 4 press return 5 press return Q
After pressing Q you will be returned to the Raid Sets screen. You
should see the newly configured Raid set displayed in the data area.
Press I to [I]nstall the changes
<Figure 12: The RaidSets screen of the GUI showing the newly
configured RAID 5>
Press Q to exit the RaidSet screen and return to the the Main screen
Press Q to [Q]uit agui and exit to the husky prompt.
type "reboot" then press enter. This will reboot the RaidRunner (not
the host machine.)
When the RaidRunner reboots it will prepare the drives for the newly
configured RAID. NOTE: Depending on the size of the RAID this could
take a few minutes to a few hours. For the above example it takes the
5070 approximately 10 - 20 minutes to stripe the RAID set.
Once you see the husky prompt again the RAID is ready for use. You can
then proceed with the Linux configuration.
8. Linux Configuration
These instructions cover setting up the virtual RAID drives on RedHat
Linux 6.1. Setting it up under other Linux distributions should not be
a problem. The same general instructions apply.
If you are new to Linux you may want to consider installing Linux from
scratch since the RedHat installer will do most of the configuration
work for you. If so skip to section titled "New Linux Installation."
Otherwise go to the "Existing Linux Installation" section (next).
8.1. Existing Linux Installation
Follow these instructions if you already have Redhat Linux installed
on your system and you do not want to re-install. If you are
installing the RAID as part of a new RedHat Linux installation (or are
re-installing) skip to the "New Linux Installation" section.
8.1.1. QLogic SCSI Driver
The driver can either be loaded as a module or compiled into your
kernel. If you want to boot from the RAID then you may want to use a
kernel with compiled in QLogic support (see the kernel-HOWTO available
from
http://www.linuxdoc.org. To use the modular driver become the
superuser and add the following lines to /etc/conf.modules:
alias qlogicpti /lib/modules/preferred/scsi/qlogicpti
Change the above path to where ever your SCSI modules live. Then add
the following line to you /etc/fstab (with the appropriate changes for
device and mount point, see the fstab man page if you are unsure)
/dev/sdc1 /home ext2 defaults 1 2
Or, if you prefer to use a SYSV initialization script, create a file
called "raid" in the /etc/rc.d/init.d directory with the following
contents (NOTE: while there are a few good reasons to start the RAID
using a script, one of the aforementioned methods would be
preferable):
#!/bin/bash
case "$1" in
start)
echo "Loading raid module"
/sbin/modprobe qlogicpti
echo
echo "Checking and Mounting raid volumes..."
mount -t ext2 -o check /dev/sdc1 /home
touch /var/lock/subsys/raid
;;
stop)
echo "Unmounting raid volumes"
umount /home
echo "Removing raid module(s)"
/sbin/rmmod qlogicpti
rm -f /var/lock/subsys/raid
echo
;;
restart)
$0 stop
$0 start
;;
*)
echo "Usage: raid {start|stop|restart}"
exit 1
esac
exit 0
You will need to edit this example and substitute your device name(s)
in place of /dev/sdc1 and mount point(s) in place of /home. The next
step is to make the script executable by root by doing:
chmod 0700 /etc/rc.d/init.d/raid
Now use your run level editor of choice (tksysv, ksysv, etc.) to add
the script to the appropriate run level.
8.1.2. Device mappings
Linux uses dynamic device mappings you can determine if the drives
were found by typing:
more /proc/scsi/scsi
one or more of the entries should look something like this:
Host: scsi1 Channel: 00 Id: 00 Lun: 00
Vendor: ANTARES Model: CX106 Rev: 0109
Type: Direct-Access ANSI SCSI revision: 02
There may also be one which looks like this:
Host: scsi1 Channel: 00 Id: 00 Lun: 07
Vendor: ANTARES Model: CX106-SMON Rev: 0109
Type: Direct-Access ANSI SCSI revision: 02
This is the SCSI monitor communications channel which is currently un-
used under Linux (see SMON in the advanced topics section below).
To locate the drives (following reboot) type:
dmesg | more
Locate the section of the boot messages pertaining to you SCSI
devices. You should see something like this:
qpti0: IRQ 53 SCSI ID 7 (Firmware v1.31.32)(Firmware 1.25 96/10/15)
[Ultra Wide, using single ended interface]
QPTI: Total of 1 PTI Qlogic/ISP hosts found, 1 actually in use.
scsi1 : PTI Qlogic,ISP SBUS SCSI irq 53 regs at fd018000 PROM node ffd746e0
Which indicates that the SCSI controller was properly recognized,
Below this look for the disk section:
Vendor ANTARES Model: CX106 Rev: 0109
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sdc at scsi1, channel 0, id 0, lun 0
SCSI device sdc: hdwr sector= 512 bytes. Sectors= 20971200 [10239
MB] [10.2 GB]
Note the line that reads "Detected scsi disk sdc ..." this tells you
that this virtual disk has been mapped to device /dev/sdc. Following
partitioning the first partition will be /dev/sdc1, the second will be
/dev/sdc2, etc. There should be one of the above disk sections for
each virtual disk that was detected. There may also be an entry like
the following:
Vendor ANTARES Model: CX106-SMON Rev: 0109
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sdd at scsi1, channel 0, id 0, lun 7
SCSI device sdd: hdwr sector= 512 bytes. Sectors= 20971200 [128 MB]
[128.2 MB]
BEWARE: this is not a drive DO NOT try to fdisk, mkfs, or mount it!!
Doing so WILL hang your system.
8.1.3. Partitioning
A virtual drive appears to the host operating system as a large but
otherwise ordinary SCSI drive. Partitioning is performed using fdisk
or your favorite utility. You will have to give the virtual drive a
disk label when fdisk is started. Using the choice "Custom with
autoprobed defaults" seems to work well. See the man page for the
given utility for details.
8.1.4. Installing a filesystem
Installing a filesystem is no different from any other SCSI drive:
mkfs -t <filesystem_type> /dev/<device>
for example:
mkfs -t ext2 /dev/sdc1
8.1.5. Mounting
If QLogic SCSI support is compiled into you kernel OR you are loading
the "qlogicpti" module at boot from /etc/conf.modules then add the
following line(s) to the /etc/fstab:
/dev/<device> <mount point> ext2 defaults 1 1
If you are using a SystemV initialization script to load/unload the
module you must mount/unmount the drives there as well. See the
example script above.
8.2. New Linux Installation
This is the easiest way to install the RAID since the RedHat installer
program will do most of the work for you.
1. Configure the host port, RAID sets, and spares as outlined in
"Onboard Configuration." Your computer must be on to perform this
step since the 5070 is powered from the SBUS. It does not matter if
the computer has an operating system installed at this point all we
need is power to the controller card.
2. Begin the RedHat SparcLinux installation
3. The installation program will auto detect the 5070 controller and
load the Qlogic driver
4. Your virtual RAID drives will appear as ordinary SCSI hard drives
to be partitioned and formatted during the installation. NOTE: When
using the graphical partitioning utility during the RedHat
installation DO NOT designate any partition on the virtual drives
as type RAID since they are already hardware managed virtual RAID
drives. The RAID selection on the partitioning utilities screen is
for setting up a software RAID. IMPORTANT NOTE: you may see a
small SCSI drive ( usually ~128 MB) on the list of available
drives. DO NOT select this drive for use. It is the SMON
communication channel NOT a drive. If setup tries to use it it will
hang the installer.
5. Thats it, the installation program takes care of everything else !!
9. Maintenance
9.1. Activating a spare
When running a RAID 3 or 5 (if you configured one or more drives to be
spares) the 5070 will detect when a drive goes offline and
automatically select a spare from the spares pool to replace it. The
data will be rebuilt on-the-fly. The RAID will continue operating
normally during the re-construction process (i.e. it can be read from
and written to just is if nothing has happened). When a backend fails
you will see messages similar to the following displayed on the 5070
console:
930 secs: Redo:1:1 Retry:1 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/Selection
Time-out @682400+16
932 secs: Redo:1:1 Retry:2 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/Selection
Time-out @682400+16
933 secs: Redo:1:1 Retry:3 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/Selection
Time-out @682400+16
934 secs: CIO_cim_homes_q3 R5_W(3412000, 16): Pre-Read drive 4 (D1.1.0)
fails with result "Re-/Selection Time-out"
934 secs: CIO_cim_homes_q2 R5: Drained alternate jobs for drive 4 (D1.1.0)
934 secs: CIO_cim_homes_q2 R5: Drained alternate jobs for drive 4 (D1.1.0)
RPT 1/0
934 secs: CIO_cim_homes_q2 R5_W(524288, 16): Initial Pre-Read drive 4 (D1.1.0)
fails with result "Re-/Selection Time-out"
935 secs: Redo:1:0 Retry:1 (DIO_cim_homes_D1.0.0_q1) CDB=28(Read_10)SCSI
Bus ~Reset detected @210544+16
936 secs: Failed:1:1 Retry:0 (rconf) CDB=2A(Write_10)Re-/Selection Time-out
@4194866+128
Then you will see the spare being pulled from the spares pool, spun
up, tested, engaged, and the data reconstructed.
937 secs: autorepair pid=1149 /raid/cim_homes: Spinning up spare device
938 secs: autorepair pid=1149 /raid/cim_homes: Testing spare device/dev/hd/1.5.0/data
939 secs: autorepair pid=1149 /raid/cim_homes: engaging hot spare ...
939 secs: autorepair pid=1149 /raid/cim_homes: reconstructing drive 4 ...
939 secs: 1054
939 secs: Rebuild on /raid/cim_homes/repair: Max buffer 2800 in 7491 reads,
priority 6 sleep 500
The rebuild script will printout its progress every 10% of the job
completed
939 secs: Rebuild on /raid/cim_homes/repair @ 0/7491
1920 secs: Rebuild on /raid/cim_homes/repair @ 1498/7491
2414 secs: Rebuild on /raid/cim_homes/repair @ 2247/7491
2906 secs: Rebuild on /raid/cim_homes/repair @ 2996/7491
9.2. Re-integrating a repaired drive into the RAID (levels 3 and 5)
After you have replaced the bad drive you must re-integrate it into
the RAID set using the following procedure.
1. Start the text GUI
2. Look the list of backends for the RAID set(s).
3. Backends that have been marked faulty will have a (-) to the right
of their ID ( e.g. D1.1.0- ).
4. If you set up spares the ID of the faulty backend will be followed
by the ID of the spare that has replaced it ( e.g. D1.1.0-D1.5.0 )
.
5. Write down the ID(s) of the faulty backend(s) (NOT the spares).
6. Press Q to exit agui
7. At the husky prompt type:
replace <name> <backend>
Where <name> is whatever you named the raid set and <backend> is the
ID of the backend that is being re-integrated into the RAID. If a
spare was in use it will be automatically returned to the spares pool.
Be patient, reconstruction can take a few minutes minutes to several
hours depending on the RAID level and the size. Fortunately, you can
use the RAID as you normally would during this process.
10. Troubleshooting / Error Messages
10.1. Out of band temperature detected...
� Probable Cause: The 5070 SBUS card is not adequately cooled.
� Solution: Try to improve cooling inside the case. Clean dust from
the fans, re-organize the cards so the raid card is closest to the
fan, etc. On some of the "pizza box" sun cases (e.g. SPARC 20) you
may need to add supplementary cooling fans especially if you have
it loaded with cards.
10.2. ... failed ... cannot have more than 1 faulty backend.
� Cause: More than one backend in the RAID 3/4/5 has failed (i.e.
there is no longer sufficient redundancy to enable the lost data to
be reconstructed).
� Solution: You're hosed ... Sorry. If you did not assign spares
when you configured you RAID 3/4/5 now may be a good time to re-
consider the wisdom of that decision. Hopefully you have been
making regular backups. Since now you will have to replace the
defective drives, re-configure the RAID, and restore the data from
a secondary source.
10.3. table. When booting I see: ... Sun disklabel: bad magic 0000
... unknown partition
� Suspected Cause: Incorrect settings in the disk label set by fdisk
(or whatever partitioning utility you used). This message seems to
happen when you choose one of the preset disk labels rather than
"Custom with autoprobed defaults."
� Solution: Since this error does not seem to effect the operation of
the drive you can choose to do nothing and be ok. If you want to
correct it you can try re-labeling the disk or re-partitioning the
disk and choose "Custom with autoprobed defaults." If you are
installing RedHat Linux from scratch the installer will get all of
this right for you.
11. Bugs
None yet! Please send bug reports to
[email protected]
12. Frequently Asked Questions
12.1. How do I reset/erase the onboard configuration?
At the husky prompt issue the following command:
rconf -init
This will delete all of the RAID configuration information but not the
global variables and scsi monitors. the remove ALL configuration
information type:
rconf -fullinit
Use these commands with caution!
12.2. How can I tell if a drive in my RAID has failed?
In the text GUI faulty backends appear with a (-) to the right of
their ID. For example the list of backends:
D0.0.0,D1.0.0-,D2.0.0,D0.1.0,D1.1.0,D2.1.0
Indicates that backend (drive) D1.0.0 is either faulty or not present.
If you assigned spares (RAID 3 or 5) then you should also see that one
or more spares are in use. Both the main and the and the RaidSets
screens will show information on faulty/not present drives in a RAID
set.
13. Advanced Topics: 5070 Command Reference
In addition to the text based GUI the RAID configuration may also be
manipulated from the husky prompt ( the : raid; prompt) of the onboard
controller. This section describes commands that a user can input
interactively or via a script file to the K9 kernel. Since K9 is an
ANSI C Application Programming Interface (API) a shell is needed to
interpret user input and form output. Only one shell is currently
available and it is called husky. The K9 kernel is modelled on the
Plan 9 operating system whose design is discussed in several papers
from AT&T (See the "Further Reading" section for more information).
K9 is a kernel targeted at embedded controllers of small to medium
complexity (e.g. ISDN-ethernet bridges, RAID controllers, etc). It
supports multiple lightweight processes (i.e. without memory
management) on a single CPU with a non-pre-emptive scheduler. Device
driver architecture is based on Plan 9 (and Unix SVR4) STREAMS.
Concurrency control mechanisms include semaphores and signals. The
husky shell is modelled on a scaled down Unix Bourne shell.
Using the built-in commands the user can write new scripts thus
extending the functionality of the 5070. The commands (adapted from
the 5070 man pages) are extensive and are described below.
13.1. AUTOBOOT - script to automatically create all raid sets and
scsi monitors
� SYNOPSIS: autoboot
� DESCRIPTION: autoboot is a husky script which is typically executed
when a RaidRunner boots. The following steps are taken -
1. Start all configured scsi monitor daemons (smon).
2. Test to see if the total cache required by all the raid sets
that are to boot is not more than 90% of available memory.
3. Start all the scsi target daemons (stargd) and set each
daemon's mode to "spinning-up" which enables it to respond to
all non medium access commands from the host. This is done
to allow hosts to gain knowledge about the RaidRunner's scsi
targets as quickly as possible.
4. Bind into the root (ram) filesystem all unused spare backend
devices.
5. Build all raid sets.
6. If battery backed-up ram is present, check for any saved
writes and restore them into the just built raid sets.
7. Finally, set the state of all scsi target daemons to "spun-up"
enabling hosts to fully access the raid set's behind them.
13.2. failure AUTOFAULT - script to automatically mark a backend
faulty after a drive
� SYNOPSIS: autofault raidset
� DESCRIPTION: autofault is a husky script which is typically
executed by a raid file system upon the failure of a backend of
that raid set when that raid file system cannot use spare backends
or has been configured not to use spare backends. After parsing
it's arguments (command and environment) autofault issues a rconf
command to mark a given backend as faulty.
� OPTIONS:
� raidset: The bind point of the raid set whose backend failed.
� $DRIVE_NUMBER: The index of the backend that failed. The first
backend in a raid set is 0. This option is passed as an environment
variable.
� $BLOCK_SIZE: The raid set's io block size in bytes. (Ignored).
This option is passed as an environment variable.
� $QUEUE_LENGTH: The raid set's queue length. (Ignored). This option
is passed as an environment variable.
� SEE ALSO: rconf
13.3. raid set AUTOREPAIR - script to automatically allocate a spare
and reconstruct a
� SYNOPSIS: autorepair raidset size
� DESCRIPTION: autorepair is a husky script which is typically
executed by either a raid type 1, 3 or 5 file system upon the
failure of a backend of that raid set.
After parsing it's arguments (command and environment) autorepair
gets a spare device from the RaidRunner's spares spool. It
then engages it in write-only mode and reads the complete raid
device which reconstructs the data on the spare. The read is from
the raid file system repair entrypoint. Reading from this
entrypoint causes a read of a block immediately followed by a
write of that block. The read/write sequence is atomic (i.e is not
interruptible). Once the reconstruction has completed, a check is
made to ensure the spare did not fail during reconstruction and if
not, the access mode of the spare device is set to the access mode
of the raid set. The process that reads the repair entrypoint
is rebuild.
This device reconstruction will take anywhere from 10 minutes to one
and a half hours depending on both the size and speed of the backends
and the amount of activity the host is generating.
During device reconstruction, pairs of numbers will be printed
indicating each 10% of data reconstructed. The pairs of numbers
are separated by a slash character, the first number being the number
of blocks reconstructed so far and the second being the number number
of blocks to be reconstructed. Further status about the rebuild can
be gained from running rebuild.
When the spare is allocated both the number of spares currently used
on the backend and the spare device name is printed. The number of
spares on a backend is referred to the depth of spares on the
backend. Thus prior to re-engaging the spare after a reconstruction
a check can be made to see if the depth is the same. If it is not,
then the spare reconstruction failed and reconstruction using another
spare is underway (or no spares are available), and hence we don't
re-engage the drive.
� OPTIONS:
� raidset: The bind point of the raid set whose backend failed.
� size : The size of the raid set in 512 byte blocks.
� $DRIVE_NUMBER: The index of the backend that failed. The
first backend in a raid set is 0. This option is passed as an
environment variable.
� $BLOCK_SIZE: The raid set's io block size in bytes. This option is
passed as an environment variable.
� $QUEUE_LENGTH: The raid set's queue length. This option is passed
as an environment variable.
� SEE ALSO: rconf, rebuild
13.4. BIND - combine elements of the namespace
� SYNOPSIS: bind [-k] new old
� DESCRIPTION: Bind replaces the existing old file (or directory)
with the new file (or directory). If the"-k" switch is given then
new must be a kernel recognized device (file system). Section 7k of
the manual pages documents the devices (sometimes called file
systems) that can be bound using the "-k" switch.
13.5. BUZZER - get the state or turn on or off the buzzer
� SYNOPSIS: buzzer or buzzer on|off|mute
� DESCRIPTION: Buzzer will either print the state of the buzzer, turn
on or off the buzzer or mute it. If no arguments are given then
the state of the buzzer is printed, that is on or off will be
printed if the buzzer is currently on or off respectively. If the
buzzer has been muted, then you will be informed of this. If the
buzzer has not been used since the RaidRunner has booted then the
special state, unused, is printed. If the argument on is given the
buzzer is turned on, if off, the buzzer is turned off. If the
argument mute is given then the muted state of the buzzer is
changed.
� SEE ALSO: warble, sos
13.6. CACHE - display information about and delete cache ranges
� SYNOPSIS: cache [-D moniker] [-I moniker] [-F] [-g moniker
first|last] lastoffset
� DESCRIPTION: cache will print (to standard output) information
about the given cache range, delete a given cache range, flush the
cache or return the last offset of all cache ranges.
� OPTIONS
� -F: Flush all cache buffers to their backends (typically raid
sets).
� -D moniker: Delete the cache range with moniker (name) moniker.
� -I moniker: Invalidate the cache for the given cache range
(moniker). This is only useful for debugging or elaborate
benchmarks.
� g moniker first|last: Print either the first or last block number
of a cache range with moniker (name) moniker.
� lastoffset: Print the last offset of all cache ranges. The last
offset is the last block number of all cache ranges.
13.7. CACHEDUMP - Dump the contents of the write cache to battery
backed-up ram
� SYNOPSIS: cachedump
� DESCRIPTION: cachedump causes all unwritten data in the
RaidRunner's cache to be written out to the battery backed-up ram.
No data will be written to battery backed-up ram if there is
currently valid data already stored there. This command is
typically executed when there is something wrong with the data (or
it's organization) in battery backed-up ram and you need to re-
initialize it. cachedump will always return a NULL status.
� SEE ALSO: showbat, cacherestore
13.8. CACHERESTORE - Load the cache with data from battery backed-up
ram
� SYNOPSIS: cacherestore
� DESCRIPTION: cacherestore will check the RaidRunner's battery
backed-up ram for any data it has stored as a result of a power
failure. It will copy any data directly into the cache. This
command is typically executed automatically at boot time and prior
to the RaidRunner making it's data available to a host. Having
successfully copied any data from battery backed-up ram into the
cache, it flushes the cache and then re-initializes battery backed-
up ram to indicate it holds no data. cacherestore will return a
NULL status on success or 1 if an error occurred during the loading
(with a message written to standard error).
� SEE ALSO: showbat
13.9. CAT - concatenate files and print on the standard output
� SYNOPSIS: cat [ file... ]
� DESCRIPTION: cat writes the contents of each given file, or
standard input if none are given or when a file named `-' is given,
to standard output. If the nominated file is a directory then the
filenames contained in that directory are sent to standard out (one
per line). More information on a file (e.g. its size) can be
obtained by using stat. The script file ls uses cat and stat to
produce directory listings.
� SEE ALSO echo, ls, stat
13.10. CMP - compare the contents of 2 files
� SYNOPSIS: cmp [-b blockSize] [-c count] [-e] [-x] file1 file2
� DESCRIPTION: cmp compares the contents of the 2 named files. If
file1 is "-" then standard input is used for that file. If the
files are the same length and contain the same val� ues then
nothing is written to standard output and the exit status NIL (i.e.
true) is set. Where the 2 files dif� fer, the first bytes that
differ and the position are out� put to standard out and the exit
status is set to "differ" (i.e. false). The position is given by a
block number (origin 0) followed by a byte offset within that block
(origin 0). The optional "-b" switch allows the blockSize of each
read operation to be set. The default blockSize is 512 (bytes). For
big compares involving disks a relatively large blockSize may be
useful (e.g. 64k). See suffix for allowable suffixes. The optional
"-c" switch allows the count of blocks read to fixed. A value of 0
for count is interpreted as read to the end of file (EOF). To
compare the first 64 Megabytes of 2 files the switches "-b 64k -c
1k" could be used. See suffix for allowable suffixes. The optional
"-e" switch instructs ccmmpp to output to stan� dard out (usually
overwriting the same line) the count of blocks compared, each time
a multiple of 100 is reached. The final block count is also output.
The optional "-x" switch instructs ccmmpp to continue after a
comparison error (but not a file error) and keep a count of blocks
in error. If any errors are detected only the last one will be
output when the command exits. If the "-e" switch is also given
then the current count of blocks in error is output to the right of
the multiple of 100 blocks compared. This command is designed to
compare very large files. Two buffers of blockSize are allocated
dynamically so their size is bounded by the amount of memory (i.e.
RAM in the target) available at the time of command execution. The
count could be up to 2G. The number of bytes compared is the
product of blockSize and count (i.e. big enough).
� SEE ALSO: suffix
13.11. CONS - console device for Husky
� SYNOPSIS: bind -k cons bind_point
� DESCRIPTION: cons allows an interpreter (e.g. Husky) to route
console input and output to an appropriate device. That console
input and output is available at bind_point in the K9 namespace.
The special file cons should always be available.
� EXAMPLES: Husky does the following in its initialisation:
bind -k cons /dev/cons
On a Unix system this is equivalent to:
bind -k unixfd /dev/cons
On a DOS system this is equivalent to:
bind -k doscon /dev/cons
On target hardware using a SCN2681 chip this is equivalent to:
bind -k scn2681 /dev/cons
SEE ALSO: unixfd, doscon, scn2681
13.12. DD - copy a file (disk, etc)
� SYNOPSIS: dd [if=file] [of=file] [ibs=bytes] [obs=bytes] [bs=bytes]
[skip=blocks] [seek=blocks] [count=blocks] [flags=verbose]
� DESCRIPTION: dd copies a file (from the standard input to the
standard output, by default) with a user-selectable blocksize.
� OPTIONS
� if=file Read from file instead of the standard input.
� of=file, Write to file instead of the standard output.
� ibs=bytes, Read given number of bytes at a time.
� obs=bytes, Write given number of bytes at a time.
� bs=bytes, Read and write given number of bytes at a time. Override
ibs and obs.
� skip=blocks, Skip ibs-sized blocks at start of input.
� seek=blocks, By-pass obs-sized blocks at start of output.
� count=blocks, Copy only ibs-sized input blocks.
� flags=verbose, Print (to standard output) the number of blocks
copied every ten percent of the copy. The output is of the form X/T
where X is the number of blocks copied so far and T is the total
number of blocks to copy. This option can only be used if both the
count= and of= options are also given.
The decimal numbers given to "ibs", "obs", "bs", "skip", "seek" and
"count" must not be negative. These numbers can optionally have a
suffix (see suffix). dd outputs to standard out in all cases. A
successful copy of 8 (full) blocks would cause the following
output:
8+0 records in
8+0 records out
The number after the "+" is the number of fractional blocks (i.e.
blocks that are less than the block size) involved. This number will
usually be zero (and is otherwise when physical media with alignment
requirements is involved).
A write failure outputting the last block on the previous example
would cause the following output:
Write failed
8+0 records in
7+0 records out
SEE ALSO: suffix
13.13. DEVSCMP - Compare a file's size against a given value
� SYNOPSIS: devscmp filename size
� DESCRIPTION: devscmp will find the size of the given file and
compare it's size in 512-byte blocks to the given size (to be in
512-byte blocks). If the size of the file is less than the given
value, then -1 is printed, if equal to then 0 is printed, and if
the size of the given file is greater than the given size then 1 is
printed. This routine is used in internal scripts to ensure that
backends of raid sets are of an appropriate size.
13.14. DFORMAT- Perform formatting functions on a backend disk drive
� SYNOPSIS
� dformat -p c.s.l -R bnum
� dformat -p c.s.l -pdA|-pdP|-pdG
� dformat -p c.s.l -S [-v] [-B firstbn]
� dformat -p c.s.l -F
� dformat -p c.s.l -D file
� DESCRIPTION: In it's first form dformat will either reassign a
block on a nominated disk drive. via the SCSI-2 REASSIGN BLOCKS
command. The second form will allow you to print out the current
manufacturers defect list (-pdP), the grown defect list (-pdG)
or both defect lists (-pdA). Each printed list is sorted with one
defect per line in Physical Sector Format - Cylinder Number, Head
Number and Defect Sector Number. The third form causes the drive to
be scanned in a destructive write/read/compare manner. If a read
or write or data comparison error occurs then an attempt is made
to identify the bad sector(s). Typically the drive is scanned from
block 0 to the last block on the drive. You can optionally give an
alternative starting block number. The fourth form causes a low
level format on the specified device. The fifth option allows you
to download a device's microcode into the device.
� OPTIONS:
� -R bnum: Specify a logical block number to reassign to the drive's
grown defect list.
� -pdA: Print both the manufacturer's and grown defect list.
� \ -pdP: Print the manufacturer's defect list.
� -pdG: Print the grown defect list.
� -S: Perform a destructive scan of the disk reporting I/O errors.
� -B firstbn: Specify the first logical block number to start a scan
from.
� -v: Turn on verbose mode - which prints the current block number
being scanned.
� -F: Issue a low-level SCSI format command to the given device. This
will take some time.
� -D file: Download into the specified device, the given file. The
download is effected by a single SCSI Write-Buffer command in save
microcode mode. This allows users to update a device's
microcode. Use this command carefully as you could destroy the
device by loading an incorrect file.
� -p c.s.l: Identify the disk device by specifying it's channel,
SCSI ID (rank) and SCSI LUN provided in the format "c.s.l"
� SEE ALSO: Product manual for disk drives used in your RAID.
13.15. DIAGS - script to run a diagnostic on a given device
� SYNOPSIS: diags disk -C count -L length -M io-mode -T io-type -D
device
� DESCRIPTION: diags is a husky script which is used to run the
randio diagnostic on a given device. When randio is executed, it is
executed in verbose mode.
� OPTIONS:
� disk: This is the device type of diagnostic we are to run.
� -C count: Specify the number of times to execute the diagnostic.
� -L length: Specify the "length" of the diagnostic to execute.
This can be either short, medium or long and specified with the
letter's s, m or l respectively. In the case of a disk, a short
test will the first 10% of the device, a medium the first 50% and
long the whole (100%) of the disk.
� -M io-mode: Specify a destructive (read-write) or non-destructive
(read-only) test. Use either read-write or read-only.
� -T io-type: Specify a type of io - either sequential or random.
� -D device: Specify the device to test.
� SEE ALSO: randio, scsihdfs
13.16. DPART - edit a scsihd disk partition table
� SYNOPSIS:
� dpart -a|d|l|m -D file [-N name] [-F firstblock] [-L lastblock]
� dpart -a -D file -N name -F firstblock -L lastblock
� dpart -d -D file -N name
� dpart -l -D file
� dpart -m -D file -N name -F firstblock -L lastblock
� DESCRIPTION: Each scsihd device (typically a SCSI disk drive) can
be divided up into eight logical partitions. By default when a
scsihd device is bound into the RaidRunner's file system, it has
four partitions, the whole device (raw), typically named
bindpoint/raw, the partition file (bindpoint/partition), the
RaidRunner backup configuration file (bindpoint/rconfig), and the
"data" portion of the disk (bind- point/data) which represents the
whole device less the backup configuration area and partition file.
For more information, see scsihdfs. If other partitions are added,
then they will appear as bindpoint/partitionname. dpart allows you
to edit or list the partition table on a scsihd device (typically a
disk).
� OPTIONS:
� -a: Add a partition. When adding a partition, you need to specify
the partition name (-N) and the partition range from the first
block (-F) to the last block (-L).
� -d: Delete a named (-N) partition.
� -l: List all partitions.
� -m: Modify an existing partition. You will need to specify the
partition name (-N) and BOTH it's first (-F) and last (-L)
blocknumbers even if you are just modifying the last block number.
� -D file: Specify the partition file to be edited. Typically, this
is the bindpoint/partition file.
� -N name: Specify the partition name.
� -F firstblock: Specify the first block number of the partition.
� -L lastblock: Specify the last block number of the partition.
� SEE ALSO: scsihd
13.17. DUP - open file descriptor device
� SYNOPSIS: bind -k dup bind_point
� DESCRIPTION: The dup device makes a one level directory with an
entry in that directory for every open file descriptor of the
invoking K9 process. These directory "entries" are the numbers.
Thus a typical process (script) binding a dup device would at least
make these files in the namespace: "bind_point/0", "bind_point/1"
and "bind_point/2". These would correspond to its open standard in,
standard out and standard error file descriptors. A dup device
allows other K9 processes to access the open file descriptors of
the invoking process. To do this the other processes simply
"open" the required dup device directory entry whose name (a
number) corresponds to the required file descriptor.
13.18. ECHO - display a line of text
� SYNOPSIS: echo [string ...]
� DESCRIPTION: echo writes each given string to the standard output,
with a space between them and a newline after the last one. Note
that all the string arguments are written in a single write kernel
call. The following backslash-escaped characters in the strings are
converted as follows:
\b backspace
\c suppress trailing newline
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\nnn the character whose ASCII code is nnn (octal)
� SEE ALSO: cat
13.19. ENV- environment variables file system
� SYNOPSIS: bind -k env bind_point
� DESCRIPTION: env file system associates a one level directory with
the bind_point in the K9 namespace. Each file name in that
directory is the name of the environment variable while the
contents of the file is that variable's current value. Conceptually
each process sees their own copy of the env file system. This copy
is either empty or inherited from this process's parent at spawn
time (depending on the flags to spawn).
13.20. ENVIRON - RaidRunner Global environment variables - names and
effects
� DESCRIPTION: The RaidRunner uses GLOBAL environment variables to
control the functionality of automatic actions. GLOBAL environment
variables are saved in the Raid configuration area so they retain
their values between reboots/power downs. Certain RaidRunner
internal run-time variables can also be set as a GLOBAL environment
variables. See the internals manual entry for details. The table
below describes those GLOBAL environment variables that are used by
the RaidRunner in it's normal operation.
� RebuildPri
This variable, if set, controls the priority used when drive
reconstruction occurs via the rebuild program. If the variable is
not set then the default rebuild priority would be used. The
variable is to be a comma separated list of raid set names and
their associated rebuild priorities and sleep periods (colon
separated). The form is
Rname_1:Pri_1:Sleep_1,Rname_2:Pri_2:Sleep_2,...,Rname_N:Pri_N:Sleep_N
where Pri_1 is to be the priority the rebuild program runs with
when run on raid set Rname_1, Sleep_1 is the period, in milliseconds,
to sleep between each rebuild action on the raid set, Pri_2 is to be
the priority for raid set Rname_2, and so forth. For example, if
the value of RebuildPri is
R:5:30000
then if a rebuild occurs (via replace, repair or autorepair) on raid
set R then the rebuild will run with priority 5 (via the -p rebuild
option) and will sleep 30000 milliseconds (30 seconds) between each
rebuild action (specified via the -S rebuild option). The priority
given must be valid for the rebuild program.
� BackendRanks
On certain RaidRunner's where multiple controllers may exist, you
can restrict a controller's access to the backend ranks of devices
available. For example, you may have 2 controllers and 4 ranks of
backend devices. You can specify that the first controller can
only access the first two ranks and the second controller, the
second two ranks. This variable along with other associated
commands allows you to set up this restriction. Additionally, you
may only have a single controller RaidRunner which is in an
enclosure with multiple ranks. By default the controller will
attempt to probe for all devices on all ranks. If you have only
populated the RaidRunner with say, half it's possible compliment of
backend devices, then the RaidRunner will still probe for the other
half. Setting this variable appropriately will prevent this un-
needed (and on occasion time consuming) process. This variable
takes the form
controller_id:ranklist controller_id:ranklist ...
where controller_id is the controller number (from 0 upwards) and
ranklist is a comma list of backend ranks which the given controller
will access. Note that the backend rank is the scsi-id of that rank.
For example, on a 2 rank (rank 1 and 2 - i.e scsi id 1 for the first
rank and scsi id 2 for the second), 1 controller
This variable takes the form
For example, on a 2 rank (rank 1 and 2 - i.e scsi id 1 for the first
rank and scsi id 2 for the second), 1 controller RaidRunner where
only the first rank has devices you could prevent the controller
from attempting to access the (empty) second rank by setting Back�
endRanks to
0:1
Typically, you would not set this variable directly, but use support�
ing commands to set it. These commands are pranks and sranks.
See these manual entries for details.
� RAIDn_reference_PBUFS
Raid types 3, 4 and 5 all make use of memory for temporary parity
buffers when they need to create parity data. This memory is in
addition to that allocated to a raid set's cache. When a raid set
is created, it will also create a default number of parity
buffers (which are the same size is a raid set's iosize).
Sometimes, if the iosize of the raid set is large there will not be
enough memory to create this default number of parity buffers. To
overcome this situation, you can set GLOBAL environment variables
to over-ride the default number of parity buffers that all raid
sets of a particular type or a specific raid set will use. You need
to set these variables before you define the raid set via agui
and if you delete them and not the raid set, then the effect raid
sets may not boot and hence will not be accessible by a host. The
variables are of the form RAIDn_reference_PBUFS where n is the raid
type (3, 4 or 5), and reference is the raid set's name or the
string 'Default' You use the reference of 'Default' to specify all
raid sets of a particular type. For example, to over-ride the
number of parity buffers for a raid 5 named
: raid ; setenv RAID5_FRED_PBUFS 64
To over-ride the number of parity buffers for ALL raid 3's (and set
only 72 parity buffers) set
: raid ; setenv RAID3_Default_PBUFS 128
If you set a default for all raid sets of a particular type, but want
ONE of them to be different then set up a variable for that
particular raid set as it's value will over-ride the default. In the
above example, where all Raid Type 3 will have 128 parity buffers, you
could set the variable
: raid ; setenv RAID3_Dbase_PBUFS 56
which will allow the raid 3 raid set named 'Dbase' to have 56 parity
buffers, but all other raid 3's defined on the RaidRunner will have
128.
� SEE ALSO: setenv, printenv, rconf, rebuild, internals
13.21. EXEC - cause arguments to be executed in place of this shell
� SYNOPSIS: exec [ arg ... ]
� DESCRIPTION: exec causes the command specified by the first arg to
be executed in place of this shell without creating a new process.
Subsequent args are passed to the command specified by the first
arg as its arguments. Shell redirection may appear and, if no other
arguments are given, causes the shell input/output to be modified.
13.22. EXIT - exit a K9 process
� SYNOPSIS: exit [string]
� DESCRIPTION: exit has an optional string argument. If the optional
argument is given the current K9 process is terminated with the
given string as its exit value. (If the string has embedded spaces
then the whole string should be a quoted_string). If no argument is
given then the shell gets the string associated with the
environment variable "status" and returns that string as the exit
value. If the environment variable "status" is not found then the
"true" exit status (i.e. NIL) is returned.
� SEE ALSO: true, K9exit
13.23. EXPR - evaluation of numeric expressions
� SYNOPSIS: expr numeric_expr ...
� DESCRIPTION: expr evaluates each numeric_expr command line argument
as a separate numeric expression. Thus a single expression cannot
contain unescaped whitespaces or needs to be placed in a quoted
string (i.e. between "{" and "}"). Arithmetic is performed on
signed integers (currently numbers in the range from -2,147,483,648
to 2,147,483,647). Successful calculations cause no output (to
either standard out/error or environment variables). So each useful
numeric_expr needs to include an assignment (or op-assignment).
Each numeric_expr argument supplied is evaluated in the order given
(i.e. left to right) until they all evaluate successfully
(returning a true status). If evaluating a numeric_expr fails
(usually due to a syntax error) then the expr command fails with
"error" as the exit status and the error message is written to the
environment variable "error".
� OPERATORS: The precedence of each operator is shown following the
description in square brackets. "0" is the highest precedence.
Within a single precedence group evaluation is left-to-right except
for assignment operators which are right-to-left. Parentheses have
higher precedence than all operators and can be used to change the
default precedence shown below.
UNARY OPERATORS
+
Does nothing to expression/number to the right.
-
negates expression/number to the right.
!
logically negate expression/number to the right.
~
Bitwise negate expression/number to the right.
BINARY ARITHMETIC OPERATORS
*
Multiply enclosing expressions [2]
/
Integer division of enclosing expressions
%
Modulus of enclosing expressions.
+
Add enclosing expressions
-
Subtract enclosing expressions.
<<
Shift left expression _left_ by number in right expression. Equivalent
to: left * (2 ** right)
>>
Shift left expression _right_ by number in right expression.
Equivalent to: left / (2 ** right)
&
Bitwise AND of enclosing expressions
^
Bitwise exclusive OR of enclosing expressions. [8]
|
Bitwise OR of enclosing expressions. [9]
BINARY LOGICAL OPERATORS
These logical operators yield the number 1 for a true comparison and 0
for a false comparison. For logical ANDs and ORs their left and right
expressions are assumed to be false if 0 otherwise true. Both logical
ANDs and ORs evaluate both their left and right expressions in all
case (cf. C's short-circuit action).
<=
true when left less than or equal to right. [5]
>=
true when left greater than or equal to right. [5]
<
true when left less than right. [5]
>
true when left greater than right. [5]
==
true when left equal to right. [6]
!=
true when left not equal to right. [6]
&&
logical AND of enclosing expressions [10]
||
logical OR of enclosing expressions [11]
ASSIGNMENT OPERATORS
In the following descriptions "n" is an environment variable while
"r_exp" is an expression to the right. All assignment operators have
the same precedence which is lower than all other operators. N.B.
Multiple assignment operators group right-to-left (i.e. same as C
language).
=
Assign right expression into environment variable on left.
*=
n *= r_exp is equivalent to: n = n * r_exp
/=
n /= r_exp is equivalent to: n = n / r_exp
%=
n %= r_exp is equivalent to: n = n % r_exp
+=
n += r_exp is equivalent to: n = n + r_exp
-=
n -= r_exp is equivalent to: n = n - r_exp
<<=
n <<= r_exp is equivalent to: n = n << r_exp
>>=
n >>= r_exp is equivalent to: n = n >> r_exp
&=
n &= r_exp is equivalent to: n = n & r_exp
|=
n |= r_exp is equivalent to: n = n | r_exp
� NUMBERS: All number are signed integers in the range stated in the
description above. Numbers can be input in base 2 through to base
36. Base 10 is the default base. The default base can be overridden
by:
1. a leading "0" : implies octal or hexadecimal
2. a number of the form _base_#_num_
Numbers prefixed with "0" are interpreted as octal. Numbers pre�
fixed with "0x" or "0X" are interpreted as hexadecimal. For num�
bers using the "#" notation the _base_ must be in the range 2
through to 36 inclusive. For bases greater then 10 the letters "a"
through "z" are utilised for the extra "digits". Upper and lower
case letters are acceptable. Any single digit that exceeds (or is
equal to) the base is consider an error. Base 10 numbers only may
have a suffix. See suffix for a list of valid suffixes. Also note
that since expr uses signed integers then "1G" is the largest mag�
nitude number that can be represented with the "Gigabyte" suffix
(assuming 32 bit signed integers, -2G is invalid due to the order
of evaluation).
� VARIABLES: The only symbolic variables allowed are K9 environment
variables. Regardless of whether they are being read or written
they should never appear preceded by a "$". Environment variables
that didn't previous exist that appear as left argument of an
assignment are created. When a non-existent environment variable is
read then it is interpreted as the value 0.
� EXAMPLES: Some simple examples:
expr {n = 1 + 2} # create n
echo $n
3
expr {n*=2} # 3 * 2 result back into n
echo $n
6
expr { k = n > 5 } # 6 > 5 is true so create k = 1
echo $k
1
� NOTE: expr is a Husky "built-in" command. See the "Note" section in
"set" to see the implications.
� SEE ALSO: husky, set, suffix, test
13.24. FALSE - returns the K9 false status
� SYNOPSIS: false
� DESCRIPTION: false does nothing other than return a K9 false
status. K9 processes return a pointer to a C string (null
terminated array of characters) on termination. If that pointer is
NULL then a true exit value is assumed while all other returned
pointer values are interpreted as false (with the string being
some explanation of what went wrong). This command returns a
pointer to the string "false" as its return value.
� EXAMPLE: The following script fragment will print "got here" to
standard out:
if false then
echo impossible
else
echo got here
end
� SEE ALSO: true
13.25. FIFO - bi-directional fifo buffer of fixed size
� SYNOPSIS:
� bind -k {fifo size} bind_point
� cat bind_point
� bind_point/data
� bind_point/ctl
� DESCRIPTION: fifo file system associates a one level directory
with the bind_point in the K9 namespace with a buffer size of size
bytes. bind_point/data and bind_point/ctl are the data and control
channels for the fifo. Data written to the bind_point/data file is
available for reading from the same file in a first-in first-out
basis. A write of x bytes to the bind_point/data file will either
complete and and transfer all the data, or will transfer
sufficient bytes until the fifo buffer is full then block until
data is removed from the fifo buffer by reading. A read of x bytes
from the bind_point/data file will transfer the lessor of the
current amount of data in the fifo buffer or x bytes. A read from
the bind_point/ctl will return the size of the fifo buffer and the
current usage. The number of opens (# Opens) is the number of
processes that currently have the bind_point/data file open.
� EXAMPLE
> /buffer
bind -k {fifo 2048} /buffer
ls -l /buffer
/buffer:
/buffer/ctl fifo 2 0x00000001 1 0
/buffer/data fifo 2 0x00000002 1 0
cat /buffer/ctl
Max: 2048 Cur: 0, # Opens: 0
echo hello > /buffer/data
cat /buffer/ctl
Max: 2048 Cur: 6, # Opens: 0
dd if=/buffer/data bs=512 count=1
hello
0+1 records in
0+1 records out
cat /buffer/ctl
Max: 2048 Cur: 0, # Opens: 0
� SEE ALSO: pipe
13.26. GET - select one value from list
� SYNOPSIS: get number [ value ... ]
� DESCRIPTION: get uses the given number to select one value from the
given list. Indexing is origin 0 (e.g. "get 0 aaa bb c" returns
"aaa"). If the number is out of range for an index on the given
list of values then nothing is returned.
13.27. GETIV - get the value an internal RaidRunner variable
� SYNOPSIS:
� getiv
� getiv name
� DESCRIPTION: getiv prints the current value of an internal
RaidRunner variable or prints a list of all variables. When a
variable name is given it's current value is printed. If no value
is given the all available internal variables are listed.
� NOTES: As different models of RaidRunners have different internal
variables see your RaidRunner's Hardware Reference manual for a
list of variables together with the meaning of their values. These
variables are run-time variables and hence revert to their default
value whenever the RaidRunner is booted.
� SEE ALSO: setiv
13.28. HELP - print a list of commands and their synopses
� SYNOPSIS: help or ?
� DESCRIPTION: help or the question mark character - ?, will print a
list of all commands available to the command interpreter. Along
with each command, it's synopsis is printed.
13.29. HUSKY - shell for K9 kernel
� SYNOPSIS
� husky [-c command] [ file [ arg ... ] ]
� hs [-c command] [ file [ arg ... ] ]
� DESCRIPTION: husky and hs are synonyms. husky is a command language
interpreter that executes commands read from the standard input or
from a file. husky is a scaled down model of Unix's Bourne shell
(sh). One major difference is that husky has no concept of current
working directory. If the "-c" switch is present then the following
command is interpreted by husky in a newly thrown shell nested in
the current environment. This newly thrown shell exits back to the
current environment when the command finishes. Otherwise if
arguments are given the first one is assumed to be a file
containing husky commands. Again a new shell is thrown to execute
these commands. husky script files can access their command line
arguments and the 2nd and subsequent arguments to husky (if
present) are passed to the file for that purpose. If no arguments
are given to husky then commands are read from standard in (and the
shell is considered interactive).
� RETURN STATUS: husky places the K9 return status of a process (NIL
if ok, otherwise a string explaining the error) in the file
"/env/status"
An example:
dd if=/xx
dd: could not open /xx
cat /env/status
open failed
cat /env/status
# empty because previous "cat" worked
As the file "/env/status" is an environment variable the return status
of a command is also available in the variable $status. The exit
status of a pipeline is the exit status of the last command in the
pipeline.
� SIGNALS If an interactive shell receives an interrupt signal (i.e.
K9_SIGINT - usually a control-C on the console) then the shell
exits. The "init" process will then start a new instance of the
husky shell with all the previously running processes (with the
exception of the just killed shell) still running. This allows the
user to kill the process that caused the previous shell problems.
Alternatively a process that is acci� dentally run in foreground is
effectively put in the background by sending an interrupt signal to
the shell. Note that this is quite different to Unix shells which
would forward the signal onto the foreground process.
� QUOTES, ESCAPING, STRING CONCATENATION, ETC: A quoted_string (as
defined in the grammar) commences with a "{" and finishes with the
matching "}". The term "matching" implies that all embedded "{"
must have a corresponding embedded "}" before the final "}" is said
to match the original "{". A quoted_string can be spread across
several lines. No command line substitution occurs within
quoted_strings. The character for escaping the following character
is "\". If a "{" needs to be interpreted literally then it can be
represented by "\{". If a string containing spaces (whitespaces)
needs to be interpreted as a single token then space (whitespace)
can be escaped (i.e. "\ "). If a "\" itself needs to be
interpreted literally then it can be represented by "\\". The
string concatenation character is "^". This is useful when a token
such as "/d4" needs to built up by a script when "/d" is fixed
and the "4" is derived from some variable:
set n 4
> /d^$n
This example would create the file "/d4".
The output of another husky command or script can be made available
inline by starting the sequence with "`" and finishing it with a "'".
For example:
echo {ps output follows:
} `ps'
This prints the string "ps output follows:" followed on the next line
by the current output from the command "ps". That output from "ps"
would have its embedded newlines replaced by whitespaces.
COMMAND LINE FILE REDIRECTION:
� Redirection should appear after a command and its arguments in a
line to be interpreted by husky. A special case is a line that just
contains "> filename" which creates the filename with zero length
if it didn't previously exist or truncates to zero length if it
did.
� Redirection of standard in to come from a file uses the token "<"
with the filename appearing to its right. The default source of
standard in is the console.
� Redirection of standard out to go to a file uses the token ">" with
the filename appearing to its right. The default destination of
standard out is the console.
� Redirection of standard error to go to a file uses the token ">[2]"
with the filename appearing to its right. The default destination
of standard error is the console.
� Redirection of writes from within a command which uses a known file
descriptor number (say "n") to go to a file uses the token ">[n]"
with the filename appearing to its right.
� Redirection of read from within a command which uses a known file
descriptor number (say "n") to come from a file uses the token
"<[n]" with the filename appearing to its right.
� Redirection of reads and writes from within a command which uses a
known file descriptor number (say "n") to a file uses the token
"<>[n]" with the filename appearing to its right. In order to
redirect both standard out and standard error to the one file the
form " > filename >[2=1]" can be used. This sequence first
redirects standard out (i.e. file descriptor 1) to filename and
then redirects what is written to file descriptor 2 (i.e. standard
error) to file descriptor 1 which is now associated with filename.
ENVIRONMENT VARIABLES: Each process can access the name it was invoked
by via the variable: "arg0" . The command line arguments (excluding
the invocation name) can be accessed as a list in the variable: "argv"
. The number of elements in the list "argv" is place in "argc". The
get command is useful for fetching individual arguments from this
list. The pid of the current process can be fetched from the
variable: "pid". When a script launches a new process in the
background then the child's pid can be accessed from the variable
"child". The variable "ContollerId" is set to the RaidRunner
controller number husky is running on. Environment variables are a
separate "space" for each process. Depending on the way a process was
created, its initial set of environment variables may be copied from
its parent process at the "spawn" point.
SEE ALSO: intro
13.30. HWCONF - print various hardware configuration details
� SYNOPSIS: hwconf [-D] [-M] [-I] [-d [-n]] [-f] [-h] [-i -p c.s.l]
[-m] [-p c.s.l] [-s] [-S] [-t] [-T] [-P] [-W]
� DESCRIPTION: hwconf prints details about the RaidRunner hardware
and devices attached.
� OPTIONS:
� -h: Print the number of controllers, host interfaces per
controller, the number of disk channels per controller, number of
ranks of disks and the details memory (in bytes) on each
controller. Four memory figures are printed, the first is the
total memory in the controller, next is the amount of memory at
boot time, next is the amount currently available and lastly is the
largest available contiguous area of memory. This is the default
option.
� -f: Print the number of fans in the RaidRunner and then the
speed for each fan in the system. The speeds values are in
revolutions per minute (rpms). The fans in the system are labeled
in your hardware specification sheet for your RaidRunner. The
first speed printed from this command corresponds to fan number 0
on your specification sheet, the second is for fan 1, and so forth.
� -d: Print out information on all the disk drives on the RaidRunner.
For each disk on the RaidRunner, print out - the device name, in
the format c.s.l where c is the channel, s is the SCSI ID (or rank)
and l is the SCSI LUN of the device, the manufacturer's name
(vendor id), the disk's model name (product id), the disk's version
id, the disk serial number, the disk geometry - number of
cylinders, heads and sectors, and the last block number on the disk
and the block size in bytes. the disk revolution count per minute
(rpm's), the number of notches/zones available on the drive (if
any)
� -n: Print out the disk drive notch/zone tables if available. This
is a sub-option to the -d option. Not all disks appear to
correctly report the notch/zone partition tables. For each
notch/zone,
� the following is printed: the zone number, the zone's starting
cylinder, the zone's starting head, the zone's ending cylinder, the
zone's ending head, the zone's starting logical block number, the
zone's ending logical block number, the zone's number of sectors
per track
� -D: Print out the device names for all disk drives on the system.
� -I: Initialize back-end NCR SCSI chips. This flag may be used in
conjunction with any other option and will done first. It has
an effect only the first call to hwconf that has not yet used a -d,
-D or -I options, or on those chips that have not yet had a -p on
the channel associated with that chip.
� -m: Print out major flash and battery backed-up ram addresses
(in hex). Additionally print out the size of the RaidRunner
configuration area. Eight (8) addresses are printed in order
RaidRunner configuration area start and end addresses (FLASH RAM),
RaidRunner Husky Scripts area start and end addresses (FLASH RAM),
RaidRunner Binary Image area start and end addresses (FLASH RAM),
RaidRunner Battery Backed-up area start and end addresses. And the
size of the RaidRunner configuration area (in bytes) is then
printed.
� -p c.s.l: Probe a single device specified by the given channel,
SCSI ID (rank) and SCSI LUN provided in the format "c.s.l". The
output of this command is the same as the "-d" option but just for
the given device. If the device is not present then nothing will be
output and the exit status of the command will be 1.
� -i -p c.s.l: Re-initialize the SCSI device driver specified by the
given channel, SCSI ID (rank) and SCSI LUN provided in the format
"c.s.l". Typically this command is used when, on a running
RaidRunner, a new drive is plugged in, and it will be used prior
to the RaidRunner's next reboot.
� -M: Set the boottime memory. This option is executed internally
by the controller at boot time and has no function (or effect)
executed at any other time.
� -s: Print the 12 character serial number of the RaidRunner.
� -S: Issue SCSI spin up commands to all backends as quickly as
possible. This option is intended for use at power-on stage
only.
� -t: Probe the temperature monitor returning the internal
temperature of the RaidRunner in degrees Celsius.
� -T: Print the temperatures being recorded by the hardware
monitoring daemon (hwmon).
� -P: For both AC and DC power supplies, print the number of each
present and the state of each supply. The state will be printed as
ok or flt depending on whether the PSU is working or faulty.
� -W: This option will wait until all possible backends have spun
up. It is used in conjunction with
� NOTES : The order of printing the disk information is by SCSI ID
(rank), by channel, by SCSI LUN.
13.31. HWMON - monitoring daemon for temperature, fans, PSUs.
� SYNOPSIS: hwmon [-t seconds] [-d]
� DESCRIPTION: hwmon is a hardware monitoring daemon. It
periodically probes the status of certain elements of a RaidRunner
and if an out-of-band occurrence happens, will cause the alarm to
sound or light up fault leds as well as saving a message in the
system log. Depending on the model of RaidRunner, the elements
monitored are temperature, fans and power supplies. When an out-of-
band occurrence is found, hwmon will reduce the time between probes
to 5 seconds. If a buzzer is the alarm device, then the buzzer
will turn on for 5 seconds then off for 5 seconds and repeat this
cycle until the buzzer is muted or the occurrence is corrected.
If the RaidRunner model supports a buzzer muting switch, then the
buzzer will be muted if the switch is pressed during a cycle
change as per the previous paragraph. When hwmon recognizes the
mute switch it will beep twice.
Certain out-of-band occurrences can be considered to be catastrophic,
meaning if the occurrence remains uncorrected, the RaidRunner's
hardware is likely to be damaged. Occurrences such as total fan
failure and sustained high temperature along with total or partial fan
failure are considered as catastrophic. hwmon has a means of
automatically placing the RaidRunner into a "shutdown" or quiescent
state where minimal power is consumed (and hence less heat is
generated). This is done by the execution of the shutdown
command after a period of time where catastrophic out-of-band
occurrences are sustained. This process is enabled, via the
AutoShutdownSecs internal variable. See the internals manual for use
of this variable. hwmon can be prevented from starting at boot time
by creating the global environment variable NoHwmon and setting any
value to it. A warning message will be stored in the syslog.
� OPTIONS:
� t seconds: Specify the number of seconds to wait between probes of
the hardware elements. If this option is not specified, the
default period is 300 seconds.
� -d: Turn on debugging mode which can produce debugging output.
� SEE ALSO: hwconf, pstatus, syslogd, shutdown, internals
13.32. running kernel INTERNALS - Internal variables used by RaidRun�
ner to change dynamics of
� DESCRIPTION: Certain run-time features of the RaidRunner can be
manipulated by changing internal variables via the setiv command.
The table below describes each changeable variable, it's effect,
it's default value and range of values it can be set to. The
variables below are run-time features of a RaidRunner and
hence are always set to their default values when a RaidRunner
boots. Certain variables can be stored as a global environment
variable and will over-ride the defaults at boot time. If you
create a global environment variable of that variable's name with
an appropriate value, it's default value will be over-ridden the
next time the RaidRunner is re-booted. Note, that the values of
these variables ARE NOT CHECKED when set in the global
environment variable tables and, if incorrectly set, will generate
errors at boot until deleted or corrected. In the table below, any
variable that can have a value stored as a global environment
variable is marked with (GEnv)
� write_limit: This variable is the maximum number of 512-byte
blocks the cache filesystem will buffer for writes. If this limit
is reached all writes to the cache filesystem will be blocked until
the cache filesystem has written out (to it's backend) enough
blocks to reach a low water mark - write_low_tide. This variable
cannot be changed if battery backed-up RAM is available as it is
tied to the amount of battery backed-up RAM available. The value of
this variable is calculated when the cache is initialized. It's
value is dependant on whether battery backed-up RAM is installed
in the RaidRunner. If installed, the number of blocks of data that
can be saved into the battery backed-up RAM is calculated. If no
battery backed-up RAM is present, it's value is set to 75% of the
RaidRunner's memory (expressed in a count of 512 byte blocks)
then adjusted to reflect the amount of cache requested by
configured raid sets. When write_limit is changed then both
write_high_tide and write_low_tide are automatically changed to
there default values (a function of the value of write_limit).
� write_high_tide: This variable is a high water mark for the number
of written-to 512-byte blocks in the cache. When the number of data
blocks exceeds this value, to avoid the cache filesystem from
blocking it's front end, the cache flushing mechanism
continually flushes the cache buffer until the amount of unwritten
(to the backend) cache buffers is below the low water mark
(write_low_tide). This value defaults to 75% of write_limit. This
variable can have values ranging from write_limit down to
write_low_tide. It is recommended that this variable not be
changed.
� write_low_tide: This variable is a low water mark for when the
cache flushing mechanism is continually flushing data to it's
backend. Once the number of written-to cache blocks yet to be
flushed equals or is less than this value, the sustained flushing
is stopped. This value defaults to 25% of write_limit. This
variable can have values ranging from write_high_tide-1 down to
zero (0). It is recommended that this variable not be changed.
� cache_nflush: This variable is the number of cache buffers (not
512-byte data blocks) that the cache flushing mechanism will
attempt to write out in one flush cycle. Adjusting this value
may improve performance on writes depending of the size of the
cache buffers and type of disk drives used in the raid set
backends. The default value is 128. It's value can range from 2 to
128.
� cache_nread: This variable is the number of cache buffers (not
512-byte data blocks) that the cache reading mechanism will
attempt to read out in one read cycle. Adjusting this value may
improve performance on reads depending of the size of the cache
buffers and type of disk drives used in the raid set backends. The
default value is 128. It's value can range from 2 to 128.
� cache_wlimit: This variable is the number of cache buffers (not
512-byte data blocks) that the cache flushing mechanism will
attempt coalesce into a single sequential write. It is different
to cache_nflush in that cache_nflush is the total number of cache
buffers that can be written in a single cache flush cycle and these
buffers can be non sequential whereas cache_wlimit is a limit
on the number of sequential cache buffer's that can be written with
one write. Adjusting this value may improve performance on writes
depending of the size of the cache buffers and type of disk
drives used in the raid set backends. The default value is 128.
It's value can range from 2 to 128.
� cache_fperiod (GEnv): By default, the cache flushes any data to be
written every 1000 milliseconds (unless it's forced to by the fact
that the cache is getting full and then it flushes the cache and
resets the timer). You can vary this flushing period by setting
this variable. Given you have a large number of sustained reads
and minimal writes, then you may want to delay the writes out of
cache to the backends as long as possible. Note, that by setting
this to a high value, you run the risk of loosing what you have
written. The default value is 1000 milliseconds (i.e 1 second).
It's value can range from 500ms to 300000ms.
� scsi_write_thru (GEnv): By default all writes (from a host) are
buffered in the RaidRunner's cache and are flushed to the backend
disks periodically. When battery backed-up RAM is available then
this results in the most efficient write throughput. If no battery
backed-up RAM is available or you do not want to depend on writes
being saved in battery backed-up RAM in event of a power failure
you can force the RaidRunner to write data straight thru to the
backends prior to returning an OK status to the host. This
essentially provides a write-thru cache. The default value of this
variable is 0 - write-thru mode is DISABLED. The values this
variable can take are
� 0 - DISABLE write-thru mode, or
� 1 - ENABLE write-thru mode.
� scsi_write_fua (GEnv): This variable effects what is done when the
FUA (Force Unit Access) bit is set on a SCSI WRITE-10 command. When
this variable is enabled and a SCSI WRITE-10 command has the FUA
bit set is processed then the data is written directly thru the
cache to the backend disks. If the variable is disabled, then the
setting of the FUA bit on SCSI WRITE-10 commands is ignored. The
default value for this variable is disabled (0) if battery backed-
up RAM is present, or enabled (1) if battery backed-up RAM is NOT
present. The values this variable can take are
� 0 - IGNORE FUA bit on SCSI WRITE-10 commands, or
� 1 - ACT on FUA bit on SCSI WRITE-10 commands.
� scsi_ierror (GEnv): This variable controls what is done when the
RaidRunner receives a Initiator Detected Error message on a SCSI
host channel. If set (1), cause an Check Condition, If NOT set
(0), follow the SCSI-2 standard and re-transmit the Data In / Out
phase. The default value is 0. The values this variable can take
are
� 0 - follow SCSI-2 standard
� 1 - ignore the SCSI-2 standard and cause a Check Condition.
� scsi_sol_reboot (GEnv): Determines whether to auto-detect a Solaris
reboot and the clear any wide mode negotiations. If set (1), detect
a Solaris reboot and clear wide mode. If NOT set (0), follow the
SCSI-2 standard and not clear wide mode. The default value is 0.
The values this variable can take are
� 0 - follow SCSI-2 standard
� 1 - ignore the SCSI-2 standard and clear wide mode.
� scsi_hreset (GEnv): Determines whether to issue a SCSI bus reset on
host ports after power-on. If set (1), then a SCSI bus reset is
done on the host port when starting the first smon/stargd process
on that port. If NOT set (0), nothing is done. The default value
is 0. The values this variable can take are
� 0 - don't issue SCSI bus resets on power-on.
� 1 - issue SCSI bus resets on power-on when the first smon/stargd
process is started.
� scsi_full_log (GEnv): Determines whether or not stargd reports, via
syslog, a Reset Check condition on Read, Write, Test Unit Ready
and Start Stop commands. This reset check condition is always set
when a RaidRunner boots or the raid detects a scsi-bus reset. Note
that this variable only suppresses the logging of this Check
condition into syslog, it does not effect the response to the host
of this and any Check condition. If set (1), then all stargd
detected reset Check condition error messages are logged. If NOT
set (0), these messages are suppressed The default value is 0. The
values this variable can take are
� 0 - suppress logging these messages
� 1 - log all messages.
� scsi_ms_badpage (GEnv): Determines whether or not stargd reports,
via syslog, that it has received a non-supported page number in a
MODE SENSE or MODE SELECT command it receives from a host. Note
that stargd will issue the appropriate Check condition to the host
("Invalid Field in CDB") irrespective of the value of this
variable. If set (1), then all stargd detected non-supported page
numbers in MODE SENSE and MODE SELECT commands will be logged.
If NOT set (0), these messages are suppressed The default value is
0. The values this variable can take are
� 0 - suppress logging these messages
� 1 - log all messages.
� scsi_bechng (GEnv): Determines whether or not the raid reports
backend device parameter change errors. In a multi controller
environment, backends are probed and some of their parameters are
changed by a booting controller. This will generate parameter
change mode sense errors. If cleared (0), then all parameter change
errors will NOT be logged. If set (1), these messages are logged
like any other backend error. The default value is 0. The values
this variable can take are
� 0 - suppress logging these messages
� 1 - log all messages.
� scsi_dnotch (GEnv): Some disk drives take an inordinate amount of
time to perform mode select commands. One set of information a
RaidRunner will obtain from a device backend are the disk notch
pages (if present). As this is for information only, then to reduce
the boot time of a RaidRunner you can request that disk notches are
not obtained. If cleared (0), backend disk notch information is not
probed for. If set (1), then backend disk notch information is
probed for. The default value is 1. The values this variable can
take are:
� 0 - don't probe for notch pages
� 1 - probe for notch pages
� scsi_rw_retries (GEnv): Specify the number of read or write
retries to perform on a device backend before effecting an error on
the given operation. Note that ALL retries are reported via
syslog. The default value is 3. It's value can range from 1 to 9.
� scsi_errpage_r (GEnv): Specify the number of internal read retries
that a disk backend is to perform before reporting an error (to
the raid). Setting this variable causes the Read Retry Count field
in the Read-Write Error Recovery mode sense page. A value of -1
will cause the drive's default to be used. The default value is -1.
It's value can range from -1 (use disk's default) or from 0 to 255.
� scsi_errpage_w (GEnv): Specify the number of internal write retries
that a disk backend is to perform before reporting an error (to the
raid). Setting this variable causes the Write Retry Count field
in the Read-Write Error Recovery mode sense page. A value of -1
will cause the drive's default to be used. The default value is -1.
It's value can range from -1 (use disk's default) or from 0 to 255.
� BackFrank: Specify the SCSI-ID of the first rank of backend disks
on a RaidRunner. This variable should never be changed and is
for informative purposes only. The default value is dependant on
the model of RaidRunner being run. The values this variable can
take are
� 0 - the first rank SCSI-ID will be 0
� 1 - the first rank SCSI-ID will be 1
� raid_drainwait (GEnv): Specify the number of milliseconds a
raidset is to delay, before draining all backend I/O's when a
backend fails. Setting this variable to a lower value will speed up
the commencement of any error recovery procedures that would be
performed on a raid set when a backend fails. The default value is
500 milliseconds. It's value can range from 50 to 10000
milliseconds.
� EnclosureType: Specify the enclosure type a raid controller is
running within. This variable should never be changed and is for
informative purposes only. The default value is dependant on the
model of RaidRunner being run. The values this variable can take
are integers starting from 0.
� fmt_idisc_tmo (GEnv): Specify the SCSI command timeout (in
milliseconds) when a SCSI FORMAT command is issued on a
backend. Disk drives take different amounts of time to perform a
SCSI FORMAT command and hence a timeout is required to be set when
the command is issued. As certain drives may take longer to format
than the default timeout you can change it. The default value is
720000 milliseconds. It's value can range from 200000 to 1440000
milliseconds.
� AutoShutdownSecs (GEnv): Specify the number of seconds the
RaidRunner should monitor catastrophic hardware failures before
deciding to automatically shutdown. A catastrophic failure is one
which will cause damage to the RaidRunner's hardware if not
fixed immediately. Failures like all fans failing would be
considered catastrophic. A value of 0 seconds (the default) will
disable this feature, that is, with the exception of logging the
errors, no action will occur. See the shutdown and hwmon for
further details. The default value is 0 seconds. It's value can
range from 20 to 125 seconds.
� SEE ALSO: setiv, getiv, syslog, setenv, printenv, hwmon, shutdown
13.33. KILL - send a signal to the nominated process
� SYNOPSIS: kill [-sig_name] pid
� DESCRIPTION: kill sends a signal to the process nominated by pid.
If the pid is a positive number then only the nominated process is
signaled. If the pid is a negative number then the signal is sent
to all processes in the same process group as the process with the
id of -pid. The switch is optional and if not given a SIGTERM
(software termination signal) is sent. If the sig_name switch is
given then it should be one of the following (lower case)
abbreviations. Only the first 3 letters need to be given for the
signal name to be recognized. Following each abbreviation is a
brief explanation and the signal number in brackets:
null - unused signal [0]
hup - hangup [1]
int - interrupt (rubout) [2]
quit - quit (ASCII FS) [3]
kill - kill (cannot be caught or ignored) [4]
pipe - write on a pipe with no one to read it [5]
alrm - alarm clock [6]
term - software termination signal [7]
cld - child process has changed state [8]
nomem - could not obtain memory (from heap) [9]
You cannot kill processes whose process id is between 0 and 5
inclusive. These are considered sacrosanct - hyena, init and console
reader/writers.
� SEE ALSO: K9kill
13.34. LED- turn on/off LED's on RaidRunner
� SYNOPSIS:
� led
� led led_id led_function
� DESCRIPTION: led uses the given led_id to identify the LED to
manipulate based on the led_function. When no arguments are given,
an internal LED register is printed along with the current function
the onboard LEDS, led1 and led2 are tracing. If a undefined led_id
is given, the led command silently does nothing and returns NULL.
If an incorrect number of arguments or invalid led_function is
given a usage message is printed. Depending on the RaidRunner model
the led_id can be one of
� led1 - LED1 on the RaidRunner controller itself
� led2 - LED2 on the RaidRunner controller itself
� Dc.s.l - Device on channel c, scsi id s, scsi lun l
� status - the status LED on the RaidRunner
� io - the io LED on the RaidRunner
and led_function can be one of
� on - turn on the given LED
� off - turn off the given LED
� ok - set the given LED to the defined OK state
� faulty - set the given LED to the defined FAULTY state
� warning - set the given LED to the defined WARNING state
� rebuild - set the given LED to the defined REBUILD state
� tprocsw - set the given LED to trace kernel process switching
� tparity - set the given LED to trace I/O parity generation
� tdisconn - set the given LED to trace host interface disconnect
activity
� pid - set the given LED to trace the process pid as it runs
Different models of RaidRunner have various differences in number of
LED's and their functionality. Depending on the type of LED, the ok,
faulty, warning and rebuild functions perform different functions. See
your RaidRunner's Hardware Reference manual to see what LED's exist
and what different functions do.
NOTES: Tracing activities can only occur on the `onboard` leds (LED1,
LED2).
SEE ALSO: lflash
13.35. LFLASH- flash a led on RaidRunner
� SYNOPSIS: lflash led_id period
� DESCRIPTION: lflash uses the given led_id to identify the LED to
flash every period seconds. If a undefined led_id is given, the led
command silently does nothing and returns NULL. Depending on the
RaidRunner model the led_id can be one of:
led1 - LED1 on the RaidRunner controller itself
led2 - LED2 on the RaidRunner controller itself
Dc.s.l - Device on channel c, scsi id s, scsi lun l
status - the status LED on the RaidRunner
io - the io LED on the RaidRunner
� NOTE: The number of seconds must be greater than or equal to 2.
� SEE ALSO: led
13.36. LINE - copies one line of standard input to standard output
� SYNOPSIS: line
� DESCRIPTION: line accomplishes the one line copy by reading up to a
newline character followed by a single K9write.
� SEE ALSO: K9read, K9write
13.37. LLENGTH - return the number of elements in the given list
� SYNOPSIS: llength list
� DESCRIPTION: llength returns the number of elements in a given
list.
� EXAMPLES: Some simple examples:
set list D1 D2 D3 D4 D5 # create the list
set len `llength $list' # get it's length
echo $len
5
set list {D1 D2 D3 D4 D5} {D6 D7} # create the
list
set len `llength $list' # get it's length
echo $len
2
set list {} # create an empty list
set len `llength $list' # get it's length
echo $len
0
13.38. LOG - like zero with additional logging of accesses
� SYNOPSIS: bind -k {log fd error_rate tag} bind_point
� DESCRIPTION: log is a special file that when written to is a
infinite sink of data (i.e. anything can be written to it and it
will be disposed of quickly). When log is read it is an infinite
source of zeros (i.e. the byte value 0). The log file will appear
in the K9 namespace at the bind_point. Additionally, ASCII log
data is written to the file associated with file descriptor fd.
error_rate should be a number between 0 and 100 and is the
percentages of errors (randomly distributed) that will be reported
(as an EIO error) to the caller. Each line written to fd will
have tag appended to it. There is one line output to fd for each IO
operation on the log special file. The first character output is
"R" or "W" indicating a read or write. The second character is
blank if no error was reported and "*" if one was reported. Next
(after a white space) is a (64 bit integer) offset into the
file of the start of the operation, followed by the size (in bytes)
of that operation. The line finishes with the tag.
� EXAMPLE: Bind a log special file at "/dev/log" that writes log
information to standard error. Each line written to standard error
has the tag string "scsi" appended to it. Approximately 30% of
reads and writes (i.e. randomly distributed) return an EIO error to
the caller. This is done as follows:
bind "log 2 30 scsi" /dev/log
dd if=/dev/zero of=/dev/log count=5 bs=512
W 0000000000 512 scsi
W 0000000200 512 scsi
W 0000000400 512 scsi
W* 0000000600 512 scsi
Write failed.
4+0 records in
3+0 records out
SEE ALSO: zero
13.39. LRANGE - extract a range of elements from the given list
� SYNOPSIS: lrange first last list
� DESCRIPTION: lrange returns a list consisting of elements first
through last of list. 0 refers to the first element in the list. If
first is greateR THAN last then the list is extracted in reverse
order.
� EXAMPLES: Some simple examples:
set list D1 D2 D3 D4 D5 # create the list
set subl `lrange 0 3 $list' # extract from indices 0 to 3
echo $subl
D1 D2 D3 D4
set subl `lrange 3 1 $list' # extract from indices 3 to 1
echo $subl
D4 D3 D2
set subl `lrange 4 4 $list' # extract from indices 0 to 3
echo $subl # equivalent to get 4 $list
D5
set subl `lrange 3 100 $list'
echo $subl
D4 D5
13.40. LS - list the files in a directory
� SYNOPSIS: ls [ -l ] [ directory... ]
� DESCRIPTION: ls lists the files in the given directory on standard
out. If no directory is given then the root directory (i.e. "/")
is listed. Each file name contained in a directory is put on a
separate line. Each listing has a lead-in line stating which
directory is being shown. If there is more than one directory
then they are listed sequentially separated by a blank line. If the
"-l" switch is given then every listed file has data such as its
length and the file system it belongs to shown on the same line as
its name. See the stat command for more information. ls is not an
inbuilt command but a husky script which utilizes cat and stat. The
script source can be found in the file "/bin/ps".
� SEE ALSO: cat, stat
13.41. LSEARCH - find the a pattern in a list
� SYNOPSIS: lsearch pattern list
� DESCRIPTION: lsearch returns the index of the first element in list
that matches pattern or -1 if none. 0 refers to the first element
in the list
� EXAMPLES: Some simple examples:
set list D1 D2 D3 D4 D5 # create the list
set idx `lsearch D4 $list' # get index of D4 in list
echo $idx
3
set idx `lsearch D1 $list' # get index of D1 in list
echo $idx
0
set idx `lsearch D8 $list' # get index of D8 in list
echo $idx # equivalent to get 4 $list
-1
13.42. LSUBSTR - replace a character in all elements of a list
� SYNOPSIS: lsubstr find_char replacement_char list
� DESCRIPTION: lsubstr returns a list replacing every find_ch
character found in any element of the list with the
replacement_char character. replacement_char can be NULL which
effectively deletes all find_char characters in the list.
� EXAMPLES: Some simple examples:
set list D1 D2 D3 D4 D5 # create the list
set subl `lsubstr D x $list' # replace all D's with x's
echo $subl
x1 x2 x3 x4 x5
set subl `lsubstr D {} $list' # delete all D's
echo $subl
1 2 3 4 5
set list -L -16 # create a list with embedded braces
set subl `lsubstr {} $list' # delete all open braces
set subl `lsubstr {} $subl' # delete all close braces
echo $subl
-L 16
13.43. MEM - memory mapped file (system)
� SYNOPSIS: bind -k {mem first last [ r ]} bind_point
� DESCRIPTION: mem allows machine memory to be accessed as a single
K9 file (rather than a file system). The host system's memory is
used starting at the first memory location up to and including the
last memory location. Both first and last need to be given in
hexadecimal. If successful the mem file will appear in the K9
namespace at the bind_point. The stat command will show it as a
normal file with the appropriate size (i.e. last - first + 1). If
the optional "r" is given then only read-only access to the file is
permitted. In a target environment mem can usefully associate
battery backed-up RAM (or ROM) with the K9 namespace. In a Unix
environment it is of limited use (see unixfd instead). In a DOS
environment it may be useful to access memory directly (IO space)
but for accessing the DOS console see doscon. When mem is
associated with the partition of Flash RAM that stores the husky
scripts, which is stored compressed, reading from that page will
automatically decompress and return the data as it is read. When
mem is associated with the writable partitions of Flash RAM
(configuration partition, husky script partition and main binary
partition) a write to the start of any partition will erase that
partition.
� SEE ALSO: ram
� BUGS: Only a single file rather than a file system can be bound.
13.44. MDEBUG - exercise and display statistics about memory alloca�
tion
� SYNOPSIS: mdebug [off|on|trace|p|m size|f ptr|c nel elsize|r ptr
size]
� DESCRIPTION: mdebug can be used to directly allocate and free
memory. mdebug will also print (to standard output) information
about the current state of memory allocation. With out any given
options a brief five line summary of memory usage is printed, e.g.
: raid; mdebug
Mdebug is off
nreq-nfree=87096-82951=4145(13905745)
size=15956672/16150000
waste=1%/2%
list=4251/8396
: raid;
The first line indicates the debug mode, either off, on or trace.
The second line indicates the number times a request for memory is
made (to Mmalloc() or Mcalloc() and related functions) and the number
of times the memory allocator is called to free memory (via
Mfree()). The difference between these first two numbers is the total
number of currently allocated blocks of memory, with the number
between the '(' and ')' being the total memory requested. Note
that the amount of memory actually assign may be more than requested.
The third line indicates the amount of memory being managed. The
second number is the total memory man aged (i.e. left over after
loading the statically allocated text, data and bss space). The first
number is that left over after various memory allocation tables have
been subtracted out from that afore mention number. The fourth line
is the total amount of extra memory assigned to requests in excess of
the actual requested memory as compared with the totals on line 3.
The fifth line relates to the list of currently allocated memory. The
first number is the number of free entries left and the second is
the maximum table size. Note that the number of currently allocated
blocks (third number on line 2) when added to the first number on line
5 gives the second number on line 5.
OPTIONS:
� p: Prints the above mentioned five line summary and then the free
list.
� P: Prints all the above plus dumps the list of currently allocated
memory.
� PP: Prints all the above plus the free bitmap.
The above three options can generate copious output and require a
detailed knowledge of the source to understand their meaning.
off: Turns off memory allocation debugging. This is the default
condition after booting.
on: Turns on memory allocation assertion checking.
trace: Turns on memory allocation assertion checking and traces every
memory allocation / deallocation.
m: Uses Mmalloc() to allocate a block of memory of size bytes.
f: Uses Mfree() to de-allocate a block of memory addressed by ptr.
c: Uses Mcalloc() to allocate a contiguous block of memory
consisting of nel elements each of size bytes.
r: Uses Mrealloc() to re-allocate a block of previously allocated
memory, ptr, changing the allocated size to be size bytes.
SEE ALSO: Unix man pages on malloc()
13.45. MKDIR - create directory (or directories)
� SYNOPSIS: mkdir [ directory_name ... ]
� DESCRIPTION: mkdir creates the given directory (or directories). If
all the given directories can be created then NIL is returned as
the status; otherwise the first directory that could not be created
is returned (and this command will continue trying to create
directories until the list is exhausted). A directory cannot be
created with a file name that previously existed in the enclosing
directory.
13.46. MKDISKFS - script to create a disk filesystem
� SYNOPSIS: mkdiskfs disk_directory_root disk_name
� DESCRIPTION: mkdiskfs is a husky script which is used to
perform all the necessary commands to create a disk filesystem
given the root of the disk file system and the name of the disk.
� OPTIONS :
� disk_directory_root: Specify the directory root under which the
disk filesystems are bound. This is typically /dev/hd.
� disk_name: Specify the name of the disk in the format Dc.s.l where
c is the channel, s is the scsi id (or rank) and l is the scsi
lun of the disk.
After parsing it's arguments mkdiskfs creates the disk filesystem's
bind point and binds in the disk at that point. set.
SEE ALSO: rconf, scsihdfs
13.47. MKHOSTFS - script to create a host port filesystem
� SYNOPSIS: mkhostfs controller_number host_port host_bus_directory
� DESCRIPTION: mkhostfs is a husky script which is used to perform
all the necessary commands to create a host port filesystem on the
given RaidRunner controller given the root of the host port file
systems and the host port number.
� OPTIONS:
� controller_number: Specify the controller on which the host port
filesystem is to be created.
� host_port: Specify the host port number to create the filesystem
for.
� host_bus_directory: Specify the directory root under which host
filesystems are bound. This is typically /dev/hostbus. After
parsing it's arguments mkhostfs finds out what SCSI ID the host
port is to present (see hconf and then binds in the host
filesystem. set.
� SEE ALSO: hconf, scsihpfs
13.48. MKRAID - script to create a raid given a line of output of
rconf
� SYNOPSIS: mkraid `rconf -list RaidSetName'
� DESCRIPTION: mkraid is a husky script which is used to perform all
the necessary commands to create and enable host access to a given
Raid Set. The arguments to mkraid is a line of output from a rconf
-list command. After parsing it's arguments mkraid checks to
see if a reconstruction was being performed when the RaidRunner was
last operating, and if so, notes this. It then creates the raid
filesystem (see mkraidfs) and adds a cache frontend to the raid
filesystem. It then creates the required host filesystems (see
mkhsotfs) and finally, if a reconstruction had been taking place
when the RaidRunner was last operating, it restarts a
reconstruction.
� NOTE: This husky script DOES NOT enable target access (stargd) to
the raid set it creates.
� SEE ALSO: rconf, mkraidfs, mkhostfs
13.49. MKRAIDFS - script to create a raid filesystem
� SYNOPSIS: mkraidfs -r raidtype -n raidname -b backends [-c chunk]
[-i iomode] [-q qlen] [-v] [-C capacity] [-S]
� DESCRIPTION: mkraidfs is a husky script which is used to perform
all the necessary commands to create a Raid filesystem.
� OPTIONS:
� -r raidtype: Specify the raid type as raidtype for the raid set.
Must be 0, 1, 3 or 5.
� -n raidname: Specify the name of the raid set as raidname.
� -b backends: Specify the comma separated list of the raid set's
backends in the format used by rconf.
� -c iosize: Optionally specify the IOSIZE (in bytes) of the raid
set.
� -i iomode: Optionally specify the raid set's iomode - read-write,
read-only, write-only.
� -q qlen: Optionally specify the raid set's queue length for each
backend.
� -v: Enable verbose mode which prints out the main actions (binding,
engage commands) as they are performed.
� -C capacity: Optionally specify the raid set's size in 512-byte
blocks.
� -S: Optionally specify that spares pool access is required should a
backend fail.
After parsing it's arguments mkraidfs creates the Raid Set's
backend filesystems, typically, disks (see mkdisfs) taking care
of failed backends. It then binds in the raid filesystem and
engages the backends into the filesystem. If spares access is
requested, it enables the autorepair feature of the raid set.
SEE ALSO: rconf, mkraidfs, mkhostfs, mkdiskfs, raid[0135]fs
13.50. MKSMON - script to start the scsi monitor daemon smon
� SYNOPSIS: mksmon controllerno hostport scsi_lun protocol_list
� DESCRIPTION: mksmon is a husky script which is used to perform all
the necessary commands to start the scsi monitor daemon smon given
the controller number, hostport, scsi lun, and the block protocol
list. Typically, mksmon, is run with it's arguments from the output
of a mconf -list command.
� OPTIONS:
� controllerno: Specify the controller on which the scsi monitor
daemon is to be run.
� hostport: Specify the host port through which the scsi monitor
daemon communicates.
� scsi_lun: Specify the SCSI LUN the scsi monitor daemon is to
respond to.
� protocol_list: Specify the comma separated block protocol list the
scsi monitor daemon is to implement.
After parsing it's arguments mksmon checks to see if it's already
running and issues a message if so and exits. Otherwise, it
creates the host filesystem (mkhostfs), creates a memory file and
set of fifo's for smon to use and finally starts smon set.