Past Echos the Future (Part II) - O350 / Chimera Stuff...

Notes on vintage SGI computers specific to Chimera Architecture machines including: Origin / Onyx 350 (O350), Onyx4, Tezro & Fuel

Past Echos the Future (Part II) - O350 / Chimera Stuff...

Status: 29 July 2021 - Split Chimera material from "Past Echos the Future (Part I).

In "playing" about with vintage SGI computers, I used my blog "Past Echos the Future (Part I) - Notes on SGI / IRIX Stuff" to record various software setup and hardware modifications and fixes.

This blog entry has become long and sprawling .... so I am splitting into two or more parts so it is easier to find my own notes.

As per title, this part if focused on Origin / Onyx 350 (O350), Tezro, Fuel Chimera architecture based machines. I will continue to update the original blog for more general and IRIX related material.


L3 Controller & L2 Emulator Linux Setup

The SGI Origin and Onyx 3000 / 300 series machines have L1 (level 1) controllers on each compute node. These can be connected (via USB) to single L2 (level 2) controller and L3 (Level 3) controllers can access L2 via network.

There is a hardware L3 controller (SGI console) or a linux L3 controller or you can telnet to the L2 controller for remote booting and configuration of a set of nodes.

The linux  L3 software is available at a number of places on the web including including here.

The Linux controller require a very early Linux version (I used Fedora 1, Yarrow) with kernel version 2.4 for USB to work.

I had no success getting the USB to work with VMWare and you need the USB to run the L2 simulator as this expects to connect to machines via USB. As I could not get this running on VMWare I have installed it on old PC (must have USB port).

One day I will try with KVM / QEMU ... (see here for progress on this)

NOTE: All of L2 & host L1 RS-232 serial ports are set to: 38,400 - 8 bits, No Parity, 1 stop bits (38,400 8-N-1)


Recovering a bricked L2 Controller following IRIX 6.5.30 Update

As described earlier the SGI "big iron" machines have L1 & L2 controllers. There is a bug in IRIX 6.5.30 (L2 firmware version 1.44) which means if you do IRIX update and include L2 update then it will brick the L2 controller. Connecting to this you will see that it is failing due to not finding a file.

This set of information has been updated based on checking against working L2 Controller. I have also recently re-verified this via a "bricked" L2 from another SGI enthausiant. If you have problems with the instructions then please let me know.

Here is the way to recover the bricked L2:

  1. You need to connect to L2 via serial (console) port

2. Then get into L2 OS (which is a PowerPC BusyBox implementation), this is by using "shell" command or just "!"

3. Change directories into /tmp (which is writable) and create new mount point: "cd /tmp" & "mkdir TMPLIB"

4. Create a new in memory (temporary) file system and mount it on your new directory: "mount -t tmpfs -o size=800k tmpfs /tmp/TMPLIB"

5. Copy the required sub directory contents onto the temporary mounted file system: "cp /stand/sysco/lib/* /tmp/TMPLIB"

5. Assign (ifconfig) your L2 an IP address and optionally the default route if required (depending on whether your http server is on local subnet or not): "ifconfig eth0 XXX.XXX.XXX.XXX netmask 255.255.255.0 broadcast XXX.XXX.XXX.255" (assuming /24) & "route add default gw XXX.XXX.XXX.XXX eth0"

6. Get the missing file via a web server via http (or tftp server) (the missing file is: libscan.ppclinux.so which is here) and save it into your new temporary lib directory: "wget http://XXX.XXX.XXX.XXX/<LOC>/libscan.ppclinux.so" (note I am using IP addresses as I have not configured DNS in this case).

7. Create another in memory (temporary) files systems and mount it on top of existing "/stand/sysco/lib" directory: "mount -t tmpfs -o size 800k tmpfs /stand/sysco/lib" and copy all the TMPLIB files over to this version "cp /tmp/TMPLIB/* /stand/sysco/lib"

8. Remove the "TMPLIB" to free memory: "umount /tmp/TMPLIB" and exit shell

9. You should now be able proceed with fixing your L2 by doing an flashsc reflash from USB connected IRIX host using patch PATCH SG0007149 which has L2 firmware 1.48

NOTE 1: See man flashsc for flashing instructions

NOTE 2: My other posting on this, which outlines the basic steps

NOTE 3: There might be other more efficient ways to getting updated mount on "/stand/sysco/lib" using mount / remount options, if you know these then please provide feedback for updating.

NOTE 4: This "hack" was originally documented on nekochan by Pymble Software.

NOTE 5: L2 Console port is: 38400,8,N,1


Recovering an L2 Controller with an L3 Controller

Another potential failure with L2 is that the image gets corrupted. This happened to me when the L2 power supply failed part way through doing a recovery from 6.5.30 bricked L2 (as per above procedure).

In this case when you connect to the L2 console port it will boot and indicate that it is in recovery mode and that you need to connect it to an L3 controller via the Console port to and run the l2recovery program:

Validating L2 Controller Flash image....FAILED!
INVALID IMAGE HEADER CHECKSUM (C55DDEBE)


FATAL ERROR!!!   Your L2 controller binary image is corrupted!!

You must perform the L2 Controller flash recovery sequence, which
allows you to download a new L2 controller image via the Console
serial connection on the L2 controller

The typical recovery sequence is:
1) Attach a serial (null modem) cable from the L2 Controller serial port
   marked "Console", to the serial port on the L3 Controller.
2) Disconnect any terminal program that may be connected to the
   serial port.
3) Execute the l2recover command on the L3 controller:
     /usr/cpu/firmware/sysco/l2recover /usr/cpu/firmware/sysco/l2.bin
4) When the command completes, the L2 should reboot.  If it does not
   then power cycle the L2 to reboot it.

So what is an L3 controller ??

It can just be your Tezro, Fuel or O350 IRIX machine, a null modem cable and the flashsc program.

I did this via Fuel using Serial Port #2 (/dev/ttyd2) plugged into the L2 serial port: "cd /usr/cpu/firmware/sysco" & "flashsc -l2recover --dev /dev/ttyd2 l2.bin"


IRIX / L1 Sofware Versions

The following log captures the version of L1 software that is provided by the particular IRIX release (note that L2 versions follow L1 versions):

---
--- IRIX 6.5.21
---
# ./flashsc -v l1.bin
./flashsc: (System Controller Flash Utility) - Version 1.0.7
Multi-image binary contains 2 flash images.
Image 0: L1 version 1.22.2, Built 06/17/2003 10:58:26  [1MB image]
Image 1: L1 version 1.22.2, Built 06/17/2003 10:59:41  [2MB image]
---
--- IRIX 6.5.22
--
# ./flashsc -v l1.bin
./flashsc: (System Controller Flash Utility) - Version 1.0.7
Multi-image binary contains 3 flash images.
Image 0: L1 version 1.24.8, Built 09/15/2003 17:07:44  [Base 1MB image]
Image 1: L1 version 1.24.8, Built 09/15/2003 17:08:18  [Fuel/PE 1MB image]
Image 2: L1 version 1.24.8, Built 09/15/2003 17:08:38  [2MB image]
---
--- IRIX 6.5.25
---
# ./flashsc -v l1.bin
./flashsc: (System Controller Flash Utility) - Version 1.2.1
Multi-image binary contains 3 flash images.
Image 0: L1 version 1.30.6, Built 06/16/2004 14:54:58  [Base 1MB image]
Image 1: L1 version 1.30.6, Built 06/16/2004 14:56:19  [Fuel/PE 1MB image]
Image 2: L1 version 1.30.6, Built 06/16/2004 14:56:38  [2MB image]
---
--- IRIX 6.5.29
---
# ./flashsc -v l1.bin
./flashsc: (System Controller Flash Utility) - Version 1.3.8
Multi-image binary contains 5 flash images.
Image 0: L1 version 1.40.5, Built 12/05/2005 14:00:44  [Base 1MB image]
Image 1: L1 version 1.40.5, Built 12/05/2005 14:01:22  [Fuel/PE/O300 1MB image]
Image 2: L1 version 1.40.5, Built 12/05/2005 14:01:32  [MIPS 2MB image]
Image 3: L1 version 1.40.5, Built 12/05/2005 14:01:53  [2MB image]
Image 4: L1 version 1.40.5, Built 12/05/2005 14:03:27  [Linux L1 image]
---
--- IRIX 6.5.30 (Avoid ...)
---
# ./flashsc -v l1.bin
./flashsc: (System Controller Flash Utility) - Version 1.4.1
Multi-image binary contains 7 flash images.
Image 0: L1 version 1.44.0, Built 07/17/2006 18:19:54  [Base 1MB image]
Image 1: L1 version 1.44.0, Built 07/17/2006 18:20:38  [Fuel/PE/O300 1MB image]
Image 2: L1 version 1.44.0, Built 07/17/2006 18:20:50  [MIPS 2MB image]
Image 3: L1 version 1.44.0, Built 07/17/2006 18:21:13  [Legacy 2MB image]
Image 4: L1 version 1.44.0, Built 07/17/2006 18:23:16  [Legacy Linux L1 image]
Image 5: L1 version 1.44.0, Built 07/17/2006 18:21:46  [2MB image]
Image 6: L1 version 1.44.0, Built 07/17/2006 18:23:57  [Linux L1 image]
pink 13# l1cmd ver
L1 1.48.1 (Image A), Built 01/22/2007 11:34:20    [Fuel/PE/O300 1MB image]
---
--- PATCH SG0007149
---
# flashsc -v l1.bin
flashsc: (System Controller Flash Utility) - Version 1.4.1
Multi-image binary contains 7 flash images.
Image 0: L1 version 1.48.1, Built 01/22/2007 11:33:34  [Base 1MB image]
Image 1: L1 version 1.48.1, Built 01/22/2007 11:34:20  [Fuel/PE/O300 1MB image]
Image 2: L1 version 1.48.1, Built 01/22/2007 11:34:34  [MIPS 2MB image]
Image 3: L1 version 1.48.1, Built 01/22/2007 11:34:57  [Legacy 2MB image]
Image 4: L1 version 1.48.1, Built 01/22/2007 11:36:34  [Legacy Linux L1 image]
Image 5: L1 version 1.48.1, Built 01/22/2007 11:35:27  [2MB image]
Image 6: L1 version 1.48.1, Built 01/23/2007 10:17:58  [Linux L1 image

NOTE: IRIX 6.5.30 L2 is known to brick L2 controller and it has been reported (not confirmed) that L1 has issues with Fuel.


Causes of "TBL Refill Exception" Error

The "TBL Refill Exception" error results in system crash and leave machine in POD/CAC mode on console. The error looks like this:

A 000 001c01: *** TLB Refill Exception on node 0
A 000 001c01: *** EPC: 0xa80000000129c874 (0xa80000000129c874)
A 000 001c01: *** Press ENTER to continue.

The appear to be three causes for this alarming failure:

  • Booting with old IRIX Release - some of SGI machines where introduced after initial and earlier IRIX 6.5 releases. You need to make sure you are using an IRIX release that includes support for your machine type. If you use IRIX 6.5.22 or higher then you should get all machine types
  • Using newer O350 IP59_4CPU system board with older L1 version - to address this you will need to put in older system board (IP53_4CPU for example) and then once booted update the L1 SW version and then put in new system board (see below for more information)
  • Trying to SW install on Fuel via systems console port - in this case you can get to PROM and start the install but on starting "inst" the machien crashes. Remedy is to first go into PROM command prompt and set the console port environment variable to "d" (instead of "g" for "graphics"). You can then do install via console. Also be sure to not have keyboard/mouse plugged in, as PROM appear to take this as indication of "server" rather then "workstation". Once you have working graphics installed you will likely need to do fuller installation to get graphics systems running and remember to reset the NVRAM variable back to "console=g".

NOTE #1: See this thread on Irix Network for discussion on the Fuel console "TBL Refill Exception" case

NOTE #2: Here is nekochan thread on Fuel and 1.44.0 (6.5.30) L1 issue with Fuel


L1 & L2 Chimera & Dallas Chip Tips

All of SGI Origin 350, Onyx 350, Onyx 4, (Fuel ?) & Tezro use a variation of the "Chimera" systems board. The non graphics servers were code named as "Chimera Server" and the Onyx and Tezro graphics machines as "Chimera Blade"

This means that they all have L1 Controllers and can be managed via an L2 controller.

In fact you can flip the machines identity by changing its Serial Number and turn rackmount Tezro's into Origin's and Origins into rackmount Tezros.

You can also swap "Dallas" chips across from Origins / Tezro's into Numalink Routers to get around the security of Numalink Router Serials. This allows you to create new and consistent Serial No's across a set of "Chimera" hosts to build up a Origin/Onyx 350 multi-chassis Numalink'ed machine.

To do this you will need to be willing to test various L1 / L2 controller serial and configuration options, some of which might cause a problem with your machine. Most of these are recoverable, but the majority of information covering the various failure and fix scenarios were documented on "Nekochan".

Here are some clues to potential problems and fixes:

  • "TLB Refill Exception" - I got this when I was running an older version of L1 software with a newer (IP59_4CPU - 4 x 1 GHZ) Chimera board. The resolution was to put in an older (IP53_4CPU - 4 x 700/800 MHZ) version of Chimera board and then update the L1 version, before putting back the newer board.
  • Disabled CPU - another problem is machine boots but has disabled CPUs possibly due to the above problem which results in board being disabled and you cannot re-enable it via regular PROM moniter boot command "enableall". So in this case you need to boot the machine into POD/DEX/CAC (Power-On Diagnostic / Dirty EXclusive / CAChed) Mode by setting the Debug flags via L1. This allows you to by pass PROM boot and hence the disabled CPUs. Debug flag is: "debug 0x10d" . This opens up a whole new arcane world of tweeking. The required sequence to revive machine is to:
  • Make sure you are directly connected to the required machines console serial port (38,400-8-N-1) (as it is not possible to do this via L2)
  • Set the Debug Flag: "debug 0x10d" (see "More on L1 Debug Flags" below)
  • Power Up: "power up"
  • Enter Dex mode: "go dex"
  • Enter CaC mode : "go cac"
  • Clear the logs: "clearalllogs"
  • Reinitalise logs: "initalllogs"
  • flush the buffers - "flush"
  • Now escape back to L1 (Ctl-T) and
  • Reset debug to 0: "debug 0"
  • Returning to console, do a reset: "reset".

Essentially what this is doing is clearing the fault log which resulted in the CPU being disabled.

See example POD/ DEX/ CAC L1 session further below.

I think the only reference to this is in this "Nekonomicon" trace ... but some of this is in the following SGI document "Hardware Quick-reference Booklet (Origin and Onyx2 Series) - HMQ-380-C" see page 174 for "POD Mode Commands".

For tips on flipping Numalink Router serials see pymblesoft.com blog. I used the chip swapping method outlined there by swapping out Numalink Dallas chip and replacing it with one from a Tezro...

See here for pictures of Dallas DS1742W-120 orientation in O350 (IP53) and Numalink Router, which should came in handy for those having to do this.

SGI Numalink Router - Dallas DS1742W-120 Orientation
SGI Onyx4 / Origin 350 (Chimera IP53) - Dallas DS1742W-120 Orientation (Left Front / Right Back)
Tezro - Dallas DS1742W-120 (Top Front / Bottom Back)
Fuel - Dallas DS1724-120 (Left is Rear near PCI Slots)

More on L1 Debug Flags ..

The L1 "debug" flags provide a way to control the machine boot process. The original flag values come from a combination of physical and virtual "Dip Switch" settings.

For Chimera machines these are all virtual and can be controlled via the L1 "debug" command. To help with getting valid debug flags (and avoid need to always go to doco and then convert values to hex) I created a little "Dip Switch Calculator" with MS Excel:

Chimera "Dip Switch Calculator" - works for O350, Tezro and Fuel

The values are documented via "man prom" and more completely in the internal Origin 2000 hardware quick reference guide (see below).

See below for an example of the difference in boot behaviour based on the "debug" settings.


As described above, it is possible to change a numalink router by putting in a alternate Dallas chip (from Tezro for instance) and through some tricky tweeking   with L2 getting it to take on a new serial number.

It has also been observed that if you have a Dallas chip with a flat battery in your Numalink router then it will simply take on the serial number from the connected L2 on startup.

I have also done testing with Numalink Router, with various Dallas chips. The resulting behaviour of Numalink is differnt dependent on whether you put in a Dallas chip from: Fuel , Tezro, other Numalink or cleared / unitialised chips.

Behaviour variants include: whether the Numalink Router comes up with existing Rack / Slot configuration, serial secuity being enabled or it a takes on serial number from L2 controller.

Here is an example from using a chip from another machine (this could be from either Onyx4 or the original Numalink). The log shows that is auto-initalised and took on serial from L2:

?-192.168.XXX.XXX-L2>config
L2 192.168.XXX.XXX: - ---- (no rack ID set) (LOCAL)
L1 192.168.XXX.XXX:0:0   - ---r-- (no rack and slot ID set)
?-192.168.XXX.XXX-L2>serial
L2 system serial number: not set.
?-192.168.XXX.XXX-L2>192.168.XXX.XXX:0:0 brick rackslot 1 7
000r00:
brick rack set to 001 (takes effect on next L1 reboot/power cycle)
brick slot set to  07 (takes effect on next L1 reboot/power cycle)
?-192.168.XXX.XXX-L2>192.168.XXX.XXX:0:0 reboot_l1
?-192.168.XXX.XXX-L2>INFO: closed connection to 000r00
INFO: opened USB device at b1;p2/0;d6 (/dev/sgil1_0)
 
?-192.168.XXX.XXX-L2>config
L2 192.168.XXX.XXX: - ---- (no rack ID set) (LOCAL)
L1 192.168.XXX.XXX:0:0   - 001r07
?-192.168.XXX.XXX-L2>001r07
001r07 ATTN: FAN 0 warning limit reached @ 0 RPM.
001r07
001r07 ATTN: Environmental redundancy lost.
l1 log
001r07:
02/07/06 06:28:15 checksum Error - common header initialized
02/07/06 06:28:15 nvram checksum error - initializing core data.
02/07/06 06:28:15 nvram checksum error - initializing extended data.
02/07/06 06:28:15 nvram checksum error - log pointers invalid, log pointers reset
02/07/06 06:28:15 L1 booting 1.42.9
02/07/06 06:28:15 ** fixing invalid SSN value
02/07/06 06:28:15 ** fixing BSN mismatch
02/07/06 06:28:15 USB0: waiting on open
02/07/06 06:28:15 USB0: opened
02/07/06 06:28:15 USB0: registered for events
02/07/06 06:28:15 power up (PANEL)
02/07/06 06:28:15 FAN 0 warning limit reached @ 0 RPM.
02/07/06 06:28:15 Environmental redundancy lost.
02/07/06 06:28:15 L1 booting 1.42.9
02/07/06 06:28:15 USB0: waiting on open
02/07/06 06:28:15 USB0: opened
02/07/06 06:28:15 USB0: registered for events
02/07/06 06:28:15 FAN 0 warning limit reached @ 0 RPM.
02/07/06 06:28:15 Environmental redundancy lost.
?-192.168.XXX.XXX-L2>l1 serial
001r07:
BSN: NYY856    SSN: L0000000    Time: 02/07/2106 06:28:15    Security: OFF
?-192.168.XXX.XXX-L2>

It you look above at the log sequence: "checksum Error - common header initialized ...", you can see the Numalink L1 is reinitalising the Dallas chip. This is the exact same sequence observed on Fuel as well.

I have proved that it is possible to flip a Numalink router by just putting in a cleared Dallas and starting it up connected to L2. Here is log from putting in cleared Dallas:

10/30/20 23:18:15 L1 booting 1.42.9
10/30/20 23:18:16 USB0: waiting on open
10/30/20 23:18:17 USB0: opened
10/30/20 23:18:16 USB0: registered for events
?-192.168.XXX.XXX-L2>serial all
001r07:
 
Data                            Location      Value
------------------------------  ------------  --------
Local System Serial Number      NVRAM          not set
Reference System Serial Number  NVRAM
Local Brick Serial Number       EEPROM        NYY856
Reference Brick Serial Number   NVRAM         NYY856
 
 
EEPROM      Product Name    Serial         Part Number           Rev  T/W
----------  --------------  -------------  --------------------  ---  ------
POWER       RPWR            NYY856         030_1631_004          C    00
LOGIC       ROUTER          NXA065         030_1634_004          C    00
 
?-192.168.XXX.XXX-L2>l1 power up
001r07 ERROR: SerNum:Invalid System Serial Number format. See log for details.
?-192.168.XXX.XXX-L2>l1 log
001r07:
10/30/20 23:16:22 checksum Error - common header initialized
10/30/20 23:16:22 nvram checksum error - initializing core data.
10/30/20 23:16:22 nvram checksum error - initializing extended data.
10/30/20 23:16:23 nvram checksum error - log pointers invalid, log pointers reset
10/30/20 23:16:23 L1 booting 1.42.9
10/30/20 23:16:24 USB0: waiting on open
10/30/20 23:16:24 USB0: opened
10/30/20 23:16:25 USB0: registered for events
10/30/20 23:18:15 L1 booting 1.42.9
10/30/20 23:18:16 USB0: waiting on open
10/30/20 23:18:17 USB0: opened
10/30/20 23:18:16 USB0: registered for events
10/30/20 23:19:44 Invalid SSN format.
10/30/20 23:19:45 SSN:
10/30/20 23:19:45 Numeric portion (last 7 chars) must be 0000000 through 3999999
?-192.168.XXX.XXX-L2>l1 help serial
001r07:
serial
        shows secure system serial numbering information only.
serial verify
        test the brick's readiness for secure serial numbering.
serial all
        show system and brick part/serial numbers.
serial all v|verbose
        show system and brick part/serial numbers with EEPROM indexes
serial dimm
        show dimm part/serial numbers.
serial dimm v|verbose
        show dimm part/serial numbers with extended data and EEPROM indexes.
serial clear
        clear the system serial number.
serial <str> <str> <str> <str>
        erases and reassigns system serial number using temporary authenticator.serial security on
        enables system serial number security.
?-192.168.XXX.XXX-L2>l1 serial clear
001r01:
INFO: command not supported on bricks that enforce security.
?-192.168.XXX.XXX-L2>l1 serial verify
001r01:
ERROR: SerNum:No assigned System Serial Number. See log for details

In this case it appears to have enabled Serial Security. So likely the 0nly way to get machine to initialise with new serial number is by plugging it into computer node via Numalink connection.

NOTE: For more testing, see my log on SGI NVRAM replacement, here, on using EEPROM Programmer to clear chip before putting it into SGI Chassis. This also shows how to disable the serial security on Numalink Router.


LSI SAS3442X-R in O350 & Fuel

Adding an LSI SAS3442X-R board in combination with SATA SSD into your O350 or Fuel is by far and away the cheapest way to get signficant disk performance boost. Fuel installation is a snap, if you buy a "new old stock" retail box which comes with board + cabling with Molex power. Here is a area where the Fuel's "cheap" PC architecture wins hards down.

Within O350 rack mount server things are a little more complicated as it does not have Molex power connectors dangling all around the place or slots for for disks inside, so you need to be a little more creative to get the power and SAS/SATA cabling sorted.

Here is the low down on these board. Firstly they come in various configurations and the naming reflects this:

  • SAS3442X ==  Serial Attached SCSI, 3 Gbit/sec, 4 Internal , 4 External, 2 Connectors, PCI-X (vs PCE-e)
  • SAS3080X == SAS, 3 Gbit/sec, 8 internal, PCI-X
  • SAS3800X == SAS, 3 Gbit/sec, 8 external (via 2 x SFF-8470 connectors), PCI-X
  • X == PCI-X and there is also a corresponding PCI-express variants
  • -R == RAID, but cards can be flashed to HBA (Initator Target) mode.

The board come in a number of versions (ReportsAs | -PartNo | FWVer):

  • 1068(A0) | -01A | A0 - avoid these as they do not have updated software available and they cannot be flashed for RAID or IT mode of operation
  • 1068(B0) | -01B | B0 - these can be cross flashed (RAID/IT) and should work in SGI's (I have not tested or confirmed this version)
  • 1068(B1) | -02C | B1 - these can be cross flashed (RAID/IT) and work well in SGIs

Once you get your board you should flash it to latest firmware / BIOS, using the HBA (IT) firmware rather than RAID firmware as SGI cannot be configured to use RAID.

To flash the board requires an MS-DOS machine with PCI-X slot. Neither the Linux or Windows version of the flashing tool allow you to erase the flash which is required to flip some of the cards from RAID to HBA (IT) mode.

The last firmware / BIOS version is:

  • 01.33.00.00 - Firmware
  • 06.36.00.00 - BIOS

This is avaiable via the BroadCom support site and is in the "P21 Package for Window & Dos". The package includes:

  • 3442XRB0.fw - RAID (-R) for 1068(B0) series adaptors
  • 3442XRB1.fw - RAID (-R) for 1068(B1) series adaptors
  • 3442XTB0.fw - HBA (-IT) for 1068(B0) series adaptors
  • 3442XTB1.fw - HBA (-IT) for 1068(B1) series adaptors
  • mptsas.rom - BIOS (same for all versions)

NOTE: The only 1068(A0) series Firmare I was able to find is in HP Service Pack (SP45154) and after applying this the adaptor will report as SAS3080 (ie 8 Internal Ports).

To flash an "-R" board to lastest "-IT" firmware the steps are:

  1. Find the card - "sasflash -listall", you only need this if there are multiple boards in the same machine, in my example I do not use controller number option as I only had one board in the flashing machine
  2. Erase Flash - "sasflash -o -e 7" , note that this also erase the SAS device Address and so be sure to record this so you can reapply it later
  3. Flash the new Firmware & BIOS - "sasflash -o -f FMVERt.bX -b mptsas.rom"
  4. Put the SAS Address back = "sasflash -o -sasadd XXXXXXXXXX"

Card should now report new FW/BIOS version:

LSI Logic SAS3442X-R - Flashed to latest BIOS

If you have flashed it to IT (HBA) mode then you should see that it reports as XX.XX.XX.XX-IT in the FW Revision, by going into bios configuration at boot:

Now take board out of MS-DOS flashing machine and put it in your Fuel/O350 and you will see it reports the LSI 1068 at boot:

Fuel - Boot to Single User with SAS3442X-R (IT)

Here is why you would bother with this...

---
--- 1. Here is diskperf of in built Ultra 160 SCSI boot disk
---      with a Seagate Cheetah U160 spinning disk
---

# diskperf -W -D -c4g -n "fuel/pink UW160 ST336706LW" testfile
#---------------------------------------------------------
# Disk Performance Test Results Generated By Diskperf V1.2
#
# Test name     : fuel/pink UW160 ST336706LW
# Test date     : Sat Oct  3 22:52:00 2020
# Test machine  : IRIX64 pink 6.5 07202013 IP35
# Test type     : XFS data subvolume
# Test path     : testfile
# Request sizes : min=16384 max=4194304
# Parameters    : direct=1 time=10 scale=1.000 delay=0.000
# XFS file size : 4294967296 bytes
#---------------------------------------------------------
# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
      16384    2.60   52.58    2.88    2.89    2.32    2.45
      32768    4.98   52.84    6.08    6.10    4.38    4.62
      65536    9.14   52.85   13.70   13.72    8.03    8.32
     131072   15.68   52.51   31.43   28.23   14.27   14.76
     262144   24.39   51.92   36.51   36.63   22.76   22.36
     524288   33.61   51.29   36.42   36.74   31.47   31.79
    1048576   41.44   51.62   46.25   41.76   39.28   37.59
    2097152   46.84   51.23   46.36   46.41   45.73   43.57
    4194304   50.30   50.96   49.32   49.69   49.13   47.92
    
---
--- 2. And here is an Octane2 with an ACARD SAS/SATA adaptor
---      with a Samsung 850 EVO SATA SSD
---

# diskperf -W -D -c4g -n "octane2/porcipine scsi/sata/acard 850 EVO" testfile
#---------------------------------------------------------
# Disk Performance Test Results Generated By Diskperf V1.2
#
# Test name     : octane2/porcipine scsi/sata/acard 850 EVO
# Test date     : Sat Oct  3 22:32:06 2020
# Test machine  : IRIX64 porcipine 6.5 07202013 IP30
# Test type     : XFS data subvolume
# Test path     : testfile
# Request sizes : min=16384 max=4194304
# Parameters    : direct=1 time=10 scale=1.000 delay=0.000
# XFS file size : 4294967296 bytes
#---------------------------------------------------------
# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
      16384   23.72   20.57   23.65   19.16   23.67   19.08
      32768   29.38   24.93   29.36   23.86   29.34   23.73
      65536   33.03   29.02   33.02   28.26   33.01   28.15
     131072   35.43   30.43   35.40   29.61   35.43   29.49
     262144   36.94   31.26   36.93   30.41   36.92   30.32
     524288   37.67   31.53   37.66   30.66   37.67   30.56
    1048576   37.99   31.43   37.98   30.81   37.98   30.79
    2097152   38.18   31.00   38.14   30.55   38.18   30.67
    4194304   38.27   30.04   38.28   29.83   38.28   29.89
    
---
--- 3. Now here is the LSI Logic SAS3442X-R with
---       Samsung 840 EVO SATA SSD
---

# diskperf -W -D -c4g -n "fuel/pink sas3442X-I 840 EVO" test/testfile
#---------------------------------------------------------
# Disk Performance Test Results Generated By Diskperf V1.2
#
# Test name     : fuel/pink sas3442X-I 840 EVO
# Test date     : Sat Oct  3 22:23:11 2020
# Test machine  : IRIX64 pink 6.5 07202013 IP35
# Test type     : XFS data subvolume
# Test path     : test/testfile
# Request sizes : min=16384 max=4194304
# Parameters    : direct=1 time=10 scale=1.000 delay=0.000
# XFS file size : 4294967296 bytes
#---------------------------------------------------------
# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
      16384   96.89  105.67   94.87   63.93   94.41   63.93
      32768  138.93  150.77  129.34  103.63  135.62  102.92
      65536  180.30  194.71  136.40  149.68   59.84  149.72
     131072  205.64  225.30  148.95  191.94   61.98  191.70
     262144  224.77  245.83  132.71  224.31   57.87  224.69
     524288  228.81  258.08  131.87  246.02   59.23  246.12
    1048576  224.70  264.18  111.43  257.66   59.37  257.81
    2097152  217.02  267.26  109.98  264.21   57.54  264.55
    4194304  179.81  268.91  135.53  267.42   57.62  267.30
    
    ---
    --- 4. For completeness here is the internal UW160 SCSI with
    ---      ACARD Samsung EVO SSD
    ---
    
% diskperf -W -D -c4g -n "fuel/pink UW160 ACARD SSD" testfile 
#---------------------------------------------------------
# Disk Performance Test Results Generated By Diskperf V1.2
#
# Test name     : fuel/pink UW160 ACARD SSD
# Test date     : Wed May 19 20:31:02 2021
# Test machine  : IRIX64 pink 6.5 07202013 IP35
# Test type     : XFS data subvolume
# Test path     : testfile
# Request sizes : min=16384 max=4194304
# Parameters    : direct=1 time=10 scale=1.000 delay=0.000
# XFS file size : 4294967296 bytes
#---------------------------------------------------------
# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
      16384   95.33  105.98   94.49   77.86   95.72   77.90
      32768  137.80  151.31  137.15  119.21  137.45  117.97
      65536  179.19  194.74  179.31  165.12  179.27  159.75
     131072  210.06  225.61  210.19  204.37  210.19  198.17
     262144  230.41  246.06  230.70  233.18  230.73  230.25
     524288  243.69  258.75  243.31  251.31  242.97  250.27
    1048576  250.07  265.13  249.99  261.13  250.03  260.76
    2097152  253.42  268.48  253.62  266.65  253.55  266.26
    4194304  254.82  270.33  255.29  269.36  255.19  269.29

So spinning disk Fuel report results are generally poorer than Octane2 with SSD even though Fuel has much faster disk sub-system (160 MB/sec for Fuel vs. 40 MB/sec of Octane). In fact Dual 600 Octane2 is pretty much saturating its 40 MB/sec SCSI bus with SSD.

But the Fuel with SAS3442X had the best single disk performance I have ever seen on MIPS SGI machine and this is very cheap to setup compared to ACARD SCSI to SATA Adaptors which are now selling for prices that are more than an entire Fuel !  Get to ebay now to improve your Fuel performance ;-) .

UPDATE:  I have since done test of Fuel internal UW160 with ACARD ARS-2160 with Samsung EVO SSD and this provide even better results than the SAS3442X. As final point of comparison I need to see how IRIX SW RAID'ed (stripped) XVM volume performs with stripped drives acoss SAS drives.

NOTE #1: If you find that your sasflash version does not allow you to erase the ROM, then you should go back to an older version.

NOTE #2: Updated Fuel hinv

NOTE #3: LSI internal SAS connector is: SFF-8484 & external connector is: SFF-8470 (both SAS industry standard interfaces)


Some Testing with XVM and LSI SAS344X SAS

Given good performance results achieved with LSI SAS344X, I was curious to see what sort of performance could be achieved if you use IRIX's XVM (XFS Volume Manager) which allows you to setup Stripped volumes (RAID 0) for performance and Mirrored volumes (RAID 1) for data security.

My initial test where to set up mirror drives across 2 x SSD:

$ diskperf -W -D -c4g -n "fuel/pink LSISAS3 EVO 2xSSD Mirror" testfile
#---------------------------------------------------------
# Disk Performance Test Results Generated By Diskperf V1.2
#
# Test name     : fuel/pink LSISAS3 EVO 2xSSD Mirror
# Test date     : Tue Jun  8 10:59:49 2021
# Test machine  : IRIX64 pink 6.5 07202013 IP35
# Test type     : XFS data subvolume
# Test path     : testfile
# Request sizes : min=16384 max=4194304
# Parameters    : direct=1 time=10 scale=1.000 delay=0.000
# XFS file size : 4294967296 bytes
#---------------------------------------------------------
# req_size  fwd_wt  fwd_rd  bwd_wt  bwd_rd  rnd_wt  rnd_rd
#  (bytes)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)  (MB/s)
#---------------------------------------------------------
      16384    0.03   72.26    0.03   58.82    0.03   59.81
      32768    0.06  111.45    0.06   91.18    0.07   87.57
      65536    0.13  142.93    0.13  126.77    0.11  123.36
     131072    0.25  166.89    0.24  157.57    0.26  151.88
     262144    0.52  184.49    0.50  172.30    0.49  173.67
     524288    1.00  189.69    1.09  186.79    1.07  191.92
    1048576    1.36  195.16    1.75  196.48    1.64  199.43
    2097152    3.24  203.30    3.09  219.54    3.20  195.22
    4194304    6.08  200.12    5.67  197.19    5.86  200.26

Whooahh ... the impact of mirroring on performance is huge, to the point of unacceptable.


Putting SATA SSD into O350 with LSI SAS3442X

As shown above the LSI SAS3442X provides a way to put much faster modern SATA SSD into Fuel.  For O350 Servers you need do a bit of cabling playing and you can get SATA SSD installed and accessbile via same front facing disk bay used for 3.5 inch disks.

While the disk are not hot swappable, this option does allow you to easily add 1 to 4 disk depending on whether you want to still include one of the hardrives in the bay. This option also allow you to keep your DVD-ROM attached.

So if you have only a single compute module you can have boot disk, 1 or 2 SATA SSDs and DVD-ROM.

To provide this I used:

  • LSI SAS3442X-R SAS/SATA cable that come with card
  • A single Molex -> 2 x power out extension cable
  • A single Molex -> 2 x SATA power cable

Using the Molex extension cable was able to get power to SCSI backplane / DVD-ROM drive and have extension feed into the drive bay and the SATA power cables.  The SATA lines are then just feed through the same way.

While 2 SATA lines is easy, for 4 lines you will have to sacrifice the SCSI disk, as there is not sufficient room for drives or cables.

Here are pictures of the way to do this:

O350 - Plug Existing Cable into Molex Extension and adding SAS/SATA Power to one extension

Once you have extensions plugged together:

  • run SAS lines (flat blue cabling in picture) and SAS power cables from one part of extension between fan gap and into drive bay
  • then push Molex plug into same fan gap and bring out other Molex extension to plug into the existing SCSI backplane power and
  • being careful to keep the existing DVD-ROM mini power cable out from the fan gap, so you can plug it back ino DVD-ROM:
O350 - Cable Contortions to keep existing DVD-ROM Power available
O350 - SAS3442X using 2 SAS Lane and exiting ATA DVD-ROM cable slotted back into gap

The result is two SAS/SATA lanes being available in the existing drive bay:

O350 - SAS/SATA cabling feeding into drive bay
O350 - Drive bay wired for 2 x SAS/SATA and keeping boot drive and DVD-ROM

O350 (Onyx 350 / Origin 350) Fans

The O350 machines have a number of fans which are reported on as part of the environmental monitoring.  What fans are installed can different based on the chassis configuration.

The the main fan group are powered from the IP53 Interface board, which has 4 fan pinouts (on this port H2J5 has broken retention clip):

O35O (Chimera) Interface Board - Fan Plugs (Top Back, Botton Front)

The fan plugin points and what they report as via L1 env are:

H2H6 (EXHST 1) - Four pin for back facing right hand (looking from back) exhaust fan (always present)

H2J5 (EXHST 2) - Four pin for back facing left hand (looking from back) exhaust fan (optional , not present if there is V12 or DM3 installed)

H9J4 (PS) - Three pin for front facing power module fan, power and monitoring cable runs along inner side of chassis between narrow gap between edge and power modules enclousure (always present)

H9H6 (ODY) - Three pin for optional V12 / DM3 carrier fan (see picture below), This fan is used instead H2J2 exhaust fan when V12 / DM3 is installed.

O350 (Chimera Blade) DM3 Carrier Fan - Plugged into plug H9H6

In addition to the main chassis fans the O350 also has a fan within the PCI / Disk chassis section, This is reported as "PCI 1" & "PCI 2", as while it is a single unit it contains two seperate fans, each with its own cable, which are connected to riser board:

O350 (Chimera) - PCI / Disk Cage fan, powered via riser board

And finally the IP59_4CPU 4 x 1GHz Processor Module includes an additional three fans (NODE ZERO - N0 LEFT / N0 CNTR / N0 RIGHT)  that are part of the board and need to be removed to get access to installation screws:

O350 (Chimera Server) - IP59_4CPU 1GHZ Module with extra 3 fans

Here are the different L1 "env" reports based on alternate chassis configurations:

M200XXXX-001-L2>env
001c01:     <<===== O350 with V12
Environmental monitoring is enabled and running.
...
...
Description     State       Warning RPM  Current RPM
--------------- ----------  -----------  -----------
FAN  0  EXHST 1    Enabled         1980         2393
FAN  1       PS    Enabled         3200         4821
FAN  2    PCI 1    Enabled         1980         2428
FAN  3    PCI 2    Enabled         1980         2616
FAN  4      ODY    Enabled         1679         1985
...
...
001c02:     <<===== O350 with DM3
Environmental monitoring is enabled and running.
...
...
Description     State       Warning RPM  Current RPM
--------------- ----------  -----------  -----------
FAN  0  EXHST 1    Enabled         1980         2280
FAN  1       PS    Enabled         3200         4963
FAN  2    PCI 1    Enabled         1980         2343
FAN  3    PCI 2    Enabled         1980         2556
FAN  4      ODY    Enabled         1679         1814
...
...
001c03:     <<===== O350 with IP59_4CPU 4x1GHZ Module
Environmental monitoring is enabled and running.
...
...
Description     State       Warning RPM  Current RPM
--------------- ----------  -----------  -----------
FAN  0  EXHST 1    Enabled         1980         2616
FAN  1  EXHST 2    Enabled         1980         2616
FAN  2       PS    Enabled         3200         4066
FAN  3    PCI 1    Enabled         1980         2556
FAN  4    PCI 2    Enabled         1980         2812
FAN  5  N0 LEFT    Enabled         1980         4054
FAN  6  N0 CNTR    Enabled         1980         3846
FAN  7 N0 RIGHT    Enabled         1980         4166
...
...
001r06:     <<===== Not an O350 this is Numalink Router
Environmental monitoring is enabled and running.
...
...
Description     State       Warning RPM  Current RPM
--------------- ----------  -----------  -----------
FAN  0     LEFT    Enabled         2160         5443
FAN  1    RIGHT    Enabled         2160         5532
...
...

NOTE: "FAN #" reported vary depending on the configuration, but the ID is consistent.


Upgrading Fuel CPU PIMM requires IRIX PROM flash

To my knowledge all SGI machines support some level of CPU upgrade/swap. This is typically just an exercise in doing a physical swap of the CPU module and then reboot.  The system will then do hardware based detection a pickup the new CPU speed.

The Fuel is an exception to this, as if you put in a new physical CPU then it will not automatically boot up at faster speed. Rather it relies on SW based setting that are managed via PROM "flash" program to set the revised CPU parameters.

The revised parameter are set using "flash -f" option:

# man flash

flash(1M)                                                            flash(1M)

NAME
     flash - reprogram the flash PROM hardware on Origin and OCTANE machines

SYNOPSIS
     flash [ -a ] [ -c ] [ -d ] [ -D ] [ -f ] [ -F ] [ -i ] [ -m module_id ]
          [ -n ] [ -o ] [ -p dir_name ] [ -P img_name ] [ -s slot_name ] [ -S ]
          [ -v ] [ -V ]

     The SGI Origin 3000 server series.
     flash [ -a ] [ -d ] [ -D ] [ -f ] [ -F ] [ -b brick_id ]
           [ -o ] [ -p dir_name ] [ -P img_name ] [ -S ]
           [ -v ] [ -V ]

     flash -L

DESCRIPTION
     flash allows a user to manage the flash PROMs on the IO and CPU boards of
     Origin systems, the base system board on OCTANE systems and CPU boards on
     the SGI Origin 3000 server series.  Without options, the command flashes
     all appropriate boards on the machine with the PROM images found in
     /usr/cpu/firmware.  Normally, flash is executed automatically during the
     installation of a new release of IRIX.  A customer should rarely need to
     use it directly.  You must have superuser privilege to use this command.
...
...
...
     -f           Specify different (than currently in PROM) configurations
                  values to be used when the new images are flashed.  These
                  values include the speed of the CPU, hub, and size of the
                  cache.  This option should be used with great care as cause
                  the machine to freeze and be rendered unusable if incorrect
                  values are given.

     -F           Similar to -f except more detailed information is required
                  and no checking is done in the input values.  This is more
                  dangerous the -f option and the same cautions apply.

...
...
...

     -o           Override the version checking and flash the PROM even if it
                  is not newer than what is currently on the PROM.

Here is an example session:

---
--- Test 1
---
# flash -f
No proms need flashing
# flash -f -v
setting default path_name to /usr/cpu/firmware
No proms need flashing
# flash -f -V
Prom version 6.211
No proms need flashing
# flash -F -V
Prom version 6.211
No proms need flashing
---
--- As I have not got actual update physical CPU install
---   need to do manual override with -o flag

---
--- Test 2
---
# flash -o -f -V
Enter CPU frequency (MHZ): [400] 600
Enter Hub frequency (MHZ): [200] 200
Enter cache size (in MBs): [4] 4
Enter machine type (0)SN1 (1)SN10 (2)SN11 (3)SN12: 
If flash is killed in the middle of execution, the machine
will freeze after it is reset. continuing...
Invalid input... Try again.
Enter machine type (0)SN1 (1)SN10 (2)SN11 (3)SN12: SN11
Invalid input... Try again.
Enter machine type (0)SN1 (1)SN10 (2)SN11 (3)SN12: 2
Info for prom at /hw/module/001c01/node/prom
Prom version 6.211
# 

---
--- NOTE: you can get the required system information via hinv
---   for CPU speed, BUS speed and Cache size
---   For SNx machine type, this is reported at boot.
---   I do not know if the SNx varies across Fuel machine versions
---   so you should check specifically for your machine
---

# cd /var/adm
# grep SN SYSLOG
Oct 4 02:33:10 6A:pink unix: Selecting SN11  <<== 4 5 6 my fuel oct 03:38:37 6a:pink unix: selecting sn11 03:43:25 ... feb 16:40:45 11:35:36 sn11< code>

So how do you know that the appropriate set of CPU & BUS speed and Cache size are?

This can be found by looking at existing machine hinv -mv results. Here is set of all of 500 to 900 MHz Fuels (collected from historical Nekochan hinv threads and my own Fuel):

--- 500 (SGI PN 030-1708-002 / 030-1708-003)
CPU 0 at Module 001c01/Slot 0/Slice A: 500 Mhz MIPS R14000 Processor Chip (enabled) 
Processor revision: 2.3. Scache: Size 2 MB Speed 250 Mhz  Tap 0xa
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)

--- 600 (SGI PN 030-1836-001 / 030-1730-002 / 030-1730-001)
CPU 0 at Module 001c01/Slot 0/Slice A: 600 Mhz MIPS R14000 Processor Chip (enabled) 
Processor revision: 2.4. Scache: Size 4 MB Speed 300 Mhz  Tap 0xa
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)

--- 700 (SGI PN 030-1891-001)
CPU 0 at Module 001c01/Slot 0/Slice A: 700 Mhz MIPS R16000 Processor Chip (enabled) 
Processor revision: 2.2. Scache: Size 4 MB Speed 350 Mhz  Tap 0xc
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)

--- 800  (SGI PN 030-2024-001 / 030-1932-001)
CPU 0 at Module 001c01/Slot 0/Slice A: 800 Mhz MIPS R16000 Processor Chip (enabled) 
Processor revision: 2.2. Scache: Size 4 MB Speed 400 Mhz  Tap 0xa
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)

--- 900 (SGI PN 030-2023-001)
CPU 0 at Module 001c01/Slot 0/Slice A: 900 Mhz MIPS R16000 Processor Chip (enabled) 
Processor revision: 3.0. Scache: Size 8 MB Speed 450 Mhz  Tap 0xb
HUB in Module 001c01/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)

As per above to get the SNx info you should do a grep on on SYSLOG at "/var/adm/SYSLOG" looking for "SN" and it will print out what this is for your Fuel.

Here is the log of a real update of my machine. First swap out slower CPU module and put in faster one. Then run flash command. I swapped a 600 MHz PIMM for 800 MHz PIMM, which I am happy to report worked as it should. As pre-caution I first did hinv -mv on machine and checked to PIMM module against the table above to verify that the PIMM was indeed an 800 MHz one:

# flash -o -f -v
setting default path_name to /usr/cpu/firmware
Enter CPU frequency (MHZ): [400] 800
Enter Hub frequency (MHZ): [200] 200
Enter cache size (in MBs): [4] 4
Enter machine type (0)SN1 (1)SN10 (2)SN11 (3)SN12: 2
m001c01: freq cpu 800000000 freq hub 200000000 mode 549baf85
m001c01: Flashed this prom 14 times
m001c01: Flashing prom data in file /usr/cpu/firmware/ip35prom.img
m001c01:                  to device /hw/module/001c01/node/prom
m001c01: > Manufacturer code: 0x00
m001c01: > Device code      : 0x00
m001c01: > Erasing code sectors (30 to 40 seconds)
m001c01: > Erasure complete and verified
m001c01: PROM Header contains:
m001c01:   Magic:    0x4a464b535743534d
m001c01:   Version:  6.211
m001c01:   Length:   0x169648
m001c01:   Segments: 1
m001c01: Segment 0:
m001c01:   Name:        ip35prom
m001c01:   Flags:       0x10
m001c01:   Offset:      0x1000
m001c01:   Entry:       0xc00000001fc00000
m001c01:   Ld Addr:     0xc00000001fc00000
m001c01:   True Length: 0x168648
m001c01:   True sum:    0x84ae700
m001c01: > Programming Bedrock PROM
m001c01: > Writing 1476168 bytes of data ...
m001c01: >     0/168648 ........
m001c01: >  8000/168648 ........
m001c01: > 10000/168648 ........
m001c01: > 18000/168648 ........
m001c01: > 20000/168648 ........
m001c01: > 28000/168648 ........
m001c01: > 30000/168648 ........
m001c01: > 38000/168648 ........
m001c01: > 40000/168648 ........
m001c01: > 48000/168648 ........
m001c01: > 50000/168648 ........
m001c01: > 58000/168648 ........
m001c01: > 60000/168648 ........
m001c01: > 68000/168648 ........
m001c01: > 70000/168648 ........
m001c01: > 78000/168648 ........
m001c01: > 80000/168648 ........
m001c01: > 88000/168648 ........
m001c01: > 90000/168648 ........
m001c01: > 98000/168648 ........
m001c01: > a0000/168648 ........
m001c01: > a8000/168648 ........
m001c01: > b0000/168648 ........
m001c01: > b8000/168648 ........
m001c01: > c0000/168648 ........
m001c01: > c8000/168648 ........
m001c01: > d0000/168648 ........
m001c01: > d8000/168648 ........
m001c01: > e0000/168648 ........
m001c01: > e8000/168648 ........
m001c01: > f0000/168648 ........
m001c01: > f8000/168648 ........
m001c01: > 100000/168648 ........
m001c01: > 108000/168648 ........
m001c01: > 110000/168648 ........
m001c01: > 118000/168648 ........
m001c01: > 120000/168648 ........
m001c01: > 128000/168648 ........
m001c01: > 130000/168648 ........
m001c01: > 138000/168648 ........
m001c01: > 140000/168648 ........
m001c01: > 148000/168648 ........
m001c01: > 150000/168648 ........
m001c01: > 158000/168648 ........
m001c01: > 160000/168648 ........
m001c01: > 168000/168648 .
m001c01: > 168648/168648 
m001c01: > Programmed and verified
 Prom version 6.211
m001c01: Verifying:
m001c01:           cpu speed 800000000
m001c01:           hub speed 200000000
m001c01:           other configuration information also verified
m001c01: Compare of file data to in core prom data succeeded
# reboot

Finally what happens if you need to downgrade CPU speed..

In this case you will need to run "flash -f -o" using the initial (faster) CPU and set PROM parameters to speed for target slower CPU. Once this is done you can shutdown your machine and swap CPU PIMM modules.

If swapping from a faster to slower CPU PIMM without first doing PROM update, then you Fuel will not boot. The only know remedy at this point is to put in right speed (or faster) CPU and do the "flash -f -o" update.

NOTE #1: Fuel CPU replacement discussion thread on "Irix Network". This includes discussion of possible way to change CPU speed parameters via POD/CAC mode, though no one appears to know the exact set of commands. What we do know is that the PROM is held in a flash chip on the Fuel & O350 board and this is not the same as the DALLAS NVRAM component. See picture etc below for "search for PROM chip..."

NOTE #2: Fuel SGI Part Numbers taken from "SGI Depot" Fuel Parts Page


Recovering Damaged Fuel PIMM

The above tip show how you flash the Fuel PROM to raise / lower the CPU speed based on the installed CPU. The Fuel is the only Chimera based machine that needs this and it appear to be that for the O350 (Origin, Onyx & Tezro) series machines the "Bedrock" ASIC is on the same board as the CPUs. In the case of the Fuel the "Bedrock" ASIC is on the system board and the CPU board is just that a CPU only module/

In my case the PIMM connector array got damaged in shipping and many of PINs were squashed:

Fuel PIMM with damaged connector array

To fix this I used a pair of very fine tweezers to pull out and straighten the pins. In doing this you need lift up the squashed PIN and then make sure it straightened up so it is vertical.

Looking at the pin array line up you make sure that all the pins are nicely aligned and chance of repair is good.

Good luck to others who need to do this repair and please make sure you protect your PIMM with the right packaging before shipping.


References & Links:

  • Irix Network - working to take up hole left with demise of Nekchan, populated by many passionate and knowledgeable SGI users
  • irix7.com - keep an archive of lots and lots of original SGI technical documents
  • SGI Depot - keeps an archive of various sgi related materials and provides parts. run by Ian Mapleson one of the original sgi/irix community members and all round helpful person
  • techpub.jurassic.nl - another SGI TechPubs archive, link and thanks for keeping this high quality document (via HTML and PDF), while irix7 above is PDFs
  • "Hardware Quick-reference Booklet (Origin and Onyx2 Series) - HMQ-380-C" - this document is for older Orgin / Onyx2 , but the POD command document (see page 174 for "POD Mode Commands") is still useful for Origin 350 Chimera based systems. Theses physical / virtual dip switch setting seem to align with what is documented via "man prom".
  • HP Server for Flashing - more details on setup to help with flashing disks & Adapters.
  • Dallas DS1742W Hacking - my testing on replacement and intialisation of Dallas DS1742W chips
  • SGI Fuel L1 Serial Comms - HPE has this old SGI bullitin which clearly states that L1 comms via internal serial port if 38400 Baud and external Serial Port #1 is 9600 Baud.
  • Onyx 350 - Racking and Stacking, my rather long blog post on moving all my O350 kit into new SGI "Hour Glass" rack.
  • What is inside that box? - SGI hinv - my hinv reports of mostly Chimera machines
  • Chimera "Dip Switch Calculator" - use at your own peril ... as this is only as good as the sketchy documentation (see link above), and also via "man prom". See below for examples of the difference in boot behavior on a Fuel based on the "debug" flag setting.

Sample POD/DEX/CAC Session via Console Serial Port with O350

What is this POD/DEX/CAC stuff ?

  • POD -  Power-On Diagnostic
  • DEX  - Dirty EXclusive
  • CAC - CAChed

To find info see the following SGI document "Hardware Quick-reference Booklet (Origin and Onyx2 Series) - HMQ-380-C" see page 174 for "POD Mode Commands" and the POD/CAC online help. Here is a sample session on Chimera board via L1 Console Serial Port:

01c01-L1>help
Commands are:
check              fru                promver|promversionnode               
reset|rst          prom               try                pic                    
make               pwm                syscom             error                  
pci                *                  autopower|apwr     syscom|junkbus|jb|bedr 
partdb             cpu                nia|ni|ctc         nib                    
iia|ii|cti         iib                iic                iid                    
config|cfg         debug              display|dsp        button|btn             
env                fan                help|hlp           history|hist           
l1dbg              link               log                ioport|ioprt           
istat              l1                 leds               margin|mgn             
network            pimm               port|prt           power|pwr              
reset|rst          nmi                softreset|softrst  select|sel             
serial             sysstate           eeprom             uart                   
usb                router|rtr         service            date                   
nvram              security           flash              reboot_l1              
version|ver        pbay               test|tst           scan                   
fru|pci|prom|node                                                               
enter 'hlp <cmd>' for more help on a single command.                            
001c01-L1>cpu                                                                   
CPU Present Enabled                                                             
--- ------- -------                                                             
0A    1       1                                                                 
0B    1       1                                                                 
0C    1       1                                                                 
0D    1       1                                                                 
001c01-L1>env                                                                   
Environmental monitoring is enabled and running.                                
                                                                                
Description    State       Warning Limits     Fault Limits       Current        
-------------- ----------  -----------------  -----------------  -------        
          1.8V   Wait Pwr  10%   1.62/  1.98  20%   1.44/  2.16    0.000        
           12V   Wait Pwr  10%  10.80/ 13.20  20%   9.60/ 14.40    0.125        
        12V #2   Wait Pwr  10%  10.80/ 13.20  20%   9.60/ 14.40    0.125        
          3.3V   Wait Pwr  10%   2.97/  3.63  20%   2.64/  3.96    0.069        
        12V IO   Wait Pwr  10%  10.80/ 13.20  20%   9.60/ 14.40    0.125        
        5V AUX   Wait Pwr  10%   4.50/  5.50  20%   4.00/  6.00    5.096        
      3.3V AUX   Wait Pwr  10%   2.97/  3.63  20%   2.64/  3.96    3.302        
    PCI 5V AUX   Wait Pwr  10%   4.50/  5.50  20%   4.00/  6.00    5.070        
      PCI 3.3V   Wait Pwr  10%   2.97/  3.63  20%   2.64/  3.96    0.069        
      PCI 2.5V   Wait Pwr  10%   2.25/  2.75  20%   2.00/  3.00    0.000        
        PCI 5V   Wait Pwr  10%   4.50/  5.50  20%   4.00/  6.00    0.000        
  XIO 12V BIAS   Wait Pwr  10%  10.80/ 13.20  20%   9.60/ 14.40    0.125        
        XIO 5V   Wait Pwr  10%   4.50/  5.50  20%   4.00/  6.00    0.000        
      XIO 2.5V   Wait Pwr  10%   2.25/  2.75  20%   2.00/  3.00    0.000        
  XIO 3.3V AUX   Wait Pwr  10%   2.97/  3.63  20%   2.64/  3.96    3.302        
 IP53 3.3V AUX   Wait Pwr  10%   2.97/  3.63  20%   2.64/  3.96    3.302        
   IP53 5V AUX   Wait Pwr  10%   4.50/  5.50  20%   4.00/  6.00    5.070        
      IP53 12V   Wait Pwr  10%  10.80/ 13.20  20%   9.60/ 14.40    0.125        
     IP53 VCPU   Wait Pwr  10%   1.13/  1.38  20%   1.00/  1.50    0.000        
     IP53 SRAM   Wait Pwr  10%   2.25/  2.75  20%   2.00/  3.00    0.000        
     IP53 1.5V   Wait Pwr  10%   1.35/  1.65  20%   1.20/  1.80    0.000        
                                                                                
Description     State       Warning RPM  Current RPM                            
--------------- ----------  -----------  -----------                            
FAN  0  EXHST 1   Wait Pwr         1980            0                            
FAN  1       PS   Wait Pwr         3200            0                            
FAN  2    PCI 1   Wait Pwr         1980            0                            
FAN  3    PCI 2   Wait Pwr         1980            0                            
FAN  4      ODY   Wait Pwr         1679            0                            
                                                                                
                              Advisory   Critical   Fault      Current          
Description       State       Temp       Temp       Temp       Temp             
----------------- ----------  ---------  ---------  ---------  ---------        
 0 INTERFACE 0      Wait Pwr    [Autofan Control]    75C/167F   18C/ 64F        
 1 INTERFACE 1      Wait Pwr    [Autofan Control]    75C/167F   19C/ 66F        
 2 INTERFACE 2      Wait Pwr    [Autofan Control]    75C/167F   17C/ 62F        
 3 PCI RISER        Wait Pwr    [Autofan Control]    75C/167F   17C/ 62F        
 4 ODYSSEY          Wait Pwr    [Autofan Control]    75C/167F   17C/ 62F        
 5 NODE             Wait Pwr    [Autofan Control]    75C/167F   17C/ 62F        
 6 BEDROCK          Wait Pwr  Not currently available                           
                                                                                
                     Zone Temp     Target    Current   Zone Fan   Curr/Min      
Zone Name  State     Sensors       Average   Average   Index      Fan %         
---------  --------  ------------  --------  --------  ---------  ---------     
NODE       Wait Pwr     0,1,2,5,6  47C/116F  17C/ 62F          0   18%/ 18%     
PS         Wait Pwr     0,1,2,5,6  47C/116F  17C/ 62F          1   55%/ 55%     
PCI        Wait Pwr             3  45C/113F  17C/ 62F        2,3   55%/ 55%     
ODY        Wait Pwr             4  48C/118F  17C/ 62F          4   55%/ 55%

...
... Set the debug flags and boot up to POD mode...
...

001c01-L1>debug 0x10d                                                           
debug switches set to 0x010d                                                    
001c01-L1>power up                                                              
001c01-L1>                                                                      
entering console mode  001c01 CPU0, <CTRL_T> to escape to L1                    
Starting PROM Boot process                                                      
hubii_link_good: 8-brick attached to module 001c01.                             
HUB at 0x0 attached as widget 0xb                                               
001c01/0xb/xbow_arb: nasid= 0x0 xbow_base= 0x9200000000000000                   
001c01/0xb/xbow_arb: 622 master is 0xb                                          
Check_master: link 11 is master                                                 
hubii_link_good: 8-brick attached to module 001c01.                             
Check_master: link 11 is master                                                 
                                                                                
                                                                                
IP35 PROM SGI Version 6.210  built 02:33:51 PM Aug 26, 2004                     
  built for bedrock rev. 1.1 or greater                                         
SN12 Graphics Blade.                                                            
Local master CPU A revision: f42                                                
Local slave CPU B revision: f42                                                 
Local slave CPU D revision: f42                                                 
Local slave CPU C revision: f42                                                 
PROM length: 0x1686a8, BSS length: 0xa7a0, flash count: 2                       
Configured bedrock clock: 200.0 MHz                                             
Status of local IO: 0x1 0x3fc03ff6403                                           
Bedrock Rev: 2, Module: 1 (001c01) from Sys Ctlr                                
On PROM entry: ERR_EPC=0xc00000001fc02cb0 (0xc00000001fc02cb0)                  
Configuring memory                                                              
Local memory configured: 8192 MB (premium)                                      
*** Warning: System controller debug switches are non-zero (0x10d)              
*** Diag level set to None (2)                                                  
*** Info level set to verbose                                                   
*** Boot stop requested at Global (2)                                           
before reading NICHub NIC: 0x62c0e690                                           
SR1 set to 0x6000081690349000                                                   
SR0 set to 0x0000000062c0e690                                                   
Testing/Initializing memory ...............             DONE                    
Copying PROM code to memory ...............             Copy PROM (0x90000000188
Done                                                                            
DONE                                                                            
Skipping secondary cache diags                                                  
Skipping secondary cache diags                                                  
Skipping secondary cache diags                                                  
Skipping secondary cache diags                                                  
CPU B switching stack into UALIAS and invalidating D-cache                      
CPU A switching stack into UALIAS and invalidating D-cache                      
CPU C switching stack into UALIAS and invalidating D-cache                      
CPU D switching stack into UALIAS and invalidating D-cache                      
CPU B switching into node 0 cached RAM                                          
CPU C switching into node 0 cached RAM                                          
CPU B running cached                                                            
CPU C running cached                                                            
CPU A switching into node 0 cached RAM                                          
CPU D switching into node 0 cached RAM                                          
CPU A running cached                                                            
CPU D running cached                                                            
Initializing kldir.                                                             
Done initializing kldir.                                                        
Initializing klconfig.                                                          
init_klcfg: nasid 0 start 9600000000030000 size 10000                           
Done initializing klconfig.                                                     
Discovering local IO ......................             Check_master: link 11 ir
Check_master: link 11 is master                                                 
DONE                                                                            
CPU A initialized subnode                                                       
CPU C initialized subnode                                                       
Discovering NUMAlink connectivity .........                                     
Local hub NUMAlink is down.                                                     
*** Local network link down                                                     
DONE                                                                            
Found 1 objects (1 hubs, 0 routers) in 5893 usec                                
Waiting for peers to complete discovery....             Discovery results:      
ENTRY 0: HUB(62c0e690)                                                          
    NASID=-1 Mod=1 Flg=0x9500000 PROM=6.210 Route=N/A                           
    MODULE=001c01 PARTITION=0 SPACE=RESET                                       
    Port 1 connection: Not connected                                            
    Port status: NF                                                             
DONE                                                                            
No other nodes present; becoming global master                                  
Global master is entry 0, NIC 0x62c0e690, /hw/rack/001/bay/01                   
Global master is /hw/rack/001/bay/01                                            
Global barrier (line 4315)Global barrier passed.                                
Global barrier (line 4348)Global barrier passed.                                
Master System Topology Graph (pre-nasid_assign):                                
Local Slave : Waiting for my NASID ...                                          
ENTRY 0: HUB(62c0e690)                                                          
    NASID=-1 Mod=1 Flg=0x9500000 PROM=6.210 Route=N/A                           
    MODULE=001c01 PARTITION=0 SPACE=RESET                                       
Local Slave : Waiting for my NASID ...                                          
Local Slave : Waiting for my NASID ...                                          
    Port 1 connection: Not connected                                            
    Port status: NF                                                             
Calculating NASIDs                                                              
num_routers is 0                                                                
Master System Topology Graph:                                                   
ENTRY 0: HUB(62c0e690)                                                          
    NASID=0 Mod=1 Flg=0x9500000 PROM=6.210 Route=N/A                            
    MODULE=001c01 PARTITION=0 SPACE=RESET                                       
    Port 1 connection: Not connected                                            
    Port status: NF                                                             
Distributing routing tables                                                     
Distributing NASIDs                                                             
*** NASID assigned to 0                                                         
CPU B switching to UALIAS                                                       
CPU D switching to UALIAS                                                       
CPU C switching to UALIAS                                                       
CPU A switching to UALIAS                                                       
CPU D running in UALIAS                                                         
CPU A running in UALIAS                                                         
CPU B running in UALIAS                                                         
CPU C running in UALIAS                                                         
CPU D Flushing and invalidating caches                                          
CPU C Flushing and invalidating caches                                          
CPU B Flushing and invalidating caches                                          
Changing node ID to 0                                                           
Global barrier (line 4823)Global barrier passed.                                
CPU A Flushing and invalidating caches                                          
Global barrier (line 4928)Global barrier passed.                                
CPU B switching to node 0 cached RAM                                            
CPU D switching to node 0 cached RAM                                            
CPU B running cached                                                            
CPU D running cached                                                            
CPU A switching to node 0 cached RAM                                            
CPU C switching to node 0 cached RAM                                            
CPU A running cached                                                            
CPU C running cached                                                            
Nasids in partition:  +0                                                        
Regions in partition:  +0                                                       
Intializing any CPUless nodes..............             Global barrier (line Gl.
Global barrier (line 7715)Global barrier passed.                                
DONE                                                                            
Global barrier (line 5089)Global barrier passed.                                
hubii_link_good: 8-brick attached to module 001c01.                             
Checking partitioning information .........             DONE                    
No other nodes present; becoming partition master                               
*** After partitioning ***                                                      
ENTRY 0: HUB(62c0e690)                                                          
    NASID=0 Mod=1 Flg=0x9500000 PROM=6.210 Route=N/A                            
    MODULE=001c01 PARTITION=0 SPACE=RESET                                       
    Port 1 connection: Not connected                                            
    Port status: FE                                                             
Erecting partition fences ................                        DONE          
Update config for routers connected to hubs                                     
Update config for hubs and hubless routers                                      
CPU B flushing cache                                                            
CPU D flushing cache                                                            
CPU A flushing cache                                                            
CPU C flushing cache                                                            
check_router_cfg: nasid 0 is_voyager 0 check_cfg = 0                            
Global barrier (line 5300)Global barrier passed.                                
Nasids in partition:  +0                                                        
Regions in partition: Local slave entering slave loop                           
Local slave entering slave loop                                                 
Local slave entering slave loop                                                 
 +0                                                                             
A 000 001c01:                                                                   
A 000 001c01: *** Entering POD mode on node 0

...
... Get POD Command Help
...

A 000 001c01: POD SysCt Cac> ?                                                  
A 000 001c01: Commands may be separated by semicolons, grouped with             
A 000 001c01: curly braces, and used in nested loop constructs.                 
A 000 001c01:                                                                   
A 000 001c01: Calculator                                                        
A 000 001c01:    Print hex:           px EXPR                                   
A 000 001c01:    Print decimal:       pd EXPR                                   
A 000 001c01:    Print octal:         po EXPR                                   
A 000 001c01:    Print binary:        pb EXPR                                   
A 000 001c01:    Look up PROM addr:   nm ADDR                                   
A 000 001c01: Hardware Registers                                                
A 000 001c01:    Print register(s):   pr [GPRNAME [VAL]                         
A 000 001c01:    Print fpreg(s):      pf [REGNO]                                
A 000 001c01:    Store register:      sr REG VAL                                
A 000 001c01:    Store fpreg:         sf REGNO VAL                              
A 000 001c01: Memory Access                                                     
A 000 001c01:    Print address:       pa ADDR [BITNO]                           
A 000 001c01:    Load byte:           lb ADDR [COUNT]                           
A 000 001c01:    Load half-word:      lh ADDR [COUNT]                           
A 000 001c01:    Load word:           lw ADDR [COUNT]                           
A 000 001c01:    Load double-word:    ld ADDR [COUNT]                           
A 000 001c01:    Load ASCII:          la ADDR [COUNT]                           
A 000 001c01:    Store byte:          sb ADDR [VAL [COUNT]]                     
A 000 001c01:    Store half-word:     sh ADDR [VAL [COUNT]]                     
A 000 001c01:    Store word:          sw ADDR [VAL [COUNT]]                     
A 000 001c01:    Store double-word:   sd ADDR [VAL [COUNT]]                     
A 000 001c01:    Store and verify:    sdv ADDR VAL                              
A 000 001c01: Memory Operations                                                 
A 000 001c01:    Fill mem w/ byte:    memset DST BYTE LEN                       
A 000 001c01:    Copy memory bytes:   memcpy DST SRC LEN                        
A 000 001c01:    Cmp memory bytes:    memcmp DST SRC LEN                        
A 000 001c01:    Add memory bytes:    memsum SRC LEN                            
A 000 001c01: Memory Testing                                                    
A 000 001c01:    Mem. sanity test:    santest ADDR                              
A 000 001c01:    Dir/prot init:       dirinit START LEN                         
A 000 001c01:    Memory clear:        meminit START LEN                         
A 000 001c01:    Dir. test/init:      dirtest START LEN                         
A 000 001c01:    Memory test/init:    memtest START LEN                         
A 000 001c01:    Clear errors:        clear                                     
A 000 001c01:    Display errors:      error                                     
A 000 001c01:    Quality mode:        qual [1|0]                                
A 000 001c01:    ECC mode:            ecc [1|0]                                 
A 000 001c01:    Set R10k int mask:   im [BYTE]                                 
A 000 001c01:    Test error limit:    maxerr COUNT                              
A 000 001c01:    Scan dir states:     scandir ADDR [LEN]                        
A 000 001c01:    Directory state:     dirstate [BASE [LEN [STATE]]]             
A 000 001c01: Network and Vectors                                               
A 000 001c01:    Vector read:         vr VEC VADDR                              
A 000 001c01:    Vector write:        vw VEC VADDR VAL                          
A 000 001c01:    Vector exchange:     vx VEC VADDR VAL                          
A 000 001c01:    Discover network:    disc                                      
A 000 001c01:    Dump pcfg struct:    pcfg [n:NODE] [v]                         
A 000 001c01:    Get/set node ID:     node [[VEC] ID]                           
A 000 001c01:    Set up route:        route [VEC NODE]                          
A 000 001c01:    Read router NIC:     rnic [VEC]                                
A 000 001c01:    Dump config info.:   cfg [n:NODE]                              
A 000 001c01:    Dump route table:    rtab [VEC]                                
A 000 001c01:    Dmp/clr rtr stat:    rstat VEC                                 
A 000 001c01: Control Structures                                                
A 000 001c01:    Reset the system:    reset [all]                               
A 000 001c01:    Softreset a node:    softreset n:NODE                          
A 000 001c01:    Call subroutine:     call ADDR [A0 [A1 [...]]]                 
A 000 001c01:    Inv. cache & jump:   jump ADDR [A0 [A1]]                       
A 000 001c01:    Goto slave mode:     slave                                     
A 000 001c01:    Repeat count:        repeat COUNT CMD                          
A 000 001c01:    Repeat forever:      loop CMD                                  
A 000 001c01:    While loop:          while (EXPR) CMD                          
A 000 001c01:    For loop:            for (CMD;EXPR;CMD) CMD                    
A 000 001c01:    If statement:        if (EXPR) CMD                             
A 000 001c01:    Delay:               delay MICROSEC                            
A 000 001c01:    Sleep:               sleep SEC                                 
A 000 001c01:    Benchmark timing:    time CMD                                  
A 000 001c01:    Echo string:         echo "STRING"                             
A 000 001c01: Miscellaneous                                                     
A 000 001c01:    Show PROM version:   version                                   
A 000 001c01:    Display help:        help [CMDNAME]                            
A 000 001c01:    Read hub NIC:        nic [n:NODE]                              
A 000 001c01:    Prgm remote PROM:    flash NODE [...]                          
A 000 001c01:    Prgm remote PROM with values:    fflash NODE [...]             
A 000 001c01:    Prgm modebits with values: setmodebits NODE [...]              
A 000 001c01: TLB and Cache                                                     
A 000 001c01:    Clear TLB:           tlbc [INDEX]                              
A 000 001c01:    Read TLB:            tlbr [INDEX]                              
A 000 001c01:    Inv. cache(s):       inval [i][d][s]                           
A 000 001c01:    Flush+inv caches:    flush                                     
A 000 001c01:    Dump dcache tag:     dtag line                                 
A 000 001c01:    Dump icache tag:     dtag line                                 
A 000 001c01:    Dump scache tag:     stag line                                 
A 000 001c01:    Dump dcache line:    dline line                                
A 000 001c01:    Dump icache line:    iline line                                
A 000 001c01:    Dump scache line:    sline line                                
A 000 001c01:    Dump dcache tag:     adtag line                                
A 000 001c01:    Dump icache tag:     aitag line                                
A 000 001c01:    Dump scache tag:     astag line                                
A 000 001c01:    Dump dcache line:    adline line                               
A 000 001c01:    Dump icache line:    ailine line                               
A 000 001c01:    Dump scache line:    asline line                               
A 000 001c01:    Store a scache dword:  sscache line taglo taghi                
A 000 001c01:    Store a scache tag:    sstag line taglo taghi [way]            
A 000 001c01:    Set memory mode:     go dex|unc|cac                            
A 000 001c01:    Hub_send_data_err:   hubsde                                    
A 000 001c01:    Rtr_send_data_err:   rtrsde                                    
A 000 001c01:    Check local link:    chklink                                   
A 000 001c01:    Self-test hub:       bist le|ae|lr|ar [n:NODE]                 
A 000 001c01:    Self-test router:    rbist le|ae|lr|ar VEC                     
A 000 001c01:    Self-test memory:    mbist ADDR                                
A 000 001c01:    Disable CPU/MEM:     disable n:NODE [SLICE/BANKS]              
A 000 001c01:    Enable CPU/MEM:      enable n:NODE [SLICE/BANKS]               
A 000 001c01:    Temp. disable:       tdisable n:NODE [SLICE]                   
A 000 001c01: Cache tests                                                       
A 000 001c01:    Instruction Cache test: icachetest                             
A 000 001c01:    Primary Cache test:     dcachetest                             
A 000 001c01:    Secondary Cache test:   scachetest                             
A 000 001c01:    compute CPU frequency                                          
A 000 001c01:    Generate SAMSUNG WAR                                           
A 000 001c01: I/O PROM                                                          
A 000 001c01:    List segments:       segs [FLAG]                               
A 000 001c01:    Load/exec segment:   exec [SEGNAME [FLAG]]                     
A 000 001c01:    Reconfig. memory:    reconf                                    
A 000 001c01: Console Selection                                                 
A 000 001c01:    Use IOC3/IOC4 UART:  ioc                                       
A 000 001c01:    Use JunkBus UART:    junk                                      
A 000 001c01:    Use SysCtlr UART:    elsc                                      
A 000 001c01:    Use Net UART:        talk [n:NODE SLICE]                       
A 000 001c01: Error Registers                                                   
A 000 001c01:    Dump II CRBs:        crb [n:NODE]                              
A 000 001c01:    137-col wide crb:    crbx [n:NODE]                             
A 000 001c01:    Dump PI err spool:   dumpspool [n:NODE SLICE]                  
A 000 001c01:    Dump error info:     error_dump                                
A 000 001c01:    Dump reset error:    reset_dump                                
A 000 001c01:    Dump bridge errs:    edump_bri [n:NODE]                        
A 000 001c01: System Controller                                                 
A 000 001c01:    System ctlr cmd:     sc ["STRING"]                             
A 000 001c01:    Wr sysctlr nvram:    scw ADDR [VAL [COUNT]]                    
A 000 001c01:    Rd sysctlr nvram:    scr ADDR [COUNT]                          
A 000 001c01:    Rd sysctlr dbgsw:    dips                                      
A 000 001c01:    Set/get debug sw:    dbg [VIRT_VAL PHYS_VAL]                   
A 000 001c01:    Set/get password:    pas ["PASW"]                              
A 000 001c01:    Set/get module #:    module [NUM]                              
A 000 001c01:    Set/get partition #: partition [NUM]                           
A 000 001c01:    Get module NIC:      modnic                                    
A 000 001c01: Debugging                                                         
A 000 001c01:    Verbose mode:        verbose [1|0]                             
A 000 001c01:    Use alt. regs:       altregs [NUM]                             
A 000 001c01:    kernel debugging:    kdebug [STACKADDR]                        
A 000 001c01:    Use kernel symtab:   kern_sym                                  
A 000 001c01:    Send NMI to node:    nmi n:NODE [SLICE]                        
A 000 001c01:    Why are we here?:    why                                       
A 000 001c01:    Stack backtrace:     btrace [epc sp]                           
A 000 001c01:    Switch to cpu:       cpu [[n:NODE] SLICE]                      
A 000 001c01:    Disassemble:         dis ADDR [COUNT]                          
A 000 001c01:    Dump mem cfg:        dmc [n:NODE]                              
A 000 001c01:    Run FRU analyzer:    fru [1(local) | 2(all node)]              
A 000 001c01: Environment Variables and Error Log                               
A 000 001c01:    Init. PROM log:      initlog [n:NODE]                          
A 000 001c01:    Clear PROM log:      clearlog [n:NODE]                         
A 000 001c01:    Init. all PROM logs in system: initalllogs                     
A 000 001c01:    Clear all PROM logs in system: clearalllogs                    
A 000 001c01:    Set variable:        setenv [n:NODE] KEY ["STRING"]            
A 000 001c01:    Remove variable:     unsetenv [n:NODE] KEY                     
A 000 001c01:    Print variables:     printenv [n:NODE] [KEY]                   
A 000 001c01:    Tail log entries:    log [n:NODE] [TAIL_CNT [HEAD_CNT]]        
A 000 001c01:    power cycle CBrick                                             
A 000 001c01:    initialize PLL WAR variables                                   
A 000 001c01:    obtain PLL WAR statistics                                      
A 000 001c01: I/O Diagnostics                                                   
A 000 001c01:    XBow Diagnostic:     dgxbow [m<n|h|m>] [n<NODE>]               
A 000 001c01:    Bridge Diagnostic:   dgbrdg [m<n|h|m>] [n<NODE>] [s<slot>]     
A 000 001c01:    IO7 Conf Spc Diag:   dgconf [m<n|h|m>] [n<NODE>] [s<slot>]     
A 000 001c01:    PCI Bus Diag.:       dgpci [m<n|h|m>] [n<NODE>] [s<slot>] [p<P]
A 000 001c01:    Serial PIO Diag:     dgspio [m<n|h|m|x>] [n<NODE>] [s<slot>] [ 
A 000 001c01:    Serial DMA Diag:     dgsdma [m<n|h|m|x>] [n<NODE>] [s<slot>] [ 
A 000 001c01:    Keyb/Mouse Diag:     dgpckm [m<n|m>]                           

...
... Now Enter CAC Mode ... but we hae to first go to DEX..
...

A 000 001c01: POD SysCt Cac> go cac                                             
A 000 001c01: Must be in Dex mode before switching to Cac or Unc.                        
A 000 001c01: POD SysCt Cac> go dex                                             
A 000 001c01:                                                                   
A 000 001c01: *** Requested DEX mode on node 0

...
... Get DEX Command Help
...

A 000 001c01: POD SysCt Dex> ?                                                  
A 000 001c01: Commands may be separated by semicolons, grouped with             
A 000 001c01: curly braces, and used in nested loop constructs.                 
A 000 001c01:                                                                   
A 000 001c01: Calculator                                                        
A 000 001c01:    Print hex:           px EXPR                                   
A 000 001c01:    Print decimal:       pd EXPR                                   
A 000 001c01:    Print octal:         po EXPR                                   
A 000 001c01:    Print binary:        pb EXPR                                   
A 000 001c01:    Look up PROM addr:   nm ADDR                                   
A 000 001c01: Hardware Registers                                                
A 000 001c01:    Print register(s):   pr [GPRNAME [VAL]                         
A 000 001c01:    Print fpreg(s):      pf [REGNO]                                
A 000 001c01:    Store register:      sr REG VAL                                
A 000 001c01:    Store fpreg:         sf REGNO VAL                              
A 000 001c01: Memory Access                                                     
A 000 001c01:    Print address:       pa ADDR [BITNO]                           
A 000 001c01:    Load byte:           lb ADDR [COUNT]                           
A 000 001c01:    Load half-word:      lh ADDR [COUNT]                           
A 000 001c01:    Load word:           lw ADDR [COUNT]                           
A 000 001c01:    Load double-word:    ld ADDR [COUNT]                           
A 000 001c01:    Load ASCII:          la ADDR [COUNT]                           
A 000 001c01:    Store byte:          sb ADDR [VAL [COUNT]]                     
A 000 001c01:    Store half-word:     sh ADDR [VAL [COUNT]]                     
A 000 001c01:    Store word:          sw ADDR [VAL [COUNT]]                     
A 000 001c01:    Store double-word:   sd ADDR [VAL [COUNT]]                     
A 000 001c01:    Store and verify:    sdv ADDR VAL                              
A 000 001c01: Memory Operations                                                 
A 000 001c01:    Fill mem w/ byte:    memset DST BYTE LEN                       
A 000 001c01:    Copy memory bytes:   memcpy DST SRC LEN                        
A 000 001c01:    Cmp memory bytes:    memcmp DST SRC LEN                        
A 000 001c01:    Add memory bytes:    memsum SRC LEN                            
A 000 001c01: Memory Testing                                                    
A 000 001c01:    Mem. sanity test:    santest ADDR                              
A 000 001c01:    Dir/prot init:       dirinit START LEN                         
A 000 001c01:    Memory clear:        meminit START LEN                         
A 000 001c01:    Dir. test/init:      dirtest START LEN                         
A 000 001c01:    Memory test/init:    memtest START LEN                         
A 000 001c01:    Clear errors:        clear                                     
A 000 001c01:    Display errors:      error                                     
A 000 001c01:    Quality mode:        qual [1|0]                                
A 000 001c01:    ECC mode:            ecc [1|0]                                 
A 000 001c01:    Set R10k int mask:   im [BYTE]                                 
A 000 001c01:    Test error limit:    maxerr COUNT                              
A 000 001c01:    Scan dir states:     scandir ADDR [LEN]                        
A 000 001c01:    Directory state:     dirstate [BASE [LEN [STATE]]]             
A 000 001c01: Network and Vectors                                               
A 000 001c01:    Vector read:         vr VEC VADDR                              
A 000 001c01:    Vector write:        vw VEC VADDR VAL                          
A 000 001c01:    Vector exchange:     vx VEC VADDR VAL                          
A 000 001c01:    Discover network:    disc                                      
A 000 001c01:    Dump pcfg struct:    pcfg [n:NODE] [v]                         
A 000 001c01:    Get/set node ID:     node [[VEC] ID]                           
A 000 001c01:    Set up route:        route [VEC NODE]                          
A 000 001c01:    Read router NIC:     rnic [VEC]                                
A 000 001c01:    Dump config info.:   cfg [n:NODE]                              
A 000 001c01:    Dump route table:    rtab [VEC]                                
A 000 001c01:    Dmp/clr rtr stat:    rstat VEC                                 
A 000 001c01: Control Structures                                                
A 000 001c01:    Reset the system:    reset [all]                               
A 000 001c01:    Softreset a node:    softreset n:NODE                          
A 000 001c01:    Call subroutine:     call ADDR [A0 [A1 [...]]]                 
A 000 001c01:    Inv. cache & jump:   jump ADDR [A0 [A1]]                       
A 000 001c01:    Goto slave mode:     slave                                     
A 000 001c01:    Repeat count:        repeat COUNT CMD                          
A 000 001c01:    Repeat forever:      loop CMD                                  
A 000 001c01:    While loop:          while (EXPR) CMD                          
A 000 001c01:    For loop:            for (CMD;EXPR;CMD) CMD                    
A 000 001c01:    If statement:        if (EXPR) CMD                             
A 000 001c01:    Delay:               delay MICROSEC                            
A 000 001c01:    Sleep:               sleep SEC                                 
A 000 001c01:    Benchmark timing:    time CMD                                  
A 000 001c01:    Echo string:         echo "STRING"                             
A 000 001c01: Miscellaneous                                                     
A 000 001c01:    Show PROM version:   version                                   
A 000 001c01:    Display help:        help [CMDNAME]                            
A 000 001c01:    Read hub NIC:        nic [n:NODE]                              
A 000 001c01:    Prgm remote PROM:    flash NODE [...]                          
A 000 001c01:    Prgm remote PROM with values:    fflash NODE [...]             
A 000 001c01:    Prgm modebits with values: setmodebits NODE [...]              
A 000 001c01: TLB and Cache                                                     
A 000 001c01:    Clear TLB:           tlbc [INDEX]                              
A 000 001c01:    Read TLB:            tlbr [INDEX]                              
A 000 001c01:    Inv. cache(s):       inval [i][d][s]                           
A 000 001c01:    Flush+inv caches:    flush                                     
A 000 001c01:    Dump dcache tag:     dtag line                                 
A 000 001c01:    Dump icache tag:     dtag line                                 
A 000 001c01:    Dump scache tag:     stag line                                 
A 000 001c01:    Dump dcache line:    dline line                                
A 000 001c01:    Dump icache line:    iline line                                
A 000 001c01:    Dump scache line:    sline line                                
A 000 001c01:    Dump dcache tag:     adtag line                                
A 000 001c01:    Dump icache tag:     aitag line                                
A 000 001c01:    Dump scache tag:     astag line                                
A 000 001c01:    Dump dcache line:    adline line                               
A 000 001c01:    Dump icache line:    ailine line                               
A 000 001c01:    Dump scache line:    asline line                               
A 000 001c01:    Store a scache dword:  sscache line taglo taghi                
A 000 001c01:    Store a scache tag:    sstag line taglo taghi [way]            
A 000 001c01:    Set memory mode:     go dex|unc|cac                            
A 000 001c01:    Hub_send_data_err:   hubsde                                    
A 000 001c01:    Rtr_send_data_err:   rtrsde                                    
A 000 001c01:    Check local link:    chklink                                   
A 000 001c01:    Self-test hub:       bist le|ae|lr|ar [n:NODE]                 
A 000 001c01:    Self-test router:    rbist le|ae|lr|ar VEC                     
A 000 001c01:    Self-test memory:    mbist ADDR                                
A 000 001c01:    Disable CPU/MEM:     disable n:NODE [SLICE/BANKS]              
A 000 001c01:    Enable CPU/MEM:      enable n:NODE [SLICE/BANKS]               
A 000 001c01:    Temp. disable:       tdisable n:NODE [SLICE]                   
A 000 001c01: Cache tests                                                       
A 000 001c01:    Instruction Cache test: icachetest                             
A 000 001c01:    Primary Cache test:     dcachetest                             
A 000 001c01:    Secondary Cache test:   scachetest                             
A 000 001c01:    compute CPU frequency                                          
A 000 001c01:    Generate SAMSUNG WAR                                           
A 000 001c01: I/O PROM                                                          
A 000 001c01:    List segments:       segs [FLAG]                               
A 000 001c01:    Load/exec segment:   exec [SEGNAME [FLAG]]                     
A 000 001c01:    Reconfig. memory:    reconf                                    
A 000 001c01: Console Selection                                                 
A 000 001c01:    Use IOC3/IOC4 UART:  ioc                                       
A 000 001c01:    Use JunkBus UART:    junk                                      
A 000 001c01:    Use SysCtlr UART:    elsc                                      
A 000 001c01:    Use Net UART:        talk [n:NODE SLICE]                       
A 000 001c01: Error Registers                                                   
A 000 001c01:    Dump II CRBs:        crb [n:NODE]                              
A 000 001c01:    137-col wide crb:    crbx [n:NODE]                             
A 000 001c01:    Dump PI err spool:   dumpspool [n:NODE SLICE]                  
A 000 001c01:    Dump error info:     error_dump                                
A 000 001c01:    Dump reset error:    reset_dump                                
A 000 001c01:    Dump bridge errs:    edump_bri [n:NODE]                        
A 000 001c01: System Controller                                                 
A 000 001c01:    System ctlr cmd:     sc ["STRING"]                             
A 000 001c01:    Wr sysctlr nvram:    scw ADDR [VAL [COUNT]]                    
A 000 001c01:    Rd sysctlr nvram:    scr ADDR [COUNT]                          
A 000 001c01:    Rd sysctlr dbgsw:    dips                                      
A 000 001c01:    Set/get debug sw:    dbg [VIRT_VAL PHYS_VAL]                   
A 000 001c01:    Set/get password:    pas ["PASW"]                              
A 000 001c01:    Set/get module #:    module [NUM]                              
A 000 001c01:    Set/get partition #: partition [NUM]                           
A 000 001c01:    Get module NIC:      modnic                                    
A 000 001c01: Debugging                                                         
A 000 001c01:    Verbose mode:        verbose [1|0]                             
A 000 001c01:    Use alt. regs:       altregs [NUM]                             
A 000 001c01:    kernel debugging:    kdebug [STACKADDR]                        
A 000 001c01:    Use kernel symtab:   kern_sym                                  
A 000 001c01:    Send NMI to node:    nmi n:NODE [SLICE]                        
A 000 001c01:    Why are we here?:    why                                       
A 000 001c01:    Stack backtrace:     btrace [epc sp]                           
A 000 001c01:    Switch to cpu:       cpu [[n:NODE] SLICE]                      
A 000 001c01:    Disassemble:         dis ADDR [COUNT]                          
A 000 001c01:    Dump mem cfg:        dmc [n:NODE]                              
A 000 001c01:    Run FRU analyzer:    fru [1(local) | 2(all node)]              
A 000 001c01: Environment Variables and Error Log                               
A 000 001c01:    Init. PROM log:      initlog [n:NODE]                          
A 000 001c01:    Clear PROM log:      clearlog [n:NODE]                         
A 000 001c01:    Init. all PROM logs in system: initalllogs                     
A 000 001c01:    Clear all PROM logs in system: clearalllogs                    
A 000 001c01:    Set variable:        setenv [n:NODE] KEY ["STRING"]            
A 000 001c01:    Remove variable:     unsetenv [n:NODE] KEY                     
A 000 001c01:    Print variables:     printenv [n:NODE] [KEY]                   
A 000 001c01:    Tail log entries:    log [n:NODE] [TAIL_CNT [HEAD_CNT]]        
A 000 001c01:    power cycle CBrick                                             
A 000 001c01:    initialize PLL WAR variables                                   
A 000 001c01:    obtain PLL WAR statistics                                      
A 000 001c01: I/O Diagnostics                                                   
A 000 001c01:    XBow Diagnostic:     dgxbow [m<n|h|m>] [n<NODE>]               
A 000 001c01:    Bridge Diagnostic:   dgbrdg [m<n|h|m>] [n<NODE>] [s<slot>]     
A 000 001c01:    IO7 Conf Spc Diag:   dgconf [m<n|h|m>] [n<NODE>] [s<slot>]     
A 000 001c01:    PCI Bus Diag.:       dgpci [m<n|h|m>] [n<NODE>] [s<slot>] [p<P]
A 000 001c01:    Serial PIO Diag:     dgspio [m<n|h|m|x>] [n<NODE>] [s<slot>] [ 
A 000 001c01:    Serial DMA Diag:     dgsdma [m<n|h|m|x>] [n<NODE>] [s<slot>] [ 
A 000 001c01:    Keyb/Mouse Diag:     dgpckm [m<n|m>]                           

...
... Enter CAC Mode
...

A 000 001c01: POD SysCt Dex>  go cac                                            
A 000 001c01: Testing/Initializing memory                                       
A 000 001c01: Init PROM text/data (0x9600000001a00000), len 0x16c000            
A 000 001c01:   Initializing dir/prot                                           
A 000 001c01:   Initializing ECC                                                
A 000 001c01:   Clearing memory                                                 
A 000 001c01: Copy PROM (0x9000000018000000) to RAM (0x9600000001a00000), len 08
A 000 001c01: Done                                                              
A 000 001c01: Init PROM bss (0x9600000001b6c000), len 0x8000                    
A 000 001c01:   Initializing dir/prot                                           
A 000 001c01:   Initializing ECC                                                
A 000 001c01:   Clearing memory                                                 
A 000 001c01: Init PROM stack/structures (0x96000000020d0000), len 0x12000      
A 000 001c01:   Initializing dir/prot                                           
A 000 001c01:   Initializing ECC                                                
A 000 001c01:   Clearing memory                                                 
A 000 001c01: Done                                                              
A 000 001c01:                                                                   
A 000 001c01: *** Requested CAC mode on node 0

...
... Get CAC Command Help
...

A 000 001c01: POD SysCt Cac> ?                                                  
A 000 001c01: Commands may be separated by semicolons, grouped with             
A 000 001c01: curly braces, and used in nested loop constructs.                 
A 000 001c01:                                                                   
A 000 001c01: Calculator                                                        
A 000 001c01:    Print hex:           px EXPR                                   
A 000 001c01:    Print decimal:       pd EXPR                                   
A 000 001c01:    Print octal:         po EXPR                                   
A 000 001c01:    Print binary:        pb EXPR                                   
A 000 001c01:    Look up PROM addr:   nm ADDR                                   
A 000 001c01: Hardware Registers                                                
A 000 001c01:    Print register(s):   pr [GPRNAME [VAL]                         
A 000 001c01:    Print fpreg(s):      pf [REGNO]                                
A 000 001c01:    Store register:      sr REG VAL                                
A 000 001c01:    Store fpreg:         sf REGNO VAL                              
A 000 001c01: Memory Access                                                     
A 000 001c01:    Print address:       pa ADDR [BITNO]                           
A 000 001c01:    Load byte:           lb ADDR [COUNT]                           
A 000 001c01:    Load half-word:      lh ADDR [COUNT]                           
A 000 001c01:    Load word:           lw ADDR [COUNT]                           
A 000 001c01:    Load double-word:    ld ADDR [COUNT]                           
A 000 001c01:    Load ASCII:          la ADDR [COUNT]                           
A 000 001c01:    Store byte:          sb ADDR [VAL [COUNT]]                     
A 000 001c01:    Store half-word:     sh ADDR [VAL [COUNT]]                     
A 000 001c01:    Store word:          sw ADDR [VAL [COUNT]]                     
A 000 001c01:    Store double-word:   sd ADDR [VAL [COUNT]]                     
A 000 001c01:    Store and verify:    sdv ADDR VAL                              
A 000 001c01: Memory Operations                                                 
A 000 001c01:    Fill mem w/ byte:    memset DST BYTE LEN                       
A 000 001c01:    Copy memory bytes:   memcpy DST SRC LEN                        
A 000 001c01:    Cmp memory bytes:    memcmp DST SRC LEN                        
A 000 001c01:    Add memory bytes:    memsum SRC LEN                            
A 000 001c01: Memory Testing                                                    
A 000 001c01:    Mem. sanity test:    santest ADDR                              
A 000 001c01:    Dir/prot init:       dirinit START LEN                         
A 000 001c01:    Memory clear:        meminit START LEN                         
A 000 001c01:    Dir. test/init:      dirtest START LEN                         
A 000 001c01:    Memory test/init:    memtest START LEN                         
A 000 001c01:    Clear errors:        clear                                     
A 000 001c01:    Display errors:      error                                     
A 000 001c01:    Quality mode:        qual [1|0]                                
A 000 001c01:    ECC mode:            ecc [1|0]                                 
A 000 001c01:    Set R10k int mask:   im [BYTE]                                 
A 000 001c01:    Test error limit:    maxerr COUNT                              
A 000 001c01:    Scan dir states:     scandir ADDR [LEN]                        
A 000 001c01:    Directory state:     dirstate [BASE [LEN [STATE]]]             
A 000 001c01: Network and Vectors                                               
A 000 001c01:    Vector read:         vr VEC VADDR                              
A 000 001c01:    Vector write:        vw VEC VADDR VAL                          
A 000 001c01:    Vector exchange:     vx VEC VADDR VAL                          
A 000 001c01:    Discover network:    disc                                      
A 000 001c01:    Dump pcfg struct:    pcfg [n:NODE] [v]                         
A 000 001c01:    Get/set node ID:     node [[VEC] ID]                           
A 000 001c01:    Set up route:        route [VEC NODE]                          
A 000 001c01:    Read router NIC:     rnic [VEC]                                
A 000 001c01:    Dump config info.:   cfg [n:NODE]                              
A 000 001c01:    Dump route table:    rtab [VEC]                                
A 000 001c01:    Dmp/clr rtr stat:    rstat VEC                                 
A 000 001c01: Control Structures                                                
A 000 001c01:    Reset the system:    reset [all]                               
A 000 001c01:    Softreset a node:    softreset n:NODE                          
A 000 001c01:    Call subroutine:     call ADDR [A0 [A1 [...]]]                 
A 000 001c01:    Inv. cache & jump:   jump ADDR [A0 [A1]]                       
A 000 001c01:    Goto slave mode:     slave                                     
A 000 001c01:    Repeat count:        repeat COUNT CMD                          
A 000 001c01:    Repeat forever:      loop CMD                                  
A 000 001c01:    While loop:          while (EXPR) CMD                          
A 000 001c01:    For loop:            for (CMD;EXPR;CMD) CMD                    
A 000 001c01:    If statement:        if (EXPR) CMD                             
A 000 001c01:    Delay:               delay MICROSEC                            
A 000 001c01:    Sleep:               sleep SEC                                 
A 000 001c01:    Benchmark timing:    time CMD                                  
A 000 001c01:    Echo string:         echo "STRING"                             
A 000 001c01: Miscellaneous                                                     
A 000 001c01:    Show PROM version:   version                                   
A 000 001c01:    Display help:        help [CMDNAME]                            
A 000 001c01:    Read hub NIC:        nic [n:NODE]                              
A 000 001c01:    Prgm remote PROM:    flash NODE [...]                          
A 000 001c01:    Prgm remote PROM with values:    fflash NODE [...]             
A 000 001c01:    Prgm modebits with values: setmodebits NODE [...]              
A 000 001c01: TLB and Cache                                                     
A 000 001c01:    Clear TLB:           tlbc [INDEX]                              
A 000 001c01:    Read TLB:            tlbr [INDEX]                              
A 000 001c01:    Inv. cache(s):       inval [i][d][s]                           
A 000 001c01:    Flush+inv caches:    flush                                     
A 000 001c01:    Dump dcache tag:     dtag line                                 
A 000 001c01:    Dump icache tag:     dtag line                                 
A 000 001c01:    Dump scache tag:     stag line                                 
A 000 001c01:    Dump dcache line:    dline line                                
A 000 001c01:    Dump icache line:    iline line                                
A 000 001c01:    Dump scache line:    sline line                                
A 000 001c01:    Dump dcache tag:     adtag line                                
A 000 001c01:    Dump icache tag:     aitag line                                
A 000 001c01:    Dump scache tag:     astag line                                
A 000 001c01:    Dump dcache line:    adline line                               
A 000 001c01:    Dump icache line:    ailine line                               
A 000 001c01:    Dump scache line:    asline line                               
A 000 001c01:    Store a scache dword:  sscache line taglo taghi                
A 000 001c01:    Store a scache tag:    sstag line taglo taghi [way]            
A 000 001c01:    Set memory mode:     go dex|unc|cac                            
A 000 001c01:    Hub_send_data_err:   hubsde                                    
A 000 001c01:    Rtr_send_data_err:   rtrsde                                    
A 000 001c01:    Check local link:    chklink                                   
A 000 001c01:    Self-test hub:       bist le|ae|lr|ar [n:NODE]                 
A 000 001c01:    Self-test router:    rbist le|ae|lr|ar VEC                     
A 000 001c01:    Self-test memory:    mbist ADDR                                
A 000 001c01:    Disable CPU/MEM:     disable n:NODE [SLICE/BANKS]              
A 000 001c01:    Enable CPU/MEM:      enable n:NODE [SLICE/BANKS]               
A 000 001c01:    Temp. disable:       tdisable n:NODE [SLICE]                   
A 000 001c01: Cache tests                                                       
A 000 001c01:    Instruction Cache test: icachetest                             
A 000 001c01:    Primary Cache test:     dcachetest                             
A 000 001c01:    Secondary Cache test:   scachetest                             
A 000 001c01:    compute CPU frequency                                          
A 000 001c01:    Generate SAMSUNG WAR                                           
A 000 001c01: I/O PROM                                                          
A 000 001c01:    List segments:       segs [FLAG]                               
A 000 001c01:    Load/exec segment:   exec [SEGNAME [FLAG]]                     
A 000 001c01:    Reconfig. memory:    reconf                                    
A 000 001c01: Console Selection                                                 
A 000 001c01:    Use IOC3/IOC4 UART:  ioc                                       
A 000 001c01:    Use JunkBus UART:    junk                                      
A 000 001c01:    Use SysCtlr UART:    elsc                                      
A 000 001c01:    Use Net UART:        talk [n:NODE SLICE]                       
A 000 001c01: Error Registers                                                   
A 000 001c01:    Dump II CRBs:        crb [n:NODE]                              
A 000 001c01:    137-col wide crb:    crbx [n:NODE]                             
A 000 001c01:    Dump PI err spool:   dumpspool [n:NODE SLICE]                  
A 000 001c01:    Dump error info:     error_dump                                
A 000 001c01:    Dump reset error:    reset_dump                                
A 000 001c01:    Dump bridge errs:    edump_bri [n:NODE]                        
A 000 001c01: System Controller                                                 
A 000 001c01:    System ctlr cmd:     sc ["STRING"]                             
A 000 001c01:    Wr sysctlr nvram:    scw ADDR [VAL [COUNT]]                    
A 000 001c01:    Rd sysctlr nvram:    scr ADDR [COUNT]                          
A 000 001c01:    Rd sysctlr dbgsw:    dips                                      
A 000 001c01:    Set/get debug sw:    dbg [VIRT_VAL PHYS_VAL]                   
A 000 001c01:    Set/get password:    pas ["PASW"]                              
A 000 001c01:    Set/get module #:    module [NUM]                              
A 000 001c01:    Set/get partition #: partition [NUM]                           
A 000 001c01:    Get module NIC:      modnic                                    
A 000 001c01: Debugging                                                         
A 000 001c01:    Verbose mode:        verbose [1|0]                             
A 000 001c01:    Use alt. regs:       altregs [NUM]                             
A 000 001c01:    kernel debugging:    kdebug [STACKADDR]                        
A 000 001c01:    Use kernel symtab:   kern_sym                                  
A 000 001c01:    Send NMI to node:    nmi n:NODE [SLICE]                        
A 000 001c01:    Why are we here?:    why                                       
A 000 001c01:    Stack backtrace:     btrace [epc sp]                           
A 000 001c01:    Switch to cpu:       cpu [[n:NODE] SLICE]                      
A 000 001c01:    Disassemble:         dis ADDR [COUNT]                          
A 000 001c01:    Dump mem cfg:        dmc [n:NODE]                              
A 000 001c01:    Run FRU analyzer:    fru [1(local) | 2(all node)]              
A 000 001c01: Environment Variables and Error Log                               
A 000 001c01:    Init. PROM log:      initlog [n:NODE]                          
A 000 001c01:    Clear PROM log:      clearlog [n:NODE]                         
A 000 001c01:    Init. all PROM logs in system: initalllogs                     
A 000 001c01:    Clear all PROM logs in system: clearalllogs                    
A 000 001c01:    Set variable:        setenv [n:NODE] KEY ["STRING"]            
A 000 001c01:    Remove variable:     unsetenv [n:NODE] KEY                     
A 000 001c01:    Print variables:     printenv [n:NODE] [KEY]                   
A 000 001c01:    Tail log entries:    log [n:NODE] [TAIL_CNT [HEAD_CNT]]        
A 000 001c01:    power cycle CBrick                                             
A 000 001c01:    initialize PLL WAR variables                                   
A 000 001c01:    obtain PLL WAR statistics                                      
A 000 001c01: I/O Diagnostics                                                   
A 000 001c01:    XBow Diagnostic:     dgxbow [m<n|h|m>] [n<NODE>]               
A 000 001c01:    Bridge Diagnostic:   dgbrdg [m<n|h|m>] [n<NODE>] [s<slot>]     
A 000 001c01:    IO7 Conf Spc Diag:   dgconf [m<n|h|m>] [n<NODE>] [s<slot>]     
A 000 001c01:    PCI Bus Diag.:       dgpci [m<n|h|m>] [n<NODE>] [s<slot>] [p<P]
A 000 001c01:    Serial PIO Diag:     dgspio [m<n|h|m|x>] [n<NODE>] [s<slot>] [ 
A 000 001c01:    Serial DMA Diag:     dgsdma [m<n|h|m|x>] [n<NODE>] [s<slot>] [ 
A 000 001c01:    Keyb/Mouse Diag:     dgpckm [m<n|m>]                    
A 000 001c01: POD SysCt Cac>

...
... Escape back to L1 (Ctl-t)
...

escaping to L1 system controller                                                
001c01-L1>debug 0x0                                                             
debug switches set to 0x0000                                                    
                                                                                
returning to console mode  001c01 CPU0, <CTRL_T> to escape to L1

...
... Finally so reset to go back to standard boot process
...
                                                                                
A 000 001c01: POD SysCt Cac> reset                                              
A 000 001c01: Resetting the system...                                           
Starting PROM Boot process                                                      
                                                                                
                                                                                
IP35 PROM SGI Version 6.210  built 02:33:51 PM Aug 26, 2004                     
Testing/Initializing memory ...............             DONE                    
Copying PROM code to memory ...............             DONE                    
Discovering local IO ......................             DONE                    
Discovering NUMAlink connectivity .........                                     
Local hub NUMAlink is down.                                                     
*** Local network link down                                                     
DONE                                                                            
Found 1 objects (1 hubs, 0 routers) in 5897 usec                                
Waiting for peers to complete discovery....             DONE                    
No other nodes present; becoming global master                                  
Global master is /hw/rack/001/bay/01                                            
Intializing any CPUless nodes..............             DONE                    
Checking partitioning information .........             DONE                    
No other nodes present; becoming partition master                               
Local slave entering slave loop                                                 
Local slave entering slave loop                                                 
Local slave entering slave loop                                                 
Loading BASEIO prom .......................             DONE                    
                                                                                
BASEIO PROM Monitor SGI Version 6.210  built 02:30:38 PM Aug 26, 2004 (BE64)    
4 CPUs on 1 nodes found.                                                        
                                                                                
NVRAM checksum is incorrect: reinitializing.                                    
Automatic update of PROM environment disabled                                   
Graphics diagnostics                                                            
                                                                                
Odyssey board #0 found on nasid 0                                               
Running Odyssey xtalk sanity diag...                                            
        Board version 1 - Buzz revision 3B                                      
        On board sdram size: 128 Mb                                             
        Cas latency: CAS 3                                                      
        4 banks by sdram module                                                 
Running Odyssey Buzz registers diag...                                          
Device passed diagnostics                                                       
                                                                                
Installing PROM Device drivers ............                                     
On-board (IO9) tigon3 1000BaseT interface                                       
Base I/O Ethernet set to /dev/ethernet/tg0                                      
Installing Graphics Console...                                                  
graphics install: searching for pipe 0                                          
Probing IOC4 ATA adapter 2                                                      
IOC4 RevId = 83                                                                 
Detected Vendor id/Product MATSHITA DVD-ROM SR-8178                             
                                                                                
Walking SCSI Adapter 0, (pci id 3)                                              
1+ Device Vendor Product: ATA SCSIDE BRIDGE320                                  
2- 3- 4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 1 device(s)                   
                                                                                
                                                                                
Walking SCSI Adapter 1, (pci id 3)                                              
1- 2- 3- 4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 0 device(s)                
                                                                                
Initializing PROM Device drivers ..........                                     
  Initializing Base I/O Ethernet Interface...Done.                              
  ---------------Interface Configuration Summary----------------                
  ASIC|Revision|MAC Address       : 5701|B5|08:00:69:11:e9:d0                   
  Link Negotiation|Advertisement  : On|<H10 F10 H100 F100 H1000 F1000>          
  Link|Speed|Duplex|Rx/Tx FlowCtrl: Up|1000|Full|Off/Off                        
  --------------------------------------------------------------                
DONE                                                                            
                                                                                
escaping to L1 system controller                                                
001c01-L1>

Sample Session with Fuel (internal ) Serial Comms Port

The Fuel has multiple ways to support diagnostic communications:

  • L1 USB Port - this is the external USB port just below the Ethernet port and can be used to connect Fuel to an L2 controller
  • Internal Serial Port - there is an internal serial port which can also be used to communicate with the L1 controller using a NULL modem serial cable (38400,8,N,1)
  • External Serial Port #1 which is the lower of the two serials ports above the Ethernet Port.

When I first got my Fuel I could communicate with L1 via USB and L2 but not via either the internal or external serial ports. To resolve this I had to:

  • edit /etc/inittab - to ensure that ports where defined and not allocated to other uses
  • edit /etc/uucp/Devices - to allow use of Serial Port #2 communications at multiple speeds
  • /etc/ioconfig.conf - to reset the tty config back to tty1 & tty2 as it was reporting serial ports on tty2 & tty4
  • POD/CAC Reset - go into POD/CAC mode and do a reset of logs to clear errors

Here is exampe session via the internal Serial port:

001a01-L1>
001a01-L1>
001a01-L1>ver
L1 1.48.1 (Image A), Built 01/22/2007 11:34:20    [Fuel/PE/O300 1MB image]
001a01-L1>?
ERROR: command not found.
001a01-L1>help
Commands are:
*                  autopower|apwr     syscom|junkbus|jb|bedrockbrick              
partdb             cpu                nia|ni|ctc         nib                
iia|ii|cti         iib                iic                iid                
config|cfg         debug              display|dsp        button|btn         
env                fan                help|hlp           history|hist       
l1dbg              link               log                ioport|ioprt       
istat              l1                 leds               margin|mgn         
network            pimm               port|prt           power|pwr          
reset|rst          nmi                softreset|softrst  select|sel         
serial             sysstate           eeprom             uart               
usb                router|rtr         service            date               
nvram              security           flash              reboot_l1          
version|ver        pbay               test|tst           scan               
fru|pci|prom|node  
enter 'hlp <cmd>' for more help on a single command.
001a01-L1>uart
      Baud    Read    Read    Read     Read   Read   Write   Write   Write
UART  Rate    State   Status  Timeouts Breaks Errors State   Status  Timeouts
----  ----    -----   ------  -------- ------ ------ -----   ------  --------
JNK 0 57692   Connect Suspend 0        0      42     Connect Ready   0    
SMP   37500   Connect Ready   0        0      0      Connect Ready   3    

001a01-L1>serial
BSN: MSM019    SSN: XX:XX:XX:XX:XX:XX    Time: 02/07/2106 06:28:15    Security: OFF
Public Key data in EEPROM is invalid
001a01-L1>usb

Device: 0  Disconnects: 1  Bus Resets:  20

Endpoint State    Status    Stalls Errors Timeouts
-------- -----    ------    ------ ------ --------
Control  Stalled  Suspended 30085  0      0    
Read     Unconfig Ready     0      0      0    
Write    Unconfig Ready     1      0      0    

001a01-L1>power up
001a01-L1>
entering console mode  001a01 CPU0, <CTRL_T> to escape to L1
Starting PROM Boot process  


IP35 PROM SGI Version 6.211  built 04:16:18 PM Jan 25, 2008
Running in DDR mode
*** Mixed standard and premium memory:
*** Treating all as standard.
Testing/Initializing memory ...............             DONE
Copying PROM code to memory ...............             DONE
Discovering local IO ......................             DONE
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 5884 usec
Waiting for peers to complete discovery....             DONE
No other nodes present; becoming global master
Global master is /hw/rack/001/bay/01
Intializing any CPUless nodes..............             DONE
Checking partitioning information .........             DONE
No other nodes present; becoming partition master
Loading BASEIO prom .......................             DONE

BASEIO PROM Monitor SGI Version 6.211  built 04:15:20 PM Jan 25, 2008 (BE64)
1 CPUs on 1 nodes found.
Automatic update of PROM environment disabled

PS/2 Keyboard & Mouse diagnostics
    Found mouse on port 0
    Found keyboard on port 1
PS/2 Keyboard & Mouse diagnostics passed 

Graphics diagnostics

Odyssey board #0 found on nasid 0
Running Odyssey xtalk sanity diag...
        Board version 1 - Buzz revision 2B
        On board sdram size: 128 Mb
        Cas latency: CAS 3
        4 banks by sdram module
Running Odyssey Buzz registers diag...
Device passed diagnostics

Installing PROM Device drivers ............             
Base I/O Ethernet set to /dev/ethernet/ef0
Installing Graphics Console...
graphics install: searching for pipe 0

Walking SCSI Adapter 0, (pci id 1)
1+ Device Vendor Product: ATA Samsung SSD 840
2- 3- 4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 1 device(s)


Walking SCSI Adapter 1, (pci id 1)
1+ Device Vendor Product: SONY SDT-9000
2+ Device Vendor Product: TOSHIBA DVD-ROM SD-M1711
3- 4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 2 device(s)

Initializing PROM Device drivers ..........             DONE

escaping to L1 system controller
001a01-L1>env
Environmental monitoring is enabled and running.

Description    State       Warning Limits     Fault Limits       Current
-------------- ----------  -----------------  -----------------  -------
           12V    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   12.063
        12V IO    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   12.063
            5V    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    5.044
          3.3V    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.320
          2.5V    Enabled  10%   2.25/  2.75  20%   2.00/  3.00    2.470
          1.5V    Enabled  10%   1.35/  1.65  20%   1.20/  1.80    1.466
        5V AUX    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    5.096
      3.3V AUX    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.285
 PIMM 12V BIAS    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   12.063
          SRAM    Enabled  10%   2.25/  2.75  20%   2.00/  3.00    2.509
          VCPU    Enabled  10%   1.44/  1.76  20%   1.28/  1.92    1.593
     PIMM 1.5V    Enabled  10%   1.35/  1.65  20%   1.20/  1.80    1.495
 PIMM 3.3V AUX    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.268
   PIMM 5V AUX    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    5.096
  XIO 12V BIAS    Enabled  10%  10.80/ 13.20  20%   9.60/ 14.40   12.000
        XIO 5V    Enabled  10%   4.50/  5.50  20%   4.00/  6.00    5.044
      XIO 2.5V    Enabled  10%   2.25/  2.75  20%   2.00/  3.00    2.457
  XIO 3.3V AUX    Enabled  10%   2.97/  3.63  20%   2.64/  3.96    3.285

Description     State       Warning RPM  Current RPM
--------------- ----------  -----------  -----------
FAN  0  EXHAUST    Enabled          920         1188
FAN  1       HD    Enabled         1560         2393
FAN  2      PCI    Enabled         1120         1520
FAN  3    XIO 1    Enabled         1600         2343
FAN  4    XIO 2    Enabled         1600         2220
FAN  5       PS    Enabled         1349        30681

                              Advisory   Critical   Fault      Current      
Description       State       Temp       Temp       Temp       Temp       
----------------- ----------  ---------  ---------  ---------  ---------  
 0 NODE 0            Enabled    [Autofan Control]    80C/176F   31C/ 87F
 1 NODE 1            Enabled    [Autofan Control]    80C/176F   27C/ 80F
 2 NODE 2            Enabled    [Autofan Control]    80C/176F   25C/ 77F
 3 PIMM              Enabled    [Autofan Control]    80C/176F   38C/100F
 4 ODYSSEY           Enabled    [Autofan Control]    80C/176F   24C/ 75F
 5 BEDROCK           Enabled    [Autofan Control]    85C/185F   29C/ 84F


returning to console mode  001a01 console, <CTRL_T> to escape to L1

Sample POD/DEX/CAC Startup on Fuel

Fuel shares same underlying architecture as O3000 & Chimera, with the same L1 debug setting ("debug 0x10d"), to boot into POD/CAC mode:

Starting PROM Boot process
hubii_link_good: A-brick attached to module 001c01.
HUB at 0x0 attached as widget 0xa
001c01/0xa/xbow_arb: nasid= 0x0 xbow_base= 0x9200000000000000
001c01/0xa/xbow_arb: 622 master is 0xa
Check_master: link 10 is master
hubii_link_good: A-brick attached to module 001c01.
Check_master: link 10 is master
 
 
IP35 PROM SGI Version 6.211  built 04:16:18 PM Jan 25, 2008
  built for bedrock rev. 1.1 or greater
running in IP34 mode
Running in DDR mode
Local master CPU A revision: f42
PROM length: 0x168648, BSS length: 0xa7a0, flash count: 16
Configured bedrock clock: 200.0 MHz
Status of local IO: 0x1 0x3fc0fff6403
Bedrock Rev: 2, Module: 1 (001c01) from Sys Ctlr
On PROM entry: ERR_EPC=0xc00000001fc29684 (0xc00000001fc29684)
Configuring memory
Local memory configured: 4096 MB (premium)
*** Warning: System controller debug switches are non-zero (0x10d)
*** Diag level set to None (2)
*** Info level set to verbose
*** Boot stop requested at Global (2)
before reading NICHub NIC: 0x52275dad
SR1 set to 0x0000081698349000
SR0 set to 0x0000000052275dad
Testing/Initializing memory ...............             DONE
Copying PROM code to memory ...............             Copy PROM (0x9000000018000000) to RAM (0x9600000001a00000), len 0x168648
Done
DONE
Skipping secondary cache diags
CPU A switching stack into UALIAS and invalidating D-cache
CPU A switching into node 0 cached RAM
CPU A running cached
Initializing kldir.
Done initializing kldir.
Initializing klconfig.
init_klcfg: nasid 0 start 9600000000030000 size 10000
Done initializing klconfig.
Discovering local IO ......................             Check_master: link 10 is master
Check_master: link 10 is master
DONE
CPU A initialized subnode
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 5886 usec
Waiting for peers to complete discovery....             Discovery results:
ENTRY 0: HUB(52275dad)
    NASID=-1 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: NF
DONE
No other nodes present; becoming global master
Global master is entry 0, NIC 0x52275dad, /hw/rack/001/bay/01
Global master is /hw/rack/001/bay/01
Global barrier (line 4315)Global barrier passed.
Global barrier (line 4348)Global barrier passed.
Master System Topology Graph (pre-nasid_assign):
ENTRY 0: HUB(52275dad)
    NASID=-1 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: NF
Calculating NASIDs
num_routers is 0
Master System Topology Graph:
ENTRY 0: HUB(52275dad)
    NASID=0 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: NF
Distributing routing tables
Distributing NASIDs
*** NASID assigned to 0
CPU A switching to UALIAS
CPU A running in UALIAS
Changing node ID to 0
Global barrier (line 4823)Global barrier passed.
CPU A Flushing and invalidating caches
Global barrier (line 4928)Global barrier passed.
CPU A switching to node 0 cached RAM
CPU A running cached
Nasids in partition:  +0
Regions in partition:  +0
Intializing any CPUless nodes..............             Global barrier (line 7714)Global barrier passed.
Global barrier (line 7715)Global barrier passed.
DONE
Global barrier (line 5089)Global barrier passed.
hubii_link_good: A-brick attached to module 001c01.
Checking partitioning information .........             DONE
No other nodes present; becoming partition master
*** After partitioning ***
ENTRY 0: HUB(52275dad)
    NASID=0 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: FE
Erecting partition fences ................                        DONE
Update config for routers connected to hubs
Update config for hubs and hubless routers
CPU A flushing cache
check_router_cfg: nasid 0 is_voyager 0 check_cfg = 0
Global barrier (line 5300)Global barrier passed.
Nasids in partition:  +0
Regions in partition:  +0
A 000 001c01:
A 000 001c01: *** Entering POD mode on node 0
A 000 001c01: POD SysCt Cac>
escaping to L2 system controller
?-192.168.XXX.XXX-L2>debug 0
001a01:
debug switches set to 0x0000
 
re-entering system console mode (001a01 CPU0), <CTRL_T> to escape to L2
 
A 000 001c01: POD SysCt Cac> reset
A 000 001c01: Resetting the system...

Search for Fuel / O350 PROM Chip

The Octane, Fuel and O3000, O300 & O350 machines all have a flashable PROM chip that can be flashed and dumped using the irix "flash" command. There has been quite a bit if speculation on where the PROM chips are and what is the type. The reason is that the PROM can get invalidated by flash crash and also if you have Fuel flashed for higher speed, this is recorded in PROM chip and there does not appear to be anyway to reset this to work with lower speed CPU other than by putting in faster CPU and then down-flashing speed and only then replacing the faster CPU with slower model.

So is there a way to program / replace the PROM chip to support recovery of machines ?

Here is picture of Fuel system board, in midde right you can see the DALLAS DS1742W-120 (NVRAM and RTC) and next to it a small ATEM EEPROM chip:

Fuel System Board - 030-1707-005 Rev. B

And detail view of DALLAS and ATEM Chips:

Fuel DALLAS & ATEN Çhips

The ATEM chip is a: ATEM 116 AT2404C PC27, which is a 4K (512 x 8), 2.7 - 5.5 Volt EEPROM.

According to flash log the PROM is: 1,476,168 bytes of data.

So can you put this into 4K bits ?

The answer is no... as 4k bits is 512 bytes, which is much much small than reported PROM flash data size.


More Fuel POD/CAC to try to understand CPU Speed configuration

In trying to find out how Fuel CPU speed is controlled, have been looking are various POD/CAC command to see what they reveal..

?-192.168.XXX.XXX-L2>power up 
 
re-entering system console mode (001a01 CPU0), <CTRL_T> to escape to L2 
Starting PROM Boot process
hubii_link_good: A-brick attached to module 001c01.
HUB at 0x0 attached as widget 0xa
001c01/0xa/xbow_arb: nasid= 0x0 xbow_base= 0x9200000000000000
001c01/0xa/xbow_arb: 622 master is 0xa
Check_master: link 10 is master
hubii_link_good: A-brick attached to module 001c01.
Check_master: link 10 is master
 
 
IP35 PROM SGI Version 6.211  built 04:16:18 PM Jan 25, 2008
  built for bedrock rev. 1.1 or greater
running in IP34 mode
Running in DDR mode
Local master CPU A revision: f42
PROM length: 0x168648, BSS length: 0xa7a0, flash count: 16 
Configured bedrock clock: 200.0 MHz
Status of local IO: 0x1 0x3fc0fff6403
Bedrock Rev: 2, Module: 1 (001c01) from Sys Ctlr
On PROM entry: ERR_EPC=0xc00000001fc02ac0 (0xc00000001fc02ac0)
Configuring memory
Local memory configured: 4096 MB (premium)
*** Warning: System controller debug switches are non-zero (0x10d)
*** Diag level set to None (2)
*** Info level set to verbose
*** Boot stop requested at Global (2)
before reading NICHub NIC: 0x52275dad
SR1 set to 0x0000081698349000
SR0 set to 0x0000000052275dad
Testing/Initializing memory ...............             DONE
---
---  This section of diagnostics provide memory location where PROM is copied to
---   this is needed to do some memory snooping to see in RAM configuration data
---
Copying PROM code to memory ...............             Copy PROM (0x9000000018000000) to RAM (0x9600000001a00000), len 0x168648
Done
DONE
Skipping secondary cache diags
CPU A switching stack into UALIAS and invalidating D-cache
CPU A switching into node 0 cached RAM
CPU A running cached
Initializing kldir.
Done initializing kldir.
Initializing klconfig.
init_klcfg: nasid 0 start 9600000000030000 size 10000
Done initializing klconfig.
Discovering local IO ......................             Check_master: link 10 is master
Check_master: link 10 is master
DONE
CPU A initialized subnode
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 5894 usec
Waiting for peers to complete discovery....             Discovery results:
ENTRY 0: HUB(52275dad)
    NASID=-1 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: NF 
DONE
No other nodes present; becoming global master
Global master is entry 0, NIC 0x52275dad, /hw/rack/001/bay/01
Global master is /hw/rack/001/bay/01
Global barrier (line 4315)Global barrier passed.
Global barrier (line 4348)Global barrier passed.
Master System Topology Graph (pre-nasid_assign):
ENTRY 0: HUB(52275dad)
    NASID=-1 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: NF 
Calculating NASIDs
num_routers is 0
Master System Topology Graph:
ENTRY 0: HUB(52275dad)
    NASID=0 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: NF 
Distributing routing tables
Distributing NASIDs
*** NASID assigned to 0
CPU A switching to UALIAS
CPU A running in UALIAS
Changing node ID to 0
Global barrier (line 4823)Global barrier passed.
CPU A Flushing and invalidating caches
Global barrier (line 4928)Global barrier passed.
CPU A switching to node 0 cached RAM
CPU A running cached
Nasids in partition:  +0 
Regions in partition:  +0 
Intializing any CPUless nodes..............             Global barrier (line 7714)Global barrier passed.
Global barrier (line 7715)Global barrier passed.
DONE
Global barrier (line 5089)Global barrier passed.
hubii_link_good: A-brick attached to module 001c01.
Checking partitioning information .........             DONE
No other nodes present; becoming partition master
*** After partitioning ***
ENTRY 0: HUB(52275dad)
    NASID=0 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: FE 
Erecting partition fences ................                        DONE
Update config for routers connected to hubs
Update config for hubs and hubless routers
CPU A flushing cache
check_router_cfg: nasid 0 is_voyager 0 check_cfg = 0
Global barrier (line 5300)Global barrier passed.
Nasids in partition:  +0 
Regions in partition:  +0 
A 000 001c01:
A 000 001c01: *** Entering POD mode on node 0
---
--- Now lets look at the "ip27conf" are where has:
---   CPU Speed
---   HUB Speed
---
--- First check "asci marker" with: la
---   Then dump bytes with: lb
---
A 000 001c01: POD SysCt Cac> la 0x9600000001a00068 8
A 000 001c01: 9600000001a00068:    i    p    2    7    c    o    n    f  
A 000 001c01: POD SysCt Cac> lb 0x9600000001a00068 32 
A 000 001c01: 9600000001a00068: 69 70 32 37 63 6f 6e 66 00 00 00 00 2f af 08 00 
A 000 001c01: 9600000001a00078: 00 00 00 00 0b eb c2 00 00 00 00 00 00 13 12 d0 
A 000 001c01: POD SysCt Cac>

And an example of L1 debug flag impact on Fuel boot

Here is log of booting Fuel with debug = 0x10d & debug = 0x7890. This is log captured from L2 emulator...

---
--- Ok startup with shortest boot possible .. debug == 0x7890
---
001a01:
debug switches set to 0x7890
?-XXX.XXX.XXX.143-L2>l1 power up
?-XXX.XXX.XXX.143-L2>
entering system console mode (001a01 CPU0), <CTRL_T> to escape to L2
*** DIP switch 15 set. Will skip IO and NUMAlink discovery
 
 
IP35 PROM SGI Version 6.211  built 04:16:18 PM Jan 25, 2008
Running in DDR mode
*** Warning: System controller debug switches are non-zero (0x7890)
*** Boot stop requested at Local (1)
*** Giving up global master status
Testing/Initializing memory ...............             DONE
Copying PROM code to memory ...............             DONE
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 5886 usec
Waiting for peers to complete discovery....             DONE
No other nodes present; becoming global master
Global master is /hw/rack/001/bay/01
Intializing any CPUless nodes..............             DONE
Checking partitioning information .........             DONE
No other nodes present; becoming partition master
Suppressing error state display (system just powered on).
A 000 001c01:
A 000 001c01: *** Entering POD mode on node 0
A 000 001c01: POD SysCt Cac>
---
--- Ok now escape back to L2 and set the debug to 0x10d and
---   reboot the Fuel (via POD/CAC "reset")
--- This result in much more complete boot process and hence lots
---   more diagnostic output
---
escaping to L2 system controller
?-XXX.XXX.XXX.143-L2>debug 0x10d
001a01:
debug switches set to 0x010d
 
re-entering system console mode (001a01 CPU0), <CTRL_T> to escape to L2
 
A 000 001c01: POD SysCt Cac> reset
A 000 001c01: Resetting the system...
Starting PROM Boot process
hubii_link_good: A-brick attached to module 001c01.
HUB at 0x0 attached as widget 0xa
001c01/0xa/xbow_arb: nasid= 0x0 xbow_base= 0x9200000000000000
001c01/0xa/xbow_arb: 622 master is 0xa
Check_master: link 10 is master
hubii_link_good: A-brick attached to module 001c01.
Check_master: link 10 is master
 
 
IP35 PROM SGI Version 6.211  built 04:16:18 PM Jan 25, 2008
  built for bedrock rev. 1.1 or greater
running in IP34 mode
Running in DDR mode
Local master CPU A revision: f42
PROM length: 0x168648, BSS length: 0xa7a0, flash count: 16
Configured bedrock clock: 200.0 MHz
Status of local IO: 0x1 0x3fc0fff6403
Bedrock Rev: 2, Module: 1 (001c01) from Sys Ctlr
On PROM entry: ERR_EPC=0xffffffffbfc00300 (0xc00000001fc00300)
Configuring memory
Local memory configured: 4096 MB (premium)
*** Warning: System controller debug switches are non-zero (0x10d)
*** Diag level set to None (2)
*** Info level set to verbose
*** Boot stop requested at Global (2)
before reading NICHub NIC: 0x52275dad
SR1 set to 0x0000081698349000
SR0 set to 0x0000000052275dad
Testing/Initializing memory ...............             DONE
Copying PROM code to memory ...............             Copy PROM (0x9000000018000000) to RAM (0x9600000001a00000), len 0x168648
Done
DONE
Skipping secondary cache diags
CPU A switching stack into UALIAS and invalidating D-cache
CPU A switching into node 0 cached RAM
CPU A running cached
Initializing kldir.
Done initializing kldir.
Initializing klconfig.
init_klcfg: nasid 0 start 9600000000030000 size 10000
Done initializing klconfig.
Discovering local IO ......................             Check_master: link 10 is master
Check_master: link 10 is master
DONE
CPU A initialized subnode
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 5889 usec
Waiting for peers to complete discovery....             Discovery results:
ENTRY 0: HUB(52275dad)
    NASID=-1 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: NF
DONE
No other nodes present; becoming global master
Global master is entry 0, NIC 0x52275dad, /hw/rack/001/bay/01
Global master is /hw/rack/001/bay/01
Global barrier (line 4315)Global barrier passed.
Global barrier (line 4348)Global barrier passed.
Master System Topology Graph (pre-nasid_assign):
ENTRY 0: HUB(52275dad)
    NASID=-1 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: NF
Calculating NASIDs
num_routers is 0
Master System Topology Graph:
ENTRY 0: HUB(52275dad)
    NASID=0 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: NF
Distributing routing tables
Distributing NASIDs
*** NASID assigned to 0
CPU A switching to UALIAS
CPU A running in UALIAS
Changing node ID to 0
Global barrier (line 4823)Global barrier passed.
CPU A Flushing and invalidating caches
Global barrier (line 4928)Global barrier passed.
CPU A switching to node 0 cached RAM
CPU A running cached
Nasids in partition:  +0
Regions in partition:  +0
Intializing any CPUless nodes..............             Global barrier (line 7714)Global barrier passed.
Global barrier (line 7715)Global barrier passed.
DONE
Global barrier (line 5089)Global barrier passed.
hubii_link_good: A-brick attached to module 001c01.
Checking partitioning information .........             DONE
No other nodes present; becoming partition master
*** After partitioning ***
ENTRY 0: HUB(52275dad)
    NASID=0 Mod=1 Flg=0x9500000 PROM=6.211 Route=N/A
    MODULE=001c01 PARTITION=0 SPACE=RESET
    Port 1 connection: Not connected
    Port status: FE
Erecting partition fences ................                        DONE
Update config for routers connected to hubs
Update config for hubs and hubless routers
CPU A flushing cache
check_router_cfg: nasid 0 is_voyager 0 check_cfg = 0
Global barrier (line 5300)Global barrier passed.
Nasids in partition:  +0
Regions in partition:  +0
A 000 001c01:
A 000 001c01: *** Entering POD mode on node 0
A 000 001c01: POD SysCt Cac>

That all folks .....


NOTE: An O350 Chimera machine with Graphics reports as a "ChiBlade", hence the swords graphics, which is from: "The Complete History Of The Japanese Samurai Sword".