SGI NVRAM & Time Keeper Chips

DYI information on replacing/reviving NVRAM and RTC chips on SGI machines

SGI NVRAM & Time Keeper Chips

Status: August 2020 - I am having date issues with one O350 and NVRAM error on another, so need to look at doing something.

March 2022 - Replacement Dallas DS1742 now being built by multiple people, so testing these and providing link


With old Apple Macintosh's it is the slow leaking capacitors that is mostly responsible for killing then off. The required remediation if to get out the soldering iron and replace all the leaky capacitors on the system board. On Macs this treatment also extends to the capacitors in the power supply and for orginal PowerBook portables to the capacitors that are embedded in the display.

For old SGI (Silicon Graphics Incorporated) machines it is the Dallas NVRAM and Snaphat battery/oscillator chips which fail. These have enclosed batteries with an approximate 10 - 20 year life expectancy before needing replacement.

In the case of Dallas DS1742W-120 chip the problem is compounded by the fact the replacement Dallas chips are no longer manufactured and even if they were there are still potential problems. The issue with replacing the Dallas chip is that machines can be rendered unbootable, if the chip is replaced with one that has not been cleared or which has invalid data.

So here is a "tips" section to collect information on SGI timeer and NVRAM chip replacement.


Problem Cases

The Tezro, O350 and Fuel all have two independent RTC and battery backed RAM chips. These are for:

  • L1 - Clock and Configuration Data (as managed via L1)
  • Systems - Clock and NVRAM Data (as managed via PROM and OS)

These can both fail and failures have different and distinct behavior. Here is list of likely problems and cause:

  • "Preposterous Date" - system complains about preposterous date on boot and you are unable to login via GUI. To login get to PROM and boot into single user mode, then reset time and reboot. If you disconnect machine from power it will lose date again. Fix is to replace Snaphat Oscilator / Battery (see below)
  • "L1 Date Wrong" - if L1 date is wrong and it is not being maintained across power cycles, then DALLAS DS1742W-120 battery is flat. Solution is to replace DALLAS or put new battery on top (see below)
  • "L1 Serial Number Not Sticking" - if the machine keeps losing its serial number, which can result in failure to boot or reset via L2 on boot. This means your Dallas DS1742W-120 is flat. Solution is to replace DALLAS or put new battery on top (see below)
  • "Machine will not boot" - this could be for many reasons, but having a flat DALLAS with corrupted NVRAM can be a cause of this. In this case solution is to replace DALLAS or put new battery on top. You may be needed to "blank" the chip to force it to get auto-initialised (see below)

DALLAS DS1742 Replacement Options & Testing

If you are like most "users" it is likely that rather than wanting to: drill chips, build new electronic circuits or have to deal with people selling fake replacement chips that don't work; you just want to get (buy) something that works.

Having spent some considerable time opening up and working on how you might reverse engineer a DS1742 replacement, now two years down the track there are at least two replacement options available, from the SGI community:

  • IRIX Network & SGI User Group - cs7asm has created a drop in replacement, which I have tested in O35o & Numalink Router and others have tested in Fuel, Tezro & O300
  • SGI User Group / Glitchworks - is also reporting having been working on a replacement unit (still in prototype in March 2022), but likely to be available via their Tindie store soon.

There is also a potential replacement unit on the EEVBlog by cuebus, but it is not clear to me is this supports both RTC and full NVRAM logic.

So far I have had good experience with the cz7asm replacement and will likely order a Glitchworks one, as I had good experince with the Glitchworks DS1687-5 unit for my Octane2 (see below).

While it would be nice to have an open source electronics design (like SCSI2SD), having these options is huge improvement for the SGI community.

Read on if you want to design your own or drill battery an existing dead DS1742W.


Tezro, O350 & Fuel - DALLAS DS1742W-120

The Chimera based machines (Tezro, O350 & Fuel) have embedded L1 (Level 1) controller.  This is an independent computer (based on Motorola 68K variant), responsible for providing machine monitoring and control. The L1 is analagous to the BMC (Baseboard Management Controller) / IPMI (Intelligent Platform Management Interface) of an X86 (Intel) Server. The L1 has its own indpendent NVRAM and Clock provided by Dallas DS1742W-120.

Failure of the Dallas DS1742W-120 can render a Tezro, O350 & Fuel unbootable. To avoid this, when replacing a DS1742W you need to either: ensure the chip is cleared (this means it will get auto-initialised correctly) or copy valid data into it before putting it into the target machine.

The DS1724W-120 reached End of Life and was replaced by the DS1742W-120+ which is also no longer manufactored.

There are few things that need to be addressed when replacing / refreshing the DS1742W-120 chips:

  1. Getting alternate replacement or
  2. Providing an alternate way to power them to avoid using throw away replacements when the battery runs out
  3. Initialising the replacement chip before installing it

Some people have taken to drilling the chip and putting replaceable battery on top. To do this you have to know the chip pinouts.

For power these are:

  • 12 - Ground
  • 24 - Vcc

I got some replacement DS1742W-120+ from eBay and none of them appear to work, so I took the top off one to see if there was anything inside and to find where the battery was located.

Dallas DS1742W-120+ - this one did not seem to have battery

It turns out that there are fake Dallas chips out in the wild, here is photo of original  (left) & likely fake (right):

Smaller real Dallas (120) vs likely fake (120+)

NOTE: See section "Inside Fake (ebay) Dallas DS1742W-120+" below for internal view of the fake -120+.

Having opened up fake -120+ I turned my attention to real -120 taken from an SGI O350/Chimera machine.

Initially I sliced off the top right two-thirds of chip expecting to find battery there...

SGI Chimera sourced - DS1742W-120
DS1742W-120 with Top Sliced Off

Rather I found that this chip has much harder further expoxy encasing which is stamped with  nomenclature, in reverse arrow (<=) "M12C, 131, B". To see more I will have to slice off the rest of the top (see below).

Hard Inner Expoxy with Arrow (<=) pointing to Pin 1/24 (left) and stamped M12C, 131, B
DS1742W-120 with Top Slice Off

And the next layer.. show battery and oscilator

Dallas DS1742W-120 - down to oscillator & battery
Dallas DS1742W-120 - showing in closer detail

Battery:

  • -Negative - (Top) to internal contact near pin 12
  • +Positive - (Bottom) to internal contact near pin 13

And some more stripping...

Dallas DS1742W-120 - Battery Now Removable
Dallas DS1742W-120 - Stripped showing underlying surface mount chip & oscillator
Dallas DS1742W-120 - Stripped Pin 1 Bottom Left, Pin 12 Bottom Right

Identification stamp on chip is possibly (it is very hard to read):

  • DALLAS ?, ?, 04398 or 0439B, 276A0 - this is guess and any B could be an 8

Review of Maxim Real Time Clock Selector Guide from 2009 does not have any seperately available RTC IC chip that matches specification of the DS1742W. So replacement / revival options are:

  1. Restore existing DS1742W by removing battery and putting new replacement on top
  2. Restore exiting DS1742W by creating tophat PCB with oscillator & battery to replace the existing ones and wiring this to left and right hand side contacts
  3. Taking out the embedded IC and putting this onto newly designed carrier board with matching pin-outs
  4. Completely new design / build for backward compatible replacement (a first year uni digital electronics exercise)
  5. Create backward compatible design based on new DS174XP (Powercap) variation, which will require wiring lower address lines high to ensure RTC clock is addressed correctly

Option (1) is the simplest and can be easily done to get the SGI machines going while full replacement option (4) is tackled. Option (4) would make a good Open Source electronics candidate a'la SCSI2SD from "codesrc".

To extract & replace battery:

  • cut across top at around pin 6 and slice around outer shell from pin 6 all the way around to pin 19 and then take off top layer
  • to take out battery you could use dremel carefully, heat gun & Stanley knife or heated knife
  • then to get access to battery pins use dremel to make indent into pin 12/13 end of chip. The flat metal contacts for battery trace down to quite large contact point for soldering that are not so visible on photo, but are visible in below x-ray view

Looking at x-ray view (from james_s on eevblog) and physical chip, it looks like the 4 outer pins of embedded IC are not wired to anything. But you can clearly see quite large battery connect points on left side in the x-ray view (this has pin 1 at top right).

Dallas DS1472W-120 - X-Ray from james_s @ eevblog (Pin 1 top right)

I hope this helps people with Dallas DS1472W chip revivals.


Having got a ready for rebuild Dallas, the next step is to test the chip to make sure it is working. For this you need an EEPROM programmer. I have used XGecu TL866A, which is no longer made. It was replacd by XGecu TL866II Plus. Both of these come with a libraray of pre-defined chip profiles and a Windows application that you can used to read and write to chip.

The profile to use for the Dallas DS1742W is the DS1220 chip. This has the same 2K x 8 bit NVRAN as the DS1742W, but does not include the Real Time Clock (RTC). Reading and writing the RTC is still possible with this profile, as the clock is just addressed like NVRAM with specific 07F8-07FF address ranges.

To verify your chip, see if you can successfully read a known good chip and save this image file and then try to write it back to unknown quality chip.

TL866 EEPROM Programmer - Windows App

Be aware that writing to the clock area can change the behavior of clock (so make sure you are aware of safe values). If you want to restore a chip with data from another one then I suggest you leave clock setting unchanged, by not writing into the clock address range. If you do write to clock range as part of a dump and restore then it is likely you will get a verification error as the clock data will change between write and read back operations.

Various testing results in following testing sections.


Testing Dallas DS1742W with Fuel

I have run the following tests with DS1742W with Fuel to help understand how the chip can be initialised,

Test 1: Copy NVRAM from one Dallas to another

For this test I copied known good chip contents into another chip, using the progammer and then rebooted Fuel with the new copied (backup chip):

  • a: Pull and dump DS1742W-120 from working Fuel and then write this back onto another DS1742W-120 that was out of Numalink or ATI Graphics Chassis (Onyx4 UltimateVision).
  • b. Install backup DS1742W-120 back into Fuel and reboot
  • c. Result was all good, as per Khral (Irix Network)  testing, Fuel booted up and beyond having to first boot into single user mode to reset the time (due to failing Snaphat), it booted up fine.

NOTE: This was using a Dallas chip that I had previously tried in the Fuel without success. So this proves that invalid data content in NVRAM will cause Fuel to fail to boot.

Test 2: - Try "eeprom Fuel write default" via L2

In test I tried the Fuel L1 "eeprom Fuel write default" command to see what a newly initialised Dallas looks like. This was potentially risky so I made sure I had backup on chip via Test 1 before doing this:

  • a. Using the backup Dallas and L2 controller via Fuel L1 USB port try to do a "l1 eeprom Fuel write default"
  • b. Result: Error saying EEPROM already contains data:
?-192.168.XXX.XXX-L2>l1 help eeprom
001a01:
eeprom
        show brick eeprom data.
eeprom <exp> <exp> <exp>
        show brick eeprom data at <eeprom> <offset> <length>.
eeprom Fuel write default
        write standard Fuel EEPROM data to MAC EEPROM
?-192.168.XXX.XXX-L2>l1 eeprom Fuel write default
001a01:
MAC EEPROM already contains valid data
?-192.168.XXX.XXX-L2>l1 eeprom
001a01:
NODE (CH)
00 20 01 06 00 00 00 d9
NODE (CIA)
00 02 17 c2 4e 41 c2 4e 41 c1 00 00 00 00 00 84
NODE (BIA)
00 09 00 17 8d 38 c9 53 4f 4c 45 43 54 52 4f 4e
c4 49 50 33 34 c6 4d 53 4d 30 31 39 cc 30 33 30
5f 31 37 30 37 5f 30 30 33 00 c2 5f 48 01 02 c2
30 30 04 ff ff ff ff 04 ff ff ff ff 04 ff ff ff
ff c1 00 00 00 00 00 ba
NODE (PIA), no data available (1)
NODE (IUA)
00 03 30 01 00 03 00 00 00 00 00 08 00 01 00 0c
0b 45 01 03 00 20 1f 00 01 02 03 4a 00 03 4a 06
50 26 06 3f fc 06 3f b7 06 08 88 06 02 46 06 00
ec 18 00 6b 00 00 00 72
MAC (CH)
00 00 01 03 00 00 00 fc
MAC (CIA)
00 02 17 c2 4e 41 c2 4e 41 c1 00 00 00 00 00 84
MAC (BIA)
00 0b 00 00 00 00 c2 4e 41 cb 4d 41 43 20 41 44
44 52 45 53 53 c6 4e 41 20 20 20 20 cc 4e 41 20
20 20 20 20 20 20 20 20 20 00 c2 4e 41 01 02 c2
4e 41 04 ff ff ff ff 04 ff ff ff ff 04 ff ff ff
ff cc 30 38 30 30 36 39 31 30 35 32 41 31 00 c1
00 00 00 00 00 00 00 cf
MAC (PIA), no data available (1)
MAC (IUA), no data available (1)
PIMM (CH)
00 20 00 01 00 00 00 df
PIMM (CIA), no data available (1)
PIMM (BIA)
00 07 00 5e 65 38 c9 53 4f 4c 45 43 54 52 4f 4e
c8 49 50 33 34 50 49 4d 4d c6 4d 53 4b 30 30 37
cc 30 33 30 5f 31 38 33 36 5f 30 30 31 00 c2 5f
43 01 02 c2 30 30 c1 6f
PIMM (PIA), no data available (1)
PIMM (IUA)
00 01 60 01 00 01 00 04 00 14 00 2e 00 2e 00 3e
00 4a 00 4a 00 4a 00 4a 00 4a 00 4a 00 4a 00 4a
00 3e 00 4a 00 4a 00 4a 00 4a 00 4a 00 4a 00 4a
00 4a 04 44 01 00 00 00 0f 00 01 01 01 70 00 01
70 00 04 03 01 01 20 07 01 00 46 00 46 00 04 03
01 01 20 07 01 00 33 00 33 00 00 00 00 00 00 00
00 00 00 00 00 00 00 7d
XIO (CH)
00 20 00 01 00 00 00 df
XIO (CIA), no data available (1)
XIO (BIA)
00 09 00 bf 4e 35 c9 53 4f 4c 45 43 54 52 4f 4e
c6 41 53 54 4f 44 59 c6 4d 4e 54 39 31 34 cc 30
33 30 5f 31 37 32 36 5f 30 30 33 00 c2 5f 45 01
02 c2 30 30 04 ff ff ff ff 04 ff ff ff ff 04 ff
ff ff ff c1 00 00 00 7a
XIO (PIA), no data available (1)
XIO (IUA)
00 01 33 01 01 01 01 04 01 06 03 00 09 02 0f 01
13 01 16 01 19 01 24 00 00 00 00 00 00 04 03 01
01 20 07 01 05 ff 00 ff 04 42 01 01 00 20 03 00
01 02 01 4f 00 01 4f f4
DIMM 0 (JEDEC-SPD)
80 08 07 0d 0a 02 48 00 04 a0 80 02 82 08 08 01
0e 04 0c 01 02 26 00 00 00 00 00 50 3c 50 30 40
b0 b0 60 60 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c
ce 00 00 00 00 00 00 00 01 4d 33 20 34 37 4c 36
34 32 33 44 54 32 2d 43 41 30 20 32 44 02 53 0c
b8 47 00 00 57 4d 42 31 30 30 38 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DIMM 2 (JEDEC-SPD)
80 08 07 0d 0a 02 48 00 04 a0 80 02 82 08 08 01
0e 04 0c 01 02 26 00 00 00 00 00 50 3c 50 30 40
b0 b0 60 60 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c
ce 00 00 00 00 00 00 00 01 4d 33 20 34 36 4c 36
34 32 33 43 54 32 2d 43 41 30 20 32 43 02 29 0c
0f 6b 00 00 57 38 42 30 32 30 32 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DIMM 1 (JEDEC-SPD)
80 08 07 0d 0a 02 48 00 04 a0 80 02 82 08 08 01
0e 04 0c 01 02 26 00 00 00 00 00 50 3c 50 30 40
b0 b0 60 60 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c
ce 00 00 00 00 00 00 00 01 4d 33 20 34 37 4c 36
34 32 33 44 54 32 2d 43 41 30 20 32 44 02 53 0c
a8 46 00 00 57 4d 42 31 30 30 38 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DIMM 3 (JEDEC-SPD)
80 08 07 0d 0a 02 48 00 04 a0 80 02 82 08 08 01
0e 04 0c 01 02 26 00 00 00 00 00 50 3c 50 30 40
b0 b0 60 60 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c
ce 00 00 00 00 00 00 00 01 4d 33 20 34 36 4c 36
34 32 33 43 54 32 2d 43 41 30 20 32 43 02 29 0c
0e 6b 00 00 57 38 42 30 32 30 32 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
?-192.168.XXX.XXX-L2>

So Fuel L1 will not allow you to re-initialise an already initialised Dallas, all that nervous worrying about potentially bricking the Fuel was not needed ;-) .

Test 3: Put a cleared Dallas into Fuel and try the "eeprom Fuel write default"

For this test I put a cleared Dallas into the Fuel to see if it would boot and I could initialise the chip.

  • a. Using programmer clear Dallas by programming with "0F" for all ram with exception of RTC clock data area
  • b. Put cleared Dallas into Fuel and monitor via L2
  • c. Via L2 try: "l1 eeprom Fuel write default"
  • d. Result: get same error as Test 2 ... which is unexpected, having just cleared the chip.
  • e. Do a check of EEPROM via l2 and surprising it looks like it has valid data including a serial no, it shows that it has already got valid MAC based serial number:
INFO: opened USB device at b1;p0;d3 (/dev/sgil1_0)
001a01 INFO: 001a01 auto power up aborted, L2 detected.
 
?-192.168.XXX.XXX-L2>l1 serial
001a01:
BSN: MSM019    SSN: 08:00:69:10:52:A1    Time: 02/07/2106 06:28:15    Security: OFF
Public Key data in EEPROM is invalid
?-192.168.XXX.XXX-L2>l1 eeprom Fuel write default
001a01:
MAC EEPROM already contains valid data
?-192.168.XXX.XXX-L2>l1 eeprom
001a01:
NODE (CH)
00 20 01 06 00 00 00 d9
NODE (CIA)
00 02 17 c2 4e 41 c2 4e 41 c1 00 00 00 00 00 84
NODE (BIA)
00 09 00 17 8d 38 c9 53 4f 4c 45 43 54 52 4f 4e
c4 49 50 33 34 c6 4d 53 4d 30 31 39 cc 30 33 30
5f 31 37 30 37 5f 30 30 33 00 c2 5f 48 01 02 c2
30 30 04 ff ff ff ff 04 ff ff ff ff 04 ff ff ff
ff c1 00 00 00 00 00 ba
NODE (PIA), no data available (1)
NODE (IUA)
00 03 30 01 00 03 00 00 00 00 00 08 00 01 00 0c
0b 45 01 03 00 20 1f 00 01 02 03 4a 00 03 4a 06
50 26 06 3f fc 06 3f b7 06 08 88 06 02 46 06 00
ec 18 00 6b 00 00 00 72
MAC (CH)
00 00 01 03 00 00 00 fc
MAC (CIA)
00 02 17 c2 4e 41 c2 4e 41 c1 00 00 00 00 00 84
MAC (BIA)
00 0b 00 00 00 00 c2 4e 41 cb 4d 41 43 20 41 44
44 52 45 53 53 c6 4e 41 20 20 20 20 cc 4e 41 20
20 20 20 20 20 20 20 20 20 00 c2 4e 41 01 02 c2
4e 41 04 ff ff ff ff 04 ff ff ff ff 04 ff ff ff
ff cc 30 38 30 30 36 39 31 30 35 32 41 31 00 c1
00 00 00 00 00 00 00 cf
MAC (PIA), no data available (1)
MAC (IUA), no data available (1)
PIMM (CH)
00 20 00 01 00 00 00 df
PIMM (CIA), no data available (1)
PIMM (BIA)
00 07 00 5e 65 38 c9 53 4f 4c 45 43 54 52 4f 4e
c8 49 50 33 34 50 49 4d 4d c6 4d 53 4b 30 30 37
cc 30 33 30 5f 31 38 33 36 5f 30 30 31 00 c2 5f
43 01 02 c2 30 30 c1 6f
PIMM (PIA), no data available (1)
PIMM (IUA)
00 01 60 01 00 01 00 04 00 14 00 2e 00 2e 00 3e
00 4a 00 4a 00 4a 00 4a 00 4a 00 4a 00 4a 00 4a
00 3e 00 4a 00 4a 00 4a 00 4a 00 4a 00 4a 00 4a
00 4a 04 44 01 00 00 00 0f 00 01 01 01 70 00 01
70 00 04 03 01 01 20 07 01 00 46 00 46 00 04 03
01 01 20 07 01 00 33 00 33 00 00 00 00 00 00 00
00 00 00 00 00 00 00 7d
XIO (CH)
00 20 00 01 00 00 00 df
XIO (CIA), no data available (1)
XIO (BIA)
00 09 00 bf 4e 35 c9 53 4f 4c 45 43 54 52 4f 4e
c6 41 53 54 4f 44 59 c6 4d 4e 54 39 31 34 cc 30
33 30 5f 31 37 32 36 5f 30 30 33 00 c2 5f 45 01
02 c2 30 30 04 ff ff ff ff 04 ff ff ff ff 04 ff
ff ff ff c1 00 00 00 7a
XIO (PIA), no data available (1)
XIO (IUA)
00 01 33 01 01 01 01 04 01 06 03 00 09 02 0f 01
13 01 16 01 19 01 24 00 00 00 00 00 00 04 03 01
01 20 07 01 05 ff 00 ff 04 42 01 01 00 20 03 00
01 02 01 4f 00 01 4f f4
DIMM 0 (JEDEC-SPD)
80 08 07 0d 0a 02 48 00 04 a0 80 02 82 08 08 01
0e 04 0c 01 02 26 00 00 00 00 00 50 3c 50 30 40
b0 b0 60 60 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c
ce 00 00 00 00 00 00 00 01 4d 33 20 34 37 4c 36
34 32 33 44 54 32 2d 43 41 30 20 32 44 02 53 0c
b8 47 00 00 57 4d 42 31 30 30 38 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DIMM 2 (JEDEC-SPD)
80 08 07 0d 0a 02 48 00 04 a0 80 02 82 08 08 01
0e 04 0c 01 02 26 00 00 00 00 00 50 3c 50 30 40
b0 b0 60 60 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c
ce 00 00 00 00 00 00 00 01 4d 33 20 34 36 4c 36
34 32 33 43 54 32 2d 43 41 30 20 32 43 02 29 0c
0f 6b 00 00 57 38 42 30 32 30 32 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DIMM 1 (JEDEC-SPD)
80 08 07 0d 0a 02 48 00 04 a0 80 02 82 08 08 01
0e 04 0c 01 02 26 00 00 00 00 00 50 3c 50 30 40
b0 b0 60 60 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c
ce 00 00 00 00 00 00 00 01 4d 33 20 34 37 4c 36
34 32 33 44 54 32 2d 43 41 30 20 32 44 02 53 0c
a8 46 00 00 57 4d 42 31 30 30 38 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
DIMM 3 (JEDEC-SPD)
80 08 07 0d 0a 02 48 00 04 a0 80 02 82 08 08 01
0e 04 0c 01 02 26 00 00 00 00 00 50 3c 50 30 40
b0 b0 60 60 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c
ce 00 00 00 00 00 00 00 01 4d 33 20 34 36 4c 36
34 32 33 43 54 32 2d 43 41 30 20 32 43 02 29 0c
0e 6b 00 00 57 38 42 30 32 30 32 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
?-192.168.XXX.XXX-L2>l1 serial all
001a01:
 
Data                            Location      Value
------------------------------  ------------  --------
Local System Serial Number      EEPROM        08:00:69:10:52:A1
Local Brick Serial Number       EEPROM        MSM019
Reference Brick Serial Number   NVRAM         MSM019
 
 
EEPROM      Product Name    Serial         Part Number           Rev  T/W
----------  --------------  -------------  --------------------  ---  ------
NODE        IP34            MSM019         030_1707_003          H    00
MAC         MAC ADDRESS     NA             NA                    NA   NA
PIMM        IP34PIMM        MSK007         030_1836_001          C    00
XIO         ASTODY          MNT914         030_1726_003          E    00
 
EEPROM     JEDEC-SPD Info           Part Number        Rev  Speed  SGI
---------- ------------------------ ------------------ ---- ------ --------
DIMM 0     CE000000000000000CB84700 M3 47L6423DT2-CA0   2D   10.0  N/A
DIMM 2     CE000000000000000C0F6B00 M3 46L6423CT2-CA0   2C   10.0  N/A
DIMM 1     CE000000000000000CA84600 M3 47L6423DT2-CA0   2D   10.0  N/A
DIMM 3     CE000000000000000C0E6B00 M3 46L6423CT2-CA0   2C   10.0  N/A
 
?-192.168.XXX.XXX-L2>l1 date
001a01:
02/07/2106 06:28:15
?-192.168.XXX.XXX-L2>INFO: closed connection to 001a01
INFO: opened USB device at b1;p0;d4 (/dev/sgil1_0)
INFO: closed connection to 001a01
INFO: opened USB device at b1;p0;d5 (/dev/sgil1_0)
 
?-192.168.XXX.XXX-L2>l1 power
001a01:
Supply          State Voltage    Margin  Value
--------------  ----- ---------  ------- -----
           12V     on   12.063V      N/A
        12V IO     NC   12.125V      N/A
            5V     NC    5.044V      N/A
          3.3V     NC    3.337V  default     0
          2.5V     on    2.470V  default     0
          1.5V     NC    1.466V  default     0
        5V AUX     NC    5.122V      N/A
      3.3V AUX     NC    3.285V      N/A
 PIMM 12V BIAS     NC   12.063V      N/A
          SRAM     NC    2.509V  default     0
          VCPU     on    1.593V  default   128
     PIMM 1.5V     NC    1.495V  default     0
 PIMM 3.3V AUX     NC    3.268V      N/A
   PIMM 5V AUX     NC    5.096V      N/A
  XIO 12V BIAS     NC   12.000V      N/A
        XIO 5V     NC    5.044V      N/A
      XIO 2.5V     on    2.457V  default     0
  XIO 3.3V AUX     NC    3.285V      N/A
?-192.168.XXX.XXX-L2>l1 log
001a01:
02/07/06 06:28:15 checksum Error - common header initialized
02/07/06 06:28:15 nvram checksum error - initializing core data.
02/07/06 06:28:15 nvram checksum error - initializing extended data.
02/07/06 06:28:15 nvram checksum error - log pointers invalid, log pointers reset
02/07/06 06:28:15 L1 booting 1.48.1
02/07/06 06:28:15 NVRAM doesn't match EEPROM
02/07/06 06:28:15 USB0: waiting on open
02/07/06 06:28:15 auto power up countdown initiated
02/07/06 06:28:15 USB0: opened
02/07/06 06:28:15 USB0: registered as remote
02/07/06 06:28:15 USB0: registered for events
02/07/06 06:28:15 auto power up aborted, L2 detected (0)
02/07/06 06:28:15 G:ffffffff   P:ffffffff  Y:ffffffff
02/07/06 06:28:15 USB-R: USB:bus was reset
02/07/06 06:28:15 UNREG: 30004af0 0 7
02/07/06 06:28:15 USB0: unregistered
02/07/06 06:28:15 USB0-R: IRouter:read failed - read error
02/07/06 06:28:15 USB0: waiting on open
02/07/06 06:28:15 USB0: opened
02/07/06 06:28:15 USB0: registered as remote
02/07/06 06:28:15 USB0: registered for events
02/07/06 06:28:15 USB-R: USB:bus was reset
02/07/06 06:28:15 UNREG: 30004af0 0 7
02/07/06 06:28:15 USB0: unregistered
02/07/06 06:28:15 USB0-R: IRouter:read failed - read error
02/07/06 06:28:15 USB0: waiting on open
02/07/06 06:28:15 USB0: opened
02/07/06 06:28:15 USB0: registered as remote
02/07/06 06:28:15 USB0: registered for events
02/07/06 06:28:15 power up (PANEL)
02/07/06 06:28:15 reset again MIPS
02/07/06 06:28:15 power down (PANEL)
02/07/06 06:28:15 power up (PANEL)
02/07/06 06:28:15 reset again MIPS
?-192.168.XXX.XXX-L2>
  • f. A check of the L1 log (see above "l1 log") found that the Fuel had automatically initialised the "new" chip.

So I rebooted Fuel and aside from same issue, of having to first go into single user mode to reset the Snaphat clock the machine rebooted perfectly once this was done.

NOTE: I also had to reset the Dallas Clock setting via L2, as time was obviously wrong.

Summary of Testing with Fuel

So my conclusion is that there is no need to ever use the fuel: "eeprom Fuel write default" L1 command as the Fuel appears to do the Dallas initialisation perfectly well anyway if you put a cleared / clean Dallas into it. It maybe that the Fuel will let you run this command if you are putting a Dallas into it that has come from other machine (such as Tezro).

So Fuel appears to be a valid way to initialise chips, but alternative of just copying a valid Fuel configuration via EEPROM programmer is an option as well.


Numalink Routers have been one of the most frustrating devices to get working as they typically have Serial Security enabled.  The result is that the serial number gets locked on initial configuration and changing the serial has involved process of swapping Dallas chips from Tezro or other machine into Numalink router and then doing startup in a particular sequence to allow serial to be reset.

So I wanted to test the result with a cleared / uninitialised Dallas, as this would be simpler and more readily repeated than rather complex way that I got my first Numalink Router going.

In this test I first cleared the Dallas with programmer (filled with "0F') and then powered on the L2 connected Numalink Router. At each step along the way I rechecked the serial status to see what the impact of configuration change was.

  • a. Clear DS1742W with programmer by filling with "0F" and put cleared ship into Numalink router which is connected to L2
  • b. Power up Numalink Router and do config check and set the Rack / Slot details and reboot L1
  • c. Check serial on reboot and then try L1 "let the carnage begin" and reboot and recheck serial
  • d. Check that Numalink Router can be powered up / down via L1 and that serial clear works
SGI L2 Controller
Current L2 version: 1.62.0
 
INFO: opened USB control /dev/sgil1_cs
INFO: SMP listening on port: 8001
INFO: SMP listening on port: 8003
?-192.168.XXX.XXX-L2>INFO: opened USB device at b1;p2/0;d3 (/dev/sgil1_0)
 
?-192.168.XXX.XXX-L2>config
L2 192.168.XXX.XXX: - ---- (no rack ID set) (LOCAL)
L1 192.168.XXX.XXX:0:0   - ---r-- (no rack and slot ID set)
?-192.168.XXX.XXX-L2>192.168.XXX.XXX:0:0 brick rackslot 1 7
000r00:
brick rack set to 001 (takes effect on next L1 reboot/power cycle)
brick slot set to  07 (takes effect on next L1 reboot/power cycle)
?-192.168.XXX.XXX-L2>192.168.XXX.XXX:0:0 reboot_l1
?-192.168.XXX.XXX-L2>INFO: closed connection to 000r00
INFO: opened USB device at b1;p2/0;d4 (/dev/sgil1_0)
 
?-192.168.XXX.XXX-L2>config
L2 192.168.XXX.XXX: - ---- (no rack ID set) (LOCAL)
L1 192.168.XXX.XXX:0:0   - 001r07
?-192.168.XXX.XXX-L2>l1 ver
001r07:
L1 1.42.9 (Image B), Built 05/19/2006 09:08:12    [Base 1MB image]
?-192.168.XXX.XXX-L2>l1 serial
001r07:
BSN: NYY856    SSN:     Time: 10/31/2020 18:20:33
?-192.168.XXX.XXX-L2>serial all
001r07:
 
Data                            Location      Value
------------------------------  ------------  --------
Local System Serial Number      NVRAM          not set
Reference System Serial Number  NVRAM
Local Brick Serial Number       EEPROM        NYY856
Reference Brick Serial Number   NVRAM         NYY856
 
 
EEPROM      Product Name    Serial         Part Number           Rev  T/W
----------  --------------  -------------  --------------------  ---  ------
POWER       RPWR            NYY856         030_1631_004          C    00
LOGIC       ROUTER          NXA065         030_1634_004          C    00
 
?-192.168.XXX.XXX-L2>l1 let the carnage begin
?-192.168.XXX.XXX-L2>l1 serial
001r07:
BSN: NYY856    SSN:     Time: 10/31/2020 18:21:09    Security: OFF
?-192.168.XXX.XXX-L2>l1 serial clear
?-192.168.XXX.XXX-L2>l1 serial
001r07:
BSN: NYY856    SSN: L0000000    Time: 10/31/2020 18:21:30    Security: OFF
?-192.168.XXX.XXX-L2>power up
?-192.168.XXX.XXX-L2>power down
?-192.168.XXX.XXX-L2>l1 log
001r07:
10/31/20 18:16:44 checksum Error - common header initialized
10/31/20 18:16:44 nvram checksum error - initializing core data.
10/31/20 18:16:45 nvram checksum error - initializing extended data.
10/31/20 18:16:45 nvram checksum error - log pointers invalid, log pointers
reset
10/31/20 18:16:44 L1 booting 1.42.9
10/31/20 18:16:46 USB0: waiting on open
10/31/20 18:16:47 USB0: opened
10/31/20 18:16:46 USB0: registered for events
10/31/20 18:18:25 L1 booting 1.42.9
10/31/20 18:18:29 USB0: waiting on open
10/31/20 18:18:28 USB0: opened
10/31/20 18:18:28 USB0: registered for events
10/31/20 18:22:10 power up hold (COMMAND)
10/31/20 18:22:13 reset (COMMAND)
10/31/20 18:22:26 power down (COMMAND)
10/31/20 18:22:31 FAN 0 warning limit reached @ 0 RPM.
10/31/20 18:22:31 Environmental redundancy lost.
10/31/20 18:24:17 USB-R: USB:bus was reset
10/31/20 18:24:17 UNREG: 30004a00 0 7
10/31/20 18:24:16 USB0: unregistered
10/31/20 18:24:19 USB0-R: IRouter:read failed - read error
10/31/20 18:24:19 USB0: waiting on open
10/31/20 18:44:57 L1 booting 1.42.9
10/31/20 18:44:59 USB0: waiting on open
10/31/20 18:45:07 USB0: opened
10/31/20 18:45:07 USB0: registered for events
?-192.168.XXX.XXX-L2>

Like the Fuel the Numalink Router will start up L1 when a cleared / unitialised chip is installed and will automtically initialise the chip.  As the L1 version I have installed includes "serial security" this also results in serial security being enabled by default.

Using the undocumented L1 "let the carnage begin" command, you can disable the "serial security" and clear the serial.

Once the serial is cleared you can power the router up and down via L1. If you now plug router into computer / L2 with serial number configured the Numalink Router should adopt the serial number of computer / L2.

Now you can start building up your ultimate Numalink'ed SGI machine, but be aware that it's performance is likely below that of high core count Intel 2U dual CPU server ;-) .

NOTE: Old Nekochan thread where undocumented command to disable Numalink security was found


Testing Dallas DS1742W with O350

As per Fuel & Numalink testing, put a clear / unitiatialised Dallas chip into O350 node and power up and observe behavior. In this test I run against L2 with valid Serial Number set as I wanted machine to return it is correct configuration, so it can be put back into an existing Numalink'ed O350 machine.

Test 1: Put a cleared Dallas into O350 Chassis

In this test I first cleared the Dallas with programmer (filled with "0F') and then powered on the the L2 connected O350 chassis. Unlike the Numalink Router test I ensured that L2 has correctly configured serial number so that if things all went well then it would reconfigure the O350 serial.

  • a. Clear DS1742W with programmer by filling with "0F" and put cleared ship into O350 Chassis which is connected to L2
  • b. Power up O350 and do config check and set the Rack / Slot details and reboot L1
  • c. Check serial on reboot and test to see if machine can be booted
SGI L2 Controller
Current L2 version: 1.62.0
 
INFO: opened USB control /dev/sgil1_cs
INFO: SMP listening on port: 8001
INFO: SMP listening on port: 8003
M200XXXX-192.168.XXX.XXX-L2>INFO: opened USB device at b1;p0;d2 (/dev/sgil1_0)
000?00 INFO: System serial number reassigned (8000) to M200XXXX from attached L2.
 
M200XXXX-192.168.XXX.XXX-L2>config
L2 192.168.XXX.XXX: - ---- (no rack ID set) (LOCAL)
L1 192.168.XXX.XXX:0:0   - ---c-- (no rack and slot ID set)
M200XXXX-192.168.XXX.XXX-L2>192.168.XXX.XXX:0:0 brick rackslot 1 7
000c00:
brick rack set to 001 (takes effect on next L1 reboot/power cycle)
brick slot set to  07 (takes effect on next L1 reboot/power cycle)
M200XXXX-192.168.XXX.XXX-L2>192.168.XXX.XXX:0:0 reboot_l1
M200XXXX-192.168.XXX.XXX-L2>INFO: closed connection to 000c00
INFO: opened USB device at b1;p0;d3 (/dev/sgil1_0)
config
L2 192.168.XXX.XXX: - ---- (no rack ID set) (LOCAL)
L1 192.168.XXX.XXX:0:0   - 001c07
M200XXXX-192.168.XXX.XXX-L2>l1 serial
001c07:
BSN: NAL684    SSN: M200XXXX    Time: 11/01/2020 23:43:55
M200XXXX-192.168.XXX.XXX-L2>l1 log
001c07:
11/01/20 23:41:34 checksum Error - common header initialized
11/01/20 23:41:34 nvram checksum error - initializing core data.
11/01/20 23:41:34 nvram checksum error - initializing extended data.
11/01/20 23:41:34 nvram checksum error - log pointers invalid, log pointers reset
11/01/20 23:41:34 L1 booting 1.44.0
11/01/20 23:41:34 ChiServ IP59
11/01/20 23:41:34 Checking for Type
11/01/20 23:41:34  -- ChiServ Type set
11/01/20 23:41:36 ** fixing invalid SSN value
11/01/20 23:41:36 ** fixing BSN mismatch
11/01/20 23:41:36 USB0: waiting on open
11/01/20 23:41:41 USB0: opened
11/01/20 23:41:41 USB0: registered for events
11/01/20 23:41:42 INFO: System serial number reassigned (8000) to M200XXXX from attached L2.
11/01/20 23:43:37 L1 booting 1.44.0
11/01/20 23:43:37 ChiServ IP59
11/01/20 23:43:37 Checking for Type
11/01/20 23:43:37  -- ChiServ Type set
11/01/20 23:43:40 USB0: waiting on open
11/01/20 23:43:41 USB0: opened
11/01/20 23:43:41 USB0: registered for events
M200XXXX-192.168.XXX.XXX-L2>serial all
001c07:
 
Data                            Location      Value
------------------------------  ------------  --------
Local System Serial Number      NVRAM         M200XXXX
Reference System Serial Number  Attached L2   M200XXXX
Local Brick Serial Number       EEPROM        NAL684
Reference Brick Serial Number   NVRAM         NAL684
 
 
EEPROM      Product Name    Serial         Part Number           Rev  T/W
----------  --------------  -------------  --------------------  ---  ------
INTERFACE   2U_INT_53       NAL684         030_1809_003          B    00
IO9         IO9             NAJ955         030_1771_005          A    00
ODYSSEY     no hardware detected
RISER       2U_RISER        NBD099         030_1808_005          A    00
NODE        IP59_4CPU       RBS625         030_1989_003          C    00
SNOWBALL    no hardware detected
PS 1        no hardware detected
PS 2        DPS-500EBE      XPD0513004325  060-0178-003          S4
 
EEPROM     JEDEC-SPD Info           Part Number        Rev  Speed  SGI
---------- ------------------------ ------------------ ---- ------ --------
DIMM 0     CE000000000000000CDBD500 M3 46L2820ET3-CA0   3E   10.0  N/A
DIMM 2     CE000000000000000CD3D500 M3 46L2820ET3-CA0   3E   10.0  N/A
DIMM 4     CE000000000000000C209800 M3 46L2820DT2-CA0   2D   10.0  N/A
DIMM 6     CE000000000000000C2B9800 M3 46L2820DT2-CA0   2D   10.0  N/A
DIMM 1     CE000000000000000C549800 M3 46L2820DT2-CA0   2D   10.0  N/A
DIMM 3     CE000000000000000CDFD500 M3 46L2820ET3-CA0   3E   10.0  N/A
DIMM 5     CE000000000000000CD0D500 M3 46L2820ET3-CA0   3E   10.0  N/A
DIMM 7     CE000000000000000C379800 M3 46L2820DT2-CA0   2D   10.0  N/A
 
M200XXXX-192.168.XXX.XXX-L2>power up
001c07 INFO: PIMM type changed, setting the default voltage margins
M200XXXX-192.168.XXX.XXX-L2>
M200XXXX-192.168.XXX.XXX-L2>
entering system console mode (001c07 CPU0), <CTRL_T> to escape to L2
Starting PROM Boot process
 
 
IP35 PROM SGI Version 6.210  built 02:33:51 PM Aug 26, 2004
Testing/Initializing memory ...............             DONE
Copying PROM code to memory ...............             DONE
Discovering local IO ......................             DONE
Discovering NUMAlink connectivity .........
Local hub NUMAlink is down.
*** Local network link down
DONE
Found 1 objects (1 hubs, 0 routers) in 5886 usec
Waiting for peers to complete discovery....             DONE
No other nodes present; becoming global master
Global master is /hw/rack/001/bay/07
Intializing any CPUless nodes..............             DONE
*** Nasid 0: Memory bank 2 was previously Absent but is now Present & Enabled
*** Nasid 0: Memory bank 3 was previously Absent but is now Present & Enabled
*** Nasid 0: Memory bank 4 was previously Absent but is now Present & Enabled
*** Nasid 0: Memory bank 5 was previously Absent but is now Present & Enabled
*** Nasid 0: Memory bank 6 was previously Absent but is now Present & Enabled
*** Nasid 0: Memory bank 7 was previously Absent but is now Present & Enabled
*** Nasid 0: Memory bank 0 previously had 512 MB but now has 1024 MB
*** Nasid 0: Memory bank 1 previously had 512 MB but now has 1024 MB
*** Nasid 0: Memory bank 2 previously had 0 MB but now has 1024 MB
*** Nasid 0: Memory bank 3 previously had 0 MB but now has 1024 MB
*** Nasid 0: Memory bank 4 previously had 0 MB but now has 1024 MB
*** Nasid 0: Memory bank 5 previously had 0 MB but now has 1024 MB
*** Nasid 0: Memory bank 6 previously had 0 MB but now has 1024 MB
*** Nasid 0: Memory bank 7 previously had 0 MB but now has 1024 MB
Checking partitioning information .........             DONE
No other nodes present; becoming partition master
Local slave entering slave loop
Local slave entering slave loop
Local slave entering slave loop
Loading BASEIO prom .......................             DONE
 
BASEIO PROM Monitor SGI Version 6.210  built 02:30:38 PM Aug 26, 2004 (BE64)
4 CPUs on 1 nodes found.
Automatic update of PROM environment disabled
Graphics diagnostics
 
***WARNING: Master I-brick was /hw/module/001c01/iobrick but is now /hw/module/001c07/iobrick
Check for LINK errors or setenv ConsolePath to new value.
Installing PROM Device drivers ............
On-board (IO9) tigon3 1000BaseT interface
Base I/O Ethernet set to /dev/ethernet/tg0
Installing Graphics Console...
graphics install: searching for pipe 0
klgraphics: System has no GFX
Probing IOC4 ATA adapter 2
IOC4 RevId = 79
 
Walking SCSI Adapter 0, (pci id 3)
1+ Device Vendor Product: COMPAQ BF14689BC5
2+ Device Vendor Product: SGI ST373307LC
3- 4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 2 device(s)
 
 
Walking SCSI Adapter 1, (pci id 3)
1- 2- 3- 4- 5- 6- 7- 8- 9- 10- 11- 12- 13- 14- 15- = 0 device(s)
 
Initializing PROM Device drivers ..........
  Initializing Base I/O Ethernet Interface...Failed.  MII Status Register = 0x7949
Done.
  ---------------Interface Configuration Summary----------------
  ASIC|Revision|MAC Address       : 5701|B5|08:00:69:13:eb:30
  Link Negotiation|Advertisement  : On|<H10 F10 H100 F100 F1000>
  Link|Speed|Duplex|Rx/Tx FlowCtrl: Down|10|Half|Off/Off
  --------------------------------------------------------------
DONE
Cannot connect to keyboard -- check the cable.
Cannot open /dev/input/ioc4pckm0 for input
Cannot connect to keyboard -- check the cable.
Cannot open /dev/input/ioc4pckm0 for input
Checking hardware inventory ...............
***Warning: Board in module 001c07 is missing or disabled
It previously contained a New Type board, barcode  laser 0
 Found new or re-enabled component PCI DEV 2
 Found new or re-enabled component PCI DEV 2
 Found new or re-enabled component PCI DEV 2
DONE
 
**** System Configuration and Diagnostics Summary ****
CONFIG:
         No. of NODEs enabled    = 1
         No. of NODEs disabled   = 0
         No. of CPUs enabled     = 4
         No. of CPUs disabled    = 0
         Mem enabled             = 8192 MB
         Mem disabled            = 0 MB
         No. of RTRs enabled     = 0
         No. of RTRs disabled    = 0
 
DIAG RESULTS:
         ALL DIAGS PASSED.
**** End System Configuration and Diagnostics Summary ****
 
 
System Maintenance Menu
 
1) Start System
2) Install System Software
3) Run Diagnostics
4) Recover System
5) Enter Command Monitor
 
Option? 5
Command Monitor.  Type "exit" to return to the menu.
>> hinv -v -m
IP35 Node Board, Module 001c07
    ASIC BEDROCK Rev 2, 200 MHz, (nasid 0)
    Processor A: 1.0 GHz R16000 Rev 3.0
          Secondary Cache 16MB 333MHz Tap 0x15 , (cpu 0)
      R16010FPC Rev 3.0
    Processor B: 1.0 GHz R16000 Rev 3.0
          Secondary Cache 16MB 333MHz Tap 0x15 , (cpu 1)
      R16010FPC Rev 3.0
    Processor C: 1.0 GHz R16000 Rev 3.0
          Secondary Cache 16MB 333MHz Tap 0x15 , (cpu 2)
      R16010FPC Rev 3.0
    Processor D: 1.0 GHz R16000 Rev 3.0
          Secondary Cache 16MB 333MHz Tap 0x15 , (cpu 3)
      R16010FPC Rev 3.0
    Memory on board, 8192 MBytes (Standard)
      Bank 0, 1024 MBytes (Premium)  <-- (Software Bank 0)
      Bank 1, 1024 MBytes (Premium)
      Bank 2, 1024 MBytes (Premium)
      Bank 3, 1024 MBytes (Premium)
      Bank 4, 1024 MBytes (Premium)
      Bank 5, 1024 MBytes (Premium)
      Bank 6, 1024 MBytes (Premium)
      Bank 7, 1024 MBytes (Premium)
IXBRICK Bridge, Module 001c07
    ASIC BRIDGE Rev 3, (widget 15)
    adapter IOC4 Rev 4f
          (pci id 1)
    adapter IOC4-ATA Rev 4f
          (pci id 1)
    adapter ID (Vendor 1000 Device 54 class 1 subclass 1)
          (pci id 2)
    adapter PCI (SCSI interface) Rev 6
          (pci id 3)
        peripheral DISK, BUS 0, ID 1, COMPAQ BF14689BC5
        peripheral DISK, BUS 0, ID 2, SGI ST373307LC
    adapter GigE Rev 15
          (pci id 4)
IXBRICK Bridge, Module 001c07
    ASIC BRIDGE Rev 3, (widget 15)
    adapter USB (OHCI interface)
          (pci id 2 function 0)
    adapter USB (OHCI interface)
          (pci id 2 function 1)
ASIC XBOW Rev 3, on CBrick, Module 001c07
>> date
Unable to execute date: no such file or directory
>>
escaping to L2 system controller
M200XXXX-192.168.XXX.XXX-L2>power down
 
WARNING: power on 001c07 CPU0 appears off
 
re-entering system console mode (001c07 CPU0), <CTRL_T> to escape to L2

Summary of Testing with O350

Like the Fuel & Numalink Router the O350 will start up L1 when a cleared / initialised chip is installed and will automtically initialise the chip.  As the L2 has valid serial number the O350 was automatically configured to take on the L2 serial.

I did not worry about testing with the undocumented L1 "let the carnage begin" command, to disable the "serial security" and clear the serial, as configuring O350 serials's is pretty well documented and easy process.

Overall I am now sure that all of Fuel, O350, Tezro and Numalink Router can have a new cleared Dallas installed and you can configure and boot the machine to get it up and running.

No special Fuel chip swapping required...


Tezro, O350 & Fuel - Snaphat

The main CPU clock uses an STI clock and this has a replacement battery package know as "Snaphat", which sits on top of the RTC clock chip. The Snaphat is a combination of a battery and an oscillator with a proprietary packaging. So there is no point going to the trouble of putting a replaceable battery on top of this as it is just a replaceable battery already.

The Snaphat battery/ocillator is completely distinct from the Dallas DS1742W NVRAM + RTC Chip. The Dallas chip supports needs of L1 (its clock time is managed via L1) and the L1 configuration, including the machines Serial Number. The SnapHat provide Battery and Oscillator for the STI RTC and NVRAM chip which holds data for the NVRAM (as managed via PROM) and the system clock. Its Date/Time is set via PROM command monitor.

Lets hope that the Snaphat battery does not have end up with supply issue like the Dallas DS1742W series and in any case its content are so trivial that others manufactorer can easy make a replacment. There are likely imitation Snaphats already being made. For retro equipment like SGI machines there is no difference between real/fake, so long and the imitation has working oscilator and a good battery. Before you put your Snaphat in serivce, just meaure the battery to make sure it is not flat. I measured the voltage on my ebay acquired (China sourced) "Snaphat"s and they recorded as around 2.9 Volts. Also do a physical inspection to be sure the chip does not have an cracks in the packaging, which could result in leakage at later stage.

Looking at this externally it is likely an imitation:

Original (left) and eBay acquired Snaphat

The Snaphat has a plastic clip to hold it in place, so removing it requires a bit more force that you might expect and having a chip removal tool will make this easier, especally on the Fuel where it is not easily gripped by "big fingers". Here is the Snaphat location on Fuel:

Snaphat Battery / Oscillator in Fuel (left is back of machine)

For the O350 replaceing the Snaphat is more work as it is on the underside of the IO9 board. So you will have to remove the IO9 to get access. See the following pictures:

SGI O350 Module with IO9 (bottom board at bottom of picture)

The Snaphat is on underside, so requires IO9 removal for access:

SGI O350 Module IO9 with Snaphat

The IO9 is hard to remove as it has the PCIX and extension connectors. Once the IO9 board is removed replacement is easy:

SGI O350 / Tezro IO9 Board

NOTE: Tezro shares the same IO9 Board


Glitchworks Dallas DS1687-5 for Octane/Octane2 and O2

The SGI Octane and O2 machines have a Dallas DS1687-5 RTC/NVRAM chip. While this chip is still available, Glitchworks have made a nice replacement version that includes a replaceable battery.

A Glitchworks GW-1687 Dallas DS1687-5 Replacement

The Dallas chip is installed on the Octane/Octane2 removeable CPU Module. So when doing replacement it is very important to be very careful not to touch the XTALK compression connectors.

The chip is located next to the memory slots on the same side as the external ports.

Octane2 Systems board Dallas DS1687-5 Chip - Pin 1 botton left

As the chip is hard to access it is best to have an IC removal tool to pull the chip out.  There is no need to save or copy  your NVRAM setting from your old Dallas to the new Glitchworks replacement during replacment.

I simply pulled out the old Dallas and put in the new Glitchworks replacement and the new chip was auto-initalised as part of system boot. The only change in the NVRAM setttings was that "netaddr=192.0.2.1" , which is its default value, while I had changed this to new address based on my network environment. So I went into PROM command prompt to reset for my network.

You should also boot into single user mode after you first change the chip so that you can reset the date to current time, as you will get a "preposterious date" warning when first booting with the new Dallas replacement.

Octane2 with Glitchworks GW-1687 replacement installed - Pin 1 botom lef

While Glitchworks is not essential, as the Dallas DS1687-5 is still available (unlike the DS1742W), the quality of the Glitchworks replacement and having a replaceable battery makes this a worthwhile update for long term Octane/Octane2 owner.

NOTE: You can dump your NVRAM setting from IRIX using "nvram" command.


Inside Fake (ebay) Dallas DS1742W-120+

Further destructive investigation of DS1742W-120+ (fake or faulty..?) finds that battery is on botton not top of package:

Dallas DS1742W-120+ (pin 1 on left, pin 12/13 on right)

On Irix Network it was found that the outer casing of Dallas chips becomes plastic on heating and it is then much easier to remove the outer casing to reveal what is inside. Testing with (fake) DS1742W-120+ chips I opened one of these up using the sigificantly less destructive heating technique in combination with some cutting with Stanley knife. The small 6 pin IC on left (pin 1 end is stamped 10A45).

Dallas DS1742W-120+ Broken Open Using Heat Gun (pin 1 on left)
Dallas DS1742W-120+ Broken Open and Capacity and Battery Removed.
Dallas DS1724W-120+ Top View with Casing Removed
Dallas DS1742W-120+ Heat Gun vs Dremel
Dallas DS1742W-120+ Side On

With heat gun the outer casing becomes crumbly at around 270 Degrees Celcius and will start to bubble at around 300 Degrees Celcius. At 320 Degrees Celcius the heat gun will affect the solder used to to hold pins on.

Here is PCB and pin details for Dallas DS1742W-120+ :

Dallas DS1742W-120+ PCB Pin 1 Top Left
Dallas DS1742W-120+ Pin 1 Botton Left

Draft Pin Details:

  • Pin 1 - Notched - Straight Through (A7)
  • Pin 2 - Notched - Straight Through (A6)
  • Pin 3 - Notched - Straight Through (A5)
  • Pin 4 - Notched - Straight Through (A4)
  • Pin 5 - Notched - Straight Through (A3)
  • Pin 6 - Notched - Straight Through (A2)
  • Pin 7 - Notched - Straight Through (A1)
  • Pin 8 - Notched - Straight Through (A0)
  • Pin 9 - Top Side R1, Under Side T1 (DQ)
  • Pin 10 - Top Side R5, Under Side T2 (DQ)
  • Pin 11 - Top Side R6, Under Side T3 (DQ)
  • Pin 12 - Notched - Ground Layer (Battery -Neg)
  • Pin 13 - Top Side R7, Under Side T6 (DQ)
  • Pin 14 - Top Side R8, Under Side T5 (DQ)
  • Pin 15 - Top Side R9, Under Side T5 (DQ)
  • Pin 16 - Top Side R10, Under Side T9 (DQ)
  • Pin 17 - Top Side R11, Under Side T8 (DQ)
  • Pin 18 - Notched - Straight Through (^CE)
  • Pin 19 - Notched - Straight Through (A10)
  • Pin 20 - Notched - Straight Through (^OE)
  • Pin 21 - Notched - Straight Through (^WE)
  • Pin 22 - Notched - Straight Through (A9)
  • Pin 23 - Notched - Straight Through (A8)
  • Pin 24 - Top Side C2 (VCC), Under Side C3 (10A45)

Draft Components

  • Capacitors x 3 (10A45)
  • Resistor x 8 (102) - 1 per Data Line
  • Voltage Regulator IC 6 PIN 10A45
  • Oscillator Crystal (Pin 1/24 Notch)
  • Battery Cell  - 3V CR1225 (+Pos Pin 12/13 Notch, -Neg Pin 12)

So there is bit of complexity to the PCB board that sits underneath the main IC. Pending completing full reverse engineer of chip, simpler solution is to put new replaceable battery on top of chip and wire this into Pin 12/13 Notch (+Pos) and Pin 12 (-Neg).

Here are pictures of carefully cut DS1742W-120+ ready for new battery to be put on top.

Carefully Cut DS1742W-120+ (Pin 1 Left Bottom)
Carefully Removed Battery from DS1742W-120+

Based on feedback from eevblog thread (james_s) and looking at known orginal SGI source (see pictures above) I think we can conclusively say this is a "fake" Dallas DS1742W-120+


References & Links:

Irix Network - Discussion thread on drilling Dallas and replacement thread by cz7asm

eevblog - Dallas DS1742W Reverse Engineering thread, includes x-ray view of chip from james_s

Dallas  - is also used in Tektronics Oscilloscope and so others have looked at this in another eevblog electronics forum thread

Dallas DS1742W - Specification Sheet

10A45 - Voltage Regulator Specification Sheet

Maxim - Real Time Clocks Selector Guide from June 2009

STI Snaphat - Oscillator & Battery Replacement package

XGecu - TL866 USB Programmer

Disable l1 serial security - using the "let the carnage begin" undocumented command (again thanks to old Nekochan thread)

Codesrc - Michael McMaster developed and released a SCSI to SD flash adaptor that provides a SCSI disk replacement solution for retro-computing. The design for this was released as Open Source with GPL which means it can be manufactored by anyone and resulted in variations such as SCSI2CF (compact flash). Current V6 version no longer has Open Source design :-( .

Glitchworks - make a very nice DS1687-5 replacement for the Octane & O2 machines