Chapter 8: SROM dumps

23-MAR-2024

Following best practice of vintage computing, I might as well dump the Serial ROM (SROM) while I’m at it.

On power-up, the 21064 CPU bootstraps via a 3-wire serial interface.

Three signals are used to interface to the serial ROM. The
              sRomOE_l output signal supplies the output enable to the ROM,
              serving both as an output enable and as a reset (refer to the
              serial ROM specifications for details). The sRomClk_h output
              signal supplies the clock to the ROM that causes it to advance to
              the next bit. The ROM data is read via the sRomD_h input signal.
DECChip™ 21064–AA RISC Microprocessor Preliminary Data Sheet
, 1992

But, as I noted before, the ROM sitting on that interface is not serial in the slightest. It is a 65536x8 parallel UV-erasable EPROM chip 27C512. Quick poking about with a multimeter has revealed that it presents itself to the CPU as eight serial ROMs with the aid of external components: two 8-bit counters and jumpers J1…J8.

SROM schematic diagram

SROM components on the System Module

Dumping the contents is thus pleasantly simple. All signals are available on standard 0.1” pin headers. All you need is a TTL-compatible SPI adapter or a logic analyser.

Bitstream structure

Thankfully, the initial program bitstream is very well documented.

Following power-on, the CPU loads its I-stream internal cache from an external 64-kbit
              SROM, clocking in the data as a serial bit stream. All I-cache bits are loaded from the
              65536-bit serial stream, including both data and control bits; the size of the loaded
              program is therefore 7136 bytes. Data is loaded, least-significant bit first, starting from
              LW0 (LongWord0, 32 bits), then LW2, LW4, LW6, TAG (21 bits), ASN (6 bits), the
              ASM bit, the V bit, LW1, LW3, LW5, LW7, and, lastly, BHT (8 bits).
Digital AlphaStation 200/400 Series Technical Information
, 1995

The order in which bits within each block are serially loaded
DECChip™ 21064–AA RISC Microprocessor Preliminary Data Sheet
, 1992

Every 256 data bits are accompanied by 37 control bits in the bit stream. Hence 8 kBytes of SROM data populate 223 Icache blocks, or 7136 bytes, as stated above. In practice, the CPU actually tries to read in all 256 blocks, or 75008 bits (9376 bytes). But the 16-bit address counter wraps around on DEC 3000 AXP, and the last 33 Icache blocks fill up with nonsense (bit shifted contents of the beginning of the SROM bit stream, with data and control bits ending up in wrong places).

Having downloaded all eight bit streams, I extracted the actual executable contents with a simple Python script. In populated cache blocks, bits V (valid) and ASM (address space match) are set to one. The remaining control fields – TAG, ASN, and BHT – are zeros. Thus, the machine code in the Icache looks as if it were filled from memory at physical address 0.

Hexadecimal dumps of the raw bit streams and executable images contained within are available here:

ImageDescriptionLength in bytes Raw bitstream hex dumpExecutable image hex dump
0Powerup Sequence7008 srom0-bitstream-hex.txt srom0-image-hex.txt
1Mini-Console at 19200 baud7008 srom1-bitstream-hex.txt srom1-image-hex.txt
2Mini-Console at 9600 baud7008 srom2-bitstream-hex.txt srom2-image-hex.txt
3Cache Test (longword)1024 srom3-bitstream-hex.txt srom3-image-hex.txt
4Mfg Test – bctest5632 srom4-bitstream-hex.txt srom4-image-hex.txt
5Empty (no output)1024 srom5-bitstream-hex.txt srom5-image-hex.txt
6LongWord Memory test (no cache)5120 srom6-bitstream-hex.txt srom6-image-hex.txt
7LongWord Memory test (cache on)5120 srom7-bitstream-hex.txt srom7-image-hex.txt

For those who’d like to follow at home, I’ll pack everything up into a zip at the end of the chapter for your downloading convenience.

Now I can interleave the raw streams bitwise to get the full PROM content (as programmed into the 27C512). The least significant bit of each byte comes from bit stream 0, the next one from stream 1 and so on. In Python this can be done like this:

bitstreams = [open(f"srom{i}-raw.bit", "rb").read() for i in range(7, -1, -1)]
combined = b''.join([
    int("".join(map("".join, zip(*map("{:08b}".format, bits)))), 2)
        .to_bytes(8, "little")
    for bits in zip(*bitstreams)])
with open("srom-image.bin", "wb") as f:
    f.write(combined)

Here’s what the result looks like in hexadecimal: full-srom-hex.txt.

Mystery bytes

I noticed something odd. At the very end of each bitstream, after all the zeros that pad it to 8 kB, there is an unknown byte.

Bitstream 0:  1ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 d8
Bitstream 1:  1ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 f0
Bitstream 2:  1ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 1a
Bitstream 3:  1ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 d5
Bitstream 4:  1ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 c1
Bitstream 5:  1ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 f0
Bitstream 6:  1ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 94
Bitstream 7:  1ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 c7

Although this byte does make its way into the Icache, it’s not in a valid block (V=0) and therefore the processor can never see it. So it’s not meant for consumption by the code executing on the Alpha CPU. What is it then, some sort of a checksum or identification for external tools?

It does not look like a checksum: I’ve tried various types of CRC-8 as well as XOR and sum modulo 256, and none of them match the final byte. It’s not an ID either, as f0 is found in two different bit streams.

Perhaps these bytes are not related to the bitstreams per se but are rather an artefact of some signature added to the full ROM image? Let’s have a look.

% hexdump -C -s 0xfff0 full-srom.bin
0000fff0  00 00 00 00 00 00 00 00  98 84 c8 05 6f 22 bb fb  |............o"..|
00010000

Well, this doesn’t ring any bells either. I’ve tried 64- and 32-bit XORs and modulo sums, and got it checked for all known CRC-64 and CRC-32 algorithms with CRC RevEng. No joy.

Could that be a timestamp? DEC’s flagship operating system was VMS (a.k.a. OpenVMS™). Here's what its timestamp would have looked like:

The operating system maintains the current date and time in 64-bit format. The time value is a binary number in 100-nanosecond (ns) units offset from the system base date and time, which is 00:00 o'clock, November 17, 1858 (the Smithsonian base date and time for the astronomic calendar).

Programming Concepts Manual, Volume II

Windows NT has gone back as far as 1 January 1601 for its epoch but kept the 100-nanosecond resolution, which it inherited from VMS together with Dave Cutler. And Unix time in seconds since 1 January 1970 UTC might also be an option, who knows.

The label on the ROM says “(C) DEC 89”. It may have been updated since then, but probably no later than 1997. Let’s see how VMS, Windows, and UNIX timestamps look like in that time period.

OS1 January 19891 January 1998
VMS91e2e116bc40009bf9d0aa8c8000
Windows1b2ff58a00f80001bd164833dfc000
Unix23bd6a0034aadc80

The bytes at the end of the SROM would have been in little endian, so we are looking at either 0x05c88498 (9702722410) and 0xfbbb226f (−7162203310), or the 64-bit number 0xfbbb226f05c88498. None of these look in any way related to the timestamp ranges above.

And so the mystery remains unsolved. What are those bytes? If you happen to know, please write to cetus at cetus.sh.

Disassembly

24-MAR-2024

Now that we’ve come this far, let’s have a quick look at how SROM code works. This is interesting for several reasons.

I used good old  objdump -b binary -m alpha:ev4 -D  to disassemble SROM images.

Below are links to the disassembled code next to the respective SROM output on my machine. (Corresponding binaries can be found in a zip at the bottom.)

Image 0:  srom0-asm.html
DEC 3000 - M800 SROM 6.1
Powerup Sequence
ff.fd.fb.fa.f9.f8.f7.f6.f5.f4.f3.f2.f1.f0.
sysROM  00000069.0000087a
ioROM   00000069.00000212
MCRstat 11111111.11801180
bnkSize 00000200.00000500
memSize 00000040.00000040



Image 1:  srom1-asm.html
DEC 3000 - M800 SROM 6.1
Mini-Console
ff.fd.fb.fa.f9.f8.f7.f6.f5.f4.f3.f2.f1.f0.
sysROM	00000069.0000087a
ioROM	00000069.00000212
MCRstat	11111111.11801180
bnkSize	00000200.00000500
memSize	00000040.00000040


SROM> 
Image 2:  srom2-asm.html
DEC 3000 - M800 SROM 6.1
Mini-Console
ff.fd.fb.fa.f9.f8.f7.f6.f5.f4.f3.f2.f1.f0.
sysROM	00000069.0000087a
ioROM	00000069.00000212
MCRstat	11111111.11801180
bnkSize	00000200.00000500
memSize	00000040.00000040


SROM> 
Image 3:  srom3-asm.html
Cache Test (longword)

r13 = 00000000, r14 = 00080000...done.
r13 = 00080000, r14 = 00100000...done.
r13 = 00100000, r14 = 00180000...done.
r13 = 00180000, r14 = 00200000...done.
r13 = 00200000, r14 = 00280000...done.
r13 = 00280000, r14 = 00300000...done.
r13 = 00300000, r14 = 00380000...done.
r13 = 00380000, r14 = 00400000...done.
r13 = 00400000, r14 = 00480000...done.
Image 4:  srom4-asm.html
DEC 3000 - M800 SROM 6.1
Mfg Test
ff.fd.fb.f0.
MCRstat 11111111.11801180
bnkSize 00000200.00000500
memSize 00000040.00000040

        bctest

d12345678 00000000
D12345678 00000001
D12345678 00000002
d12345678 00000003
Image 5:  srom5-asm.html

— no output —

Image 6:  srom6-asm.html
DEC 3000 - M800 SROM 6.1
Mfg Test
ff.fd.fb.f0.
MCRstat	11111111.11801180
bnkSize	00000200.00000500
memSize	00000040.00000040

	memTest (no-cache)
	LongWord Memory Test

....done.
....done.
....done.
Image 7:  srom7-asm.html
DEC 3000 - M800 SROM 6.1
Mfg Test
ff.fd.fb.f0.
MCRstat 11111111.11801180
bnkSize 00000200.00000500
memSize 00000040.00000040

	memTestCacheOn
	LongWord Memory Test

....done.
....done.
....done.

A quick peek

The default SROM program (image 0) starts by initialising various internal processor state. Then something interesting happens at address 10016:

     100:  227f0a0d    lda        r19, 0xa0d          ; '\r\n'
     104:  26730a0d    ldah       r19, 0xa0d(r19)     ; '\r\n'
     108:  d3a002e8    bsr        r29, 0xcac
     10c:  227f3033    lda        r19, 0x3033         ; '30'
     110:  26733030    ldah       r19, 0x3030(r19)    ; '00'
     114:  4a641733    sll        r19, 0x20, r19
     118:  22734544    lda        r19, 0x4544(r19)    ; 'DE'
     11c:  26732043    ldah       r19, 0x2043(r19)    ; 'C '
     120:  d3a002e2    bsr        r29, 0xcac
     124:  227f2d20    lda        r19, 0x2d20         ; ' -'
     128:  26730020    ldah       r19, 0x20(r19)      ; ' '
     12c:  d3a002df    bsr        r29, 0xcac
     130:  227f0020    lda        r19, 0x20           ; ' '
     134:  4a641733    sll        r19, 0x20, r19
     138:  2273384d    lda        r19, 0x384d(r19)    ; 'M8'
     13c:  26733030    ldah       r19, 0x3030(r19)    ; '00'
     140:  d3a002da    bsr        r29, 0xcac

It builds up ASCII strings of up to 8 characters in register r19 and calls 0xcac, presumably to send them to the SROM debug port. The return address for each call is stored in r29. Let’s see what happens there.

     cac:  4a603630    zapnot     r19, 0x1, r16
     cb0:  d3c0003d    bsr        r30, 0xda8
     cb4:  4a611693    srl        r19, 0x8, r19
     cb8:  f67ffffc    bne        r19, 0xcac
     cbc:  6bfd8000    ret        r31, (r29), 0

The string in r19 is handed out one byte at a time to a subroutine at 0xda8. This time the parameter is passed in r16, and the return address in r30.

     da8:  e5000009    beq        r8, 0xdd0
     dac:  22df0014    lda        r22, 0x14
     db0:  2210ff00    lda        r16, -0x100(r16)
     db4:  4a00b730    sll        r16, 0x5, r16
     db8:  77ff0033    hw_mtpr/i  r31, 0x13           ; r31 -> SL_CLR
     dbc:  76100036    hw_mtpr/i  r16, 0x16           ; r16 -> SL_XMIT
     dc0:  4a003690    srl        r16, 0x1, r16
     dc4:  42c03536    subq       r22, 0x1, r22
     dc8:  d3600009    bsr        r27, 0xdf0
     dcc:  f6dffffa    bne        r22, 0xdb8
     dd0:  6bfe8000    ret        r31, (r30), 0

The character code in r16 is surrounded by start and stop bits and then sequentially shifted out onto sRomClk_h pin via bit 4 of the SL_XMIT register. The total shift count is 0x14 = 2010, which is somewhat excessive: it generates 11 stop bits! A subroutine at 0xdf0 is called between bit shifts, this time via r27.

     df0:  47ff0415    clr        r21
     df4:  201f00c8    lda        r0, 0xc8
     df8:  4c0d1400    mulq       r0, 0x68, r0
     dfc:  48150680    srl        r0, r21, r0
     e00:  22bf0001    lda        r21, 1
     e04:  4aa41735    sll        r21, 0x20, r21
     e08:  76b50051    hw_mtpr/a  r21, 0x11           ; r21 -> CC_CTL
     e0c:  613fc000    rpcc       r9
     e10:  4921f629    zapnot     r9, 0xf, r9
     e14:  400909b5    cmplt      r0, r9, r21
     e18:  e6bffffc    beq        r21, 0xe0c
     e1c:  6bfb8000    ret        r31, (r27), 0

This is simply a busy wait loop. The RPCC instruction reads the processor cycle counter, which is continuously compared with 0xc8⋅0x68 = 2080010. As the CPU is clocked at 200 MHz, this gives us 2⋅108 / 20800 ≈ 9615 baud.

As you may recall from Chapter 7, images 1 and 2 communicate at different baud rates but otherwise appear to implement the same interactive miniconsole. Now, this just compels me to compare their code:

Image 1:  Mini-Console at 19 200 baud
           ...
     de8:  47ff0415    clr        r21
     dec:  201f00c8    lda        r0, 0xc8
     df0:  4c069400    mulq       r0, 0x34, r0
     df4:  48150680    srl        r0, r21, r0
           ...
Image 2:  Mini-Console at 9600 baud
           ...
     de8:  47ff0415    clr        r21
     dec:  201f00c8    lda        r0, 0xc8
     df0:  4c0d1400    mulq       r0, 0x68, r0
     df4:  48150680    srl        r0, r21, r0
           ...

Indeed, the only difference is the multiplication constant in the UART timing loop! (These programs use the same bit-bash UART subroutines as image 0, just at a slightly different address.)

The full zip

For those playing at home, here is the zip with everything in it:
SROM_6.1_DEC_3000_M800.zip