23-MAR-2024
Following best practice of vintage computing, I might as well dump the Serial ROM (SROM) while I’m at it.
On power-up, the 21064 CPU bootstraps via a 3-wire serial interface.
But, as I noted before, the ROM sitting on that interface is not serial in the slightest. It is a 65536x8 parallel UV-erasable EPROM chip 27C512. Quick poking about with a multimeter has revealed that it presents itself to the CPU as eight serial ROMs with the aid of external components: two 8-bit counters and jumpers J1…J8.
Dumping the contents is thus pleasantly simple. All signals are available on standard 0.1” pin headers. All you need is a TTL-compatible SPI adapter or a logic analyser.
Thankfully, the initial program bitstream is very well documented.
Every 256 data bits are accompanied by 37 control bits in the bit stream. Hence 8 kBytes of SROM data populate 223 Icache blocks, or 7136 bytes, as stated above. In practice, the CPU actually tries to read in all 256 blocks, or 75008 bits (9376 bytes). But the 16-bit address counter wraps around on DEC 3000 AXP, and the last 33 Icache blocks fill up with nonsense (bit shifted contents of the beginning of the SROM bit stream, with data and control bits ending up in wrong places).
Having downloaded all eight bit streams, I extracted the actual executable contents with a simple Python script. In populated cache blocks, bits V (valid) and ASM (address space match) are set to one. The remaining control fields – TAG, ASN, and BHT – are zeros. Thus, the machine code in the Icache looks as if it were filled from memory at physical address 0.
Hexadecimal dumps of the raw bit streams and executable images contained within are available here:
Image | Description | Length in bytes | Raw bitstream hex dump | Executable image hex dump |
0 | Powerup Sequence | 7008 | srom0-bitstream-hex.txt | srom0-image-hex.txt |
1 | Mini-Console at 19200 baud | 7008 | srom1-bitstream-hex.txt | srom1-image-hex.txt |
2 | Mini-Console at 9600 baud | 7008 | srom2-bitstream-hex.txt | srom2-image-hex.txt |
3 | Cache Test (longword) | 1024 | srom3-bitstream-hex.txt | srom3-image-hex.txt |
4 | Mfg Test – bctest | 5632 | srom4-bitstream-hex.txt | srom4-image-hex.txt |
5 | Empty (no output) | 1024 | srom5-bitstream-hex.txt | srom5-image-hex.txt |
6 | LongWord Memory test (no cache) | 5120 | srom6-bitstream-hex.txt | srom6-image-hex.txt |
7 | LongWord Memory test (cache on) | 5120 | srom7-bitstream-hex.txt | srom7-image-hex.txt |
For those who’d like to follow at home, I’ll pack everything up into a zip at the end of the chapter for your downloading convenience.
Now I can interleave the raw streams bitwise to get the full PROM content (as programmed into the 27C512). The least significant bit of each byte comes from bit stream 0, the next one from stream 1 and so on. In Python this can be done like this:
bitstreams = [open(f"srom{i}-raw.bit", "rb").read() for i in range(7, -1, -1)]
combined = b''.join([
int("".join(map("".join, zip(*map("{:08b}".format, bits)))), 2)
.to_bytes(8, "little")
for bits in zip(*bitstreams)])
with open("srom-image.bin", "wb") as f:
f.write(combined)
Here’s what the result looks like in hexadecimal: full-srom-hex.txt.
I noticed something odd. At the very end of each bitstream, after all the zeros that pad it to 8 kB, there is an unknown byte.
Bitstream 0: 1ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d8
Bitstream 1: 1ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0
Bitstream 2: 1ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1a
Bitstream 3: 1ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d5
Bitstream 4: 1ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c1
Bitstream 5: 1ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0
Bitstream 6: 1ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 94
Bitstream 7: 1ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c7
Although this byte does make its way into the Icache, it’s not in a valid block (V=0) and therefore the processor can never see it. So it’s not meant for consumption by the code executing on the Alpha CPU. What is it then, some sort of a checksum or identification for external tools?
It does not look like a checksum: I’ve tried various types of CRC-8 as well as XOR and sum modulo
256, and none of them match the final byte. It’s not an ID either, as f0
is found in
two different bit streams.
Perhaps these bytes are not related to the bitstreams per se but are rather an artefact of some signature added to the full ROM image? Let’s have a look.
% hexdump -C -s 0xfff0 full-srom.bin
0000fff0 00 00 00 00 00 00 00 00 98 84 c8 05 6f 22 bb fb |............o"..|
00010000
Well, this doesn’t ring any bells either. I’ve tried 64- and 32-bit XORs and modulo sums, and got it checked for all known CRC-64 and CRC-32 algorithms with CRC RevEng. No joy.
Could that be a timestamp? DEC’s flagship operating system was VMS (a.k.a. OpenVMS™). Here's what its timestamp would have looked like:
The operating system maintains the current date and time in 64-bit format. The time value is a binary number in 100-nanosecond (ns) units offset from the system base date and time, which is 00:00 o'clock, November 17, 1858 (the Smithsonian base date and time for the astronomic calendar).
Windows NT has gone back as far as 1 January 1601 for its epoch but kept the 100-nanosecond resolution, which it inherited from VMS together with Dave Cutler. And Unix time in seconds since 1 January 1970 UTC might also be an option, who knows.
The label on the ROM says “(C) DEC 89”. It may have been updated since then, but probably no later than 1997. Let’s see how VMS, Windows, and UNIX timestamps look like in that time period.
OS | 1 January 1989 | 1 January 1998 |
VMS | 91e2e116bc4000 | 9bf9d0aa8c8000 |
Windows | 1b2ff58a00f8000 | 1bd164833dfc000 |
Unix | 23bd6a00 | 34aadc80 |
The bytes at the end of the SROM would have been in little endian, so we are looking at either 0x05c88498 (9702722410) and 0xfbbb226f (−7162203310), or the 64-bit number 0xfbbb226f05c88498. None of these look in any way related to the timestamp ranges above.
And so the mystery remains unsolved. What are those bytes? If you happen to know, please write to cetus at cetus.sh.
24-MAR-2024
Now that we’ve come this far, let’s have a quick look at how SROM code works. This is interesting for several reasons.
Below are links to the disassembled code next to the respective SROM output on my machine. (Corresponding binaries can be found in a zip at the bottom.)
DEC 3000 - M800 SROM 6.1 Powerup Sequence ff.fd.fb.fa.f9.f8.f7.f6.f5.f4.f3.f2.f1.f0. sysROM 00000069.0000087a ioROM 00000069.00000212 MCRstat 11111111.11801180 bnkSize 00000200.00000500 memSize 00000040.00000040
DEC 3000 - M800 SROM 6.1 Mini-Console ff.fd.fb.fa.f9.f8.f7.f6.f5.f4.f3.f2.f1.f0. sysROM 00000069.0000087a ioROM 00000069.00000212 MCRstat 11111111.11801180 bnkSize 00000200.00000500 memSize 00000040.00000040 SROM>
DEC 3000 - M800 SROM 6.1 Mini-Console ff.fd.fb.fa.f9.f8.f7.f6.f5.f4.f3.f2.f1.f0. sysROM 00000069.0000087a ioROM 00000069.00000212 MCRstat 11111111.11801180 bnkSize 00000200.00000500 memSize 00000040.00000040 SROM>
Cache Test (longword) r13 = 00000000, r14 = 00080000...done. r13 = 00080000, r14 = 00100000...done. r13 = 00100000, r14 = 00180000...done. r13 = 00180000, r14 = 00200000...done. r13 = 00200000, r14 = 00280000...done. r13 = 00280000, r14 = 00300000...done. r13 = 00300000, r14 = 00380000...done. r13 = 00380000, r14 = 00400000...done. r13 = 00400000, r14 = 00480000...done.
DEC 3000 - M800 SROM 6.1 Mfg Test ff.fd.fb.f0. MCRstat 11111111.11801180 bnkSize 00000200.00000500 memSize 00000040.00000040 bctest d12345678 00000000 D12345678 00000001 D12345678 00000002 d12345678 00000003
— no output —
DEC 3000 - M800 SROM 6.1 Mfg Test ff.fd.fb.f0. MCRstat 11111111.11801180 bnkSize 00000200.00000500 memSize 00000040.00000040 memTest (no-cache) LongWord Memory Test ....done. ....done. ....done.
DEC 3000 - M800 SROM 6.1 Mfg Test ff.fd.fb.f0. MCRstat 11111111.11801180 bnkSize 00000200.00000500 memSize 00000040.00000040 memTestCacheOn LongWord Memory Test ....done. ....done. ....done.
The default SROM program (image 0) starts by initialising various internal processor state. Then something interesting happens at address 10016:
100: 227f0a0d lda r19, 0xa0d ; '\r\n' 104: 26730a0d ldah r19, 0xa0d(r19) ; '\r\n' 108: d3a002e8 bsr r29, 0xcac 10c: 227f3033 lda r19, 0x3033 ; '30' 110: 26733030 ldah r19, 0x3030(r19) ; '00' 114: 4a641733 sll r19, 0x20, r19 118: 22734544 lda r19, 0x4544(r19) ; 'DE' 11c: 26732043 ldah r19, 0x2043(r19) ; 'C ' 120: d3a002e2 bsr r29, 0xcac 124: 227f2d20 lda r19, 0x2d20 ; ' -' 128: 26730020 ldah r19, 0x20(r19) ; ' ' 12c: d3a002df bsr r29, 0xcac 130: 227f0020 lda r19, 0x20 ; ' ' 134: 4a641733 sll r19, 0x20, r19 138: 2273384d lda r19, 0x384d(r19) ; 'M8' 13c: 26733030 ldah r19, 0x3030(r19) ; '00' 140: d3a002da bsr r29, 0xcac
It builds up ASCII strings of up to 8 characters in register r19
and
calls 0xcac
, presumably to send them to the SROM debug port. The return
address for each call is stored in r29
. Let’s see what happens there.
cac: 4a603630 zapnot r19, 0x1, r16 cb0: d3c0003d bsr r30, 0xda8 cb4: 4a611693 srl r19, 0x8, r19 cb8: f67ffffc bne r19, 0xcac cbc: 6bfd8000 ret r31, (r29), 0
The string in r19
is handed out one byte at a time to a subroutine at
0xda8
. This time the parameter is passed in r16
,
and the return address in r30
.
da8: e5000009 beq r8, 0xdd0 dac: 22df0014 lda r22, 0x14 db0: 2210ff00 lda r16, -0x100(r16) db4: 4a00b730 sll r16, 0x5, r16 db8: 77ff0033 hw_mtpr/i r31, 0x13 ; r31 -> SL_CLR dbc: 76100036 hw_mtpr/i r16, 0x16 ; r16 -> SL_XMIT dc0: 4a003690 srl r16, 0x1, r16 dc4: 42c03536 subq r22, 0x1, r22 dc8: d3600009 bsr r27, 0xdf0 dcc: f6dffffa bne r22, 0xdb8 dd0: 6bfe8000 ret r31, (r30), 0
The character code in r16
is surrounded by start and stop bits and then
sequentially shifted out onto sRomClk_h pin via bit 4 of the SL_XMIT register. The total shift
count is 0x14 = 2010, which is somewhat excessive: it generates 11 stop bits!
A subroutine at 0xdf0
is called between bit shifts, this time via
r27
.
df0: 47ff0415 clr r21 df4: 201f00c8 lda r0, 0xc8 df8: 4c0d1400 mulq r0, 0x68, r0 dfc: 48150680 srl r0, r21, r0 e00: 22bf0001 lda r21, 1 e04: 4aa41735 sll r21, 0x20, r21 e08: 76b50051 hw_mtpr/a r21, 0x11 ; r21 -> CC_CTL e0c: 613fc000 rpcc r9 e10: 4921f629 zapnot r9, 0xf, r9 e14: 400909b5 cmplt r0, r9, r21 e18: e6bffffc beq r21, 0xe0c e1c: 6bfb8000 ret r31, (r27), 0
This is simply a busy wait loop. The RPCC instruction reads the processor cycle counter, which is continuously compared with 0xc8⋅0x68 = 2080010. As the CPU is clocked at 200 MHz, this gives us 2⋅108 / 20800 ≈ 9615 baud.
As you may recall from Chapter 7, images 1 and 2 communicate at different baud rates but otherwise appear to implement the same interactive miniconsole. Now, this just compels me to compare their code:
... de8: 47ff0415 clr r21 dec: 201f00c8 lda r0, 0xc8 df0: 4c069400 mulq r0, 0x34, r0 df4: 48150680 srl r0, r21, r0 ...
... de8: 47ff0415 clr r21 dec: 201f00c8 lda r0, 0xc8 df0: 4c0d1400 mulq r0, 0x68, r0 df4: 48150680 srl r0, r21, r0 ...
Indeed, the only difference is the multiplication constant in the UART timing loop! (These programs use the same bit-bash UART subroutines as image 0, just at a slightly different address.)
For those playing at home, here is the zip with everything in it:
SROM_6.1_DEC_3000_M800.zip