Sunday 29 March 2020

Commodore PET checkerboard and one odd character on the screen

This is an old post, preserved for reference.
The products and services mentioned within are no longer available.

When you have a problem with an early Commodore PET, you often see a screen full of random characters.
Less frequency, like this one, you get a screen with a check pattern.
This is made up from character 255, which is four squares, two black and two white, something like a slice of goth battenberg.
That is caused when the video RAM is not reading correctly, and the data lines all read as 1.
That can be caused by the RAM chips, the latches that read out of them, or the enable lines that control all of the above.
This board had one of my PET ROM/RAM boards fitted, but that hasn't fixed the issue, so it had been sent it for repair.
In this case, the problem with the screen was not the main issue, everything was running hot, and the current draw was about 3.2A (this photo was taken a bit later when the initial problem has been resolved).
The 6540 ROM and 6550 RAM chips often run hot (as they have lots of issues with failing enable lines and bus contention), but the CPU and IO chips were also warm, so something was a bit wrong.
Checking around the board, the 74154 is used as the main part of the address decoding, and generates 16 enable lines for each 4K block across the address range. All of the outputs were low. Everything was enabled at the same time. I powered it off immediately and did a bit more checking around, and I thought it best to remove all the ROM and RAM chips at this stage, in case they were still OK.
I had noticed and fixed a couple of chips with bent legs, and one with a missing leg. Not ideal, but not enough to cause these problems.
With the ROM and RAM chips and the IO chips removed, I replaced the CPU with a NOP generator and started probing around to see if the 74154 was at fault. All the clock inputs were there and wiggling up and down nicely in sequence, but all sixteen outputs were low. I would have expected that was a faulty 74154, however, one other thing I noticed was that the 5V pin was also low, measuring 0.6V in fact.
The PET 2001 board has four of these 7805 regulators, but they are not commoned, and run different sections of the board. The regulator which runs the 74154 and the ROM chips was the one that had failed. Often this can be caused by a faulty chip pulling the rail low, so I tried initially lifting the output leg of the regulator and checking it's output, and it was still low.
Just to be sure, I lifted all the legs and connected the output of one of the other regulators to the missing rail (I used the one that powers the RAM chips, since they were currently removed).
Powering that up, the rail was back to 5V, and the outputs of the 74154 were doing the right thing, so I replaced the 7805 with a new one and removed the wire link.
With that fitted, and the link removed, all the rails were back, so I started putting things back.
I fired it up with a ROM/RAM board fitted and it all seemed to be working.
I started refitting the RAM chips, the minimum required is 2K
So far, so good, so I went for the full 8K. I had to fix the bent pins, and the one broken leg.
The full 8K was working, and remarkably, all the ROM chips were also working. 8K RAM and BASIC version 1.
I don't think I've ever actually seen a full working set of 6540's and 6550's as they are so good at self destructing. The current draw with all those installed was just under 3A (the picture from above).
I ran a memory test (having to load from tape since I couldn't use an SD2PET) and it passed, I swapped out two of those chips with the video RAM, and retested, but I was still seeing the odd issue on the screen (more on that in a moment).
However, given their fragility, and as BASIC 1 has some bugs, and doesn't support disk drives, and 8K RAM is a bit limiting. I would recommend putting them away for Sunday best. If you actually want to use the PET, use a ROM/RAM board instead and get 32K and BASIC 4.
Without the 6540 ROM and 6550 RAM chips, the current went down to under 1A, so a definite improvement, less power, less heat, less stress, greater reliability.
With that, the PET seemed to be working, there was just this odd single character that was occasionally showing the wrong thing. Here you can see an asterisk about half way down the screen.
Normally that is down to a bad RAM chip, but here it didn't seem to be related to the RAM chips, as they were passing the RAM tests and the same fault was persisting when both chips were swapped out.
The same fault would come back at other times, but generally in one of two positions, the 32nd character of the 13th line, or the 16th character of the 7th line.
If you add those up, that works out to be the 256th and the 512th characters on the screen. Hmm, powers of two. I had a theory, and to test it, I waited until it was showing the asterisk and then moved the cursor to the top left of the screen, the 0th character, and typed in a letter A.
Well, would you look at that, the 256th character also changed to an A. So what I think is happening is the counter which generates the video address is a bit slow when it comes to read the 256th character and instead is reading the 0th. The address changes from 00FF to 0100, I think it is one of those cascaded counters where each bit turns over slightly after the previous one (I had a similar problem with the Minstrel 3, seen here glitching between 3 and 4).
So, rather than going from 00FF to 0100, it occasionally goes to 0000 briefly between 00FF and 0100, so at the time it is drawing the screen it reads the character at address 0000 rather than 0100.
I spent a while trying to trace this through the schematic, and with a tangle of wires and a logic analyser, but I couldn't see anything conclusive.
There are three counters, the first two bits use the two halves of a 74107, and the last 8 bits use two 74177s.
The video RAM is accessed by both the video circuitry and the 6502 CPU, and the two address busses are multiplexed using three 74157s.
74157s do work hard, switching continually at a very high speed, so they are a bit prone to fail or start switching things slower, so could explain some of the bits of the address being delayed. Unlike the counters, these are still in production and available, so I thought it would be a good bet. I also noticed that two were 74157s and one was a 74LS157, so not particularly well matched.
I replaced the one which switched the higher address lines, and initially it appeared to resolve the problem, but all it did was reduce the frequency of the problem, and it did come back. I thought it was best to replace the other 74157 and the older 74LS157 with new 74LS157s, to make sure they all switch together. That again seemed to help things, but I could still get it to happen occasionally during IO operations.
In the end, it was replacing the 74LS107 counter that fixed it. I've run the board for a long time with the PET ROM/RAM board, 32K BASIC 4 and an SD2PET.
That's all done now, quite an odd problem, but I've not been able to recreate it after several hours testing, and also a cold start the following morning, so that seems to have fixed it.