The ZX Spectrum Reverse Engineering and Clone Desgin Blog

Harlequin

A site dedicated to the reverse engineering of the ZX Spectrum and related projects.

News

17 Feb, 2013

A +D disk containing the original cassette master maker for my unpublished game Skyway has been found, and uploaded to the ZXDesign website!

30 Jul, 2010

The writing and production of The ZX Spectrum ULA: How to design a microcomputer has been completed and the book is with the printer. Once proof copies have been checked, distribution will begin!

Twitter

Follow me on Twitter

< 47 of 68 >

Testing The New Contention Model

Jul 20, 2007

I used Ramsofts floating bus and interrupt test program to test the new contention model: measurements were good, but not perfect.

The Harlequin boots okay, but the first video byte during uncontended floating bus read is returned at T-state 14343, which is four T-states earlier than a real ZX Spectrum.

The contended floating bus read (say port 16639) does not return the usual 67, 255 x 7, 69, 255 x 7, 71 ... sequence but instead returns 67 x 8, 69 x 8, 71 x 8, etc ( where V x N means V repeats N times).

Looking closely at the signals generated, I notice that the CPU clock showed a variable phase during the video fetch, and most probably continuously. All previous tests had seen this being a low to high transition. See the ULA contention diagram.

I suspected that holding the CPU clock high instead of activating the WAIT CPU signal was the cause of this instability. I know from previous tests that I want the CPU clock to make an upwards transition during the video fetch latches AL_1/2 to get an effective floating bus read (ie the downward T₃ clock transition at the end of the fetch).

During an IO instruction, IOREQ goes low shortly after the rising edge of the CPU clock at the start of the second T-state. It is at this point that we decide whether or not to pause the CPU. Until now I've been labelling the start of the T-states so that they line up with the video latch pulses. This is not correct, as a T-state starts with the rising edge of the CPU clock. Shifting the labelling one half CPU cycle left leaves a diagram that looks odd, even if it is technically correct:

	3	4	5
AL₂

CLK_cpu

IOREQ

IO WAIT

ULA IO	T₁	T₂	---->	T₂	T_W	T₃

This may well explain the unstable CPU clock. If our 'wait' period begins half way through a T-state (as it does at '7' above), then as IOREQ goes low at the start of a T-state, we will have lost half a period worth of 'wait' and thus won't hold the clock for an whole number of clock cycles. Therefore the clock transitions will become inconsistent and the clock phase will be seen to flip. Notice that in the diagram above, T_W and T₃ do not line up with the markers at 7 and 8.

The CLK_cpu shows the effect of responding to the IOREQ during T₂. CLK_cpu is created by dividing CLK₇ by 2 via a D-Type flip-flop, which we hold high during our 'wait'. Notice that CLK_cpu is held high for an extra half cycle after IO WAIT is removed (position 5 above). This is because the output of the CLK_3.5 D-Type changes state at the next positive edge of CLK₇, which will be half a CLK_3.5 period later.

The following scope picture shows IO WAIT aligned to the second AL₂ latch signal:

In order to guarantee a whole number of clock cycles are held high during a 'wait', the WAIT signal should be synchronised to the start of a T-state. We can do this by delaying the wait by half a cycle. The following diagram shows the effect of this on the generated CLK_cpu:

	3	4	5	6	7	8
AL₂

CLK_cpu

IOREQ

IO WAIT

ULA IO	T₁	T₂	-------->	T₂	T_W	T₃

Notice how the phase of CLK_cpu is maintained, and the clock is held high for an exact number of cycles.

This picture shows the WAIT signal delayed by half a CPU clock cycle:

There are two methods of aligning the WAIT signals (including MEM WAIT) with the start of a T-state:

CLK_3.5 starting low, WAIT delayed half a cycle to bring in line with rising edge of CLK_3.5. (Solution above).
CLK_3.5 inverted (starting high), WAIT not delayed as it will already be in line with rising edge of clock.

I tried both these options, with interesting results:

The first video byte on floating bus read occurs at T-state 14346.
The contended floating bus read (say port 16639) 65, 255 x 1, 67, 255 x 5, 69, 255 x 7, 71, 255 x 7, 73, 255 x 7 ... starting at 14345.
The first video byte on floating bus read occurs at T-state 14343 (ie early), and the read returns intermittent values.
The contended floating bus read returns repeating attribute values with no 255 bytes between them.

With solution 1, the floating bus auto test failed on a few reads, which was a surprise, however the downward edge of CLK_cpu lags behind AL₂ by approximately 17ns.

This bothered me a bit, as any fluctuation in timing would potentially cause a miss-read of the floating bus, which is probably what is happening. The Z80 performs an IO read slightly before the downward clock transition of T₃. It is essential that the value on the data bus is stable by this time, so if our clock transitions are too early, we will read intermittent values. I've not as yet been able to pin-point at what point exactly the IO read takes place in the Harlequin, but it must be happening soon after the data bus is stable during the video memory read as doing the read 17ns earlier is enough to occasionally miss the byte.

To fix this, instead of dividing CLK₇ by 2 to give CLK_3.5, I derived CLK_3.5 from HC₀ (which oscillates at 3.5MHz) and passed it through the old dividing D-Type which I clocked on the downwards edge of CLK₁₄. This delays CLK_3.5 slightly and brings is in line with the video byte fetches.

Note: Because CLK_3.5 is now being generated from a stable 3.5MHz source (HC₀) it will always maintain phase, so the half cycle delay of IOWait discussed on this page no longer performs this function. See "Improving Stability" for a more accurate analysis of IOWait alignment.

It is important that the memory access, clock and clock-effecting signals are synchronised if spurious timing issues are to be avoided. For instance if the WAIT signal and CLK are out of sync, our CLK_cpu will be full of glitches!

This new delayed CLK_3.5 cleared up the occasional miss-reads completely, and the auto test succeeds every time! And I'm much happier about the relationship between CLK_3.5 and AL_{1 and 2}.

Note: The missreads are infact due to sampling the first idle floating bus value after a video fetch whilst the bus is stabalising. It is more appropriate and stable to pull the databus up to VCC harder instead of delaying the clock. See Clock Alignment Stability for details.

Examining the combined memory and IO wait implemented in waitGen version 1.11 there was a very visible glitch.

This was caused by the memory wait occurring slightly before the IO wait. This was easily resolved by buffering the memory wait through a spare gate before combining with the IO wait:

All of these features have been incorporated into the Clock Generator and Wait Generator schematics version 1.12.