Several readers of this Blog contacted me with suggestions of test programs to test the Harlequin's 48K Spectrum likeness. Most of them were from the demo scene and feature full screen rasters, which are notoriously difficult to get working if emulator or clone timings are not exact. Some however were test-and-measure utilities used in emulator test suites.
Fusetest runs through a couple of undocumented Z80 flag features, an LDIR test at the contended and uncontended memory boundary, and a contended IO test.
ULA Test 3 displays a matrix of T-states and illustrates on this where the floating bus reads occur, the IO contention is shown as inverted text.
ULA Test 3 results were interesting on the Harlequin, and showed that timings were still a little off, with the video byte fetches occurring two clock cycles early, even though floatspy showed their only being 1-Tstate less than a real Spectrum between the interrupt and first byte fetch.
Delaying the byte fetch
It was rather tricky to adjust the byte fetch timing to be one or two cycles later because the OutLatch signal moves from the end of one 8-cycle period into the start of the next. See diagrams Wait To Late and Adjusted Fetch for a comparison. This causes problems latching bytes from memory into the output registers because a latch will occur at the start of a display line (cycle 2) before the first video byte is read for that line (cycle 4). Another problem is that the last output latch for a line now occurs in the next 8-cycle period, when we've switched to generating the border.
In order to avoid this false first latch and allow the last latch to complete, we can delay Den (Display Enable) for 2 cycles by clocking it with HC2 through a D-Type flip-flop. This produces a Den signal that starts 2 cycles into the first 8-cycle period, and ends 2 cycles after the last 8-cycle period.
Generating OutLatch directly from HCn and Den with simple logic instead of from U8, gives an OutLatch sequence that begins and ends whilst we have data to latch. There will be 32 OutLatch pulses per line.
The first real output latch occurs 10 pixel-clock cycles into a line, and Vout/Border needs to be delayed until this point. Currently the Harlequin cascades D-type flip-flops to achieve the delay, but cascading 10 flip-flops is excessive, and more than provided in an octal D-type latch. There is however a shortcut.
We require that Vout/Border be delayed until the first display byte of a row is latched into the output shift register, and that it is held low until the last display byte has been shifted to the screen. In otherwords, from the first OutLatch pulse to the 33rd.
Using OutLatch to clock Den through a flip flop synchronises the change of state of Den with OutLatch. This gives us a signal which is active for exactly the period we require, and so becomes the new Vout/Border signal, generated with a single D-type flip-flop instead of a cascade!
There is a small problem with the design above. OutLatch is produced using Den and so cannot be used to clock Den through a flip-flop, as Den will always be low when OutLatch goes high. We can however use the signal that is ORed with Den in creating OutLatch, and use this to clock the D-type and generate Vout/Border.
The horizontal control and wait generator schematics version 1.13 show the cleaned up OutLatch generation and adjusted byte fetch.
The following timing diagram shows the adjusted fetch and wait timings as produced by U6 3 to 8 line decoder (the alignment shift for WAIT discussed in New Contention Model is not shown, as these are the signals as generated before the shift):
Testing the adjusted byte fetch
Implementing the new OutLatch generation and adjusting the byte fetch to be one cycle later, I ran UlaTest 3:
Running Floatspy on these partially adjusted timings still showed the first display byte read occurring at 14346, not 1 T-state later. But, interestingly, the contended floating bus read now returns values at T-states
indicating that an extra T-state has appeared between the 2nd and 3rd contended float reads.
Moving the video byte fetches one cycle later again (as shown in the timing diagram above) displays results comparable to a real Spectrum:
The contended floating bus timings returned from Floatspy however are now:
Which is exactly the offset sequence shown by a ZX Spectrum, but one T-state early as the Spectrum begins at 14344.
The fact that the first offest between floating bus reads is 2 must have something to do with the way Floatspy implements it's timing, and must be subject to the effects of memory contention. Altering the memory contention alone earlier or later introduces some instability in operation (as shown previously) but changes the start T-state of both the contended and uncontended floating bus read. Contention early increases the start T-state - presumably due to the uncontended cycles being hit earlier so more T-states of the timing loop are consumed.
Even though the start T-states are 1 cycle early, a test of Shock MegaDemo 3 seems in order. Shock has a full screen raster demo that will not work unless the machine timings are exact. And this means really exact!!
Here we go.....
This slightly untidy display was disappointing as I was sure I'd resolved all the timing issues with the Harlequin. So I have not yet achieved 100% compatibility.
Shock MegaDemo performs some auto sync detection for 48/128/+2 Spectrum's so that it operates correctly. My initial suspicions are that it is miss-detecting the Harlequin. It's a shot in the dark, and some analysis is required.