More gold questions...

Nagging hardware related question? Post here!
Post Reply
User avatar
Pr0f
QL Wafer Drive
Posts: 1298
Joined: Thu Oct 12, 2017 9:54 am

More gold questions...

Post by Pr0f »

Well one main one actually...

The Gold card has a 68000 on board, and internally on the card accesses it's memory on a 16 bit wide bus.

I can see that the ROM chip on board is actually on the 'QL' bus side, and that the Gold Card offers out D0-D7 and A0-A17 (the lack of A18/19 preventing Aurora high res memory access).

I can see that the CPLD on the board has decode signals for high byte access and seems to allow either D0-D7 or D8-D15 to pass through to the QL main board.

That would be necessary for access to the QL's peripherals. I know that the 68000 can run a byte only access for either odd or even accesses, and see how that might translate through logic to selecting either D0-D7, or D8-D15 with the appropriate Data strobes, mimicking 8 bit access with DS and A0 from the 68008.

So the killer question:

How does the Gold Card work when a 16 bit access is made to the QL mainboard, as would be the case for instance when accessing any memory areas (ROM or RAM for instruction or word / longword data), and for peripheral access for instance to read the RTC as a long word which it does in the ROM disassembly, not to mention setting several 8 bit regs in the ZX8302 at init time with a single move.l ?

Clearly something rather clever going on the CPLD.

I read the 68000 notes, and can see that you can force a peripheral chip access using VPA - but the wiring isn't there for that - plus that doesn't work at all for a 68EC000 as far as I can see, the other option is also not wired for - which is to repeat the bus cycle using BERR and HALT together - and read / write one byte the first time around, and the other on the 2nd attempt - using 2 16 bit accesses and throwing alternate halves of the data away - that's probably how I would approach that, but it's costly on time.

Anyone help in deducing the magic going on the CPLD ?


Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: More gold questions...

Post by Nasta »

The biggest problem here is you are imagining a whole lot of 'magic' where there is none.
Have a look at what the 68008 does when accessing a word or long word - it simply does a series of byte accesses.
A 68000 does the same with word accesses (since it's bus is word-wide) when doing a long word access.
The access 'template' is exactly the same - the only difference is A0/DS on 68008, vs. UDS/LDS on 68000.
Since the exact sequence (i.e. timing) of all signals is known (given by the datasheet), it is not difficult to emulate it using a state machine in a CPLD. This has to be done anyway since the 68000 on board a GC runs at a faster clock than the orioginal 68008.
Basically, when the CPLD detects the 68000 signals LDS and UDS simultaneously, it emulates a 68008 access to the bus and with the help of some bus buffer chips does the following:
in case of writing:
- first writes the high byte temporarily connecting D8..15 of the 68000 to the QL 8-bit bus, also generating A0=H for the bus, and waiting for DTACKL as a normal 68008 would;
- then writes the low byte, generating A0=L for the bus, at which point DTACKL is also given as DTACK to the 68000 since now both bytes have been written so a word write is completed).
in the case of reading
- first reads the high byte and stores it in an intermediate buffer (note 74HCT646 chip on schematic diagram), also generating A0=H for the bus, and waiting for DTACKL as a notmal 68008 would;
- then read the low byte and present it on D0..D7 of the 68000 while simultaneously presenting the contents of the intermediate buffer to D8..D15 of the 68000 to provide it a complete word of data, generating A0=0 and again waiting for DTACLK which is now also given as DTACK to the 68000 as completing this operation also completes a full word access.

It should be noted that this process automatically takes care of a long word access since the outside of the CPU sees any series of word accesses exactly the same as any other - it does not care if two consecutive or random words are read or written, assembling a long word out of two words is the CPU's business.

Also, to lessen the confusion, VPA is NOT used for data access on GC or SGC, and in fact even on the unexpanded QL I am aware of only one peripheral ever using this function, which is the QEPIII Eprom programmer, and perhaps something homebrew.
Do not confuse the word 'peripheral' in the 68k manuals with any peripheral you might think of, the use of VPA is to cater for 6800 (not a typo!) peripheral chips, i.e. older generation 8-bit stuff. This was designed into the 68k series in the beginning, because dedicated peripheral chips were not yet available and would come out later, so people could still implement peripheral function using chips they knew from the older 6800 and 6809 series CPUs (especially since code was expected to be ported from the 6800/9 to the 68k).

Accessing 'peripherals' on the QL is, to the CPU, like accessing memory. External circuits (if needed) must handle possible speed discrepancies using DTACKL to slow down CPU access when needed.
In the same manner, when a wider than 8-bit data transfer is performed, there is no 'magic' - it's broken down into standard 4 clock cycle (or more, if DTACK is used to delay each one) byte accesses, just like on the 68008, and yes, they take twice the time or more (in reality much more than twice as the QL bus is not only half width but much slower than the internal GC bus).
This is why the GC does all that 'copy ROM to RAM, patch and execute from RAM' trickery and completely ignores slow QL motherboard RAM whenever possible, only writing to the screen area (and using shadowing in it's own RAM to be able to read screen data only from it at a far higher speed), and doing read/write only to the 8301/8302 control registers via the IO area.

Just a small note on VPA again - this signal has a dual use on the 68000/08 (and does not exist as such on 68EC/SEC000 and 68EC001), which is, along with 6800/9 peripheral access compatibility, to force automatic interrupt vector generation. Without it, the CPU expects the interrupting peripheral to respond to a special interrupt acknowledge cycle and provide an 8-bit vector to the interrupt service routine. Alternatively, external hardware can detect the interrupt acknowledge cycle (on the QL this is done by detecting FC1=FC2=1) and terminate the cycle with VPA rather than the normal DTACK, at which point the CPU automatically uses predefined vectors depending on interrupt level that is being served and ignores any data provided during the interrupt acknowledge cycle. This mode of operation is used on the QL and carried over to GC and SGC. 68EC variants have a dedicated pin for this called AVEC which behaves similar to VPA on older CPUs but only for the autovectoring function.


User avatar
tofro
Font of All Knowledge
Posts: 2685
Joined: Sun Feb 13, 2011 10:53 pm
Location: SW Germany

Re: More gold questions...

Post by tofro »

And the SGC doesn't even need to go through all of this hassle because it supports a dynamically sized data bus.

Just to add to what Nasta has written: The only wider-than-8-bit transfer that needs to be accounted for is a 16-bit transfer. The 68000 already breaks down a longword access into two word accesses just like the 68008 does on 8-/16-bit accesses.

Tobias


ʎɐqǝ ɯoɹɟ ǝq oʇ ƃuᴉoƃ ʇou sᴉ pɹɐoqʎǝʞ ʇxǝu ʎɯ 'ɹɐǝp ɥO
User avatar
Pr0f
QL Wafer Drive
Posts: 1298
Joined: Thu Oct 12, 2017 9:54 am

Re: More gold questions...

Post by Pr0f »

I get than long word transfers are done as 2 x 16 bit word transfers - that I understand from the 68000 datasheet.

What wasn't clear was how the GC accessed 8 bit areas in the QL when a 16 bit or 32 bit (as 2 x 16 bit) transfer was made. I can see in the source for the JS and Minerva ROMS that not all accesses even to peripheral chips are done using byte oriented operation codes, and I was looking to understand how the gold card made that translation between the 16 bit gold card internal bus and the 8 bit external QL bus.

I think Nasta's explanation of the state machine makes sense, I was just exploring any possible option using the logic on board the processor itself to do this functionality with some support for simpler logic externally, such as a bus replay cycle. Either way it would be interesting to see the clock cycle cost for either approach, as wait states would be, I guess, inferred given the faster clock on the Gold card.

As for Magic - I use this term for anything I don't understand, until I understand it, as it might as well be magic ;)

Thank you for the explanation.


Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: More gold questions...

Post by Nasta »

The logic of he thing is easy - since there is (originally) a 68008 on the bus, this is what the hardware has to emulate.
That being said, the 68k CPU does not offer much of help with this, even using a cycle re-run strategy requires a state machine that operates on a access cycle basis rather than a clock basis. The speed penalty is actually the same because most of it is due to the QL bus being much slower and not clock-synchronized to the CPU clock.
Ideally a 68000 will transfer a long word in 8 cycles in a zero-wait 16-bit bus environment (GC RAM comes close to that), whereas it will need about 36 cycles accessing eg. QL ROM and over 100 if it accesses the screen RAM at an in-opportune moment. A difference in +-1 for those numbers is really negligible.
The actual description of what the state machine has to do is literally spelled out in the explanation of the 68000/008 bus operation, which is essentially the same except for width. The 68k uses both clock edges to advance throough the states based on the tate of the read/write signal and DTACKL. It takes a minimum of 8 edges (4 full clock cycles) to do anything except the read-modify-write cycle which is only used by the TAS instruction (and there is a serious question of why this was done, CPU designers took some time to catch up and the issue was handled in a much more consistent way on 68040 and 060).

It should be said that even the 68(EC)020 requires a small state machine to emulate the 68008 correctly enough for the 8301 ULA to work reliably, even unlike on the GC, it does not actually use extra hardware to 'serialize' word and long accesses into a series of byte accesses (the 68(EC)020 knows how to do it on it's own). The issue here is that 8301 expects 68008 timing at 7.5MHZ synchronized with CLKCPU so some measures have to be taken (I still owe a small explanation of that in the 8301 thread).


Post Reply