Faster CPU plug in extensions, How?

A place to discuss general QL issues.
User avatar
Ruptor
Trump Card
Posts: 177
Joined: Fri Dec 20, 2019 2:23 pm

Faster CPU plug in extensions, How?

Postby Ruptor » Sun Sep 13, 2020 11:29 am

How do the extension boards with a faster cpu work? Do they park the 68008 and take over everything or is the 68008 still involved? If it is the former then any processor could be used couldn't it?


User avatar
Chr$
Super Gold Card
Posts: 547
Joined: Mon May 27, 2019 10:03 am
Location: Sachsen, Germany

Re: Faster CPU plug in extensions, How?

Postby Chr$ » Sun Sep 13, 2020 11:52 am

The Gold Card and Super Gold Card CPU's completely take over the 68008 duties. You can also remove the 68008 from the QL (but you usually don't have to).

My basic understanding is that any replacement for the 68008 would have to be some kind of Motorola 68k compatible CPU! I'm sure there are other considerations.


Collector of original and some newer QL related computers, accessories and software even though I don't understand how any of them work.
Original BBQL's: 3x German, 1x UK model (plus quite a bit of other old stuff)
Custom AT cased QL system with Gold Card, Qimi mouse, Hermes and MFM HDD
PC with QXLII card installed
Q68 also present

Ask me about near NOS re-felted good quality Microdrive Carts - many available.
User avatar
Pr0f
Super Gold Card
Posts: 659
Joined: Thu Oct 12, 2017 9:54 am

Re: Faster CPU plug in extensions, How?

Postby Pr0f » Sun Sep 13, 2020 12:15 pm

Chr$ wrote:The Gold Card and Super Gold Card CPU's completely take over the 68008 duties. You can also remove the 68008 from the QL (but you usually don't have to).

My basic understanding is that any replacement for the 68008 would have to be some kind of Motorola 68k compatible CPU! I'm sure there are other considerations.


Technically speaking - as long as basic bus timings are observed, any processor type could take over the bus and just use the peripherals within the QL - so it doesn't have to be 68K compatible. Although, having said that it woud not make so much sense to do that.


User avatar
Dave
SandySuperQDave
Posts: 2508
Joined: Sat Jan 22, 2011 6:52 am
Location: Austin, TX
Contact:

Re: Faster CPU plug in extensions, How?

Postby Dave » Mon Sep 14, 2020 7:04 pm

This is a lot more complex than it sounds. I've played a lot with replacement CPUs in the QL, with mixed success. Here's what I can tell you.

Using a CPU of a different architecture is 100% doable at 7.5MHz, if you're into the idea of writing your own OS that uses all the QL's pre-defined IO addresses. Some CPUS like the Z80 have separate memory and IO cycles, and would need to have accommodations made - but this does add some flexibility in allowing IO to not consume memory address space. I guess this is important if your address space is only 64K.

However, I think OP is thinking about 68K processors.

So the skinny is that a QDOS, Minerva or SMSQ/E installation only requires access to a little bit of IO address space (between 96K and 128K) and screen memory of 32K at 128K or 160K. EVERYTHING outside of that doesn't need to touch the QL at all. These accesses will always be 8-bit, so you could use for example a 68EC000 in 8-bit mode, or a 68EC020 or '030 where those addresses are in 8 bit mode but all other accesses to its own private memory are 32-bits wide. This does require something a little more advanced than a very basic address decoder, but not much. Any upclocked CPU requires an alternate generation of DTACKL.

The (S)GC handled these issues by placing a small but very finely engineered state machine between the CPU and the IO/video system. While the state machine isn't ideal, it's perfectly functional. The limitations placed on it are by the device it was programmed into. It can handle up to about a 25MHz CPU on one side, and the 7.5MHz QL bus on the other.

The 8301 seems to be OK with transactions on the bus up to 11 MHz or so, and the 8302 seems quite variable but seems solid to around 16 MHz. Some 8302s seem happy to as high as 18 or 19 MHz. However, these two elements mean for a truly fast system, an 8301 and 8302/8049 replacement would be preferable. In my approach, I widened the memory bus to 16 bits for video RAM, and used a 64Kx16 dual port SRAM that could be accessed as 16 bits on both sides. This 100% eliminated bus contention from video generation and made video writes much faster - if only the OS were rewritten to always handle word sized video accesses. Using Nasta's idea of shadowing the video RAM for reads is redundant on a system like I just described, but if you're determined to use the original ICs it provides a worthwhile gain.

In terms of what can easily be interfaced to the QL, the 68EC000 can be interfaced natively in 8 bit mode, or through a simple state machine in 16 bit mode, to double memory bandwidth for all non-QL accesses. The 68EC020 and 68EC030 both have dynamic bus sizing, and can fit natively with the QL at QL bus speeds for quad-speed memory accesses, but such fast CPUs being run at 7.5 MHz is a huge waste. Again, adding a simple state machine has benefits. Using the chips' dynamic bus sizing and a clock switch would even allow regular QL/RAM cycles to be at 30 or 40 MHz, and the QL IO/video accesses to be at 7.5 MHz. However, switching the clocks needs to be done with precision to not cause runt cycles. This is more complex than you'd expect. I have it to where I can do this accurately at some point between the assertion of /AS and /DS, which gives the QL's CLAs a chance, but hoo boy it's borderline and not a thing I'd trust enough to make for public consumption.

So what does that leave, in real terms?

A few years ago, I worked with Nasta to co-develop a 4MB QL RAM expansion. It has a 68008FN in PLCC package with two extra address lines, and disables the internal CPU. Running at native speed, it would provide RAM in all unused areas of the extended memory map. It would need a 4M capable version of Minerva, and the option to use SMSQ/E was open. It also had a thru-connector, with correct manipulation of the pass thru address lines so everything was re-mapped correctly to allow maximum continuous RAM. It would shadow video RAM for reads, too, plus replace all other internal RAM. We never got it 100% running. It was a close thing though and early design validation tests were REALLY promising. I did redesign it to use a 68SEC000 and a large 3.3V PSRAM. This would have supported much larger memory configurations, up to 16MB. The 68SEC000 is also much more tolerant of clock speed changes, even mid-cycle and seems almost immune to runt cycles as long as they're long enough - something I ensured with a little extra logic.

The option also exists for Tetroid to (or to ask someone else to) port INGOT to a modern 5V tolerant FPGA. In doing so, DRAM handling could be simplified to SRAM, and refresh eliminated. Another option would be for Peter to break out elements of his Q68 hardware descriptions, so people could implement a video core replacement for a future "Aurora II" type card with fast CPU support, or inbuilt. However, Q68 is an active commercial product at this time, so maybe equivalent blocks from the Q40/60 would be more likely. The catch here is that Peter is extremely busy and would have little desire to support this. He just doesn't have the time.

What else is possible if people are open to new ideas?

Well, multi-CPU is very possible with QDOS-like systems and their pre-emptive multitasking system. Multiple copies of SMSQ/E could run on multiple cores within a system, with one acting as a job allocator and message parser and passer! Having one "system" issue tasks to other systems at full bus speed would allow, for example, having a CPU dedicated to QL-side IO and video, while another CPU handles regular tasks like clocks, timers, and non-time-sensitive routines. This is based mostly off my experiences playing with and developing hardware for PDP-11 and S-100 bus systems. I think SMSQ/E could be extended in this way with only modest effort by a competent programmer. I would chip in to that fundraiser!

The point of the above idea is that a single 8-bit 68EC000 at QL bus speed can talk quite happily with multiple much faster 68Ks and the QL on their behalf. A card with a 68EC000 and four 68EC040s at 40MHz would be quite thrilling.

So that's my take on the question.


User avatar
RalfR
Gold Card
Posts: 446
Joined: Fri Jun 15, 2018 8:58 pm

Re: Faster CPU plug in extensions, How?

Postby RalfR » Mon Sep 14, 2020 8:55 pm

Dave wrote:A few years ago, I worked with Nasta to co-develop a 4MB QL RAM expansion. It has a 68008FN in PLCC package with two extra address lines, and disables the internal CPU. Running at native speed, it would provide RAM in all unused areas of the extended memory map.
This reminds me of the ABC MegaRAM from Uli Rosowski, which had the same processor.


User avatar
Ruptor
Trump Card
Posts: 177
Joined: Fri Dec 20, 2019 2:23 pm

Re: Faster CPU plug in extensions, How?

Postby Ruptor » Mon Sep 14, 2020 9:48 pm

Dave wrote:This is a lot more complex than it sounds. I've played a lot with replacement CPUs in the QL, with mixed success. Here's what I can tell you.
Using a CPU of a different architecture is 100% doable at 7.5MHz
However, I think OP is thinking about 68K processors.
Nope :lol: It was an interesting read about 68K cpu alternatives but I was thinking of a modern one, lets say RPi, with a QL driver as an interface. It is easy to make an 800 MHz processor go slow the trick is to arrange the driver so it doesn't get affected by the operating system (OS) that controls it. With such a difference in speed the OS could time slice in QL instructions. I am speaking from the view of an engineer use to designing interrupt driven real time systems where timing was critical and the processor couldn't be hogged. I don't know squat about OS based systems but with the QL being so slow I can't believe the driver would be impossible to make.


Derek_Stewart
QL Wafer Drive
Posts: 1917
Joined: Mon Dec 20, 2010 11:40 am
Location: Runcorn, Cheshire, UK

Re: Faster CPU plug in extensions, How?

Postby Derek_Stewart » Mon Sep 14, 2020 9:49 pm

RalfR wrote:This reminds me of the ABC MegaRAM from Uli Rosowski, which had the same processor.

I tried to buy a MegaRAM, but ABC Elektronik, decided not to supply it and tried to keep the money I paid them.

But I used the German Law Courts to force Andreas Budde to return the money. Which this was successful, but alas no MegRAM... Quite a good idea. pity there was crooked people to spoil the concept.


Regards,

Derek
User avatar
Peter
QL Wafer Drive
Posts: 1075
Joined: Sat Jan 22, 2011 8:47 am

Re: Faster CPU plug in extensions, How?

Postby Peter » Mon Sep 14, 2020 10:12 pm

Dave wrote:The option also exists for Tetroid to (or to ask someone else to) port INGOT to a modern 5V tolerant FPGA.

If you find a modern 5V tolerant FPGA please let us know...
Dave wrote:In doing so, DRAM handling could be simplified to SRAM, and refresh eliminated.

Don't forget that video in highcolor needs a lot of throughput and that might be easier to achieve by SDRAM bursts.
Dave wrote:Another option would be for Peter to break out elements of his Q68 hardware descriptions, so people could implement a video core replacement for a future "Aurora II" type card with fast CPU support, or inbuilt.

Publishing QL hardware as open source was not an experience I would repeat. But I seriously discussed designing a specialized Q68 chip derivative for Nasta. The problem with such collaborations is to find common periods of time where both are motivated, in good health and not withheld too much by daytime work.


User avatar
Dave
SandySuperQDave
Posts: 2508
Joined: Sat Jan 22, 2011 6:52 am
Location: Austin, TX
Contact:

Re: Faster CPU plug in extensions, How?

Postby Dave » Mon Sep 14, 2020 10:20 pm

Ruptor wrote:
Dave wrote:This is a lot more complex than it sounds. I've played a lot with replacement CPUs in the QL, with mixed success. Here's what I can tell you.
Using a CPU of a different architecture is 100% doable at 7.5MHz
However, I think OP is thinking about 68K processors.
Nope :lol: It was an interesting read about 68K cpu alternatives but I was thinking of a modern one, lets say RPi, with a QL driver as an interface. It is easy to make an 800 MHz processor go slow the trick is to arrange the driver so it doesn't get affected by the operating system (OS) that controls it. With such a difference in speed the OS could time slice in QL instructions. I am speaking from the view of an engineer use to designing interrupt driven real time systems where timing was critical and the processor couldn't be hogged. I don't know squat about OS based systems but with the QL being so slow I can't believe the driver would be impossible to make.


Ahhh. Ok, well, I went down that road too. Here's what I learned:

A Pi is not fast enough at memory mapped IO to keep up with the QL's bus speed. Nor does it have enough continuous MMIO to make a sufficient sized word to make transfers efficient. The CM3 does and I experimented with that. With bare metal code to do the transfers, the CM3 was barely able to keep up with 3.5 MHz. All this brought you was QL video (the Pi's framebuffer is infinitely superior) and IO - which could all be implemented on a Pi and external hardware at lower "cost"....

Soooo, overall, that's just not a thing.

Architectures I have connected to a QL bus over the last 35 years include:

ARM
Crusoe (that dog don't hunt!)
Intel 80C186
Intel 80486 (which has 020-like dynamic bus sizing)*
Motorola 000/020/030/040 (not 060)
Transmeta T400-family
Zilog Z80, Z180, eZ80

Of all of those, frankly, the 486DX2 and higher 486s were most interesting, because they have internal design that runs the core at 2x the bus speed. A 7.5MHz bus gives a 15 MHz core, which is quite fun for such a cheap and plentiful CPU. The DX4/100 at 7.5MHz gives a 30 MHz CPU clock, but, big but, even though accesses are 32 bits wide to its own memory, those accesses are all at 1/4 core speed. It can do tight loops that stay within cache VERY fast, but any time it has to go to the QL it crawls, relatively speaking.

The *easiest* accelerated system would be a 68SEC000 card, with address-selected CLKCPU. The fastest system would be an '060 with state machine - more complex and fiddly, but worthwhile in the end. The main difficulty there is the 060 has a synchronous bus, which is a little unfamiliar to those of us who appreciate the aynchronous bus. If you want to stay asynch, the 030 is the best option.


Derek_Stewart
QL Wafer Drive
Posts: 1917
Joined: Mon Dec 20, 2010 11:40 am
Location: Runcorn, Cheshire, UK

Re: Faster CPU plug in extensions, How?

Postby Derek_Stewart » Mon Sep 14, 2020 11:52 pm

Hi,

Would the legendary Issue 8 board do all this?


Regards,

Derek

Who is online

Users browsing this forum: No registered users and 2 guests