ULA ZX8301 - TV Picture Capabilities

Nagging hardware related question? Post here!
Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: ULA ZX8301 - TV Picture Capabilities

Post by Nasta »

In detail:

The 8301 operation is very much a slave of the process of picture generation.
This repetitive process reads out the 32k of screen RAM as RGB pixels about 50 times a second, as a frame of lines of pixels.
The process is based on CRT technology, but even the most modern monitors use a version of the same. The reason why it does it over and over again is because the actual screens do not retain information for long, much like Dynamic RAM, so the contents must be refreshed. Indeed, they also must change dynamically, so a 'new version' comes out the RGB connector every 20ms.
Because the picture was originally drawn (almost literally) by a cathode ray on a luminescent surface, it is composed of a number of pixels in a line, followed by a sync pulse (which is a kind of 'carriage return' + 'new line' for monitors, followed by a period of black pixels which corresponds to the time needed for the beam to return to the starting position.
In a similar fashion, lines are displayed one under the other from left to right and down, until the bottom end of the screen is reached, and then a sync pulse (this time it's the 'vertical sync), followed by a number of lines filled with black pixels, during which the beam returns to it's top left starting position.
In reality the sync pulses do not happen immediately after the visible pixels but slightly after, so there is a bit of an unused 'border' on all sides. Although, as we know, when monitor mode is selected on the QL, i.e. the full width of the screen is used, some of the contents will end up just off the edges of the screen. It will soon be apparent why.

The 15MHz crystal
At first glance it's not easy to figure out why 15MHz was used except that you get the 7.5MHz CPU clock out of it by dividing by 2, which is a trivial operation in digital electronics. However, a look into the traces explains this, as well as calculating the requirements of the video standard.
The total length of one line should be 64us as defined in the standard. The visible portion should be some 48us, into which all the visible pixels in a line should fit. For the QL in mode 4, this is 512 pixels. These should be shifted out at some clock that is available, and at 15MHz, this would be 720 pixels, and we know it's 512. Using 512 as a reference, we get 93.75ns. The closest we can get from the available 15MHz is if we divide it by 1.5, getting us 10MHz, or 100ns. But, this gives us 51.2us as the visible area, more than the 48us available, so 3.2us end up displayed in the 'invisible part' - and now we know why at full 512x256 resolution, a small portion (left, right or both sides) of the screen is not visible.
However, 640 total pixels each 100ns 'wide' gives us exactly 64us for the total line lenght, i.e. the horizontal (or composite) sync period, just exactly what we need. Further, this is also exactly divisible by 66.6666ns (or, the 15MHz clock) and produces 960 periods of the 15MHz clock - important because our CPU runs at half this clock, so in essence the CPU runs in sync with the screen refresh process. Therefore a repetitive algorithm can be used to satisfy both sides - the CPU and the screen generation. Finally, if one studies the CPU datasheet carefully, one sees that all the CPU access cycles actually use both edges of the CPU clock - so having a double speed clock with respect to the CPU is a very big plus for any logic based on the CPU clock versus signal generation, as logic is normally triggered on a single edge (each clock cycle). This gives us a 15MHz clock pulse for each CPu clock edge, and an ability to thus track but also PREDICT the CPU timing.
As it turns out, this is exactly what the 8301 does.

Video timing
The 8301 generates 312 lines of 640 mode 4 pixels, each line also corresponding to 480 CPU clock periods.
Each line is divided into 40 chunks, 32 of which contain the 512 visible pixels in each line, and 8 of which are the retrace periods. So, chunks 0 to 31 are visible, and 32 to 39 are forced black, i.e. invisible. The horizontal portion of the CSYNCHL signal is a pulse that is active during chunks 34, 35 and 36.
Out of the 312 lines, 256 are used for the picture, and 56 are forced black, with VSYNCH occuring at approximately line 288, if someone needs a precise number I'll re-measure this.
The importance of the 40 chunks within each line might not be apparent until one considers that they are nicely expressed by a whole number of both the 10MHz pixel clock periods (16 pixels) and CPU clock periods (12 clocks, or 24 15MHz clocks).
The 10MHz clock is generated from the 15MHz clock by using double-edge triggering on the 15MHz clock, and counting 3 edges of the 15MHz clock for each 10MHz clock period. This normally generates a 10MHz clock with a 2:1 duty cycle, however this is not directly visible anywhere outside the 8301 so the exact edge to edge correspondence is not easy to find out. This is of some importance (see below) but only if one wants to 'read' the RGB outputs in order to do something clever with them, such as produce a 16 color mode out of 2 subsequent mode 4 pixels with external hardware.

Accessing video data
The 8301 uses a fixed scheme of access that it repeats within every of the 40 chunks of 12 CPU clocks which make each display line.
There are a maximum of 3 combinations:
1) When no screen data is accessed (during chunks 32 to 39) an scheme using 4 CPU clock cycles and double edge triggered logic is used to generate RAM timings. The CPU can start an access on any rising edge (which is how the 68008 normally works), and it will take 4 cycles to complete. If the CPU attempts to start an access 3 or less cycles before a chunk where video data needs to be accessed, it will be ignored and operation will continue as follows below. The important thing to say here is that ONLY during chunks 32 to 39, so ONLY 20% of the time the CPU has more or less full speed access to the motherboard RAM.
2a) When screen data is accessed (during chunks 0 to 31 for each of the 256 visible display lines), the first 8 CPU clock cycles of each chunk are dedicated to screen RAM data access, during which the VDA and TXOEL signals are high, preventing any contanct between CPU and RAM. If the CPU starts a RAM access cycle during this time or less than 3 cycles before chunk 0, it will be ignored, and then given access during the last 4 CPU cycles of the total 12 in a chunk. At this point the standard 4 cycle DRAM timing will be peformed for the CPU. This means that 80% of the time, the CPU only has access to the RAM 4 out of 12 cycles, i.e. at only 1/3 of the maximum theoretical speed.
2b) - and this one is not nice - during chunks 0 to 31 for each of the 56 invisible screen lines, no data needs to be accessed for the screen, BUT the 8301 behaves exactly the same, just does not access data, but rather refreshes the DRAM. In actuality, it still uses 8 CPU clock cycles out of 12 for itself, but does not activate any of the CASL lines, thus making the usual screen RAM access into a refresh cycle. This means that even for the 56 lines when no screen data is needed (nearly 22% of total time), the CPU is still slowed down the same as for visible lines.
The video data itself is accessed using DRAM page mode, which is a short 'burst' access mode that reads consecutive data within the same RAM row, in this case 4 bytes.
DRAM in general is organized as a roughly square array of memory cells, which is why RAS (row address) and CAS (column address) signals are given to the chips, and why the address is multiplexed, row first, then column. Internally, the RAM actually reads a whole row of bits - in the case of a 64k x 1 bit RAM as used in the QL, 256 bits are read at once and held in a 'column register'. The column address then selects the one bit out of the column. However, once the column has been read, data within it can be accessed very quickly by changing the column address only.
This is what the 8301 uses to access video data. It sets up the row adress, drives RASL low to latch it into the RAM chips, then sets up a column address and drives CAS0L low to read the data, then sets up the next column address, drives CAS0L high then low again to access the next consecutive bit, and does this 4 times total. So, instead of accessing one byte in 4 clocks as would be the case for random access, it manages to get 4 bytes in 8 clocks, a double improvement over regular access speed. But, as was explained above, even that penalizes the CPU severely. Out of every 480 clocks in each display line, only 224 are available for CPU access, and even then some might be lost due to sync (as when the CPU does not start a cycle on a modulo 4 clock boundary because of internal operations). This means the CPU can access motherboard RAM at most at 45.7% of the theoretical maximum speed.
Also, the 8301 only uses RAM bank 0 (CAS0L) for video data access.

Issue 5 boards and 8302
On issue 5 boards, the 8302 is connected to the RAM bus and for all intents and purposes, accessing it has exactly the same characteristics as accessing RAM - and is subject to the same slowdown.
One not so apparent problem here is that the 8301 needs rather substantial drivers on RAM address and data pins - as there are 16 chips there, with all addresses in parallel, so each address line drives 16 input pins on the RAM chips, the pin on the 74LS257 multiplexer, and a whole lot of copper trace. However, for each bit, the RAM chips used have separate inputs and outputs which are tied together, as well as tied together to the same pair connected to the corresponding bit in the other bank - so, each data line on the 8302, as connected on issue 5 boards, drives 6 pins, 2 on each DRAM in bank 0, 2 on each DRAM in bank 1, the 8301 (this only ever reads data), and the 74LS245 bus transceiver.
So, not only is there a timing in-accuracy (due to 8301 screen read - this probably results in problems with net access) when accessing the 8301, it also needs to drive more chips than expected. On issue 6, both are solved - the 8302 is decoded directly by the HAL and it drives 4 pins - the CPU, 74LS245, two ROM chips.
When the 8301 detects A17=0 and A16=1, it assumes an IO access. As a result, no DRAM bank is selected via CASL lines, even though a fake RAM access cycle is generated. Instead, PCENL is generated, with timing very similar to CASL.
When the 8301 acts as a DRAM controller for the CPU, it uses a simple sequence - ROWL is initially low to present the row address to the DRAM, after which RASL goes low to latch this address into the DRAM, then ROWL goes high with a slight delay (because DRAM expects the row address to persist a short time after RASL goes low, and then switching over tocolumn address), and then on the next half cycle, i.e. a bit more delay, CASL goes low.
In order to know if it's internal MC register is to be written, or the 8302 registers are to be accessed, the 8301 needs address line A6, which it does not have available as a signal from a pin. Instead, it can read it's state from the RAM address lines, as it is contained within the column address. Because of this, it cannot generate PCENL, which is the 8302 chip select signal, until ROWL goes high and A6 becomes available on DRAm address bit DA3. Consequently, the 8302 access time is actually quite a bit faster than a 68008 can manage, and it should work just fine with a faster CPU provided it's connected as on issue 6 and later motherboards.


martyn_hill
Aurora
Posts: 909
Joined: Sat Oct 25, 2014 9:53 am

Re: ULA ZX8301 - TV Picture Capabilities

Post by martyn_hill »

Can't wait for the next gripping episode!

(Do we get a prize if we answer the A6 question correctly?) :-)


martyn_hill
Aurora
Posts: 909
Joined: Sat Oct 25, 2014 9:53 am

Re: ULA ZX8301 - TV Picture Capabilities

Post by martyn_hill »

Darn - you beat me too it!

Gota love Nasta!


Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: ULA ZX8301 - TV Picture Capabilities

Post by Nasta »

DRAM refresh

The 8301 relies solely on the pattern of screen reading to refresh the RAM, which is one reason it keeps 'fake reading' the screen even if the contents are invisible.
DRAM is peculiar because it's data will self-destruct in a short time if it is not refreshed. This is because it is actually kept as a charge in a microscopic capacitor, which is actually a gate electrode of a MOSFET. However, most people who have routinely used DRAM do not know what 'refresh' actually means - and the actual process is even more peculiar than one might expect.
The uncommon knowledge about DRAM is that internally, reading data from it's cells is actually destructive - reading a cell will discharge it to a point where it's uncertain that the data is retained. This is because we want the data holding capacitor to be the smallest possible, in order to fit as many of them on a chip as possible, i.e. to get the largest memory capacity per unit area. The smallest you can make it is that it's capacitance is just slightly over that of all the lines and inputs to the readount circuitry, as reading will then transfer just below half of the charge from the data cell to the readout circuits - half charge being the limit between a 0 and a 1 being stored.
Thus, every time data is read from the DRAM it's also re-generated with the read circuitry and written back into the cells. WHen data is written, it's actually read, then simply replaced by the data to be written, and again, written back into the cells.
Refreshing is nothing less than simply reading (which automatically means regenerating and re-writing) but ignoring the read-out data.
For standard DRAM, it is stated that every row (and I mentioned before that when even a single bit is accessed, the whole row is read - and (re) written) should be refreshed at least once within a 4ms interval. This means that all row addresses have to be gone through in at most 4ms if we want to guarantee data integrity.
Since we can never know what rows the CPU will access, we cannot guarantee this without using special refresh cycles. However, the 8301 gets around this by exploiting the fact that data used to generate the picture on the screen is read in sequence, all 32k bytes every 20ms.
It is only down to what address bits are mapped to which row and column address bits, to guarantee all rows cycle within 4ms or less.
So let's look at that - how does the 8301 do it? We can infer this from the way CPU address bits are connected to the 74LS257 multiplexers, since we know the data appears sequential to both the CPU and 8301. One more thing we know is, that the 8301 reads 4 consecutive bytes in a sequence from the RAM when reading screen data, so we know bits A0 and A1 are going to appear as the lowest two bits in the column address.

This is how it's connected:
RAM DA0 DA1 DA2 DA3 DA4 DA5 DA6 DA7
ROW A2 A3 A4 A7 A8 A9 A10 A11
COL A0 A1 A5 A6 A12 A13 A14 A15

Each display row contains 128 bytes, and reading it is divided into 32 x 4 byte bursts (during each burst the row address remains the same). within each display row 8 consecutive DRAM rows are refreshed, 4 times each. All 256 rows of the DRAM
ere refreshed every 32 display lines, every 2.048ms, meaning the entire RAM is refreshed 9.75 times for every frame, since the frame period is 19.968ms. In other words, it's quite over-refreshed - but remember Sinclair's cheap streak, the propensity to use cheap out of spec DRAM. Still, the need to have A6 available limits the multiplexing scheme, which in turn imposes some complexity on circuits that could have made more DRAM bandwidth available to the CPU.

DRAM timing - why not use 16MHz?

The timing is based on the 15MHz master clock, and is quite lavishly slow even for the slowest 200ns 64k x 1 DRAM as used in the QL. Again, remember Sinclair's proclivity to save on everything. The sad thing is, due to all of the delays most QLs were built using regular 200ns or faster DRAM - the least improvement this would have made is the ability to use a 16MHz base clock. Keeping the basic operation the same, the number of 12 CPU clock chunks would have to be increased from 40 to 43, resulting with a slightly longer but still in-spec sync frequency, but more importantly, the full 512 pixels of mode 4 would have fitted within the screen. Since counting to 40 needs a 6-bit counter just like counting to 43, the difference in logic needed to reset a counter from 40 to 0 versus 43 is probably 2 ULA gates, completely trivial.
Also, running the CPU at the full 8 MHz provide a nice speed-up, slightly more than the difference in operating frequency - remember, now that each line of the has 43 chunks of 12 clocks, so a total of 516 clock cycles instead of 480 previously available, while the number of clocks used for display generation remains the same (256), the CPU now has full speed access during 260 out of 516 clock cycles, and the memory now runs at 50.4% maximum speed, a 7.9% improvement, more than the 6.66% on account of clock speed! A 15% total improvement would have come in handy at the time.


Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: ULA ZX8301 - TV Picture Capabilities

Post by Nasta »

The curse of issue 5

As stated before, the main difference between issue 5 and issue 6 boards is the HAL chip. However, on inspection, it does quite a bit more than just replacing a single 74LS03 chip. In fact, it also connects the 8302 directly to the CPU bus and decodes it instead of the 8301, leaving the PCENL pin hanging free - and having one pin free on an ULA chip now poses all sorts of questions on how things could have been different, if the HAL (or it's equivalent) was there from the start, and the 8301 and 8302 were indeed treated as separate chips from the very start.

Let's start with the 8302 - which has an extra pin to begin with!
The 8302 has a DSMCL pin (which is basically DSL from the CPU, that can be disabled in order to enforce different decoding from the outside, using the DSMCL pin on the J1 connector), and also the PCENL pin. In order for the 8302 to be accessed, both have to be low. The thing is, both are ALWAYS low because the 8301 uses the DSMCL pin of it's own to decode PCENL. Connecting DSMCL on the 8302 to ground or interchanging DSMCL and PCENL makes no difference for the 8302. Definitely an opportunity lost, though it takes a bit more imagination to figure out what this extra pin could have been used for.

The 8301 is more difficult in this manner as decoding it separately generates all sorts of what-ifs and if-onlys.

To begin with, there is TXOEL. This signal gates off the RAM data bus from the CPU data bus, and is active only if either CPU is accessing the RAM or the 8301 when connected to the RAM bus as on issue 5, as well as when the internal MC register of the 8301 is accessed. This is a push-pull signal, and is generated basically by VDA being low, RASL being low (for which DSMCL needs to be 0, A17 needs to be 1, or A17 = 0 and A16 = 1.
Basically: TXOEL low = VDA low and RASL low. So you could simply generate it from VDA and RASL with an OR gate.

Stunningly, if one looks at DTACKL from the 8301, it is either the same as TXOEL, or inverted ROMOEH. These can never occur simultaneously. So, you can get DTACKL from VDA, RASL and ROMOEH.

Finally, there is ROWL. This is used to multiplex row and column addresses for DRAM access. It is essentially a slightly delayed version of RAS. Sufficiently slightly to function as a RC delay between RASL and the inputs of the multiplexer, even better if there is a free simple gate to implement the delay. Now, one could say, RASL also goes low when the 8301 accesses the RAM for it's own needs - so that would also operate the multiplexer input pins. The thing is, the VDA signal disables the multiplexer while 8301 is doing that, so the signal is really a 'don't care' under those circumstances, anyway.

So, there are no less than 3 pins that could have been freed with the use of some external TTL logic, two for sure with a single chip such as a 74LS32 (TXOEL and ROWL). Enter the HAL, which decodes 8301, 8302 and DTACKL using it's own logic, from A17, A16, A6, FC1, FC2 and DSL, with plenty to spare for the above mods, also leaving PCENL on the 8301 not being used, so that's a total of 4 extra pins. Actually, the HAL could have been made a bit more clever improving the performance of DSMCL.

The funny thing is... if ROWL was say, made into A6, as it's function can be emulated from RASL using only two passive components, the 8301 could have decoded PCENL just like ROMOEH, directly, with no wait states during which it has to wait for A6 to become available on the RAM bus, which is only possible when 8301 is not accessing RAM to generate the screen - and the 8301 could have been connected as it on issue 6 and later, without the HAL, just to make a full circle...

And now we are seriously going into the realm of what-if and if-only - things that could have been improved in the chip as it is.
Last edited by Nasta on Wed Nov 29, 2017 9:24 pm, edited 1 time in total.


Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: ULA ZX8301 - TV Picture Capabilities

Post by Nasta »

What could have been improved?

Well, aside from the issue of using a 16MHz clock rather than 7.5HMz (which also has repercussions on the 8302 as the CPU clock is used for baud rate generation), the biggest improvement to be gained would have been handling the invisible vertical retrace period in a more clever way, leaving the CPU more available RAM bandwidth.
Assuming there was no refresh needed at all during this period, so all available cycles were open to CPU access, given the exact same 15MHz main clock, each of the 256 visible screen lines would have had 224 out of 480 CPU cycles available to the CPU, and the remaining 56 lines would have all of the 480 CPU cycles available. It should be noted that 56 lines is just over 3.58ms, and with the current scheme it's not really possible to guarantee a way to refresh all DRAM rows in the remaining 420us (to keep the refresh requirement at 4ms maximum), this is a sort of mind experiment to see how much we can gain at most, using this idea.
Under the actual scheme, 224 out of 480 cycles are available to the CPU for all 312 lines, so 69888 cycles out of 149760 total, giving the already mentioned 46.67% maximum speed.
If no refresh was needed during the 56 retrace lines, 84224 out of 149760 total cycles would be available, which is 56.2% maximum speed. Does not look that much until the two are compared relatively one to another, the latter approach would yield a 20.5% improvement.

So, let's compare with some more realistic approaches:
1) shorten the refresh timing to a regular 4 cycle rather than 8 cycle burst timing. This means that for the active 32 chunks during the vertical retrace 352 out of 480 cycles are available to the CPU. Thus 77056 out of 149760 total cycles are available for the CPU, 51.5% total bandwidth, a 10% improvement.
2) Since each display line refreshes the same 8 DRAM rows 4 times, limit this to one time. There are two ways to do this, one being more elegant - during vertical retrace, generate refresh only chunks inside the horizontal retrace. There is exactly 8 of them, providing the same number of DRAM rows to refresh. Also, during vertical retrace, free all active chunks for CPU access. This means that now 416 out of 480 cycles on each line are available to the CPU during vertical retrace. Thus 80640 out of 149760 total cycles are available for the CPU, 53.8% available bandwidth, 15.4% improvement.
3) combination of both 1 and 2, implemented during vertical retrace, gives 82432 out of 149760 cycles available to the CPU, 55% bandwidth, 17.95% improvement over standard, and pretty much the most one can expect.

All of these figures improve proportionally when a 16MHz base clock is used.

Other (obvious?) improvements:

1) Adding screen 2 and 3 would have been trivial, one extra bit in the MC register to select screens, which select CAS1L is used instead of CAS0L for video generation.

2) Slightly different logic for VSYNCH and RAM address multiplexing could have enabled a 512x512 interlaced mode, though at the expense of using 64k of RAM. Alternatively color efects could be used for more colors (this was done on the spectrum). Note: some of this can be done externally.

3) 16 color mode. It could have been done without extra pins (though I have shown that some could be made available!) by modulating the pixel width. Since MODE 4 already had pixels half the width of MODE 8, the 4th bit could have been used to display half or full wide pixels. In half-wide, the remaining half is filled with white (black is not a good candidate as half wide and full wide black = black). If you want to be real clever, swap the halves every even/odd line to get a chequerboard stipple. This can be done externally with some logic that has to extract a 10MHz clock from the available 15MHz, and use it to sample and process RGB. An extra control bit could also be implemented using external logic.
Admittedly, the bit layout is a bit awkward but so is MODE 8.

4) Extended vertical resolution. The standard supports up to 288 vertical pixels. As the ULA is now, it would basically just extend the visible number of lines by 32 and move VSYNCH 16 lines further down. There would be no penalty in RAM speed. If it was enhanced as discussed above, additional visible lines reduce the available bandwidth. The extra lines would just overflow into the screen 1 memory area, so no dual screen mode if extended vertical resolution is used.

5) Fast mode, anyone? When the screen is blanked (control bit is already available), all lines in the frame are treated using the enhanced refresh method discussed above. 93.33% RAM bandwidth available to the CPU, 100% improvement over current situation. But - no display. Reserved for stuff that needs to be real fast :P

Finally, just to get back to the 8302 and using the 7.5MHz CPU clock. Some dividers would have to be done differently if an 8MHz clock was used, due to baud rate generation. However, it is a mystery to me why 11, rathter than the standard 11.059MHz crystal was not used on the IPC, given that the value is suitable for simple baud rate generation (by dividing by 9)! It could have been simply passed on to the 8302 (and remember, there could have been an extra pin there from the very start even if one wanted to retain the 7.5MHz clock). Even so, the 11.000MHz clock would have offered increased baud rate accuracy. The highest baud rate attainable with the 8302 is 19200, which is approximately 7.5MHz divided by 390 (exact would be 390.625). Since BAUDX4 is available, it stands to reason the internal circuitry also works at 4x the baud rate (though, strictly speaking since it's a transmitter only, it could work at 1x the baud rate), so getting 19200x4 from 7.5MHz is a less accurate division by 98 (-0.35%). In theory you can go one more baud rate higher (38400x4) but after that the error increases too much, while up to that limit it's no problem. With a 11.059MHz reference, any standard baud rate up to 3686400 can be generated by dividing by 3 and then subsequent divisions by 2, or up to 1260800 by dividing by 9 and then subsequent divisions by 2, all with perfect accuracy. With 11MHz the error is the reference clock error, about -0.54%. Based on a BAUDX4 clock for the transmitter, this would be up to 921600 or 307200 respectively. That being said, one more counter bit is required compared to the 7.5MHz version, but the counter is simpler. Also, available IPC replacements such as Hermes and superHermes do not use the BAUDX4 line on the 8302, but can use their internal timers to generate baud rate references, at which point using 11.059 for a 8049 or similar IPC replacement is a bonus as it's easy to generate the baud rate locally on the chip with perfect accuracy.
Let's also explore an 8MHz version - in this case BAUDX4 goes all the way up to 614400 (actual baud rate would be 153600) by dividing 8MHz with 13 and then subsequent divisions by 2 to get lower baud rates, with excellent accuracy (+0.17%). It also requires the least stages in the divider and the simplest circuit. However, baud rates such as 57600, 115200, 230400 cannot be generated with precision better than 2%
Last edited by Nasta on Thu Nov 30, 2017 1:21 am, edited 1 time in total.


User avatar
M68008
Trump Card
Posts: 223
Joined: Sat Jan 29, 2011 1:55 am
Contact:

Re: ULA ZX8301 - TV Picture Capabilities

Post by M68008 »

Nasta wrote:with VSYNCH occuring at approximately line 288, if someone needs a precise number I'll re-measure this.
Do you know at what point IPL1 is triggered? At the same time as VSYNCH?


Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: ULA ZX8301 - TV Picture Capabilities

Post by Nasta »

M68008 wrote:
Nasta wrote:with VSYNCH occuring at approximately line 288, if someone needs a precise number I'll re-measure this.
Do you know at what point IPL1 is triggered? At the same time as VSYNCH?
Not sure as this is done by the 8302 (it has a VSYNCH input for this), but I would assume at the rising edge of VSYNCH plus reaction time of the CPU (the processing time for the logic inside the 8302 should be negligible in this case).
I can try and measure it when I get the next chance.


tcat
Super Gold Card
Posts: 633
Joined: Fri Jan 18, 2013 5:27 pm
Location: Prague, Czech Republic

Re: ULA ZX8301 - TV Picture Capabilities

Post by tcat »

Hi Nasta,

This is a very thorough explanation, and excellent read, too.
I have also wondered how 68008 does read and write 16, and 32 byte entities over 8-bit data bus.
Also, you described how some pins can be freed by supplying simple external logic, e.g. transistor+diode.
In standard QL design there is already such fast transistor used to drive DSMCL, at least I believe is this signal.

Many thanks.
Tomas


Nasta
Gold Card
Posts: 443
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: ULA ZX8301 - TV Picture Capabilities

Post by Nasta »

tcat wrote: I have also wondered how 68008 does read and write 16, and 32 byte entities over 8-bit data bus.
No big mystery there - it's all explained in the 68000/8 user manual, it basically breaks larger transfers into a series of byte transfers, at least 4 clock cycles per transfer.

That being said, because during screen access the 8301 only gives the CPU 4 out of 12 clock cycles, things like moving 16-bit words or 32-bit long words while the 8301 is accessing the RAM as well, will result in consecutive accesses, but they will be extended to 12 clock cycles each as far as what the CPU sees.


Post Reply