Re: 8301
Posted: Fri Aug 03, 2018 2:34 am
Here is a bit more about how the 8301 works.
Beware, there will be some numbers to understand, as well as a lot about timing based on various clock cycles.
I will start with the more complicated bit, which is how the 8301 actually manages to read the screen RAM and use the data to create the image on the screen.
A bit about CRT screen basics is needed here - and the plus side is, much of the way these used to work is still the underlying logic for more modern flat panel displays.
QL users already know that the MODE 4 resolution is 512 x 256 pixels - so, 512 pixels per each of the 256 lines. However, a CRT monitor does not actually have 'pixels' in the usual sense, but rather each line has a part you can display an image within, and a part that is not seen, and falls 'outside of the screen'. This will already give us a clue why some of the QL's picture get's clipped on the sides - the part the QL uses to display pixels is actually slightly wider than standard and extends outside the screen, into the 'invisible area'.
The CRT displays a picture using a 'raster' of lines, basically it draws the display using a focused 'dot' of electrons hitting the phosphors on the screen. Depending on the current (which is basically the amount of electrons hitting the screen) the point will be more or less bright. The dot is made to move from the top lefthand corner in horizontal lines, till it reaches the right end of the screen, then it returns quickly back to the lefthand side but a bit lower, so the next line it describes gets to be under the previous line. It repeats that until it reaches the bottom of the screen, then returns back to the top lefthand corner.
While it is doing that, the actual video signal modulates the electron dot and thus produces a picture.
In order for the monitor to know when to return from the right side of the screen to the left, there is a horizontal synch signal pulse that starts the 'retrace' back to the left side. Similar there is a vertical synch signal pulse that tells the monitor when the last line has been drawn and the dot should return to the top of the screen. In actuality, horizontal and vertical movement are independent so it's up to the device driving the monitor to properly generate the synch signals to get a stable picture.
One thing to know is that actual timing back then was based on a standard TV specification, and it is quite rigid. To top it off, it takes time for the dot to react to the synch signals and also to return to the left side, and during that time the dot travels backwards at a higher speed, so the video signal should be turned off (black level) or it would also write an image backwards and with less precision and resolution as the dot is now traveling faster and not necessarily as precisely as in the 'usable' L to R direction. As the definition of one display line is basically the time from one to the next synch pulse, this is why there is a portion of the line that can be used to display video within, while the other, non-usable part is called the retrace period.
The same exact logic applies to the vertical direction, but now the period between the synch pulses is expressed in lines. Similar to how each line is composed from a visible and invisible part, so is the entire frame of lines composed of visible and invisible lines, where the invisible lines now form the vertical retrace period.
For PAL TV, on which the QL video is based, each line takes 64us and there are 312 lines per frame, so the entire frame takes about 20ms to draw, resulting in a frame frequency of 50Hz.
Quick aside: Real PAL TV uses two consecutive frames with 312.5 lines each (.5 meaning the vertical synch pulse happens at about half the 313th line of the even frame and at the end of the 312th line of the odd frame) to get a total of 625 lines of vertical resolution by drawing first the even and then odd lines, at an effective 25Hz rate, the contents of the picture rarely being wildly different between the even and odd frame so the interlacing reduces flicker. However, the QL simply uses the non-interlaced version which reduces the vertical resolution to 312 lines but with twice the refresh rate, which is more suitable to a computer display where contents of lines can be completely independent, so flicker would be quite annoying at 25Hz even at twice the vertical resolution.
Now, remember that I said that not all of a line can be used, and neither can all of the lines be used to display pixels. Also, the signal that modulates the electron dot has a 'bandwidth', or, simply put, a maximum frequency so the number of pixels one can put into the allotted 64us of time is also limited. For a color system it's about 400 or so pixels in ideal circumstances, realistically around 320 if you used relatively high spec commercial video circuits and ICs. However, if you can drive the 'dot' directly, then it comes down only to the circuits in the actual monitor, and the sharpness of the dot - but the basic timing remained (back then) based on standard TV, if you wanted to avoid going bankrupt.
What this means is, one line is 64us, out of which 48us should be used for the actual pixels, and there are 312 lines out of which up to 288 in theory could be used for pixels, in other words, this defines the 'visible screen area'.
Since the QL was intended to have a 80 character text display, and (I will jump ahead a bit here) we know the pixels come out at a 10 MHz rate, this means that each pixel takes up 0.1us, so if 48us is visible, at 0.1us per pixel, that would give us 480 pixels per line. Using a 6 pixel wide character, we would get exactly 80 characters per line. And, if we wanted to use all of the 288 lines available for display, that would give us 138240 total pixels and a 480x288 resolution, and that comes out to 34560 bytes. And we do know that computers rather like things to be numbered in powers of 2, because it simplifies addressing of those bytes.
Let's explain this in a bit more detail.
The way various timing signals are generated by digital logic, is to use a master clock and then count cycles of it to get the various periods and frequencies. Again, knowing that 10MHz is used to drive the video system, it means all timings are derived from units of 0.1us. In this case it means that a whole diplay line (visible plus invisible part) takes 640 clocks at 10MHz, which is why the horizontal synch is 10MHz divided by 640, which gives us 15625Hz for the horizontal synch frequency. This signal is then used to count lines, again visible and invisible ones, 312 total.
If one uses standard binary counting, starting at 0, this means that one line has 640 cycles numbered 0 to 639. Lines are numbered 0 to 311. In binary, 629 requires 10 bits to encode, and the total pixels in a iline get numbered from 0000000000b to 1001111111b, the last number being 639=512+127. Once cycle 640 happens, the counter is reset to 0, so one could detect the combination 1010000000 to reset the line pixel counter - and in fact, this is done by only detecting the two 1's in the entire 10-bit counter, which makes the entire 'reset' circuit very simple.
For the line counter, 311 requires 9 bits to code, and the numbers go from 000000000b to 100110111. Once state 312 came up the counter would be reset, and 312 being 100111000, the hardware can detect the 4 1s in there using a 4-input and gate to reset the counter, so still not too bad for reset logic.
However, getting the address of the data to read from screen RAM to display at the right time, and this goes from address 0 to 34559, from the state of the pixel and line counters, is so complex that it's easier to actually have one 16 bit counter for the address (which is what is needed to code for numbers 0 to 34559) and reset it when the vertical counter is reset, and let it count only for certain combinations of states of the horizontal and vertical counter - namely, only while the horizontal counter counted from 0 to 479 and the vertical counter was counting from 0 to 287. So along with a 16-bit counter (which can be a problem because it takes time for the carry from lower bits to ripple into the higher bits for a simple counter implementation so it could get too slow, and a synchronous counter would have to be used instead which uses a LOT more logic for long counters), one would also need additional logic to figure out when the counter should advance in count and when not.
So what happens if the horizontal and vertical resolutions were some convenient power of 2? This is now getting awfully familiar, as using the closest values would be 512 horizontal and 256 vertical.
This makes the pixel count go from 0 to 511, which fits exactly 9 bits (the reason we used a power of 2 in the first place!) and line count from 0 to 255, which needs exactly 8 bits. So, the numbers go from 000000000 to 111111111 horizontally and 00000000 to 11111111 vertically. Since we are counting starting with visible pixels and lines, this means the initial state of the counters up to 511 for the horizontal and 255 vertical correspond exactly to the address of the pixel, so concatenating the lower 8 bits of the vertical counter with the lower 9 bits of the horizontal counter would directly give us a pixel address. Since there are 4 pixels per byte (given 2 bits per pixel), the bottom 2 bits of such address would give you the position of the 2 bit pixel within a byte and the remaining 15 bits would give you a byte address, from 0 to 32767, which is exactly 32k. All of this is basically re-using horizontal and vertical counter bits and no additional logic - which is a considerable simplification of logic, which is what you want when designing custom logic that is supposed to be as cheap as possible.
The consequence is that now the horizontal resolution is increased to 512 pixels, which uses 51.2us of the complete display line, which breaks the 48us standard. So, there are 32 extra pixels - the timing is adjusted so that around 16 are added to each side of the 480 pixel visible area, and this is how TV mode was born - have 512 pixels horizontally, and limit the usable pixels in software. Simplifying the logic to 9 and 8 bit addressing for the visible pixels and lines also simplifies the logic that has to do with generating the vertical synch pulse and 'blanking' the display, i.e. to determine when pixels are to be fed to the monitor, or 'black' should be generated during the various invisible or unused areas of the screen - simply look at the top bit of the counter and if it is 1, generate blank (black) pixels.
Even better - it makes it also much simpler to write software for. Figuring out the byte address for a 480 pixel wide display, requires 'take x coordinate and add to y coordinate times 120', so 'real multiplication' while calculating with a 512 pixel width means simply shifting and splicing bytes.
There are also other ways this simplifies the actual logic that reads the data from RAM as well as generates the required timing for the RAM when the CPU reads or writes it, but more on this in the next post.
Beware, there will be some numbers to understand, as well as a lot about timing based on various clock cycles.
I will start with the more complicated bit, which is how the 8301 actually manages to read the screen RAM and use the data to create the image on the screen.
A bit about CRT screen basics is needed here - and the plus side is, much of the way these used to work is still the underlying logic for more modern flat panel displays.
QL users already know that the MODE 4 resolution is 512 x 256 pixels - so, 512 pixels per each of the 256 lines. However, a CRT monitor does not actually have 'pixels' in the usual sense, but rather each line has a part you can display an image within, and a part that is not seen, and falls 'outside of the screen'. This will already give us a clue why some of the QL's picture get's clipped on the sides - the part the QL uses to display pixels is actually slightly wider than standard and extends outside the screen, into the 'invisible area'.
The CRT displays a picture using a 'raster' of lines, basically it draws the display using a focused 'dot' of electrons hitting the phosphors on the screen. Depending on the current (which is basically the amount of electrons hitting the screen) the point will be more or less bright. The dot is made to move from the top lefthand corner in horizontal lines, till it reaches the right end of the screen, then it returns quickly back to the lefthand side but a bit lower, so the next line it describes gets to be under the previous line. It repeats that until it reaches the bottom of the screen, then returns back to the top lefthand corner.
While it is doing that, the actual video signal modulates the electron dot and thus produces a picture.
In order for the monitor to know when to return from the right side of the screen to the left, there is a horizontal synch signal pulse that starts the 'retrace' back to the left side. Similar there is a vertical synch signal pulse that tells the monitor when the last line has been drawn and the dot should return to the top of the screen. In actuality, horizontal and vertical movement are independent so it's up to the device driving the monitor to properly generate the synch signals to get a stable picture.
One thing to know is that actual timing back then was based on a standard TV specification, and it is quite rigid. To top it off, it takes time for the dot to react to the synch signals and also to return to the left side, and during that time the dot travels backwards at a higher speed, so the video signal should be turned off (black level) or it would also write an image backwards and with less precision and resolution as the dot is now traveling faster and not necessarily as precisely as in the 'usable' L to R direction. As the definition of one display line is basically the time from one to the next synch pulse, this is why there is a portion of the line that can be used to display video within, while the other, non-usable part is called the retrace period.
The same exact logic applies to the vertical direction, but now the period between the synch pulses is expressed in lines. Similar to how each line is composed from a visible and invisible part, so is the entire frame of lines composed of visible and invisible lines, where the invisible lines now form the vertical retrace period.
For PAL TV, on which the QL video is based, each line takes 64us and there are 312 lines per frame, so the entire frame takes about 20ms to draw, resulting in a frame frequency of 50Hz.
Quick aside: Real PAL TV uses two consecutive frames with 312.5 lines each (.5 meaning the vertical synch pulse happens at about half the 313th line of the even frame and at the end of the 312th line of the odd frame) to get a total of 625 lines of vertical resolution by drawing first the even and then odd lines, at an effective 25Hz rate, the contents of the picture rarely being wildly different between the even and odd frame so the interlacing reduces flicker. However, the QL simply uses the non-interlaced version which reduces the vertical resolution to 312 lines but with twice the refresh rate, which is more suitable to a computer display where contents of lines can be completely independent, so flicker would be quite annoying at 25Hz even at twice the vertical resolution.
Now, remember that I said that not all of a line can be used, and neither can all of the lines be used to display pixels. Also, the signal that modulates the electron dot has a 'bandwidth', or, simply put, a maximum frequency so the number of pixels one can put into the allotted 64us of time is also limited. For a color system it's about 400 or so pixels in ideal circumstances, realistically around 320 if you used relatively high spec commercial video circuits and ICs. However, if you can drive the 'dot' directly, then it comes down only to the circuits in the actual monitor, and the sharpness of the dot - but the basic timing remained (back then) based on standard TV, if you wanted to avoid going bankrupt.
What this means is, one line is 64us, out of which 48us should be used for the actual pixels, and there are 312 lines out of which up to 288 in theory could be used for pixels, in other words, this defines the 'visible screen area'.
Since the QL was intended to have a 80 character text display, and (I will jump ahead a bit here) we know the pixels come out at a 10 MHz rate, this means that each pixel takes up 0.1us, so if 48us is visible, at 0.1us per pixel, that would give us 480 pixels per line. Using a 6 pixel wide character, we would get exactly 80 characters per line. And, if we wanted to use all of the 288 lines available for display, that would give us 138240 total pixels and a 480x288 resolution, and that comes out to 34560 bytes. And we do know that computers rather like things to be numbered in powers of 2, because it simplifies addressing of those bytes.
Let's explain this in a bit more detail.
The way various timing signals are generated by digital logic, is to use a master clock and then count cycles of it to get the various periods and frequencies. Again, knowing that 10MHz is used to drive the video system, it means all timings are derived from units of 0.1us. In this case it means that a whole diplay line (visible plus invisible part) takes 640 clocks at 10MHz, which is why the horizontal synch is 10MHz divided by 640, which gives us 15625Hz for the horizontal synch frequency. This signal is then used to count lines, again visible and invisible ones, 312 total.
If one uses standard binary counting, starting at 0, this means that one line has 640 cycles numbered 0 to 639. Lines are numbered 0 to 311. In binary, 629 requires 10 bits to encode, and the total pixels in a iline get numbered from 0000000000b to 1001111111b, the last number being 639=512+127. Once cycle 640 happens, the counter is reset to 0, so one could detect the combination 1010000000 to reset the line pixel counter - and in fact, this is done by only detecting the two 1's in the entire 10-bit counter, which makes the entire 'reset' circuit very simple.
For the line counter, 311 requires 9 bits to code, and the numbers go from 000000000b to 100110111. Once state 312 came up the counter would be reset, and 312 being 100111000, the hardware can detect the 4 1s in there using a 4-input and gate to reset the counter, so still not too bad for reset logic.
However, getting the address of the data to read from screen RAM to display at the right time, and this goes from address 0 to 34559, from the state of the pixel and line counters, is so complex that it's easier to actually have one 16 bit counter for the address (which is what is needed to code for numbers 0 to 34559) and reset it when the vertical counter is reset, and let it count only for certain combinations of states of the horizontal and vertical counter - namely, only while the horizontal counter counted from 0 to 479 and the vertical counter was counting from 0 to 287. So along with a 16-bit counter (which can be a problem because it takes time for the carry from lower bits to ripple into the higher bits for a simple counter implementation so it could get too slow, and a synchronous counter would have to be used instead which uses a LOT more logic for long counters), one would also need additional logic to figure out when the counter should advance in count and when not.
So what happens if the horizontal and vertical resolutions were some convenient power of 2? This is now getting awfully familiar, as using the closest values would be 512 horizontal and 256 vertical.
This makes the pixel count go from 0 to 511, which fits exactly 9 bits (the reason we used a power of 2 in the first place!) and line count from 0 to 255, which needs exactly 8 bits. So, the numbers go from 000000000 to 111111111 horizontally and 00000000 to 11111111 vertically. Since we are counting starting with visible pixels and lines, this means the initial state of the counters up to 511 for the horizontal and 255 vertical correspond exactly to the address of the pixel, so concatenating the lower 8 bits of the vertical counter with the lower 9 bits of the horizontal counter would directly give us a pixel address. Since there are 4 pixels per byte (given 2 bits per pixel), the bottom 2 bits of such address would give you the position of the 2 bit pixel within a byte and the remaining 15 bits would give you a byte address, from 0 to 32767, which is exactly 32k. All of this is basically re-using horizontal and vertical counter bits and no additional logic - which is a considerable simplification of logic, which is what you want when designing custom logic that is supposed to be as cheap as possible.
The consequence is that now the horizontal resolution is increased to 512 pixels, which uses 51.2us of the complete display line, which breaks the 48us standard. So, there are 32 extra pixels - the timing is adjusted so that around 16 are added to each side of the 480 pixel visible area, and this is how TV mode was born - have 512 pixels horizontally, and limit the usable pixels in software. Simplifying the logic to 9 and 8 bit addressing for the visible pixels and lines also simplifies the logic that has to do with generating the vertical synch pulse and 'blanking' the display, i.e. to determine when pixels are to be fed to the monitor, or 'black' should be generated during the various invisible or unused areas of the screen - simply look at the top bit of the counter and if it is 1, generate blank (black) pixels.
Even better - it makes it also much simpler to write software for. Figuring out the byte address for a 480 pixel wide display, requires 'take x coordinate and add to y coordinate times 120', so 'real multiplication' while calculating with a 512 pixel width means simply shifting and splicing bytes.
There are also other ways this simplifies the actual logic that reads the data from RAM as well as generates the required timing for the RAM when the CPU reads or writes it, but more on this in the next post.