So, we cracked it!
The following facts are incompletely documented anywhere outside the
SMSQ/E source code (AFAIK):
S*BASIC tokenises literal numbers, ie any number expressed as a number
in the BASIC source code (except line numbers and numbers in text) as
a 44-bit floating point number in a 48-bit package. In nibble
representation it looks like this: 0eee:mmmmmmmm, with 0eee being the
12-bit exponent and mmmmmmmm being the 32-bit mantissa.
The token value for literal decimals (until SBASIC, the only legal
representation) is the byte $Feee, ie F-anything.
SBASIC additionally allows for the representation of literal long word
integers as binary, or hexadecimal numbers if preceded by a % or a $,
respectively.
However, they too are coded as floating point numbers, as for
decimals, above! But, to enable their re-creation as binary, decimal
and hex in the listing, it is necessary to differentiate between them.
So they get slightly different tokens:
$Feee - any decimal number within the whole floating point range
$Deee - any number entered as a binary long integer
$Eeee - any number entered as a hexadecimal long integer
As EmmBee surmised, the core of the problem was in the keyword
NEXT_TOKEN. It doesnt understand binary or hexadecimal literals. So I
had a go at the NEXT_TOKEN code and changed what I hoped was the
relevant bit to this: (d0 contains the token/exponent and d2 a copy)
Code: Select all
...
Lab101B0
andi.b #$F0,d2 isolate token
cmpi.b #$F0,d2 dec float?
; Change:
beq.s float
cmpi.b #$D0,d2 bin float?
beq.s float
cmpi.b #$E0,d2 hex float?
bne.s Lab101D8 no, some other token
float
andi.b #$0f,d0 convert token to exponent
movea.l $14(a6,a3.l),a2 do something..
adda.l $28(a6),a2 do something..
move.b d0,0(a6,a2.l) put ms.b of expontent
move.b d1,1(a6,a2.l) put ls.b of exponent
move.l (a1)+,2(a6,a2.l) put mantissa
moveq #$0b,d0 signal float
bra lab10258 do return stuff
* ...
The snippet above is based on the disassembly done by Alain. However,
since this keyword is part of a toolkit, QLib4_bin, I disassembled
that and used that disassembly instead.
Complex systems are rarely as simple as they may look to the
uninitiated who think they understand them! So Im taking the chance
that there is no deeper intention behind how these tokens were
formulated. Perhaps a future update was thought of to implement real
long and/or word integers? Or some other part of the system sees $Ceee
as something special? I dont know.
If anyone knows of a reason why the following assumption wouldnt work,
now would be a good time to come forward! 'Cause this is the solution
we plumped for:
$Deee, $Eeee, and $Feee (and $Ceee too) all have one unique property
in common, different from all other current token values: Their
two top bits are set. Thus by making just two tiny alterations to the
original NEXT_TOKEN code, from this:
Code: Select all
Lab101B0
andi.b #$F0,d2 isolate token
cmpi.b #$F0,d2 Literal number?
bne.s Lab101D8 no, something else..
andi.b #$0f,d0 convert token to exponent
...
to this
Code: Select all
Lab101B0
andi.b #$C0,d2 <- isolate token
cmpi.b #$C0,d2 <- dec, hex, or bin float?
bne.s Lab101D8 no, something else..
andi.b #$0f,d0 convert token to exponent
...
and EmmBee making a few minor alterations to Q-Liberator's BASIC code,
the whole problem appears to have evaporated! Ie QLib V3.42, when
ready, should be able to process any SBASIC literal numbers without a
hitch.