SuperCharge Decompiler

Anything QL Software or Programming Related.
Martin_Head
Aurora
Posts: 847
Joined: Tue Dec 17, 2013 1:17 pm

Re: SuperCharge Decompiler

Post by Martin_Head »

martyn_hill wrote:On a side note, I've already found at least one duplication of routine (ID: 204 - "RETRY_HERE") that appears twice in my disassembly (so far), only with very slightly different offsets for the BCC and LEA instructions (due to position-dependence)
The bcc's are the same, both $6404. The LEA's are pointing to different addresses.

I think I came across the same routine appearing twice in a program once before. Both doing the same job. Can't off hand remember what I did about it.

I'm getting a little confused, Earlier you were talking about routines ending in the form JMP $00(a6,d0.L), but the sample you just supplied ends in JMP -$7C90(a6)?

JMP -$7C90(a6) is the format I have seen in all the compiled programs I have looked except FTCII. And I thought yours was doing the same as FTCII.

If you don't have a RETRY_HERE command in your source program, I would be surprised if the routine appeared in the executable code. I have not seen it in any of the test programs I have written without it.
RETRY_HERE is something I have only just come across when I was trying to find out which Turbo Toolkit commands were being handled by the compiled program internally, or were being called externally, such as things like PAUSE and OPEN. Although the RETRY_HERE routine is in the TCLibrary2_lib, I'm pretty sure that the decompiler program I up loaded here doe's not know about it .
In fact when I was trying to use WHEN_ERROR (or whatever it is in Turbo) in a test program. I could not get it to compile without error. It would fail on the END_WHEN (can't remember the error).

You may have figured this out already.
The A5 register is a pointer to the current position in the 'coded' SuperBASIC program. A1 is a pointer to a stack (the maths stack in QDOS). A4 I think is a pointer of some sort to the variables(not quite worked that one out). A6 is a fixed position in memory that just about everything revolves around. I think A6 is chosen so that you can swing plus and minus 32K from it, and can have jobs up to 64K in size.
I'm about 5% the way through the 200-odd code IDs
I don't know what technique you are using for identifying the routines, But if you look at the section on hand decompiling at the back of the DisCharge Manual, You know what the original code was, so you can quickly identify routines, or make an educated guess as to what they may be. It also helps you understand the structure of the 'coded' SuperBASIC.


martyn_hill
Aurora
Posts: 909
Joined: Sat Oct 25, 2014 9:53 am

Re: SuperCharge Decompiler

Post by martyn_hill »

Hi again Martin!
Martin_Head wrote:I'm getting a little confused, Earlier you were talking about routines ending in the form JMP $00(a6,d0.L), but the sample you just supplied ends in JMP -$7C90(a6)?

JMP -$7C90(a6) is the format I have seen in all the compiled programs I have looked except FTCII. And I thought yours was doing the same as FTCII.
Actually, in my first pass against this job disassembly, I found a couple of instances of the JMP $00(a6,d0.l) form, but many of the JMP -<offset>(a6) form and it is the later than proved to be successful in decompiling thus far.

All I meant to say was that in the provided version of TurboProcessDump, the actual JMP -<offset>(a6) coded is JMP -$7DD8, whereas I needed to switch this offset to -$7C90 - same format, just different negative offset.

I'm matching off the code segments by hand, pretty much as indicated in your excellent instructions.

I am finding more and more non-matching routines and am having to make a judgement as to which code ID they likely belong to by more careful review of the instruction sequences. I am also finding more variants that either don't yet appear in TCLibrary_lib at all, or slightly re-arranged sequences of functionally similar instructions.

Anyways, once I've got enough IDs mapped against this particular source disassembly, I'll test the decompiler itself and start to compare against the original SBASIC source code.

M.


User avatar
NormanDunbar
Forum Moderator
Posts: 2251
Joined: Tue Dec 14, 2010 9:04 am
Location: Leeds, West Yorkshire, UK
Contact:

Re: SuperCharge Decompiler

Post by NormanDunbar »

Evening Gents,

I've been following this topic with interest, but it suddenly struck me, we have the source code for Turbo - is it not possible to work out the information required from that?

Just a thought - I could of course, be barking!


Cheers,
Norm.


Why do they put lightning conductors on churches?
Author of Arduino Software Internals
Author of Arduino Interrupts

No longer on Twitter, find me on https://mastodon.scot/@NormanDunbar.
Martin_Head
Aurora
Posts: 847
Joined: Tue Dec 17, 2013 1:17 pm

Re: SuperCharge Decompiler

Post by Martin_Head »

martyn_hill wrote: Actually, in my first pass against this job disassembly, I found a couple of instances of the JMP $00(a6,d0.l) form, but many of the JMP -<offset>(a6) form and it is the later than proved to be successful in decompiling thus far.

All I meant to say was that in the provided version of TurboProcessDump, the actual JMP -<offset>(a6) coded is JMP -$7DD8, whereas I needed to switch this offset to -$7C90 - same format, just different negative offset.
I have added this and -$7CA6 (for V5.35) to the version checking. I also noticed that when the routines are identified in the copy of the disassembly. The first 'missed' routines checksum is not shown. I'm fixing that.

At the moment I am trying to identify the routines for the 'xword' program and add then to the Library files.

If I remember correctly, in SuperCharged programs this end of routine jump was the same for all versions, except V2.00. Where in Turbo it seems to change with every version.


Martin_Head
Aurora
Posts: 847
Joined: Tue Dec 17, 2013 1:17 pm

Re: SuperCharge Decompiler

Post by Martin_Head »

NormanDunbar wrote:Evening Gents,

I've been following this topic with interest, but it suddenly struck me, we have the source code for Turbo - is it not possible to work out the information required from that?

Just a thought - I could of course, be barking!


Cheers,
Norm.
I did have a very, very quick look at the sources. But I could not be bothered to try to work out the code generator worked. And it would only give me the code routines for the latest version, and as Martyn has found out, you can get small differences in the routines between different versions of Turbo.


User avatar
NormanDunbar
Forum Moderator
Posts: 2251
Joined: Tue Dec 14, 2010 9:04 am
Location: Leeds, West Yorkshire, UK
Contact:

Re: SuperCharge Decompiler

Post by NormanDunbar »

Ah, that makes sense. Thanks Martin.

Cheers,
Norm.


Why do they put lightning conductors on churches?
Author of Arduino Software Internals
Author of Arduino Interrupts

No longer on Twitter, find me on https://mastodon.scot/@NormanDunbar.
Martin_Head
Aurora
Posts: 847
Joined: Tue Dec 17, 2013 1:17 pm

Re: SuperCharge Decompiler

Post by Martin_Head »

martyn_hill wrote:Interestingly, I don't actually have any explicit RETRY_HERE commands in my source program :-)
Yesterday I had a first attempt to start a decompile of the 'xwords' program. And I have found a few differences in the Turbo code V5.35 to my V5.10 that I have been using for development. Causing problems for the Decompiler I uploaded here.

This might help if you try to use it on a version later than V5.10

In the CheckValues procedure there are a couple of FOR loops that search from the start of the program for particular data, One for the value of the A6 register, and one for the start of the Keywords. These loops need to be a bit longer.

A RETRY_HERE command is inserted right in front of the BASIC program, even though there is no RETRY_HERE command in the original BASIC program. This makes the decompiler think that the program was compiled without line numbers, as it does not find a new program line start code.

There's something odd going on with the first line of the BASIC program, that I have not figured out yet. There are two words inserted between the code marking the start of the line, and the line number. EDIT: Ignore this, turns out to be my ham fisted editing.

There seems to be two identical code routines. One to place a supplied integer onto the stack, and one to place a integer variable onto the stack. The code makes sense for the first, but not the second.

These are the problems I have found so far, I'm sure there are more.


Martin_Head
Aurora
Posts: 847
Joined: Tue Dec 17, 2013 1:17 pm

Re: SuperCharge Decompiler

Post by Martin_Head »

I'm working on decompiling the 'xwords' program, And I was wondering if anyone could help me with what's going on in a particular part of the original program

Code: Select all

170 IF COMPILED THEN
180   REMark read TConfig data block, note how strings have to be read twice
190   RESTORE 5000 : READ ink1%
200   RESTORE 5002 : READ ink2%
210   RESTORE 5004 : READ paper1%
220   RESTORE 5006 : READ border1%
230   RESTORE 5008 : READ border2%
240   RESTORE 5010 : READ ox%
250   RESTORE 5012 : READ oy%
260   RESTORE 5014 : READ directory$ : READ directory$
270 ELSE
280   REMark interpreted
290   ink1% = 7    : REMark main ink colour
300   ink2% = 4    : REMark second ink colour
310   paper1% = 0  : REMark paper colour
320   border1% = 2 : REMark border colour
330   ox% = 0      : REMark origin
340   oy% = 0      : REMark origin
350   directory$ = 'win1_xwords_' : REMark holds Control_Dat file
360 END IF



4998 DATA "$`#*","xwords    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
5000 DATA -4444:REMark Ink colour 1
5002 DATA -4444:REMark Ink colour 2
5004 DATA -4444:REMark Paper colour
5006 DATA -4444:REMark Border colour
5008 DATA -4444:REMark Border colour when moving outline
5010 DATA -4444:REMark Window x-origin
5012 DATA -4444:REMark Window y-origin
5014 DATA "XX","0000000000000000000000000000000000000000000"
In the disassembled executable the DATA statements come out as

Code: Select all

Start of DATA???

0024E046  7FFF 0007      Ink colour 1        main ink colour
          7FFF 0004      Ink colour 2        second ink colour
          7FFF 0000      Paper colour        paper colour
          7FFF 0002      Border colour       border colour
          7FFF 0007      Border colour when moving outline
          7FFF 0000      Window x-origin
          7FFF 0000      Window y-origin

0024E062  0002 002B      +    ?
          000C 77696E315F78776F7264735F   win1_xwords_
0024E074  00000000       ori.b     #$00,d0                     ....
0024E078  00000000       ori.b     #$00,d0                     ....
0024E07C  00000000       ori.b     #$00,d0                     ....
0024E080  00000000       ori.b     #$00,d0                     ....
0024E084  00000000       ori.b     #$00,d0                     ....
0024E088  00000000       ori.b     #$00,d0                     ....
0024E08C  00000000       ori.b     #$00,d0                     ....
0024E090  00000000       ori.b     #$00,d0                     ....
0024E094  8001           End of DATA
I'm guessing this is some sort of a config block that has been filled in after the original BASIC program was compiled, as the -4444 colours don't fit in with what's in the executable file.

Also the 2 word integers don't fit with what I have seen in the test programs I have made. They don't usually look like that. In fact the decompiler thinks they are very long strings, and SBASIC complains.

Line 4998 does not seem to make it into the compiled program at all. And the second string in line 5014 is going to really cause confusion to the decompiler.

So, Is this a MenuConfig block, or a special config block for Turbo? I have had a quick scan through some of the Turbo and SMSQ docs, but I can't find much to help me.
I take it that the "$`#*" in line is some sort of maker? And does Turbo recognise it?


martyn_hill
Aurora
Posts: 909
Joined: Sat Oct 25, 2014 9:53 am

Re: SuperCharge Decompiler

Post by martyn_hill »

Hi Martin

Perhaps you've already found this resource, but the UConfig documentation/program from George Gwilt (available from Dilwyn's site or directly from George's main Turbo repository) appears to align with what you've found in the DATA statements of the xwords program.

I haven't dived-in to any depth, but thought to mention the above in case you hadn't gotten there yet.

M.


Martin_Head
Aurora
Posts: 847
Joined: Tue Dec 17, 2013 1:17 pm

Re: SuperCharge Decompiler

Post by Martin_Head »

I think I've got a bit of a handle on this now.

First problem was that the decompiler was not finding the true start of the DATA area because there is no specific RESTORE (without a line number) in the program.

After studying the code that reads the DATA, there is a special case of an integer that starts $7FFF

The decompiler now produces this -
Dataoutput.png
Dataoutput.png (5.81 KiB) Viewed 4071 times
as the output for the DATA statements in the xwords program.

The first DATA line looks corrupted, but it's non printable characters in a string.

I have put the special case integers in square brackets to distinguish them from normal numbers.

All the 58184's are zero padding, I expect to let the "win1_xword_" expand.

And here's the decompiled part of the code for reading it

Code: Select all

170 IF COMPILED THEN 
180  
190  RESTORE ** Line number to be determined ** data0027802C : READ var94E2%
200  RESTORE ** Line number to be determined ** data00278030 : READ var94E6%
210  RESTORE ** Line number to be determined ** data00278034 : READ var94EA%
220  RESTORE ** Line number to be determined ** data00278038 : READ var94EE%
230  RESTORE ** Line number to be determined ** data0027803C : READ var95C6%
240  RESTORE ** Line number to be determined ** data00278040 : READ var95B6%
250  RESTORE ** Line number to be determined ** data00278044 : READ var95BA%
260  RESTORE ** Line number to be determined ** data00278048 : READ var94F2$ : READ var94F2$
270  ELSE 
280  
290  var94E2% = 7
300  var94E6% = 4
310  var94EA% = 0
320  var94EE% = 2
330  var95B6% = 0
340  var95BA% = 0
350  var94F2$ = "win1_xwords_"
360   : END IF


Post Reply