VOGONS


dynrec bugs

Topic actions

First post, by Matt Hey

User metadata
Rank Newbie
Rank
Newbie

I have been working with NovaCoder to enable 68k dynrec support. I have the dynrec cache buffers executing code but I believe I have found several bugs. Making function parameters fully stack based causes problems. This can be experienced by turning off the GCC fastcall function attribute (#define DRC_CALL_CONV _fastcall) as x86 uses all stack based calls by default. I have fixed some of the problems in decoder_basic.h which I will attach to this message. The first problems are with gen_call_function_xxx() functions where gen_call_function_setup() is called with the wrong number of parameters. With register function parameters this isn't a problem but the count can be used to deallocate the parameters on the stack after the function call. The gen_call_function_xxx() functions work well for register or stack based parameters but they aren't always used. For example, dyn_read_byte() calls gen_move_regs() to load it's register parameter and then gen_call_function_raw() to call the function. I changed this to gen_call_function_R() so the parameter can be passed in a register or on the stack. I have marked all my potential bug fixes in decoder_basic with *** Bug fix ***. This file was pretty straight forward to change and I used the general style. There is more problems like this in decoder_opcodes.h that aren't as simple to change. Someone experienced with this code could probably edit them in much less time with fewer problems and the way they want them. Note that gen_call_function_raw is correct where there are no parameters.

I also noticed while debugging that some dynrec generated code doesn't seem to have the caches flushed by cache_block_closing() before executing. This includes the dynrec code generated by gen_run_code(), a return (2) equivalent and a return (3) equivalent. None of these need to be dynrec generated anyway. gen_run_code() would be faster and wouldn't need the caches flushed if it was a normal function that was passed a pointer to the code to start rather like cache_block_closing() which doesn't generate dynrec code as it's already compiled/assembled. I haven't noticed any crashes from this but the 68k has a fraction of the caches of a modern processor.

Thanks for any help.

Attachments

  • Filename
    decoder_basic.h
    File size
    43.27 KiB
    Downloads
    128 downloads
    File license
    Fair use/fair dealing exception

Reply 2 of 15, by Dominus

User metadata
Rank DOSBox Moderator
Rank
DOSBox Moderator

Wow. After two days you begin throwing around names? Are your problems the most important ones in the world? Did you consider that maybe people have a real life as well? Goodbye 68k as...

Windows 3.1x guide for DOSBox
60 seconds guide to DOSBox
DOSBox SVN snapshot for macOS (10.4-11.x ppc/intel 32/64bit) notarized for gatekeeper

Reply 3 of 15, by NovaCoder

User metadata
Rank Newbie
Rank
Newbie

Hiya,

Do any DosBox devs have any helpful comments or suggestions to this issue?

Is this even the best place to post to get help from the DosBox devs BTW?

Thanks.

Reply 4 of 15, by Dominus

User metadata
Rank DOSBox Moderator
Rank
DOSBox Moderator

Do you really think anyone wants to answer now, after being called names?

Windows 3.1x guide for DOSBox
60 seconds guide to DOSBox
DOSBox SVN snapshot for macOS (10.4-11.x ppc/intel 32/64bit) notarized for gatekeeper

Reply 5 of 15, by Matt Hey

User metadata
Rank Newbie
Rank
Newbie
Dominus wrote:

Do you really think anyone wants to answer now, after being called names?

Nobody called anybody names. I asked if you were going to be a#?*#s and you responded. It's not like anybody would have replied by now anyway.

@NovaCoder
You were right. Waste of time. So much for doing the responsible thing by reporting bugs.

Reply 6 of 15, by Dominus

User metadata
Rank DOSBox Moderator
Rank
DOSBox Moderator

First you insult the devs then after half a day you delete it but don't apologize and then you lie about it. Charming. Bye bye

Windows 3.1x guide for DOSBox
60 seconds guide to DOSBox
DOSBox SVN snapshot for macOS (10.4-11.x ppc/intel 32/64bit) notarized for gatekeeper

Reply 7 of 15, by Qbix

User metadata
Rank DOSBox Author
Rank
DOSBox Author

Have you taken a look at the other platforms ?
Do arm and mipsel handle it differently ?

Water flows down the stream
How to ask questions the smart way!

Reply 8 of 15, by Matt Hey

User metadata
Rank Newbie
Rank
Newbie
Qbix wrote:

Have you taken a look at the other platforms ?
Do arm and mipsel handle it differently ?

RISC ABI's (and risc_x64.h) pretty much all specify register based function parameters and this default is built into GCC for those platforms. The GCC _fastcall function attribute is ignored (doesn't exist) on those platforms. They end up calling a function like this:

move par4 to r3 // gen_load_param_xxx() generated
move par3 to r2 // gen_load_param_xxx() generated
move par2 to r1 // gen_load_param_xxx() generated
move par1 to r0 // gen_load_param_xxx() generated
call function // gen_call_function_setup() generated

The paramcount passed to gen_call_function_setup() does not matter because it's not used because there is no stack to deallocate. If the gen_call_function_xxx() functions are bypassed in decoder_basic.h then the parameters are already in the correct register. The dynrec support has no problems here.

The x86 and 68k use stack based function parameters (_cdecl) by default. The x86 can use the _fastcall function attribute which passes the first 2 parameters in registers and the rest on the stack. Most other platforms do not support _fastcall though. This would look like:

push par4 // gen_load_param_xxx() generated
push par3 // gen_load_param_xxx() generated
move par2 to r1 // gen_load_param_xxx() generated
move par1 to r0 // gen_load_param_xxx() generated
call function // gen_call_function_setup() generated

Stack paramaters are deallocated by the function. It looks to me like _fastcall in risc_x86.h will work but turning it off (_cdecl) will not. This line:

cache_addb((!fastcall)?paramcount*4:0);

Won't work for all stack based parameters when gen_call_function_R3(), gen_call_function_m() and gen_call_function_mm() are used. The paramcount must match the number of paramaters pushed on the stack and it doesn't.

I think _stdcall parameter passing would be easier to implement than _fastcall and is possibly as fast in risc_x86.h. The x86 doesn't have many general purpose registers so the function may have to push some of the parameters back on the stack anyway defeating the work _fastcall did in putting them in particular registers. Working gen_load_param_xxx() functions would need slow difficult to predict branches to do _fastcall correctly. It's possible that _stdcall could be faster in this case for these reasons. This looks like:

push par4 // gen_load_param_xxx() generated
push par3 // gen_load_param_xxx() generated
push par2 // gen_load_param_xxx() generated
push par1 // gen_load_param_xxx() generated
call function // gen_call_function_setup() generated

With _stdcall (like _fastcall), the function deallocates the stack with the x86 RET instruction which accepts an immediate value for how many bytes of the stack to deallocate. It gets this from the function prototypes. The paramcount parameter of gen_call_function_setup() is then unused. It's easy, less prone to problems in this case and fairly fast. The risc_x86.h dynrec gen_call_function_setup() would look something like this:

static Bit32u INLINE gen_call_function_setup(void * func,Bitu paramcount,bool fastcall=false) {
// Do the actual call to the procedure
Bit32u proc_addr=(Bit32u)cache.pos;
cache_addb(0xe8);
cache_addd((Bit32u)func - (Bit32u)cache.pos-4);

#if DRC_CALL_CONV!=_stdcall
// Restore the params of the stack
if (paramcount) {
cache_addw(0xc483); //add ESP,imm byte
cache_addb(paramcount*4);
{
#endif
return proc_addr;
}

The 68k has an instruction called RTD (Return and Deallocate) that does the same as the x86 RET with immediate, however, GCC has not provided any function attributes for parameters which is why we use the old _cdecl style for now. Here is what it looks like:

push par4 // gen_load_param_xxx() generated
push par3 // gen_load_param_xxx() generated
push par2 // gen_load_param_xxx() generated
push par1 // gen_load_param_xxx() generated
call function // gen_call_function_setup() generated
add (4*4) to stack // gen_call_function_setup() generated

The last line deallocates 4 parameters of 4 bytes each. Here the paramcount passed to gen_call_function_setup must be correct to deallocate the parameters from the stack. The gen_call_function_xxx() function calls must be used for a function called from the dynrec cache buffer when there is at least 1 parameter. The gen_call_function_xxx() functions must call gen_call_function_setup() with the correct number of parameters loaded with gen_load_param_xxx() calls. If everything is not correct, either the stack gets corrupted (crash) or the parameters are not where they are expected (memory corruption). All stack based parameters (_cdecl) is the best test for problems after dynrec code changes. All aspects of the interface have to be correct for it to work. The original interface was designed properly to support all kinds of parameter calling but it has to be used.

I'll attach the risc_68k.h file I have created. It's still under construction but it might be useful as 68k assembler is very easy to read. It's possible someone may spot errors as well.

Attachments

  • Filename
    risc_68k.h
    File size
    20.95 KiB
    Downloads
    141 downloads
    File license
    Fair use/fair dealing exception

Reply 9 of 15, by Matt Hey

User metadata
Rank Newbie
Rank
Newbie

I think I've figured out how to make _fastcall, _stdcall, and _cdecl compatible using the same interface. I've defined an optional new variable called DRC_STACKPARM for risc_anyCPU.h that pushes the _cdecl and _stdcall missing paramaters to the stack in decoder_basic.h gen_call_function_R3(), gen_call_function_m() and gen_call_function_mm(). The code remains the same for _fastcall which I think works (turn _fastcall off in risc_x86.h and I think it crashes though). I improved the comments some too 😉. There are still a lot of functions in decoder_opcodes.h that are not using this interface. They put parameters in FC_OP1 and FC_OP2 registers instead of on the stack before calling functions from the dynrec buffer.

Attachments

  • Filename
    decoder_basic.h
    File size
    44.39 KiB
    Downloads
    108 downloads
    File license
    Fair use/fair dealing exception
  • Filename
    risc_68k.h
    File size
    21.15 KiB
    Downloads
    113 downloads
    File license
    Fair use/fair dealing exception

Reply 10 of 15, by M-HT

User metadata
Rank Newbie
Rank
Newbie

Just an idea based on how some x86 compilers optimize function calls:
Instead of pushing parameters on stack and then reclaiming the stack after function call, the idea is to allocate the space for 4 parameters on stack (when entering dynarec) and when you call a function you move the parameters into this space. (The stack space is reclaimed when exiting dynarec.)

The function call would look like this:

move par4 to [sp+12]
move par3 to [sp+8]
move par2 to [sp+4]
move par1 to [sp]
call function

where sp is the stack pointer.

I haven't thought about this much - whether it's better or not (and I didn't read your code), but it can be implemented in the 68k backend without changing other dosbox code.

Reply 11 of 15, by Matt Hey

User metadata
Rank Newbie
Rank
Newbie

@M-HT
I believe the stack on both the 68k and x86 grows negative.

The x86:

push par1

is actually:

move par1,-(sp)

And pop is:

move (sp)+,par1

Where sp is the stack pointer and we use () instead of the x86 [] to represent memory indirect (we do use () and [] in double memory indirect addressing modes). This is the notation we use on the 68k as we don't have pop and push instructions (although macros are easy enough). It looks like your example would overwrite data already on the stack. You would have to write your new data to negative offsets, maybe like this:

move par4 to [sp-20]
move par3 to [sp-16]
move par2 to [sp-12]
move par1 to [sp-8]
;[sp-4] reserved for return PC from function call
call function

The return instruction would pop off the value at [sp-4] restoring the stack correctly but the paramaters would be at a negative offset instead of a positive offset. We can't easily change the way the parameters are to be received in the function calls. Even if this was workable somehow, it would not solve are problem of some of the parameters not being pushed on the stack. For example, gen_call_function_R3() in decoder_basic.h is used for calling functions with 3 parameters but only 1 is pushed on the stack and then paramcount=3 are deallocated from the stack with all stack based parameters. Originally, I made paramcount=1 which fixed the stack problem but then 3 parameters were read from the stack when there was 1 parameter passed to the function which causes memory corruption. This was wrong! DosBox was passing the first 2 parameters in registers sometimes which only works with _fastcall and all register function calling conventions where paramcount isn't used. The paramcount of gen_call_function_setup() is only needed by _cdecl (all stack) style parameters and the fact that it exists means the interface was originally designed to handle stack based passing as it serves no other purpose than to deallocate the stack after function calls. If all stack based parameter passing used to work, then partial register parameter passing also violates the original interface. I don't think the devs even care. I have done all the fixes required myself with conditional preprocessor instructions which is the least obtrusive way to handle this and NovaCoder and I are testing now. I bet the devs won't even accept the work when it's done and working but maybe it will get us up and running faster than creating assembler stubs or functions. Thanks for at least trying to help us.

Reply 12 of 15, by NovaCoder

User metadata
Rank Newbie
Rank
Newbie

Using Matt's updates (above), now produces this output from DosBox AGA.

DOSBOX_Init() - Adding Init Functions
DOSBox version 0.74
Copyright 2002-2010 DOSBox Team, published under GNU GPL.
---
Init SDL() - START
Init SDL() - END
CONFIG:Loading primary settings from config file dosbox.conf
Init all the sections
MIXER:No Sound Mode Selected.
MIDI:Opened device:none
CPU Message Set Video Mode 3
CPU_Core_Dynrec_Run - START
CPU_Core_Dynrec_Run - no block cache found
CPU_Core_Dynrec_Run - let the dynamic core handle this instruction
CPU_Core_Dynrec_Run - before runcode
CPU_Core_Dynrec_Run - after runcode
CPU_Core_Dynrec_Run - no block cache found
CPU_Core_Dynrec_Run - let the dynamic core handle this instruction
CPU_Core_Dynrec_Run - before runcode
CPU_Core_Dynrec_Run - after runcode
CPU_Core_Dynrec_Run - START
CPU_Core_Dynrec_Run - no block cache found
CPU_Core_Dynrec_Run - let the dynamic core handle this instruction
CPU_Core_Dynrec_Run - before runcode
CPU_Core_Dynrec_Run - after runcode
CPU_Core_Dynrec_Run - START
CPU_Core_Dynrec_Run - before runcode
CPU_Core_Dynrec_Run - after runcode
Illegal read from ffffff91, CS:IP 188:ffffe711
DYNREC:Can't run code in this page
Illegal read from ffffff91, CS:IP 188:ffffe711
Illegal read from ffffff92, CS:IP 188:ffffe711
Illegal read from ffffff93, CS:IP 188:ffffe713
Illegal read from ffffff94, CS:IP 188:ffffe713
Illegal read from ffffff95, CS:IP 188:ffffe715
Illegal read from ffffff96, CS:IP 188:ffffe715
Illegal read from ffffff97, CS:IP 188:ffffe717
Illegal read from ffffff98, CS:IP 188:ffffe717
Illegal read from ffffff99, CS:IP 188:ffffe719
Illegal read from ffffff9a, CS:IP 188:ffffe719
Illegal read from ffffff9b, CS:IP 188:ffffe71b
Illegal read from ffffff9c, CS:IP 188:ffffe71b
Illegal read from ffffff9d, CS:IP 188:ffffe71d
Illegal read from ffffff9e, CS:IP 188:ffffe71d
Illegal read from ffffff9f, CS:IP 188:ffffe71f
Illegal read from ffffffa0, CS:IP 188:ffffe71f
Illegal read from ffffffa1, CS:IP 188:ffffe721
Illegal read from ffffffa2, CS:IP 188:ffffe721
Illegal read from ffffffa3, CS:IP 188:ffffe723
Illegal read from ffffffa4, CS:IP 188:ffffe723
Illegal read from ffffffa5, CS:IP 188:ffffe725
Illegal read from ffffffa6, CS:IP 188:ffffe725
Illegal read from ffffffa7, CS:IP 188:ffffe727
Illegal read from ffffffa8, CS:IP 188:ffffe727
Illegal read from ffffffa9, CS:IP 188:ffffe729
Illegal read from ffffffaa, CS:IP 188:ffffe729
Illegal read from ffffffab, CS:IP 188:ffffe72b
Illegal read from ffffffac, CS:IP 188:ffffe72b
Illegal read from ffffffad, CS:IP 188:ffffe72d
Illegal read from ffffffae, CS:IP 188:ffffe72d
Show last 172 lines
Illegal read from ffffffaf, CS:IP      188:ffffe72f
Illegal read from ffffffb0, CS:IP 188:ffffe72f
Illegal read from ffffffb1, CS:IP 188:ffffe731
Illegal read from ffffffb2, CS:IP 188:ffffe731
Illegal read from ffffffb3, CS:IP 188:ffffe733
Illegal read from ffffffb4, CS:IP 188:ffffe733
Illegal read from ffffffb5, CS:IP 188:ffffe735
Illegal read from ffffffb6, CS:IP 188:ffffe735
Illegal read from ffffffb7, CS:IP 188:ffffe737
Illegal read from ffffffb8, CS:IP 188:ffffe737
Illegal read from ffffffb9, CS:IP 188:ffffe739
Illegal read from ffffffba, CS:IP 188:ffffe739
Illegal read from ffffffbb, CS:IP 188:ffffe73b
Illegal read from ffffffbc, CS:IP 188:ffffe73b
Illegal read from ffffffbd, CS:IP 188:ffffe73d
Illegal read from ffffffbe, CS:IP 188:ffffe73d
Illegal read from ffffffbf, CS:IP 188:ffffe73f
Illegal read from ffffffc0, CS:IP 188:ffffe73f
Illegal read from ffffffc1, CS:IP 188:ffffe741
Illegal read from ffffffc2, CS:IP 188:ffffe741
Illegal read from ffffffc3, CS:IP 188:ffffe743
Illegal read from ffffffc4, CS:IP 188:ffffe743
Illegal read from ffffffc5, CS:IP 188:ffffe745
Illegal read from ffffffc6, CS:IP 188:ffffe745
Illegal read from ffffffc7, CS:IP 188:ffffe747
Illegal read from ffffffc8, CS:IP 188:ffffe747
Illegal read from ffffffc9, CS:IP 188:ffffe749
Illegal read from ffffffca, CS:IP 188:ffffe749
Illegal read from ffffffcb, CS:IP 188:ffffe74b
Illegal read from ffffffcc, CS:IP 188:ffffe74b
Illegal read from ffffffcd, CS:IP 188:ffffe74d
Illegal read from ffffffce, CS:IP 188:ffffe74d
Illegal read from ffffffcf, CS:IP 188:ffffe74f
Illegal read from ffffffd0, CS:IP 188:ffffe74f
Illegal read from ffffffd1, CS:IP 188:ffffe751
Illegal read from ffffffd2, CS:IP 188:ffffe751
Illegal read from ffffffd3, CS:IP 188:ffffe753
Illegal read from ffffffd4, CS:IP 188:ffffe753
Illegal read from ffffffd5, CS:IP 188:ffffe755
Illegal read from ffffffd6, CS:IP 188:ffffe755
Illegal read from ffffffd7, CS:IP 188:ffffe757
Illegal read from ffffffd8, CS:IP 188:ffffe757
Illegal read from ffffffd9, CS:IP 188:ffffe759
Illegal read from ffffffda, CS:IP 188:ffffe759
Illegal read from ffffffdb, CS:IP 188:ffffe75b
Illegal read from ffffffdc, CS:IP 188:ffffe75b
Illegal read from ffffffdd, CS:IP 188:ffffe75d
Illegal read from ffffffde, CS:IP 188:ffffe75d
Illegal read from ffffffdf, CS:IP 188:ffffe75f
Illegal read from ffffffe0, CS:IP 188:ffffe75f
Illegal read from ffffffe1, CS:IP 188:ffffe761
Illegal read from ffffffe2, CS:IP 188:ffffe761
Illegal read from ffffffe3, CS:IP 188:ffffe763
Illegal read from ffffffe4, CS:IP 188:ffffe763
Illegal read from ffffffe5, CS:IP 188:ffffe765
Illegal read from ffffffe6, CS:IP 188:ffffe765
Illegal read from ffffffe7, CS:IP 188:ffffe767
Illegal read from ffffffe8, CS:IP 188:ffffe767
Illegal read from ffffffe9, CS:IP 188:ffffe769
Illegal read from ffffffea, CS:IP 188:ffffe769
Illegal read from ffffffeb, CS:IP 188:ffffe76b
Illegal read from ffffffec, CS:IP 188:ffffe76b
Illegal read from ffffffed, CS:IP 188:ffffe76d
Illegal read from ffffffee, CS:IP 188:ffffe76d
Illegal read from ffffffef, CS:IP 188:ffffe76f
Illegal read from fffffff0, CS:IP 188:ffffe76f
Illegal read from fffffff1, CS:IP 188:ffffe771
Illegal read from fffffff2, CS:IP 188:ffffe771
Illegal read from fffffff3, CS:IP 188:ffffe773
Illegal read from fffffff4, CS:IP 188:ffffe773
Illegal read from fffffff5, CS:IP 188:ffffe775
Illegal read from fffffff6, CS:IP 188:ffffe775
Illegal read from fffffff7, CS:IP 188:ffffe777
Illegal read from fffffff8, CS:IP 188:ffffe777
Illegal read from fffffff9, CS:IP 188:ffffe779
Illegal read from fffffffa, CS:IP 188:ffffe779
Illegal read from fffffffb, CS:IP 188:ffffe77b
Illegal read from fffffffc, CS:IP 188:ffffe77b
Illegal read from fffffffd, CS:IP 188:ffffe77d
Illegal read from fffffffe, CS:IP 188:ffffe77d
Illegal read from ffffffff, CS:IP 188:ffffe77f
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message CPU:LOCK
CPU Message Illegal Unhandled Interrupt Called 0
CPU_Core_Dynrec_Run - START
CPU_Core_Dynrec_Run - no block cache found
CPU_Core_Dynrec_Run - let the dynamic core handle this instruction
CPU_Core_Dynrec_Run - before runcode
CPU_Core_Dynrec_Run - after runcode
CPU_Core_Dynrec_Run - no block cache found
CPU_Core_Dynrec_Run - let the dynamic core handle this instruction
CPU_Core_Dynrec_Run - before runcode

Reply 14 of 15, by M-HT

User metadata
Rank Newbie
Rank
Newbie

Apparently I didn't explain the idea sufficiently enough.
I'll try it with examples (with x86 instructions, because I'm not familiar with 68k).

Currently you are calling functions like this:

push par4
push par3
push par2
push par1
call function
add esp, 4*4

You can achieve the same with this code:

sub esp, 4*4
mov [esp+3*4], par4
mov [esp+2*4], par3
mov [esp+4], par2
mov [esp], par1
call function
add esp, 4*4

But instead of using the instruction "sub esp, 4*4" (to reserve stack) before every function call, you execute it once, when entering dynrec code (in function gen_run_code).
And instead of using the instruction "add esp, 4*4" (to release stack) after every function call, you execute it once, when exiting dynrec code.

I hope I made it clear this time.
Reserving and releasing the stack (in function gen_run_code) might be a bit complicated, but it's doable.

Reply 15 of 15, by Matt Hey

User metadata
Rank Newbie
Rank
Newbie
M-HT wrote:
Apparently I didn't explain the idea sufficiently enough. I'll try it with examples (with x86 instructions, because I'm not fami […]
Show full quote

Apparently I didn't explain the idea sufficiently enough.
I'll try it with examples (with x86 instructions, because I'm not familiar with 68k).

Currently you are calling functions like this:

push par4
push par3
push par2
push par1
call function
add esp, 4*4

Right.

M-HT wrote:
You can achieve the same with this code: […]
Show full quote

You can achieve the same with this code:

sub esp, 4*4
mov [esp+3*4], par4
mov [esp+2*4], par3
mov [esp+4], par2
mov [esp], par1
call function
add esp, 4*4

But instead of using the instruction "sub esp, 4*4" (to reserve stack) before every function call, you execute it once, when entering dynrec code (in function gen_run_code).
And instead of using the instruction "add esp, 4*4" (to release stack) after every function call, you execute it once, when exiting dynrec code.

Alright. I think I understand now. That would work if setup from 1 level of code and calling multiple functions without having to fix the stack after each function. However, calling the function here puts the return value on the stack and would not work at the level below without overwriting the return value. It might be possible to set it up at the start of the dynrec buffer code but there are multiple return points. It's a good assembler trick but it also has it's limitations 😀

I fixed another bug and we now get to the flashing command prompt but there are still some problems. Keyboard entry still isn't working. We are getting close though 😎.