Nodes of Yesod : ZX Spectrum Next : Update #4
A Big Change Of Plans!
Astro Charlie has a close encounter |
If you have read my previous updates, you'll know that Nodes of Yesod for the Speccy Next is based on a C codebase. I've discussed details of my progress in that direction in previous posts, tool selection, and various other considerations.
I have to share with you that I've had a rethink and there's a big change of plans!
As I intend to make this version of Nodes freely available, and since it follows that there's no direct benefit to me financially, building the game has to be a fun and enjoyable endeavor. While I've enjoyed certain aspects of getting a C codebase to compile and run on the Next, I've had a few nagging doubts. C was always going to be a 'heavy' way to do things on an 8-bit Z80 (even with a faster clock), and paging code in and out of 8K or 16K banks does not make things less complicated; however, more than that, I need to personally get something out of this, it needs to be a fun process, and I want to write some Z80 code!
What's The Plan?
Let me take you back in time. In 2003 I purchased a full version of IDA Pro from Hex-Rays, version 4.5.0.762. Perhaps I'll cover the specific use case for that purchase in a future post, but suffice to say there were a couple of 8-bit and 16-bit games I wanted to disassemble. And IDA Pro was the best possible disassembler tool available at that time.
IDA stands for 'interactive disassembler', which is what the tool does. A disassembler can be compared by analogy to a tool into which you could feed a cake, whereupon it will output the ingredients for that cake. Eggs, sugar, flour, etc. An 'interactive' disassembler can go much further and give you not only the ingredients but also the recipe for that cake, along with baking instructions. IDA happens to be extraordinarily good at this.
With IDA, you take a binary program (for example a memory dump of a Sinclair Spectrum game), load it into the disassembler, and then interactively recreate the program symbols, functions, branches, data areas, and data types that were present in the original source code (which may have been lost forever, as is the case for Nodes of Yesod). When a memory location is identified as a variable, 'health' for example, you can designate that memory address as a named 'label', and then IDA automatically changes any references to that memory location into a reference to that label. The same with functions, IDA will change any subroutine call into a call to a named label.
I've used this fantastic tool for various projects, but I have never upgraded it because, in addition to being extraordinarily good at what it does, it is extraordinarily expensive (it's about $2,000)!
One thing I have done, in the intervening years, sporadically, is to create a disassembly of Nodes of Yesod, which I'd guess is 90% plus complete. Not 100%, but close. You may be able to see what is coming at this point - I'm going to upgrade IDA Pro to the latest version (the original version is a DOS program) and complete the disassembly, right?
Wrong!
I'm not about to spend $$$ on the current version of IDA Pro (though did flirt with the idea!) but there is an alternative.
Ghidra
Ghidra, interactively disassembling Nodes of Yesod |
If one thing is certain, it is that there is no certainty in life. While it was not certain that I could do what I wanted to do, I spent a couple of weeks evaluating Ghidra to continue the disassembly of Nodes ('continue' meaning 'start over' using the disassembly generated in IDA Pro as a guide) and things seemed promising, there appeared to be light at the end of the tunnel. Or else, there was a train headed my way and I was about to be flattened!
Ghidra is an interesting beast, in some ways similar to IDA, but open source. It is a creation of the NSA (yes, that NSA), and it is used to find vulnerabilities in binary code that is out in the wild, and to dissect malware. It's probably used to create malware too if we're honest. It is sophisticated, and I needed a mere fraction of its capabilities. For example, as you interactively disassemble binary code, Ghidra creates a C-like implementation of each function which could, theoretically, be used to reimplement the binary code in C. I did not need anything like that, I merely needed the Z80 disassembly, with the goal of generating a buildable set of source code. Would it be possible?
The Benefits
With this approach, I'd end up, if all went to plan, with a complete source code representation of Nodes. Not the original source code, but something that I could assemble with modern-day tools. From there, I could start to support Next-specific changes needed to make the updates I want. Additionally, it'd allow me to start from a point of a working game, and then add to it. For example, replacing the sprites with hardware sprites while having a playable game. And, I'd be able to debug the Z80 code directly without the additional C abstractions getting in the way.
But there would be other benefits. I'd like to publish the source code for the original, unmodified game. And, with an assembly source code version of the game, it would make it easier to target other 8-bit or 16-bit systems machines that never had a version of Nodes. NES perhaps? I've never developed for the non-Z80 8-bitters so that might be an interesting challenge. And if this works out, I could potentially do the same with other Speccy games. For example, I have a substantially complete disassembly of Crosswize, again in IDA Pro.
Another benefit would be that by starting with a complete disassembly of the original game, it would be possible to release a new version of Nodes for the Next sooner and incrementally release upgrades. That appeals to me because I'd be able to get software into peoples' hands sooner.
What's Stopping Me?
Creating a complete disassembly of anything is a ton of work, but that is not something I find offputting. As I said, it was not 100% certain I could generate the buildable source code I need from Ghidra. After diving into the disassembly, and doing a few experimental exports, I had yet to prove that the output was viable, meaning I could feed it into a tool like SNasm and get out of SNasm something identical, or at least very similar, to the original Nodes binary.
So Where Are We At?
TLDR, it's done! I have 100% completed the disassembly of Nodes of Yesod, wrote an output formatter for Ghidra, in Python, that generates SNasm compatible source code with no hand modifications, assembled that code with SNasm, and created a binary identical to the one I started disassembling! And, it works! Here is the resultant binary running in CSpect (with commentary by yours truly).
- Generates an equates table at the top of the source file
- Outputs all equate definitions and references in ALL_CAPS to distinguish equates
- Fixes all the disassembly errors encountered so far, including bit/res/set (hl), jp (hl)/(ix)/(iy), add ix,iy, and more. These fixes should be in the core disassembler but they're fixed at the point of generating the listing, for now.
- Honors function and end-of-line comments and outputs them into the source
- Co-opts two of the Ghidra comment modes (pre and post comments) to output pseudo assembler instructions into the code (this could be expanded but is functional for now):
opt z80
device zxspectrum48
org $5b00 - Adds a trailing colon to labels
- Formats hexadecimal numbers in SNasm format ($0000)
main_entry: di ld sp,irq_entry ld a,$39 ld i,a im 2 ; im2 ld a,JR_OPCODE ld (irq_branch),a ld a,JP_OPCODE ld hl,irq_service ld (irq_entry),a ld (irq_address),hl call initialize_game ei main_loop: ld a,(time_hours) cp $10 jr c,lab_ram_7dcf ld a,$01 jp game_over lab_ram_7dcf: ld ix,player_scb call check_room_switch call update_items ld ix,gravstick_scb gravstick_counter: ld hl,$00 ld a,h or l jr nz,lab_ram_7e30 ld a,(player_flags) and $03
What's Next?
For example:
SPRITE_Y equ 2 SPRITE_X equ 1 SPRITE_DRAWH equ 15 ENEMY_THING equ 3 SPRITE_WIDTH equ 3 SPRITE_DRAWL equ 14
ENEMY_THING equ 3 SPRITE_X equ 1 SPRITE_Y equ 2 SPRITE_WIDTH equ 3 SPRITE_DRAWL equ 14 SPRITE_DRAWH equ 15
Addendum
i. To those wondering why I don't use the original source code, the answer is simple: it, along with all other source code pre-1992, was 'lost'. Perhaps I will cover that in a future post.
ii. Yes, it has taken me longer to get to this point than it took to write the original game! My hope is that this effort will pay off and the approach can be applied to other games in the future.
Comments
https://colourclash.co.uk/spectrum-analyser/
Great insights. It's really interesting to m being able to see the process and how you are going about it.
Have you been able to spend anymore time on it?
Matt.
Looking forward to yours. Thanks for taking the time to reply. :)
Matt
I'm really interested to know if you have made any further progress?
All the best
Jace
Is it still going to be available as a free offer for KS1 backers like it was suggested on the kickstart stretch goals?