Immediate string values in SLEIGH #5070

perryprog · 2023-03-06T20:19:42Z

perryprog
Mar 6, 2023

Hi,

I'm working on building a language for the Z-machine, a virtual machine designed for interactive fiction. (Specifically, v5.) One thing I've noticed that I'm unsure how to approach is that the bytecode actually allowed for immediate strings for printing, which are only null-terminated, and don't have any length marked. I haven't a clue how to properly model this in SLEIGH, if it's even possible.

The strings themselves are sequences of two-byte triplets of three 5-bit characters, plus an additional bit which marks what the last two-byte sequence is. (Spec)

The opcodes are print (178) and print_ret (179), and the specification for how instructions are encoded is here. There's also an example disassembly of a print instruction at the bottom of that page.

Is this something that's actually possible to model in SLEIGH? And if not, is there a good way to work around it in a way that still gives a good clue as to what's going on once I'm viewing a file in Ghidra?

GhidorahRex · 2023-03-06T22:12:39Z

GhidorahRex
Mar 6, 2023
Collaborator

I don't have a solid solution yet, but several of our processors have input that can be variable-length - typically register lists. ARM does this in several places. You should be able to model the print piece effectively with some context here to keep track of the current size of the string, and some bit manipulation to determine if you're at the end or not.

Implementing the pcode is more difficult. I would use a pcodeop to handle it, but as far as what to pass in, I'm not sure. You could use a fake memory space to write the string values to. I don't know how that would look in the decompiler. It might require some experimentation.

1 reply

perryprog Mar 15, 2023
Author

Oh, awesome—I was hoping there'd be an existing processor that would have something like this, but I didn't immediately see any.

Thanks for the idea on using context! That'll probably be enough to get me started (when I have time to get back to this side project, anyway 🙃.)

I'm less worried about making the decompiler right, so I wouldn't be too bothered if the pcode doesn't work out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Immediate string values in SLEIGH #5070

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Immediate string values in SLEIGH #5070

perryprog Mar 6, 2023

Replies: 1 comment · 1 reply

GhidorahRex Mar 6, 2023 Collaborator

perryprog Mar 15, 2023 Author

perryprog
Mar 6, 2023

Replies: 1 comment 1 reply

GhidorahRex
Mar 6, 2023
Collaborator

perryprog Mar 15, 2023
Author