Go Roughly Into That Good Light
What’s the difference between “software hacking” and “computer science”? I’d like to think that the more ‘science’ oriented part of writing code has to do with formal methods, algorithmic purity, and systems theories and whatnot. A lot more specification, a lot more academic, a lot less code. Then there’s how we generate 99% of the code that actually gets used on a daily basis. If we’re the creators of core systems, OS kernels, UI libraries, and the like, there might be more ‘science’ involved. As most of us are just using those libraries and systems, our processes are about library discovery, reading incomplete documentation, throwing something together, seeing how it works, and trying different parameters until we get a feel for what works, and what doesn’t. I’d call this “hacking”.
I’m embarking on a prime hack, and I know that going in. I mentioned the japajoe/rekt project, and I want to explain that a little bit more.
What I’m really after is a way to run JIT code in my environment, so that I can ‘compile’ at runtime, and get code that is as fast as ‘native’ code, without much compromise, and without having to include a heavy duty compiler runtime with the app.
OK, so I’ve already introduced a bunch of hand wave words, what do they mean?
JIT – Just In Time. The notion that you can generate native code, at runtime, Just In Time. I mean, the machine you’re running on executes a bunch of numbers. Assembling that set of numbers to actually implement functions and data structures, is what the ‘compiler’ is all about. But, what if you don’t have a compiler at hand? Is there a way to just generate the numbers the machine will understand as a program, and execute it? Yes, and that’s what JIT is all about. One way or another, creating the binary set of numbers that represent a runnable program, during runtime.
That set of numbers are CPU specific of course. So, intel/AMD understand the instruction codes of x86/x64, and ARM has various instructions, and RISC-V yet other instructions. Trying to generate the correct CPU specific codes from a single higher level source, is what compilers, and higher level languages are all about. When I write in C, I don’t really think of the low lying CPU implementation, I just assume my compiler performs the proper incantations, and magic happens.
But I need to drop down a level from there. I don’t want to run a higher level language, with the attendant compiler, just yet. I want to start much simpler. First of all, by using japajoe/rekt, I am signing up to a certain representation of a ‘CPU’. In the case of rekt, the CPU in question has a fixed set of registers, and a stack, and memory. that’s it. It does not contain all the instructions of an x86 machine. In fact, it only has the bare minimum ability to set a value in a register, push and pop from a track, and call functions.
This is a good start, because I get to learn how to write an emulator to begin, and re-learning how to write in assembly, without the headache of the hundreds of instructions that exist in modern CPUs. So, rekt is a good way to go. When I want to execute a bit of code at runtime, I write it in assembly, and execute it. Here’s an example.
1
2
3
4
5
6
section data
helloText db : "Hello, world!\n"
section text
push helloText
call printf
hlt
Well known things the rekt CPU supports. There is a ‘data’ section of the code, which contains things like fixed strings, and other kinds of constants and variables. The ‘text’ section is where the actual program begins. Here we see the ‘helloText’ location being pushed onto the stack, the ‘call printf’, which invokes the printf() function, which has been registered with the CPU runtime already. Then the program halts (‘hlt’).
So, one way of executing functions, is to follow this same “registration of functions” methodology.
1
void
SystemModule::Register()
{
RegisterFunction("printf", SystemModule::PrintF);
RegisterFunction("fgets", SystemModule::FGets);
RegisterFunction("timestamp", SystemModule::TimeStamp);
RegisterFunction("malloc", SystemModule::Malloc);
RegisterFunction("memset", SystemModule::MemSet);
RegisterFunction("memcpy", SystemModule::MemCpy);
RegisterFunction("free", SystemModule::Free);
RegisterFunction("meminc", SystemModule::MemInc);
RegisterFunction("memdec", SystemModule::MemDec);
RegisterFunction("memadd", SystemModule::MemAdd);
RegisterFunction("memsub", SystemModule::MemSub);
RegisterFunction("to_int64_ptr", SystemModule::ToInt64Pointer);
RegisterFunction("to_uint64_ptr", SystemModule::ToUInt64Pointer);
RegisterFunction("to_double_ptr", SystemModule::ToDoublePointer);
RegisterFunction("to_void_ptr", SystemModule::ToVoidPointer);
}
Just match a name, to an internally implemented function. That printf looks something like:
1
int
SystemModule::PrintF(Stack<Object> *stack)
{
size_t
stackCount = stack->GetSize();
if
(stackCount == 0)
{
std::cout << std::endl;
return
0;
}
else
if(stackCount == 1)
{
Object obj;
if
(!stack->Pop(obj))
{
return
-1;
}
std::cout << obj.ToString();
return
0;
}
else
Notice the function signature. It takes as a single parameter a ‘Stack<Object> *stack’, and returns an integer. This establishes the Application Binary Interface (ABI). This is the lowest level of interface for machine code. In this case for our CPU emulator. This is different than what the x86/x64 might really expect, but we’re not concerned with that right now. All we need to know is that if we want to make something available to code that’s inside a .asm file, we need to register functions in this way. Then the .asm code will call it by placing items on the stack, and making the function call.
There’s a little bit too much intimate knowledge required here though. Which order do parameters go on the stack? Is the stack cleared within the callee function before it returns? Is the return value returned on the stack, or in one of the registers? (EAX). All this kind of information goes into the ABI specification, and allows for systems of different origins to interoperate with each other.
The next step I want to reach towards is support for actual native code, and supporting an actual well known ABI. In order to do that, I’m going to leverage asmjit (https://github.com/asmjit/asmjit). Asmjit is great. You can easily generate low level code at runtime, and it is instantly runnable, because you’re writing in ‘assembly’, which is instantly turned into native CPU codes. AsmJIT takes care of things like ABI compliance, and even knows about a couple of CPU architectures. That’s the ultimate, but it also requires a lot more organization on my part, which I’m not ready for.
So, I want to start with rekt, because I want to re-familiarize myself with writing assembly, by stitching it into my own runtime system, stitch by stitch.