Malbolge
0.0.1
Malbolge virtual machine
|
.0.1_README A virtual machine to execute Malbolge programs, written in C++20 and dependent on Boost v1.73.
Only tested using g++ v10.0.1 on Linux.
You can load from a file like this:
The file extension is ignored. Or you pipe the output into it:
Like any terminal application, you access help with --help
. You can also get log output (in stderr) via increasing the number of -v
switches, e.g.:
You can increase logging up to 3 levels. As it is output into the error stream, you can split the logging output into a separate file to preserve the program output:
API documentation for Malbolge is available here.
Malbolge is the name of the eighth circle of Hell in Dante's Inferno. This sets the scene.
Invented in 1998 by Ben Olmstead, Malbolge isn't so much a programming language for humans, but a machine language for a fictional 10-trit ternary-based virtual CPU with 3 registers and a fixed size memory block.
A trit is single unit of state in the Malbolge VM, it holds 1 of 3 possible states: 0
, 1
, or 2
. In other words it is the 3-state equivalent of a bit. A ternary is simply an unsigned integer made from trits.
Malbolge has a single native type, a 10-trit ternary (hereby referred to as just 'ternary'). 310 gives you 59049 possible values, in the range [0, 59048]. The values wrap upon under- and overflow.
The Malbolge virtual CPU is accompanied by a fixed size block of memory that holds 59049 addressable cells with each big enough to hold a single 10-trit ternary.
The fact that the virtual memory's size is fixed by the language itself, is why Malbolge cannot be considered Turing-complete.
The virtual CPU has 3 ternary registers:
a
: 'Accumulator', a general-purpose registerc
: 'Code pointer', holds the memory location of the current or next instructiond
: 'Data pointer', holds the memory location of the current or next data valueWhen a CPU cycle is begun, the value at the code pointer needs to undergo a simple ciphering in order to extract the actual value:
The subtraction of 33 is why *c
must be checked to see if it lies in the graphcial ASCII range ([33, 126]) first, if it doesn't it is an error and the program must abort. i
is an index into the pre-cipher table:
Once processing on a CPU instruction has finished, it needs to be ciphered again before being written back into *c
- that's right, Malbolbge modifies its code as it executes. What fun.
Much like the pre-ciphering, the subtraction of 33 is why *c
must be checked to see if it lies in the graphcial ASCII range ([33, 126]) first, if it doesn't it is an error and the program must abort. i
is an index into the post-cipher table:
Another unique 'feature' of Malbolge is the operation performed on two ternary values to produce another. It is used for obfuscating the initial program data size in a memory block and can be triggered by a particular CPU instruction.
Each trit in the two input ternarys is used as an index into a table to produce a new trit value:
a | ||||
---|---|---|---|---|
0 | 1 | 2 | ||
b | 0 | 1 | 0 | 0 |
1 | 1 | 0 | 2 | |
2 | 2 | 2 | 1 |
For example:
Note this is referred to as the 'crazy operation' on Wikipedia, I don't know where that name came from as it is not in the original specification or reference implementation - so I don't use it. Also, it's not crazy, it's ridiculous.
Instruction | (as decimal | Execution |
---|---|---|
j | 106 | Sets d to *d |
i | 105 | Sets c to *d |
* | 42 | Rotates the trits in *d to the right i.e. towards the least significant, and copy into a |
p | 112 | Perform the op using a and *d , then assign to *d and a |
/ | 47 | Take a char from stdin and assign to a . EOF is 59048 |
< | 60 | Write a to stdout. Skip if a is 59048 |
v | 118 | Stops execution and exits the program |
o | 111 | No-op. This is only useful during program load, as all graphical ASCII non-instruction characters are ignored during execution |
On start the program data is processed:
m[i] = op(m[i-1], m[i-2])
At program start, all the registers are set to zero. Then the program follows this loop:
*c
and pre-cipher it. If the input is not in the graphical ASCII range then the program is malformed and must abort*c
c
and d
, they will wrap upon overflowIt should be noted that the specification and implementation differ (of course!), most notably in that the input and output instructions are reversed. As most of the existing implementations are based on the original C code, and therefore most of the existing Malbolge programs are written for that, I too have followed the reference implementation.
http://www.lscheffer.com/malbolge_spec.html