A few months ago, I wrote a Julia set fractal renderer for the demo challenge hosted by io.smashthestack.org
.
This program runs in 16-bit x86 real mode, without any operating system. It's formatted as a PC master boot record, which is 512 bytes long. Subtracting out space reserved for a partition table, we have only 446 bytes for code and data.
Programming in such a restricted environment is quite a challenge. It's further complicated by real mode's segmented addressing. Indexing an array bigger than 64 kB requires significant extra code — and that goes double for the video frame buffer. With two off-screen buffers and a 640 × 480 × 1 byte video mode, much of my code is devoted to segment juggling.
I spent a long time playing with code compression. In the end, I couldn't find a scheme which justifies the fixed size cost of its own decoder. It seems that 16-bit x86 machine code is actually pretty information-dense. For a bigger demo or 32-bit mode (with bigger immediate operands) I'd definitely want compression.
It's totally feasible to enter 32-bit protected mode within 446 bytes, but there's little gained by doing so. You lose easy access to the PC BIOS, which is the only thing you have that resembles an operating system or standard library.
You can browse the assembly source code or grab the MBR itself. It runs well in QEMU, with or without KVM, and I also tested it on a few real machines via USB boot. With QEMU on Linux it's as simple as
$ qemu -hda phosphene.mbr
Thanks to Michael Rule for the original idea and for tons of help with tweaking the rendering algorithm. His writeup has more information about this project.