Home
Admin | Edit

X-87 - DOS 256 bytes

link

256 bytes x86 + x87 (386+) / DOS real mode procedural graphics for Outline 2026VESA 2 (1994) 640x400 32 bpp showing a stylized landscape drawn with oscillators. No escape support.

This started in early 2024 from p5js sketches i initially made in mid 2021, first sketch was a different idea showing some flow like patterns that i directed towards a ground / sky division later on, i kind of stopped tinkering with it for a year but took it back in early 2025 and directed it into this barren landscape which was mostly found by accident through iterative tweaking of oscillator parameters + transforms. (particularly scale) I liked the moody feel and the accidental “moon”.

One year later i decided to try to fit this into 256 bytes, the code was pretty small but i was unsure about some parts (32 bpp, colors) especially after a semi failed x87 learning experiment. (was still successful as i learned bits of x87 but also a lot of things about high colors / high resolution DOS graphics, all documented here)

Took ~13 minutes (!) to render with default (3000; 286 era) DOSBox cycles. ~2m30 at 20000 cycles (late 486 era) and 2m10 at max cycles. (running on a i7 7700 CPU)

Took ~6m10 on Bochs + Freedos at 10MIPS. ~8m at 800x600.

The 256 bytes effort in two weeks

First step was porting the p5.js code to something akin to x87 (stack based) and as i fiddled with Forth... it felt like the perfect opportunity to tinker a bit more with Gforth, resulting code was pretty small which further convinced me that shrinking it down to 256 bytes was possible.

Second step was a crude (greyscale) DOS port (~320 bytes), learned a bit of x87 in the process, it was kinda fun to see how Forth map so well to x87 but also how it differ on some stuff (x87 is way more flexible sometimes), Forth made the x87 port straightforward and fun.

Third step was an early optimization phase (FPU code, constants) to free rooms for the colors, the landscape was kind of dull at that moment.

Final step was to integrate the dither like colorization (colors increment depends on averaged pixels brightness) then try to make it less dull by tweaking constants, this last part added some more weight which required further optimizations : overlapped / truncated floating points (got the idea from balrog), packed colors increments, less FPU instructions etc.

The last bytes + compat

Program was considered finished at this point but there was many compat issues left (no fninit, uninitialized mem etc.) and tiny black squares artifacts (almost imperceptible on DOSBox but very large on Bochs !) which was related to the tricky segment initialization and VESA paging / banking, last goal was to fix them.

It was rather simple to get few bytes to fix the artifacts by reusing palette data for a float constant (overlapping) and further optimizing the init code and px_sat loop code by reusing bx state for the loop.

Making a compatible version (one that run on Bochs + Freedos at least, didn't test on real HW) at 256 bytes was slightly harder and took me a few days, biggest win was to pack the two main loops into a single one and using the upper / lower half of ECX as a frame counter / osc. index, FPU code was again optimized. (trying hard to avoid fcomi instruction to stay at 386 era)

Conclusion

Satisfied of the final result overall as it is quite close to the prototype, not entirely convinced that 32 bpp is justified... could perhaps be done in 256 colors. (in a smaller way ?)

The image can be scaled to any resolutions (require a constant + bank check constant change) but convergence happen faster at moderate resolution + artifacts are less visible. (see below; has artifacts when vertical axis is extended)

Resolution also affect brightness / density although this can be fixed with colors increment / oscillators adjustments.


with more time (note : earlier artifacts version !)

p5js implementation

let x = []
let y = []
let n = 256

let pal = [
  [3, 3, 3], [1, 2, 3], [3, 2, 2], [1, 1, 1]
]

function setup() {
  createCanvas(800, 480)
  background(0)
  for (let i = 0; i < 256; i += 1) {
    x[i] = i
    y[i] = 0
  }
}

function draw() {
  loadPixels()
  for (let k = 0; k < n * 256; k += 1) {
    let ki = k & 255
    let fr = frameCount
    let sx = x[ki] + 1
    let t0 = x[ki]
    let t1 = y[ki]

    x[ki] = t1 * -0.99609375 + (0.009277344 / cos(t0 - cos(fr / 128 + t1 / (1 + t0 / 32))) - 5.141592653)
    y[ki] = t0 + 1 + tan(x[ki] * t1 * 32) / fr

    sx = (x[ki] + 14) / 32
    let sy = 0.36 + 0.85546875 / (y[ki] + 3)

    if (sy > 0.36) {
      sx *= 4
      sy *= 0.981592
    } else {
      sx -= sy / 2
    }

    let px = floor(sx * width)
    let py = floor(sy * width)
    let index = (px + py * width) * 4

    let p = (79 + pixels[index + 0] + pixels[index + 1] + pixels[index + 2]) / 192
    let c = pal[p & 3]

    if (index >= 0 && index < width * height * 4) {
      pixels[index + 0] += c[0]
      pixels[index + 1] += c[1]
      pixels[index + 2] += c[2]
    }
  }
  updatePixels()
}

Gforth implementation (write to a file; 7s render)

640 constant width
400 constant height
  3 constant bpp

create x 256 floats allot
create y 256 floats allot
create b width height * bpp * allot
create pal 3 c, 3 c, 3 c,
           3 c, 1 c, 2 c,
           2 c, 3 c, 2 c,
           1 c, 1 c, 1 c,

: setup
    255 0 do
        i s>f i floats x + f!
        0e i floats y + f!
    loop ;

: draw
    366 1 do \ frames
        65535 0 do \ iterations
            i 255 and { ib }

            ib floats x + f@
            j s>f
            ib floats y + f@ fdup
            -0.995683e f* 0.009429633e 4 fpick 4 fpick 128e f/ 4 fpick 2 fpick 32e f/ 1e f+ f/ f+ fcos f- fcos f/ 5.1594524e f- f+ fdup ib floats x + f!
            32e f* f* ftan fswap f/ 1e f+ fswap f+ fdup ib floats y + f!

            ib floats x + f@ 14.0e f+ 32e f/ \ sx
            fswap 3.141592653e f+ 0.8561283e fswap f/ 0.36063576e f+ fdup \ sy
            0.36063576e f> if
                0.98328006e f* \ sy
                fswap 4e f* \ sx
            else
                fswap fover 2e f/ f- \ sx
            then

            width s>f f* f>s \ px
            width s>f f* f>s \ py
            width * swap + bpp * { index }

            index 0 width height * bpp * within if
                index b + c@ index 1 + b + c@ index 2 + b + c@ + + 79 + 192 / 3 and \ p
                3 * pal + dup dup \ c
                1 + c@ index b + c@ + 255 min index b + c!
                2 + c@ index 1 + b + c@ + 255 min index 1 + b + c!
                c@ index 2 + b + c@ + 255 min index 2 + b + c!
            then
        loop
    loop ;

: write \ write 640x400 RGB8 buffer (raw) to a file
    s" landscape.data" w/o create-file throw >r
    b width height * bpp * r@ write-file throw
    r> close-file throw ;

setup draw write
bye

Iterations

first result of the port without bounds checking (speckles) nor saturation

less dull sky

bounds check

first colors



kind of dull "final" result

toward final result, this was slightly stretched horizontally later due to constants optimization

early 2025 p5js sketch (with optimizations)

p5js initial sketch (800x480)

early p5js sketch

back to topLicence Creative Commons