DOOM on a watch
Does it run DOOM?
I was trying to find a good use case for my LILIGO TTGO T-Watch. It’s a programmable smart watch featuring the amazing ESP32 chip and a 240x240 color LCD screen.
I keep hearing about Doom running on this and that, sometimes directly and sometimes using the device as an exotic external screen. My project falls into the latter category, but it was a lot of fun to implement!
As this was mostly a learning experience, I took several wrong turns and mention them in this writeup.
As with most of tiny projects, I got the basics version up and running on a two afternoons and then fiddled with the code for a few more days to get a more presentable result.
It works!
Building steps
Getting the video signal across
There are a couple of ways to transmit data from a computer to the ESP32-based watch. Wi-Fi, Bluetooth and a serial port.
Serial port seemed promising - I initially targeted 120x120 picture, which can be represented by 14400 bits or 1800 bytes. Targeting 115200 bauds I could achieve around 7 fps (115200 / 16200 bauds per frame).
The default serial configuration is 8 data bits, no parity and one stop bit, hence 9 bauds per byte.
With the goal was of displaying DOOM on the smart watch in mind, I thought. I decided to divide this into three phases:
- Scale the image down to the resolution of the watch
- Implement a proof of concept serial display on the T-Watch
- Optimize until it works better
Scaling Doom
I started with finding a suitable Doom source port. I discovered Chocolate Doom, which accurately reproduces the game as it was played in the 1990s.
Building it on Ubuntu is straightforward, so I dived straight into the code and started poking around.
Wrong turn #1 - modifying Doom internal resolution
The first thing I did was try to find how a resolution gets set.
At first I tried to change the internal resolution of the renderer - there was a #define
for SCREEN_WIDTH 320
and SCREEN_HEIGHT 200
in i_video.h
.
Changing this to 120x75 made the game crash. I attached a debugger to see where exactly and the game was attempting to render some things at locations beyond 120x75. Scaling all coordinates by 0.5x helped get the menu to display, but it crashed again as soon as I started the game.
Scaling and dithering
After studying the Chocolate Doom source port some more I realized it has a series of buffers and textures that represent stages of the rendering pipeline.
The engine itself draws assets with a palette to make a better use of 255 colors.
- the game is drawn into an 8-bit paletted 320x200 paletted screen buffer.
- blit into a 32-bit ARGB 320x200 buffer
- rendered into an upscaled texture using a nearest linear scaling (e.g. 640x400)
- rendered to the screen (e.g. 800x600) using linear scaling
It was clear the dithering and output to the serial port should happen somewhere within this pipeline.
Wrong turn #2
I realized looking at the screenshots from various buffers that I’ve been dithering the wrong display buffer (the 320x200 display buffer), then downscaled in SDL to 240x150, which caused artifacts and didn’t look as good.
The correct way was to scale the display buffer to 240x150, then dither, then send this over the wire.
Dithering
After a little research on 1-bit graphics I realized there are two commonly used dithering algorithms - I implemented both:
Ordered dithering
Ordered dithering is a simple algorithm that produces a characteristic crosshatch pattern.
It works by applying a threshold map to the pixels displayed, causing pixels from the map to change color based on the distance from the original color to black and white.
Ordered dithering patterns
Floyd-Steinberg dithering
Floyd-Steinberg dithering operates using error diffusion and is characterized by its grainy or speckled appearance.
Because Floyd-Steinberg works by pushing the quantization error from a pixel to its neighboring pixels, a slight change in the scene can propagate over the entire screen. I found that aesthetically less pleasing than the more predictable ordered dithering, as it was simply less jumpy.
Floyd-Steinberg dithering above, ordered dithering below
Grayscale and gamma
In both dithering algorithms we convert the color to grayscale with the following algorithm:
//get r,g,b color values
uint8_t r, g, b;
uint32_t pix = getpixel(s, x, y);
SDL_GetRGB(pix, s->format, &r, &g, &b);
// Convert the pixel value to grayscale / intensity
grayscale = .299f * r + .587f * g + .114f * b;
Doom is quite dark, so it’s hard to see anything in the default gamma setting. Fortunately the engine also features gamma correction, that can be toggled with the F11
key in game.
Gamma settings 1, 3, 5 going from unusable in monochrome to pretty bright
Watch as an external display
To make data transfer a bit less intensive I decided to transmit in 120x120 resolution and then double the pixels to 240x240 on the device.
Initially I programmed my watch to be a simple single-core serial display. It reads one row of pixels from the serial port as 120 bits that represent black or white pixels, then expands each bit onto a 16-bit array value.
Transfer rate
When transmitting 120x120 pixels at around 16 FPS we produce around 15 * 9 (data + stop bit) * 120 (rows) * 16 (fps) = 259 kbits of data per second.
The algorihm for
char rxBuffer[RECEIVE_LINE_BYTES];
uint32_t lineBuffer[DISPLAY_LINE_LENGTH];
...
Serial.readBytes(rxBuffer, RECEIVE_LINE_BYTES);
tft->setAddrWindow(0, y, DISPLAY_LINE_LENGTH, 2);
convertPixelsBetweenBuffers();
tft->startWrite();
tft->pushPixels(lineBuffer, DISPLAY_LINE_LENGTH);
tft->pushPixels(lineBuffer, DISPLAY_LINE_LENGTH);
tft->endWrite();
The pixel conversion does the horizontal pixel doubling and prepares it to the converts input (a 120 bits field) to the display line (240 16-bit pixels, stored in a 120 32-bit array). It iterates bit by bit .
We can display data on the TTGO T-Watch using the TFT_eSPI library. It includes a pushPixels
function that expects a buffer of 16-bit pixel colors.
I wrote a supporting tool to feed the watch some pixels in Python. The bits are sent using the pyserial
library in a loop.
Connecting the Watch and Doom
Python is not very helpful in the Chocolate Doom port, so I had to write the serial frame transmitter in C. I’ve adapted the first code snippet I found I found on Stack Overflow, credit goes to sawdust.
Still working in the 120x120 pixel resolution, this is how the intermediate result looked:
It’s incredibly low-res. Also, because I was lazy, the last row of pixels of the status bar is leaking all the way down the screen as I just repeated rows 75 up to 120 in the output stream :-)
I eventually bumped the resolution to 240x150.
There’s nothing special about the serial output module, there’s a function that takes an SDL_Surface
assuming its dimensions being 240x150, loops over the RGB values and for every row of 240 pixels (bits) spits out 30 bytes.
uint8_t bit = 7;
for(y = 0; y < SERIAL_BUFFER_HEIGHT; ++y) {
memset(buf, 0, SERIAL_BUFFER_BYTES);
for(x = 0; x < SERIAL_BUFFER_WIDTH; ++x) {
pix = getpixel(s, x, y);
if(g == 255)
buf[x >> 3] |= (1 << bit);
if(bit-- == 0)
bit = 7;
}
Getting it faster
Straightforward increase of the baudrate to 500,000 caused some screen tearing, it seemed that the serial receiver code on the watch was having hard time keeping up. After optimizing the bit conversion loop it could handle stable 500,000 bauds, leading to high enough framerate to consider increasing the resolution to 240x240.
Wrong turn #3 - going multi-core on the ESP32
I thought I could leverage the second core on the ESP32 and have the watch run two tasks - one that processes the serial data and another that decodes it and sends to the display.
I implemented a proof of concept serial display that uses the FreeRTOS Task notification API to communicate between the task using notifications.
It used two buffers - one to receive the serial data which got copied to the display task and the notifications were supposed to let the tasks know that they can touch the shared buffer.
The source lives in the multithreaded
branch - but it didn’t really work faster, which lead me to the next attempt:
Using DMA for the speed
What helped in the end was using Direct Memory Access (DMA) transfer using ` tft->pushPixelsDMA(lineBuffer, DISPLAY_LINE_LENGTH);`. It’s basically “fire and forget” data transfer - the controller will move the data from the RAM to the display and practically allow the microcontroller to execute other code as opposed to operating the SPI bus.
Now I could increase the baudrate to 921600 and practically double the framerate.
Vertical Synchronization?
If we just dump the data to the screen without some kind of synchronization or alignment, the device wouldn’t know where the boundary between the frame data lies.
It also means we would need to be lucky to start the transmission in sync with the watch displaying the first row of the frame data.
To fix this, I added a simple VSYNC message that the watch sends to the PC over the serial port when it starts drawing the first row. Upon receiving VSYNC, the PC should start abandon the current frame and start sending another frame from the beginning. I’ve added a handler for this to the python support tools, but decided not to for Doom as it was easier to just reset the line currently being drawn if no data has come for a while across the serial port.
Finishing touches
A series of color schemes livens up the 1-bit color depth - just black and white is kind of boring. This has a straightforward implementation on the T-Watch side, reacting to the touch of the touchscreen with digitalRead(...)
Various color schemes, photos of the watch display
Some gifs of Doom in action
Ordered dithering, black & white, Doom 1
Floyd-Steinberg dithering, Doom 2
We also get Heretic and Hexen!
Hexen
Appendix: Potential improvements
Frame compression
As an experiment, I’ve added zlib
compression to the Doom engine to compress the frames. An uncompressed 240x150 frame fits in 4500 bytes, with the fastest zlib compression it usually shrank into 2800 bytes, which is a saving of around 37%. That means that I could trade some the CPU time on the ESP32 (currently spent on receiving data) from the serial port for the decompression, I could potentially increase the frame rate or send some other data along with video.
Sound
There’s also a possibility of transmitting sound data, in theory. Chocolate Doom supports PC speaker output, which operates on playing back tones (see source or doom wiki page). To implement this over serial, one would need to mix the tones in with frame data and implement a player thread that plays back the tones over the i2s interface to the watch speaker.
The simplest way to make this work would probably be something inspired by the Windows port behavior that either produces a beep for a specified duration or stays silent.
Running Doom on the watch directly
There’s a port of DOOM by unlimitedbacon to the watch: https://github.com/unlimitedbacon/TTGO-DOOM that actually runs on the watch.
Source and build instructions
https://github.com/jborza/chocolate-doom -> My Chocolate Doom fork with the dithering and serial output. For serial port configuration see src/i_serial.h
, for video tweaks see the definitions on top of src/i_video.c
.
Build instructions for Chocolate Doom on Debian/Ubuntu.
https://github.com/jborza/watch-doom-receiver -> The serial display tool for the watch. Required libraries: ESP32 support, TTGO T-Watch library
After Doom is built, the watch software is up and running, the PC and the watch is connected with a USB cable, run
chocolate-doom -iwad doom2.wad -width 960 -height 600
, keep looking at the watch and play!