Exploring Old MS Paint Formats

by MrAureliusR

I've recently been hanging out with the crew from WinWorldPC - you know, that website that hosts a huge archive of old software and operating system images.  It's a great place to find obsolete software, especially for Windows 9x, MS-DOS, OS/2, and even old Macintosh stuff.  On the Discord and IRC channels they use to hang out, I met some interesting characters.  We started spending a few hours each day streaming ourselves installing and playing around with all the software on the website.  I decided to play around with Windows 1.04 and see what the very first Windows was really like.

I installed MS-DOS 3.3 and then Windows 1.04 on a VirtualBox VM.  I was immediately struck by how similar Windows 1 is to the "MS-DOS Shell" that came with MS-DOS 5.0.  However, it includes quite a few applications, including the original Calendar, Clock, Notepad, Write, and our focus: Paint.


The Windows 1.04 Interface After Install

This version of Paint was actually just a licensed version of ZSoft's PC Paintbrush.  It has some unique features that were not incorporated into the version of Paint created by Microsoft for Windows 3, such as the 3D Cube tool (pictured).


Microsoft Paint Version 1.04 with a Beautiful Piece of Art

The interface is very simple, but for the time it was actually quite a decent tool, especially when included as part of the OS.  Making simple diagrams which could be printed or even inserted into text documents was quite easy to do.

This first version of Paint uses a very simple black-and-white format.  Pixels are either on or off.  There's no grayscale, and there's definitely no color support!  I started off just messing around with the tools, but then I started trying to make interesting images that could be used as avatars for modern chat programs.  I discovered that the fill tool could create some basic fill patterns which gave the images a lot more texture and depth.  The 3D object tool made creating cubes or rectangular prisms very easy, which I used to great effect.  The selection of fonts is quite limited (that's a whole other story - this is before TrueType fonts and so Windows used a proprietary .FON format which is quite complex), but despite the limitations it's fun to create retro-looking one-bit art!

I then ran into a problem.  I wanted to be able to export these files so I could potentially edit them with GIMP or another modern tool to add a few finishing touches.  I could have taken a screenshot of the VM, but that was sure to lose some detail.  This version of Paint does not use bitmaps; it uses an earlier proprietary format which is the subject of this article, simply called MSP for Microsoft Paint.  Unfortunately, GIMP cannot read MSP files directly.  I took a quick look online to see if there were any conversion tools available, but most of them were designed for Windows and were quite obsolete.  I run Manjaro and really wanted a simple solution that would make it easy to convert to a modern format.

Whenever I've had to work with simple image data in the past, I've tended to use the X BitMap (XBM) format.  This is a very primitive file format that was designed to hold icons and small images for the original X Window System.  The format is actually just a C array with some variables to describe the width and height, and so it's very easy to generale programmatically and edit by hand.  It also happens to be a bit-per-pixel format, just like MSP.  And best of all, GIMP can directly import XBM files and then export them as PNG or any other modern format.  So my plan of attack was to go from MSP to XBM to PNG.


XBM Format Image Data is Just a C Array

First I had to figure out the MSP file format.

Thankfully, the ArchiveTeam website has a Wiki dedicated to documenting file formats, and their entry for MSP had a link to a page with the full format in good detail: www.fileformat.info/format/mspaint/egff.htm.  This is when I discovered that there are actually two versions of MSP: Version 1 was used with Windows 1, and Version 2 was used with Windows 2.  The difference is that MSP Version 2 uses some simple run-length encoding to compress files, whereas Version 1 is just the raw data preceded by a header.

With the format specification in hand, I spent a couple of minutes deciding which language to use to write the converter.  I have a lot of experience with C/C++, but when it comes to anything involving text and file manipulation I tend to use Python.  The algorithm is so simple that it would be trivial to re-implement in pretty much any language.  I've written it as a Python 3 script.

The script checks which version the MSP file is - thankfully, they included a magic four-byte value in the file header to make this easy - and then gets the height and width from the next few bytes.  If it's Version 1, it simply copies the data directly into an XBM-format file.  My first attempts resulted in some output that looked vaguely like the original file, but something was clearly wrong.  It took me a while to realize the MSP format stores each byte with the most significant bit first, but XBM is the other way around!  So while the bytes were all in the correct order, each byte was reversed.  A quick hack to flip each byte around fixed this!

I decided that I may as well add support for Version 2.  The run-length encoding is very simple: read the first byte.  If it's "0", the next byte is the length of the run, and the following byte is the byte to be repeated.  So you just copy that byte x times.  If the first byte is non-zero, it's telling you the next x bytes are to be copied over exactly.  Here's a simple example:

00 09 FE 04 AA 52 80 14

The first byte is zero, so the next byte is the run length.  In this case it's nine.  The next byte (FE) is the byte to copy.  So the start of this data is:

FE FE FE FE FE FE FE FE FE

It's taken nine bytes and compressed them into three!  The next byte is 04, which is telling us the next four bytes are to be copied directly.  So the above bytes would be added, and we end up with:

FE FE FE FE FE FE FE FE FE AA 52
80 14

Pretty simple, right?

The Version 2 file also includes a "map" which tells you where each row of image data starts, in case you want to just read part of the file.  However, since we're converting the entire file at once, we just start at the top of the image data and continue until we're done.  If you get to the end of a Version 2 file and there seem to be bytes missing, it means the remaining bytes are all FF.

For example, a 16x16 pixel image should have 256 bytes.  If you read all the image data and find you've only read 200 bytes, that means you need to add 56 bytes of FF at the end.

Oh, and one last wrench thrown in the mix: all the bits are inverted in XBM as compared to MSP!

So in GIMP, if you want the original image back, you need to do an invert on the image.  However, I actually like how they look inverted in some cases, which is why I didn't end up adding the inversion into the script itself.


An Avatar Made for User Toxidation in MS Paint v2

The script is below and in a GitLab repo located a gitlab.com/mraureliusr/mspconvert and it has instructions on how to use it.

Note that this script uses a new operator (:=) introduced in Python 3.8 and so you need at least that version to run it.  If you want to add any improvements or modify the source, go ahead!  It's licensed under the Mozilla Public License v2.0.  Issues and pull requests are appreciated.

Have fun drawing some retro images in Windows 1 and be sure to Tweet or Toot them at me on Twitter or Mastodon (@MrAureliusR on Twitter, and @amrowsell@mastodon.sdf.org on Mastodon!)

#!/usr/bin/env python3

# This Source Code Form is subject to the terms of the
# Mozilla Public License, v. 2.0. If a copy of the MPL
# was not distributed with this file, You can obtain
# one at https://mozilla.org/MPL/2.0/.

import sys
import re

print("MS Paint file converter utility v0.1")
print("Written by A.M. Rowsell, MPL Version 2.0 license\n")
if len(sys.argv) < 2:
    print("Please provide filename!")
    print("Example usage:\n./mspconvert.py DONUT.MSP")
    sys.exit(255)

filename = sys.argv[1]
filename_ne = re.split(r"\.", filename)[0]
width = 0
height = 0
# The output file follows the very simple XBM format
# which is just a basic C syntax array variable
outputString = '''
#define {0}_width {1}
#define {0}_height {2}
static unsigned char {0}_bits [] =
'''

# Output data starts as an empty bytearray
outputData = b''

try:
    with open(filename, 'rb') as f:
        versionString = f.read(4)                # check for the magic bytes
        if versionString == b'\x4c\x69\x6e\x53': # this represents the string "LinS"
            version = 2
        elif versionString == b'\x44\x61\x6e\x4d': # this represents the string "DanM"
            version = 1
        else:
            print("The given file {0} is not a valid Microsoft Paint file!".format(filename))
            sys.exit(255)       # exit with -1

        if version == 2:
            print("Version 2 Paint file detected...")
            Width = int.from_bytes(f.read(2), "little") 
            height = int.from_bytes(f.read(2), "little")
            size = int((width * height) / 8)
            f.seek((height * 2) + 32) # seek to the start of image data
            while(byte := f.read(1)):
                if(int.from_bytes(byte, "little") == 0): # RLL-encoded
                    rllLen = int.from_bytes(f.read(1), "little")
                    rllValue = f.read(1)
                    for i in range(0,rllLen):
                        outputData += rllValue
                        size -= 1
                else:           # read the following number of bytes verbatim
                    rllLen = int.from_bytes(byte, "little")
                    for i in range(0,rllLen):
                        outputData += f.read(1)
                        size -= 1
                print("Remaining size: {0}".format(size))
            for i in range(0, size):
                outputData += b'\xff'

            with open(filename_ne + "_converted.xbm", 'w') as f:
                print("Writing output file...")
                f.write(outputString.format(filename, width, height))
                f.write(" {\n")
                q = 0
                for byte in outputData:
                    result = int('{:08b}'.format(byte)[::-1], 2)
                    f.write("0x" + '{:x}'.format(result) + ", ")
                    q += 1
                    if q >= 16:
                        f.write("\n")
                        q = 0
                f.write(" };")
            print("Done!")
            sys.exit(0)
        elif version == 1:
            print("Version 1 Paint detected...")
            width = int.from_bytes(f.read(2), "little")
            height = int.from_bytes(f.read(2), "little")
            f.seek(28)
            q = 0
            outputString = outputString.format(filename, width, height)
            outputString += " {\n"
            while(byte := f.read(1)):
                result = int('{:08b}'.format(int.from_bytes(byte, "big"))[::-1], 2)
                outputString += "0x" + '{:x}'.format(result) + ", "
                q += 1
                if q >= 16:
                    outputString += "\n"
                    q = 0
            outputString += " };"

            with open(filename_ne + "_converted.xbm", 'w') as f:
                print("Writing output file...")
                f.write(outputString)
            print("Done!")
            sys.exit(0)

except FileNotFoundError:
    print("{0} does not exist! Quitting...".format(filename))
    sys.exit(255)
except PermissionError:
    print("Unable to open {0} -- insufficient permissions! Quitting...".format(filename))
    sys.exit(255)
except Exception:
    print("Something went wrong! Quitting...")
    sys.exit(255)

Code: mspconvert.py

Return to $2600 Index