[Frequently Asked Questions] [Resources] [Emulators] [Where Is...?] [File Formats] [Technical Information] [Pinouts] [Acknowledgements]


This page last updated on 21 March 1998

[Snapshots] [SLT] [SNA] [SNA 128] [SP] [ZX] [Z80] [ZX82]
[Tapes] [TAP/BLK] [TAP (Warajevo)] [TZX]
[Others] [MDR] [SCR]

This section will be expanded as I get the time to do so; if you are desperate for information about a particular format right now, a good source of information is the source to SPConv v1.10; available from (among other locations) here. The source to WSpecEm is also worth a look as WSpecEm supports lots of different formats. Also check the documentation for other emulators; many include detailed file format information.

Snapshot Files

Those handled by SPConv v1.10 are in italics:

.ACH or .archimedes Snapshots used by !Speccy. .ACH is the extension SPConv uses for these files; so that filename-challenged operating systems like DOS can handle such files for conversion purposes.
.PRG Program file used by Specem.
.RAW Raw memory dump from a real Spectrum; just the 48Kb of RAM and a CODE header on the front.
.SEM Snapshots used by ZX Spectrum-Emulator, the German PC-based emulator. SPConv v1.11 will handle this format (actually it already does, but I'm not going to release it yet because not all converted snapshots are working correctly).
.SIT Situation file used by Sinclair v2.00.
.SLT Super level loader snapshot. Used by x128 and WSpecEm at time of writing. Basically a v2/3 Z80 snapshot with level data appended.
.SNA or .snap or .snapshot Mirage Microdrive snapshot format, used by most emulators.
.SNA 128Kb version of SNA. Distinguished by file size of 131103 bytes instead of 49179 bytes for 48Kb version.
.SNX Extended version of SNA, used by Speccy on the Atari ST.
.SP Snapshots used by SPECTRUM.
.SP Snapshots used by VGASPEC.
.SP Snapshots used by ZX Spectrum (Mac emulator).
.Z80 Snapshots used by Z80 and several other emulators. Three versions in existence, the latest of which (v3/SLT) is not supported by all these emulators. Very flexible; support for SamRam, 128K snapshots, etc.
.ZX Snapshots used by KGB.
.ZX82Snapshots used by Speculator '97.

For the purposes of these descriptions, the following definitions apply:

byte = byte-sized variable; word = 2 bytes, long = 4 bytes. All stored in little-endian (Intel) format unless otherwise stated.

a) .SLT (Super Level loader Trap used by x128 v0.4+, WSpecEm, Z80 v3.04+ etc.)

The level loader trap has one annoying disadvantage; lots of extra files lying around for each game. The super level loader was thought up (by Damien Burke) to replace this multi-file format with a single snapshot file containing all the level data files. It has been designed in co-operation with James McKay (x128), Gerton Lunter (Z80), Rui Ribeiro (WSpecEm) and Darren Salt (helping with Z80Em), so is well-supported already. The format was designed with future expansion in mind, as you will see.

            Size   Description
            varies bytes  Z80 snapshot (version 2+)
            3      bytes  Three null bytes (compatibility; see below)
            3      bytes  "SLT" (signature)
   ---- the following blocks make up a table to access the data files -----
            2      word   data type (0 = end of table, 1 = level data)
            2      word   data identifier (for type 1 this is level number)
            4      long   data length
            2      word   data type (0 = end of table, 1 = level data)
            2      word   data identifier (for type 1 this is level number)
            4      long   data length
            ... and so on
   ---- the following blocks are the data files themselves ----------------
            varies bytes data
            varies bytes data
            ... and so on

The three null bytes after the end of the snapshot are for compatibility reasons; older versions of Z80 would crash if the extra data was just appended to the snapshot. With these three null bytes, they just complain about an error in the snapshot file instead. This, of course, presumes you have renamed the .SLT file to .Z80 and attempted to load it into an older emulator!

After the "SLT" signature, there is a table of data types and sizes. Only data types 0 (end of table) and 1 (level data) are supported at the moment, so if other values are encountered an emulator should ignore that data block.

To read a level data file using .SLT, the emulator should find the correct entry in the table (type = 1, identifier matching the A register when the ED/FB instruction was encountered), get its size from the table and calculate its position from the total of sizes of data blocks previous to the required one, added to the position of the end of the table. E.g., to load level 2 from a .SLT snapshot with this table:

   Position  Size  Value  Description
   40000     2     1      data type = level data
   40002     2     1      data identifier = level 1
   40004     4     256    data length = 256 bytes
   40008     2     1      data type = level data
   40010     2     2      data identifier = level 2
   40012     4     128    data length = 128 bytes
   40016     2     0      data type = end of table
   40018     2     *      data identifier = unused (may as well be zero)
   40020     4     *      data length =  unused (may as well be zero)
   40024     256   *      data block for level 1
   40280     128   *      data block for level 2
   (* = could be anything)

So, the size of level 2 is 128 bytes, and its located at the end of the table (40024) + the length of all previous blocks (just 256 here) = 40280.

Level data is packed in the same way as Z80 snapshot memory banks are.

The trainspotter award seekers of you may wonder why a whole word is used for the data identifier; after all, this is the level number and is held in the A register, so could be just a byte. For level data, correct. But future expansion is better served by a word. For the same reasons, the data length is held as a long word instead of just a word; level data will never exceed 64Kb (indeed, could not even be as much as 48Kb), but future data types may do so. One example; embedding a scan of a game's inlay card in the file is possible, and that file could exceed 64Kb easily.

See this page for Damien Burke's proposals for future data types for inclusion in .SLT snapshots.

b) .SNA, .snap or .snapshot (Mirage Microdrive format used by many emulators)

This format is the most well-supported of all snapshot formats (though Z80 is close on its heels) but has a drawback:

As the program counter is pushed onto the stack so that a RETN instruction can restart the program, 2 bytes of memory are overwritten. This will usually not matter; the game (or whatever) will have stack space that can be used for this. However, if this space is all in use when the snap is made, memory below the stack space will be corrupted. According to Rui Ribeiro, the effects of this can sometimes be avoided by replacing the corrupted bytes with zeros; e.g. take the PC from the, stack pointer, replace that word with 0000 and then increment SP. This worked with snapshots of Batman, Bounder and others which had been saved at critical points. Theoretically, this problem could cause a complete crash on a real Spectrum if the stack pointer happened to be at address 16384; the push would try and write to the ROM. How different emulators handle this is not something I know...

When the registers have been loaded, a RETN command is required to start the program. IFF2 is short for interrupt flip-flop 2, and for all practical purposes is the interrupt-enabled flag. Set means enabled.

   Offset   Size   Description
   0        1      byte   I
   1        8      word   HL',DE',BC',AF'
   9        10     word   HL,DE,BC,IY,IX
   19       1      byte   Interrupt (bit 2 contains IFF2, 1=EI/0=DI)
   20       1      byte   R
   21       4      words  AF,SP
   25       1      byte   IntMode (0=IM0/1=IM1/2=IM2)
   26       1      byte   BorderColor (0..7, not used by Spectrum 1.7)
   27       49152  bytes  RAM dump 16384..65535
   Total: 49179 bytes
c) .SNA (128Kb version) (SP_EMU)

This is simply the SNA format extended to include the extra memory banks of the 128K/+2 machines, and fixes the problem with the PC being pushed onto the stack - now it is located in an extra variable in the file (and is not pushed onto the stack at all). The first 49179 bytes of the snapshot are otherwise exactly as described above, so the full description is:

   Offset   Size   Description
   0        27     bytes  SNA header (see above)
   27       16Kb   bytes  RAM bank 5 \
   16411    16Kb   bytes  RAM bank 2  } - as standard 48Kb SNA file
   32795    16Kb   bytes  RAM bank n / (currently paged bank)
   49179    2      word   PC
   49181    1      byte   port 7FFD setting
   49182    1      byte   (unknown - padding for above byte?)
   49183    16Kb   bytes  remaining RAM banks in ascending order
   Total: 131103 or 147487 bytes

The third RAM bank saved is always the one currently paged, even if this is page 5 or 2 - in this case, the bank is actually included twice. The remaining RAM banks are saved in ascending order - e.g. if RAM bank 4 is paged in, the snapshot is made up of banks 5, 2 and 4 to start with, and banks 0, 1, 3, 6 and 7 afterwards. If RAM bank 5 is paged in, the snapshot is made up of banks 5, 2 and 5 again, followed by banks 0, 1, 3, 4, 6 and 7.

d) .SP file format used in "ZX Spectrum", the ZX Spectrum emulator for Macintosh from Lorenzo Jose Ayuda Serrano.
   Offset   Size   Description
   0        2      byte   "SP" (signature)
   2        2      word   Program length in bytes (49152 bytes)
   4        2      word   Program location (16384)
   6        8      word   BC,DE,HL,AF
   14       4      word   IX,IY
   18       8      word   BC',DE',HL',AF'
   26       2      byte   R,I
   28       4      word   SP,PC
   32       2      word   0 (reserved for future use)
   34       1      byte   Border color
   35       1      byte   0 (reserved for future use)
   36       2      word   Status word

   Status word:
   Bit     Description
   15-8    Reserved for future use
    7-6    Reserved for internal use (0)
      5    Flash: 0=INK/1=PAPER
      4    Interrupt pending for execution
      3    Reserved for future use
      2    IFF2 (internal use)
      1    Interrupt Mode: 0=IM1/1=IM2
      0    IFF1: 0=DI/1=EI
e) .ZX (KGB v.1.2-1.3) [Contributed by Troels Norgaard]

All values stored in big-endian format; on 680x0 the most significant byte goes first.

   Offset   Size   Description
   0        49284  bytes  RAM dump 16252..65535
   49284    132    bytes  unused, make 0
   49416    10     word   10,10,4,1,1 (different settings)
   49426    1      byte   InterruptStatus (0=DI/1=EI)
   49427    2      byte   0,3
   49429    1      byte   ColorMode (0=BW/1=Color)
   49430    4      long   0
   49434    16     word   BC,BC',DE,DE',HL,HL',IX,IY
   49450    2      byte   I,R
   49452    2      word   0
   49454    8      byte   0,A',0,A,0,F',0,F
   49462    8      word   0,PC,0,SP
   49470    2      word   SoundMode (0=Simple/1=Pitch/2=RomOnly)
   49472    2      word   HaltMode  (0=NoHalt/1=Halt)
   49474    2      word   IntMode   (-1=IM0/0=IM1/1=IM2)
   49476    10     bytes  unused, make 0
   Total: 49486 bytes
f) .Z80 (Z80) [from Z80 documentation]

The old .Z80 snapshot format (for version 1.45 and below) looks like this:

        Offset  Length  Description
        0       1       A register
        1       1       F register
        2       2       BC register pair (LSB, i.e.  C, first)
        4       2       HL register pair
        6       2       Program counter
        8       2       Stack pointer
        10      1       Interrupt register
        11      1       Refresh register (Bit 7 is not significant!)
        12      1       Bit 0  : Bit 7 of the R-register
                        Bit 1-3: Border colour
                        Bit 4  : 1=Basic SamRom switched in
                        Bit 5  : 1=Block of data is compressed
                        Bit 6-7: No meaning
        13      2       DE register pair
        15      2       BC' register pair
        17      2       DE' register pair
        19      2       HL' register pair
        21      1       A' register
        22      1       F' register
        23      2       IY register (Again LSB first)
        25      2       IX register
        27      1       Interrupt flipflop, 0=DI, otherwise EI
        28      1       IFF2 (not particularly important...)
        29      1       Bit 0-1: Interrupt mode (0, 1 or 2)
                        Bit 2  : 1=Issue 2 emulation
                        Bit 3  : 1=Double interrupt frequency
                        Bit 4-5: 1=High video synchronisation
                                 3=Low video synchronisation
                        Bit 6-7: 0=Cursor/Protek/AGF joystick
                                 1=Kempston joystick
                                 2=Sinclair 2 Left joystick (or user
                                   defined, for version 3 .Z80 files)
                                 3=Sinclair 2 Right joystick

Because of compatibility, if byte 12 is 255, it has to be regarded as being 1. After this header block of 30 bytes the 48K bytes of Spectrum memory follows in a compressed format (if bit 5 of byte 12 is one). The compression method is very simple: it replaces repetitions of at least five equal bytes by a four-byte code ED ED xx yy, which stands for "byte yy repeated xx times". Only sequences of length at least 5 are coded. The exception is sequences consisting of ED's; if they are encountered, even two ED's are encoded into ED ED 02 ED. Finally, every byte directly following a single ED is not taken into a block, for example ED 6*00 is not encoded into ED ED ED 06 00 but into ED 00 ED ED 05 00. The block is terminated by an end marker, 00 ED ED 00.

That's the format of .Z80 files as used by versions up to 1.45. Starting from version 2.0, a different format is used, since from then on also 128K snapshots had to be supported. This new format is used for all snapshots, either 48K or 128K.

Version 2.01 and 3.0 .Z80 files start with the same 30 byte header as old .Z80 files used. Bit 4 and 5 of the flag byte have no meaning anymore, and the program counter (byte 6 and 7) are zero to signal a version 2.01 or version 3.0 snapshot file.

After the first 30 bytes, the additional header follows:

        Offset  Length  Description
      * 30      2       Length of additional header block (see below)
      * 32      2       Program counter
      * 34      1       Hardware mode (see below)
      * 35      1       If in SamRam mode, bitwise state of 74ls259.
                        For example, bit 6=1 after an OUT 31,13 (=2*6+1)
                        If in 128 mode, contains last OUT to 7ffd
      * 36      1       Contains 0FF if Interface I rom paged
      * 37      1       Bit 0: 1 if R register emulation on
                        Bit 1: 1 if LDIR emulation on
      * 38      1       Last OUT to fffd (soundchip register number)
      * 39      16      Contents of the sound chip registers
        55      2       Low T state counter
        57      1       Hi T state counter
        58      1       Flag byte used by Spectator (QL spec. emulator)
                        Ignored by Z80 when loading, zero when saving
        59      1       0FF if MGT Rom paged
        60      1       0FF if Multiface Rom paged. Should always be 0.
        61      1       0FF if 0-8191 is RAM
        62      1       0FF if 8192-16383 is RAM
        63      10      5x keyboard mappings for user defined joystick
        73      10      5x ascii word: keys corresponding to mappings above
        83      1       MGT type: 0=Disciple+Epson,1=Discipls+HP,16=Plus D
        84      1       Disciple inhibit button status: 0=out, 0ff=in
        85      1       Disciple inhibit flag: 0=rom pageable, 0ff=not

The value of the word at position 30 is 23 for version 2.01 files, and 54 for version 3.0 files. The starred fields are the ones that constitute the version 2.01 header, and their interpretation has remained unchanged except for byte 34:

        Value:          Meaning in v2.01        Meaning in v3.0
        0               48k                     48k
        1               48k + If.1              48k + If.1
        2               SamRam                  48k + M.G.T.
        3               128k                    SamRam
        4               128k + If.1             128k
        5               -                       128k + If.1
        6               -                       128k + M.G.T.

The hi T state counter counts up modulo 4. Just after the ULA generates its once-in-every-20-ms interrupt, it is 3, and is increased by one every 5 emulated milliseconds. In these 1/200s intervals, the low T state counter counts down from 17472 to 0, which make a total of 69888 T states per frame.

The 5 ascii words (high byte always 0) at 73-82 are the keys corresponding to the joystick directions left, right, down (!), up (!), fire respectively. Shift, Symbol Shift, Enter and Space are denoted by [,],/,\ respectively. The ascii values are used only to display the joystick keys; the information in the 5 keyboard mapping words determine which key is actually pressed (and should correspond to the ascii values). The low byte is in the range 0-7 and determines the keyboard row. The high byte is a mask byte and determines the column. Enter for example is stored as 0x0106 (row 6 and column 1) and 'g' as 0x1001 (row 1 and column 4).

Byte 60 must be zero, because the contents of the Multiface RAM is not saved in the snapshot file. If the Multiface was paged when the snapshot was saved, the emulated program will most probably crash when loaded back.

Bytes 61 and 62 are a function of the other flags, such as byte 34, 59, 60 and 83.

Hereafter a number of memory blocks follow, each containing the compressed data of a 16K block. The compression is according to the old scheme, except for the end-marker, which is now absent. The structure of a memory block is:

        Byte    Length  Description
        0       2       Length of data (without this 3-byte header)
        2       1       Page number of block
        3       [0]     Compressed data

The pages are numbered, depending on the hardware mode, in the following way:

        Page    In '48 mode     In '128 mode    In SamRam mode
         0      48K rom         rom (basic)     48K rom
         1      Interface I, Disciple or Plus D rom, according to setting
         2      -               rom (reset)     samram rom (basic)
         3      -               page 0          samram rom (monitor,..)
         4      8000-bfff       page 1          Normal 8000-bfff
         5      c000-ffff       page 2          Normal c000-ffff
         6      -               page 3          Shadow 8000-bfff
         7      -               page 4          Shadow c000-ffff
         8      4000-7fff       page 5          4000-7fff
         9      -               page 6          -
        10      -               page 7          -
        11      Multiface rom   Multiface rom   -

In 48K mode, pages 4,5 and 8 are saved. In SamRam mode, pages 4 to 8 are saved. In '128 mode, all pages from 3 to 10 are saved. This version saves the pages in numerical order. There is no end marker.

g) .ZX82 (Speculator '97) [Taken from the Speculator documentation]

Amiga Speculator has its own file format which I have called ZX82 format because it contains a file identifier in the first four bytes consisting of the ASCII characters "ZX82". The format has a 12 byte header which contains the normal Spectrum type file information like length, type, start etc. as well as a compression flag which is set if the file is byte run compressed. Snapshot files have a further 32 bytes of register values and border colour information. Listed below are the offset definitions taken from the Speculator source code in case you need to write a conversion utility. All registers and other values are in Motorola format (High, Low). I have defined everything in bytes to avoid any possible confusion.

* The Standard ZX82 Header
ZX_ID           rs.l    1       Identifier for a Speculator file "ZX82"
ZX_Type         rs.b    1       0:BASIC 1:Numeric 2:String 3:Code 4:Snapshot
ZX_Comp         rs.b    1       Is data block byte run compressed ? $00=No $FF=Yes
ZX_Length_H     rs.b    1       File length up to 64k (ELINE-PROG for BASIC)
ZX_Length_L     rs.b    1
ZX_Start_H      rs.b    1       Start address for code (AUTOSTART for BASIC)
ZX_Start_L      rs.b    1
ZX_ProgLen_H    rs.b    1       Array name (VARS-PROG for BASIC)
ZX_ProgLen_L    rs.b    1
ZX_ZXHdrLen     rs.b    0       Length of ZX file header
ZX_ZXData       rs.b    0       Start of Data block for standard ZX file

* The extended Snapshot ZX82 Header
ZX_Border       rs.b    1       Border colour
ZX_IntMode      rs.b    1       IntMode over-ride (0=use i_reg, 1=im1 and 2=im2)
ZX_Registers    rs.b    0       Z80 register values for Snapshot Files
ZX_iy_H_reg     rs.b    1       (High then Low i.e. Motorola format)
ZX_iy_L_reg     rs.b    1
ZX_ix_H_reg     rs.b    1
ZX_ix_L_reg     rs.b    1
ZX_de_H_reg     rs.b    1
ZX_de_L_reg     rs.b    1
ZX_bc_H_reg     rs.b    1
ZX_bc_L_reg     rs.b    1
ZX_hl_H_reg     rs.b    1
ZX_hl_L_reg     rs.b    1
ZX_af_H_reg     rs.b    1
ZX_af_L_reg     rs.b    1
ZX_de_H_alt     rs.b    1
ZX_de_L_alt     rs.b    1
ZX_bc_H_alt     rs.b    1
ZX_bc_L_alt     rs.b    1
ZX_hl_H_alt     rs.b    1
ZX_hl_L_alt     rs.b    1
ZX_af_H_alt     rs.b    1
ZX_af_L_alt     rs.b    1
ZX_sp_H_reg     rs.b    1
ZX_sp_L_reg     rs.b    1
ZX_if_H_reg     rs.b    1
ZX_if_L_reg     rs.b    1
ZX_rf_H_reg     rs.b    1
ZX_rf_L_reg     rs.b    1
ZX_pc_H_reg     rs.b    1
ZX_pc_L_reg     rs.b    1
ZX_SnpHdrLen    rs.b    0       Length of Snapshot file header
ZX_SnpData      rs.b    65496   Start of data block for Snapshot type

The ZX_Type field is derived from the MGT diciple directory MGT_Type-1, so further file types may be supported in this way in the future.

The compression used is the standard byte run compression as used by ILBM IFF files. The whole 48k data block is compressed as if it were one long row. See Amiga ROM Kernel Reference Manual: Devices Third Edition, Appendix A - IFF Specification (P347), Appendix C - Example Packer C code (P538).

Tape Files
.BLKTape format used by Sinclair v2.00; seems to be identical to Z80's .TAP files.
.SPCTape format used by SP, the Polish emulator.
.TAPTape format used by Z80; supports headerless files and not much else.
.TAPTape format used by Warajevo - supports lots of features; turbo-load, headerless files, etc.
.TZXNew tape format used to store turbo-loaders, etc.
.VOCStraight sound sample of a tape; used by several emulators.
.ZXSVery flexible tape format, not actually used by any emulators - used to store real Spectrum tapes in a digital format. All come from the ZX Spectrum Software Museum.
a) .TAP and .BLK (Z80, Sinclair, several others) [from Z80 documentation]

The .TAP files contain blocks of tape-saved data. All blocks start with two bytes specifying how many bytes will follow (not counting the two length bytes). Then raw tape data follows, including the flag and checksum bytes. The checksum is the bitwise XOR of all bytes including the flag byte. For example, when you execute the line SAVE "ROM" CODE 0,2 this will result:

             |------ Spectrum-generated data -------|       |---------|

       13 00 00 03 52 4f 4d 7x20 02 00 00 00 00 80 f1 04 00 ff f3 af a3

       ^^^^^...... first block is 19 bytes (17 bytes+flag+checksum)
             ^^... flag byte (A reg, 00 for headers, ff for data blocks)
                ^^ first byte of header, indicating a code block

       file name ..^^^^^^^^^^^^^
       header info ..............^^^^^^^^^^^^^^^^^
       checksum of header .........................^^
       length of second block ........................^^^^^
       flag byte ............................................^^
       first two bytes of rom .................................^^^^^
       checksum (checkbittoggle would be a better name!).............^^

Note that it is possible to join .TAP files by simply stringing them together, for example COPY /B FILE1.TAP + FILE2.TAP ALL.TAP

For completeness, I'll include the structure of a tape header. A header always consists of 17 bytes:

        Byte    Length  Description
        0       1       Type (0,1,2 or 3)
        1       10      Filename (padded with blanks)
        11      2       Length of data block
        13      2       Parameter 1
        15      2       Parameter 2

The type is 0,1,2 or 3 for a Program, Number array, Character array or Code file. A SCREEN$ file is regarded as a Code file with start address 16384 and length 6912 decimal. If the file is a Program file, parameter 1 holds the autostart line number (or a number >=32768 if no LINE parameter was given) and parameter 2 holds the start of the variable area relative to the start of the program. If it's a Code file, parameter 1 holds the start of the code block when saved, and parameter 2 holds 32768. For data files finally, the byte at position 14 decimal holds the variable name.

b) .TAP (Warajevo) [from Ribic Samir]

Warajevo's tape files (TAP) has the format as follows:

At the beginning of the file there are four bytes with the pointer to the first block. Then follow four bytes with pointer to the last block. The next four bytes contain #FFFFFFFF. So, empty tape has a format:

#04 #00 #00 #00 #00 #00 #00 #00 #FF #FF #FF #FF

Sequence #00 #00 #00 #00 #FF #FF #FF #FF is, in fact, a EOF (end of file) marker. Every block contains following:

- 4 bytes, a pointer to the previous block, which is 0 for first block;
- 4 bytes, a pointer to the next block or to the EOF marker for last block;
- 2 bytes, block size;
- 1 byte, a flag byte;
- the data bytes.

If the block size is 65535, it is a compressed block. It looks like:

- 4 bytes, a pointer to the previous block;
- 4 bytes, pointer to next block;
- 2 bytes, 65535;
- 1 byte, a flag byte;
- 2 bytes, decompressed size;
- 2 bytes, compressed size;
- 2 bytes, signature length (internal);
- the data bytes.

Signatures are important for the imploding algorithm used in the Warajevo emulator. This algorithm, when decompressing, copies bytes from the source file, or returns for a few bytes, and copies some bytes from a destination file.

The explaination of compressed data bytes is rather complex. We used format similar to those in PKLITE, but unlike PKLITE where signature bytes are mixed with data bytes, authors divided them in two parts, for easier debugging.

Remember elements of Imploding (LZ77) algorythm. It depends on copying of some byte sequences. For example:

3D 18 2E 42 3D 18 2E 15 42 3D 19

will be encoded as:

3D 18 2E 42  15  19

The archivers differs on way of encoding of this special 'Return for...' code.

In Warajevo compressed format, there are two parts: signatures and data. In our example coding of signatures will be (binary):

00001001 010100xx

while data bytes will be

3D 18 2E 42 04 15 05 19

The signatures are finite automat that describe what to do with data bytes. If the bit is 0, this is simple data byte, if the bit is 1 this is code for returning.

In our example, four zeros in signatures means that four bytes can be simply copied (3D, 18, 2E, 42) to output buffer. The next bit is 1. This means: Return for xxxx bytes and copy yyyy.

The value of yyyy (size of string to be copied) is in signatures if less than 10 or in signatures and data bytes if greater of equal 10.

The size depends on next 2-4 signature bits:

 010: size=2
  00: size=3
 100: size=4
 101: size=5
 011: size>=10
1100: size=6
1101: size=7
1110: size=8
1111: size=9

If size is greater or equal than 10, the next data byte contains actual size-10. That means: maximal string size is 265.

The next data byte determine lower byte of distance of string to be copied (lower byte of xxxx). If size=2, higher bit is always zero (so for this size distance can be maximally 255). If size differs from 2 the next 1-6 signature bits determine higher byte:

     1: higher byte=0
  0000: higher byte=1
  0001: higher byte=2
 00100: higher byte=3
 00101: higher byte=4
 00110: higher byte=5
 00111: higher byte=6
01nnnn: higher byte=7+nnnn

Experiment with some ASCII text compressed. There is algorythm in Pascal for decompressing to understand the format:

procedure decompress_b;

  if duzina_ul_dek=0 then finished:=true else finished:=false;
  while not finished do begin
    if nextbit=0 then begin
     else begin
       {I know, it is  goto, but more readable than
        nested if then else sequences}
        lb: if nextbit=0 then goto b0 else goto b1;
        b0: if nextbit=0 then goto b00 else goto b01;
        b1: if nextbit=0 then goto b10 else goto b11;
        b11: if nextbit=0 then goto b110 else goto b111;
        b01: if nextbit=0 then goto b010 else goto b011;
        b10: if nextbit=0 then goto b100 else goto b101;
        b110: if nextbit=0 then goto b1100 else goto b1101;
        b111: if nextbit=0 then goto b1110 else goto b1111;
        b010: bytes:=2;
          goto izlaz;
        b00: bytes:=3;goto v;
        b100: bytes:=4;goto v;
        b101: bytes:=5;goto v;
        b011: TakeFromInputBuffer(b,finished);
          bytes:=b+10;goto v;
        b1100: bytes:=6;goto v;
        b1101: bytes:=7;goto v;
        b1110: bytes:=8;goto v;
        b1111: bytes:=9;goto v;
        v: TakeFromInputBuffer(b,finished);
        if nextbit=0 then goto v0 else goto v1;
        v0: if nextbit=0 then goto v00 else goto v01;
        v1:goto izlaz;
        v00: if nextbit=0 then goto v000 else goto v001;
        v01: Auxsilary:=7;
          if nextbit=1 then Auxsilary:=Auxsilary+8;
          if nextbit=1 then Auxsilary:=Auxsilary+4;
          if nextbit=1 then Auxsilary:=Auxsilary+2;
          if nextbit=1 then Auxsilary:=Auxsilary+1;
          goto izlaz;
        v000: if nextbit=0 then goto v0000 else goto v0001;
        v001: if nextbit=0 then goto v0010 else goto v0011;
        v0010: if nextbit=0 then goto v00100 else goto v00101;
        v0011: if nextbit=0 then goto v00110 else goto v00111;
        v0000: return_for:=return_for+1*256;goto izlaz;
        v0001: return_for:=return_for+2*256;goto izlaz;
        v00100: return_for:=return_for+3*256;goto izlaz;
        v00101: return_for:=return_for+4*256;goto izlaz;
        v00110: return_for:=return_for+5*256;goto izlaz;
        v00111: return_for:=return_for+6*256;goto izlaz;
        for i:=1 to bytes do begin
     end {else}
  end {while}
end; {decompress_b}

Complex? Yes it is. I spent more than 30 days in developing algorythm, analysing of some archivers, optimizing compression speed (it is still slow, but acceptable) , and I worked mostly on paper, because it was in hardest days of summer 1993, without electric power, water and food (in this time I losed 1kg weekly), when only miracle saved Sarajevo of fall. In this time I had not leave the army building, and while I waited for a new battle tasks I developed the compression algorythm.

NEW c) .TZX (x128, xzx, Warajevo)

.TZX files are a format developed by Tomaz Kac to allow the storage of games with non-standard loaders in a format much smaller than .VOC files. The full specification can be found at World of Spectrum.

Other Files
.DAT Data files used by level-loader versions of a game (Z80Em does not use a .DAT extension at all; instead files are just numbered, e.g. "1" instead of "GAME1.DAT").
.MDR Microdrive cartridge file as used by Spectator, Carlo Delhez' Speccy emulator for the QL, and other emulators - xzx and Z80.
.OUT OUT logs from Z80.
.SCR Screendumps from Z80 and WSpecEm.
a) .MDR (Spectator, xzx, Z80) [from Z80 documentation]

The following information is adapted from Carlo's documentation. It can also be found in the 'Spectrum Microdrive Book', by Ian Logan (co-writer of the excellent 'Complete Spectrum ROM Disassembly').

A cartridge file contains 254 'sectors' of 543 bytes each, and a final byte flag which is non-zero is the cartridge is write protected, so the total length is 137923 bytes. On the cartridge tape, after a GAP of some time the Interface I writes 10 zeros and 2 FF bytes (the preamble), and then a fifteen byte header-block-with-checksum. After another GAP, it writes a preamble again, with a 15-byte record-descriptor-with-checksum (which has a structure very much like the header block), immediately followed by the data block of 512 bytes, and a final checksum of those 512 bytes. The preamble is used by the Interface I hardware to synchronise, and is not explicitly used by the software. The preamble is not saved to the microdrive file:

    Offset Length Name    Contents
      0      1   HDFLAG   Value 1, to indicate header block
      1      1   HDNUMB   sector number (values 254 down to 1)
      2      2            not used
      4     10   HDNAME   microdrive cartridge name (blank padded)
     14      1   HDCHK    header checksum (of first 14 bytes)

     15      1   RECFLG   - bit 0: always 0 to indicate record block
                          - bit 1: set for the EOF block
                          - bit 2: reset for a PRINT file
                          - bits 3-7: not used (value 0)
     16      1   RECNUM   data block sequence number (value starts at 0)
     17      2   RECLEN   data block length (<=512, LSB first)
     19     10   RECNAM   filename (blank padded)
     29      1   DESCHK   record descriptor checksum (of previous 14 bytes)
     30    512            data block
    542      1   DCHK     data block checksum (of all 512 bytes of data
                           block, even when not all bytes are used)
    254 times

(Actually, this information is 'transparent' to the emulator. All it does is store 2 times 254 blocks in the .MDR file as it is OUTed, alternatingly of length 15 and 528 bytes. The emulator does check checksums, see below; the other fields are dealt with by the emulated Interface I software.)

A used record block is either an EOF block (bit 1 of RECFLG is 1) or contains 512 bytes of data (RECLEN=512, i.e. bit 1 of MSB is 1). An empty record block has a zero in bit 1 of RECFLG and also RECLEN=0. An unusable block (as determined by the FORMAT command) is an EOF block with RECLEN=0.

The three checksums are calculated by adding all the bytes together modulo 255; this will never produce a checksum of 255. Possibly, this is the value that is read by the Interface I if there's no or bad data on the tape.

In normal operation, all first-fifteen-byte blocks of each header or record block will have the right checksum. If the checksum is not right, the block will be treated as a GAP. For instance, if you type OUT 239,0 on a normal Spectrum with interface I, the microdrive motor starts running and the cartridge will be erased completely in 7 seconds. CAT 1 will respond with 'microdrive not ready'. Try it on the emulator...

b) .SCR (Z80, WSpecEm)

These files are just Spectrum screen dumps, and are simply the 6912 bytes of pixel and attribute data found at address 16384, stored on disk in exactly the same way as they are stored in memory.

To elaborate; the Spectrum screen is split into four areas; top third, mid third, bottom third and attributes (colours). The thirds each consist of 2048 bytes and the attribute area is 768 bytes (32 characters wide x 24 lines). So the first 6144 bytes are the actual pixel data and the remainder decides what two colours are used in each 8x8 square.

Each third of the screen is laid out unusually; the first 32 bytes are the pixels for the top row of the first character line, then the next 32 bytes are the pixels for the top row of the second character line and so on until you reach the ninth load of 32 bytes, which is the second row of the first character line. Next 32 bytes is the second row of the second character line, and so on. It's hard to explain, so the best thing to do is see for yourself; write a program to POKE data to 16384 up and see how the bytes fill in on the screen.