Topic: A New Approach to Chiptune Data Encoding
Dictionary Encoding of Song Data
Since the advent of music trackers, chiptune modules have traditionally been encoded using a pattern/sequence approach. Considering the constraints of 1980s consumer computer hardware, this is undoubtedly a quite optimal approach.
However, in the present day, perhaps the capabilities of modern computers and cross development offer new possibilities for the encoding of chiptune data?
Back in the day, the two core requirements aside from providing a reasonably efficient storage format were:
Fast encoding from the editor side with limited memory, as musicians like to listen to their work in progress instantly at the push of a button.
Fast decoding, since enough CPU time must be left to perform other tasks. In the realm of 1-bit music, this is especially critical, since on most target machines we cannot perform any tasks while synthesizing the output. That means that reading in music data must be as fast as possible, in order to minimize row transition noise. For this reason, we tend to avoid using multi-track sequences, which are very efficient in terms of storage size, and are fast to encode, but are much slower to decode than a combined sequence for all tracks.
Obviously nothing has changed regarding the second requirement. However, the first requirement no longer stands, when using editors running on a PC. So, how can we use this to our advantage?
A Dictionary Based Encoding Algorithm
I propose the following alternative to the traditional pattern/sequence encoding:
Starting from a uncompressed module with sequence/pattern data, produce a dictionary of all unique pattern rows in the module. The dictionary may consist of at most 0x7fff bytes, otherwise the module is not a viable candidate.
Construct a sequence as follows:
Replace each step in the original sequence with the rows of corresponding pattern.
Replace the rows with pointers to the corresponding dictionary entries.
Beginning with a window size of 3 and a nesting level of 0, find all consecutive repetitions of the given window in the new sequence and replace them with a control value that holds the window size and number of repetitions. The control value is distinguished by having bit 15 set. A dictionary pointer will never have bit 15 set. (On ZX Spectrum, this would be reversed, as the dictionary will reside in upper memory.)
Substitution shall not take place inside compressed blocks. So, assuming the following sequence:
<ctrl window:4 length:4>1 2 3 4 5 6 7 3 4 5 6 7...
the packer will not replace the second occurance of 3 4 5 6 7, because the first occurance is part of the compressed block <ctrl...>1 2 3 4.
If executing step 3A replaced any chunk, increment nesting level by 1.
Increment window size by 1 and repeat from step 3A until reaching a predefined maximum window size, or reaching a predefined maximum nesting level.
Testing the Algorithm
It all may sound good on paper, but some reliable data is needed to prove it actually works.
For this purpose, I ran some tests on a large set of nearly 1800 XM files obtained from ftp.modland.com.
From the XMs, input modules were prepared as follows:
Only note triggers are considered, any other data is discarded.
From the extracted note data, all empty rows are discarded.
Each note row is prepended with a control value, which has bits set depending on which channels have a note trigger. Empty channels are then discarded.
The maximum window size is determined as the length of the longest note pattern, divided by 2.
Some larger modules were included in the test set, just to get a picture how the algorithm would perform on larger data sets. In reality, any module that produces a dictionary larger than 0x7fff bytes would of course not be usable.
I then ran the dictionary encoding algorithm on the prepared modules, using 3 different settings:
No recursion applied (step 3 skipped)
Recursion with a maximum depth (nesting level) of 1
Recursion with a maximum depth of 4
modules tested: 1793
no recursion: 1641
recursion depth = 1: 1676
recursion depth = 4: 1701
An optimization is considered a success if the dictionary based algorithm produced a smaller file than the pattern/sequence approach.
no recursion: 29.1%
recursion, depth = 1: 31.6%
recursion, depth = 4: 34.1%
Average savings ratios are obtaining by comparing output size against the pattern/sequence encoded input data.
no recursion: 72.3%
recursion, depth = 1: 91.5%
recursion, depth = 4: 91.7%
no recursion: -110.4%
recursion, depth = 1: -78.2%
recursion, depth = 4: -59.7%
The results are quite surprising, in my opinion.
Even the non-recursing algorithm performs worse than the pattern/sequence approach for 152 modules (8.5% of all files) could not be compressed better. Recursing up to a depth of 4 nesting levels leaves further reduces the failures to 92 files (5.1%). I had expected that the dictionary algorithm would lose to pattern/sequence encoding in at least 1/3 of the cases.
I had assumed that dictionary encoding performance would correlate with module size, but the test data does back up this hypothesis. However, when dictionary encoding performs worse than pattern/sequence encoding, it generally happens at smaller module sizes.
Recursive dictionary encoding on average offers only marginal benefits over non-recursive dictionary encoding. However, there were a number of cases where recursive encoding performed significantly better. This includes almost all cases where dictionary encoding performed worse that pattern/sequence encoding.
Within the margin of recursive dictionary encoding approaches, the algorithm performs significantly better at higher nesting levels.
I think it is safe to conclude that plain dictionary based encoding of chiptune song data (without recursion) offers significant benefits over the traditional pattern/sequence approach. Considering that non-recursively dictionary encoded data can also be decoded easily and quickly, it can definately be considered a viable alternative.
Whether recursive dictionary encoding is worth it remains debatable. I think that the cost in terms of CPU time will be acceptable, as nested recursions will occur infrequently, so it would be comparable to performing a sequence update on pattern/sequence encoded data. However, it will incur a significant cost in terms of CPU registers resp. temporary memory required. I would argue that it depends on the use case. You might consider it if you are intending to optimize small modules, and have registers to spare.