Released a new XM parser today. It's provided as an extension for the Chicken 4 implementation of the Scheme language. If you have Chicken 4 installed, you can just run "chicken-install xmkit" to use it. The nice thing about Chicken is that it allows you to run code interpreted (for quick prototyping), as well as compile to fairly efficient C code.

The parser handles pretty much everything, including parsing pattern data, performing file integrity checks, and extracting and converting sample data. If you only need to parse patterns then it might be overkill, consider Shiru's Python converter instead.

source
documentation

Dictionary Encoding of Song Data

Since the advent of music trackers, chiptune modules have traditionally been encoded using a pattern/sequence approach. Considering the constraints of 1980s consumer computer hardware, this is undoubtedly a quite optimal approach.

However, in the present day, perhaps the capabilities of modern computers and cross development offer new possibilities for the encoding of chiptune data?

Back in the day, the two core requirements aside from providing a reasonably efficient storage format were:

  • Fast encoding from the editor side with limited memory, as musicians like to listen to their work in progress instantly at the push of a button.

  • Fast decoding, since enough CPU time must be left to perform other tasks. In the realm of 1-bit music, this is especially critical, since on most target machines we cannot perform any tasks while synthesizing the output. That means that reading in music data must be as fast as possible, in order to minimize row transition noise. For this reason, we tend to avoid using multi-track sequences, which are very efficient in terms of storage size, and are fast to encode, but are much slower to decode than a combined sequence for all tracks.

Obviously nothing has changed regarding the second requirement. However, the first requirement no longer stands, when using editors running on a PC. So, how can we use this to our advantage?


A Dictionary Based Encoding Algorithm

I propose the following alternative to the traditional pattern/sequence encoding:

  1. Starting from a uncompressed module with sequence/pattern data, produce a dictionary of all unique pattern rows in the module. The dictionary may consist of at most 0x7fff bytes, otherwise the module is not a viable candidate.

  2. Construct a sequence as follows:

    1. Replace each step in the original sequence with the rows of corresponding pattern.

    2. Replace the rows with pointers to the corresponding dictionary entries.

    1. Beginning with a window size of 3 and a nesting level of 0, find all consecutive repetitions of the given window in the new sequence and replace them with a control value that holds the window size and number of repetitions. The control value is distinguished by having bit 15 set. A dictionary pointer will never have bit 15 set. (On ZX Spectrum, this would be reversed, as the dictionary will reside in upper memory.)
      Substitution shall not take place inside compressed blocks. So, assuming the following sequence:
      <ctrl window:4 length:4>1 2 3 4 5 6 7 3 4 5 6 7...
      the packer will not replace the second occurance of 3 4 5 6 7, because the first occurance is part of the compressed block <ctrl...>1 2 3 4.

    2. If executing step 3A replaced any chunk, increment nesting level by 1.

    3. Increment window size by 1 and repeat from step 3A until reaching a predefined maximum window size, or reaching a predefined maximum nesting level.


Testing the Algorithm

It all may sound good on paper, but some reliable data is needed to prove it actually works.
For this purpose, I ran some tests on a large set of nearly 1800 XM files obtained from ftp.modland.com.

From the XMs, input modules were prepared as follows:

  • Only note triggers are considered, any other data is discarded.

  • From the extracted note data, all empty rows are discarded.

  • Each note row is prepended with a control value, which has bits set depending on which channels have a note trigger. Empty channels are then discarded.

  • The maximum window size is determined as the length of the longest note pattern, divided by 2.

Some larger modules were included in the test set, just to get a picture how the algorithm would perform on larger data sets. In reality, any module that produces a dictionary larger than 0x7fff bytes would of course not be usable.

I then ran the dictionary encoding algorithm on the prepared modules, using 3 different settings:

  1. No recursion applied (step 3 skipped)

  2. Recursion with a maximum depth (nesting level) of 1

  3. Recursion with a maximum depth of 4


Results

modules tested: 1793

Successes
no recursion: 1641
recursion depth = 1: 1676
recursion depth = 4: 1701
An optimization is considered a success if the dictionary based algorithm produced a smaller file than the pattern/sequence approach.

Average savings
no recursion: 29.1%
recursion, depth = 1: 31.6%
recursion, depth = 4: 34.1%
Average savings ratios are obtaining by comparing output size against the pattern/sequence encoded input data.

Best result:
no recursion: 72.3%
recursion, depth = 1: 91.5%
recursion, depth = 4: 91.7%

Worst result:
no recursion: -110.4%
recursion, depth = 1: -78.2%
recursion, depth = 4: -59.7%

results graphic
.svg version

The results are quite surprising, in my opinion.

  • Even the non-recursing algorithm performs worse than the pattern/sequence approach for 152 modules (8.5% of all files) could not be compressed better. Recursing up to a depth of 4 nesting levels leaves further reduces the failures to 92 files (5.1%). I had expected that the dictionary algorithm would lose to pattern/sequence encoding in at least 1/3 of the cases.

  • I had assumed that dictionary encoding performance would correlate with module size, but the test data does back up this hypothesis. However, when dictionary encoding performs worse than pattern/sequence encoding, it generally happens at smaller module sizes.

  • Recursive dictionary encoding on average offers only marginal benefits over non-recursive dictionary encoding. However, there were a number of cases where recursive encoding performed significantly better. This includes almost all cases where dictionary encoding performed worse that pattern/sequence encoding.

  • Within the margin of recursive dictionary encoding approaches, the algorithm performs significantly better at higher nesting levels.


Conclusions

I think it is safe to conclude that plain dictionary based encoding of chiptune song data (without recursion) offers significant benefits over the traditional pattern/sequence approach. Considering that non-recursively dictionary encoded data can also be decoded easily and quickly, it can definately be considered a viable alternative.

Whether recursive dictionary encoding is worth it remains debatable. I think that the cost in terms of CPU time will be acceptable, as nested recursions will occur infrequently, so it would be comparable to performing a sequence update on pattern/sequence encoded data. However, it will incur a significant cost in terms of CPU registers resp. temporary memory required. I would argue that it depends on the use case. You might consider it if you are intending to optimize small modules, and have registers to spare.


Test Code
https://gist.github.com/utz82/7c52da950 … 1c2bad5b6b

328

(11 replies, posted in General Discussion)

Excellent write-up. Glad to see it is generating quite some attention, too.

329

(164 replies, posted in Sinclair)

I finally found a way of consistently reproducing the "Engine not provide any data" bug that I mentioned above.

To reproduce:

1) load the attached .1tm
2) F5 to play, F5 to stop again
3) Move cursor to last column, press 1

I don't know if it is reproducible under Windows, if not then probably some memory is not being initialized to 0.

Some more info: Bug happens regardless of the engine used. As mentioned, it almost always happens when trying to enter something at the beginning of a block. Once it occurs, it is persistent, ie no data that would trigger a row play can be entered at this position anymore regardless of any further actions taken.

330

(11 replies, posted in General Discussion)

Waaaaaah! It may be just a day old, but to me it's already legendary. I would never have believed that it's possible to push monophonic stuff this far.
Also, there goes my hard earned paypal money. Damn you.

Glad to hear that. Could you tell me what exactly you had to do to get it working in the end? Generally TiLP+Win10 seems to be rather wonkey, so it might be a good idea to document user's findings in the manual.

Just hard-set everything in the "Change Device" dialogue. So cable = SilverLink, port is #1 (usually), calc is TI-82, of course. I think TiLP doesn't auto-detect the 82. Or do everything via command line
tilp ti82 SilverLink CRASH.82B
which usually gives more useful error messages anyway.

You need to force USB driver installation with Zadig. See the TiLP Windows readme for details. If that doesn't work, then yeah maybe it's easier to use that Linux netbook. Anyway, let me know if you have any success with the Zadig thing, so I can perhaps add a note to the HT manual about this.

Hi, I need some more info. Which transferrer software are you using and what OS are you on?
Also, plug a sound cable into your calc and confirm that you are getting a low humming noise on both stereo channels.

335

(135 replies, posted in Sinclair)

Some Thoughts on Note Data Encoding

On ZX beeper, note data is commonly encoded based on one of the following principles:

  1. 8-bit frequency dividers or countdown values, stored directly.

  2. 8-bit indices into a lookup table holding 16-bit frequency dividers.

  3. 16-bit frequency dividers, stored directly.

  4. 12-bit frequency dividers, stored directly (as 16-bits).

Method A is efficient in terms of data size, but has well-known limitations of note range and detune. Method B is also size-efficient, but table lookup is inevitably slow, which is a problem for pulse-interleaving engines because of row transition noise. Also it requires an additional register for parsing. Method C is size-inefficient, but allows for fast and efficient parsing. Method D has similar constraints as method C, but is slightly more size-efficient as additional information can be stored in the upper 4 bits.

I have been using method D in several of my later engines, and generally regard it as a good compromise except of course in cases where higher precision is needed. However, the question is if there is a better solution.

First of all, it is safe to say that the most significant bit of a 12/16-bit dividers is hardly significant at all. The relevant notes are in a range that not only isn't very useful musically, but also cannot reproduced accurately by most existing beeper engines.

More importantly though, it should be noted that for higher notes, precision actually becomes less important for a number of reasons. One of which is psychoacoustics: humans are naturally bad at distinguishing high frequencies. The key reason is of a more technical nature, though. Imagine we have determined the frequency divider of an arbitrarily chosen reference note A-9 to be 0x7a23. That means the divider for A-8 is 0x3d11, A-7 gives 0x1e88, and so forth, until we arrive at a divider value of 0x7a for A-1. And here comes the funny part. We know that of course for each octave, frequencies will double. So if A-1 is 0x7a, then A-2 is 0xf4, A-3 is 0x1e8... A-9 is 0x7a00. Wait, what? Didn't we just determine that A-9 is 0x7a23? Well, depends on how you look at it. Musically speaking, 0x7a23 may be the correct value when thinking about a system where A-4 = 440 Hz. However, in our magic little beeper world, 0x7a00 is just as correct, as it perfectly satisfies the requirement that frequencies should double with each octave. Hence we can conclude that for A-9, we actually don't need the precision of the lower 8 bits, so we could just store the higher 8 bits and ignore the lower byte altogether. However, for A-1, we very much do need those lower bits. So the point is that either way, 8 bits of precision (or even 7 bits, as demonstrated in the above example) is sufficient for encoding a large range of note frequencies. We just need to be flexible in what these 8 bits represent. Sounds like a use case for a floating point format to you? Well, it sure does to me.

However, the problem is that decoding a real floating point format would be even slower than doing a table lookup. So that's no good. Instead I propose we cheat a little (like so often in 1-bit). We need something that uses 8 bits, is fast to decode, but still gives us some floating point like behaviour. Considering an engine using 12-bit dividers, I propose the following 8-bit note data format:

Bit 7 is the "exponent". If it is reset, then remaining bits are assumed to represent bits 1..7 of the actual 12-bit divider, all other bits being 0. If the "exponent" is set, then the remaining bits are assumed to represent bit 3..9 of the actual divider. As discussed, we don't actually care about the highest bit of the divider, and for practical reasons we will ignore the second most significant bit as well. The choice of what bit 7 represents is of course arbitrary, if it suits your implementation better then there are no drawbacks to inverting the meaning whatsoever.

Now, the nice thing is that we do not need to fully decode our "el cheapo" floating point format during parsing, as it costs just 8 cycles to decode it just-in-time. What's even better though, doing so saves a precious register!

        ld a,b
        add a,c
        ld b,a
        sbc a,a
fp_switch
        add a,d             ;self-mod: add a,d | xor a,d
        ld d,a
        out (#fe),a

Where B is our accumulator, C is our note value that has previously been left-shifted if it signifies bit 1..7, or has bit 7 masked otherwise, and D is the "extended accumulator" that will be used either to accumulate the overflow from the 8-bit add (if note value signifies bit 1..7 of the divider), or will serve as extended bit (if the note value signifies bit 3..9).

Attached to this post, you will find the full source code, an example note table, and a demo. At this point, there are 3 major issues that need to be addressed.

  1. Middle E is detuned. This can probably be rectified by shifting the note table a little, though this method is bound to produce some slight detune around where the "upper" table section starts.

  2. Parsing is quite ugly atm. I'm sure it can be made more efficient, but haven't found a good method yet.

  3. The current way of calculating the output state gets into the way of applying other effects. I believe basic duty control should be possible (via the "phase offset" method), but other things like duty sweeps might be more difficult. My hope here is that instead this opens up the possibility for other tricks that I might not have thought about yet.

Well, that's basically all I wanted to share for now. It's probably possible to do something similar for 16-bit dividers, but most likely it won't work just-in-time. Other than that, I'm very curious to hear your thoughts on this. Is it useful at all? Any cool tricks that we can do with this? Any improvements for the implementation? Please let me know.

Yep, good stuff! The "graphical sound" technique was actually pretty widespread back in the day.
Here's another example from SU: https://www.youtube.com/watch?v=Z7Zb4rso82M
Another famous name is Norman McLaren.
https://www.youtube.com/watch?v=Q0vgZv_JWfM
https://www.youtube.com/watch?v=q3YeWgUgPHM
Also, Daphne Oram: https://en.wikipedia.org/wiki/Oramics
Also the Solaris soundtrack was made using a similar technique.

337

(0 replies, posted in Sinclair)

As of today, individual downloads for my Speccy engines are no longer available from the github repo. You can now grab all the converters etc. as a single download on the Releases page.

The reason for this change is that I intend to use the github repo as a submodule for MDAL/Bintracker at some point, and having a large number of binary files in the repo is somewhat counter-productive in that respect. My apologies for any inconvenience caused.

338

(130 replies, posted in Sinclair)

Yeah, 2 downvotes already! I'm famous woohooo big_smile

339

(130 replies, posted in Sinclair)

Thanks, Shiru. I'm not terribly proud of this one, actually. Except for the ghost notes in the section starting at 0:45, really like how those turned out. Think your track is great, too, classic Shiru style. Wanted to send another track, but same problem: not enough time to finish it.

Of course I've already told you a few times that this is a fantastic album but... Awesome album is awesome. It not only shows how far we've come in terms of beeper sound, but even more importantly it shows how far you've come as a composer since (the already outstanding) 1-bit Mechanistic. Well done, mate.

341

(3 replies, posted in Sinclair)

Thanks you guys! After playing with this engine some more I think it needs more polish, but I'm still glad it works at all. And yes, it's surprisingly cheap even with realtime low pass, but of course banked buffers open up even more possibilities, like emulating attack transient variations (StringKS can do it but there are no built-in buffers for it so they currently need to be prepared by the user). I'm also wondering if this could be a candidate for Jan Deak style "ahead-of-time" buffer generation. In general I have a strong feeling that his method might be useful for something other than just generating loads of pulse trains, but still searching...

342

(135 replies, posted in Sinclair)

Seems that biggest mistake in many of my newer engines is not to mask bit 3. Worked wonders for the KS thing in any case.

I tried some simple 1-bit mixing on Gameboy a while back, so I can confirm it generally works. On the other hand I currently don't have a machine with AY so I have less of a motivation to try it. Maybe when I get my Next hur hur... Tried to make a combined AY+Beeper engine but it turns out that volume difference is huge and also varies alot between models, so I haven't investigated further into this direction either.

Currently I'm looking a lot into data encoding. I have a sort of el cheapo floating point format now which allows me to encode 12-bit frequency dividers in 8 bits. Costs 8 additional cycles in sound loop, but saves a register. Something similar should be doable for 16-bit dividers. Also I'm experimenting with a new song data format that isn't based on pattern/sequence structure but rather on dictionary based approach. Parsing such data has a small overhead compared to seq/ptn data (needs an additional register pair for decoding and is slightly slower than just popping values from stack), but first tests are promising: on average 10-30% smaller data than the traditional approach. However, I need to test it with more data to be sure. Thinking about grabbing a large set of files from modarchive and build test material from that. Anyway, will of course post more on that once it's progressed a bit further.

343

(130 replies, posted in Sinclair)

Yes! Thanks for the reminder, Vinnny. I'll try to make something but can't promise.

344

(3 replies, posted in Sinclair)

Wow, appearantly it's been almost a year since I published a new beeper engine. So it was about time for some good ol' t-state squeezing.

StringKS is an experimental engine that implements Karplus-Strong inspired string synthesis. It's more a proof-of-concept than an actually useful engine (hence no converter is provided), however it does prove that physical modelling is possible in 1-bit, and I think it's worth exploring this concept further.

Synthesis is done by creating an initial ring buffer from various sources (at the moment, ROM noise, rectangle wave with variable duty, and saw wave are supported), and then continually running a simple low-pass filter over the buffer. The size of the buffer determines the pitch. It is also possible to source from user-created data (so theoretically one can start from a pre-filtered buffer to create softer attack transients). Additionally, I threw in PWM sample playback on one of the channels, and regular rectangle wave playback (also on one channel only). There's also a (rather brutal) overdrive mode. All synth methods except the saw wave one support a somewhat crude 3-bit volume control.

source code
An extremely uninspired demo tune is attached.

Limitations:
- 8-bit frequency counters only, so the available note range is rather limited.
- At higher notes, tones will fade out very quickly.

I believe it's possible to rectify these issues, but more research is needed. One possible approach I experimented with was to generate data on the fly with the usual add-and-compare method while keeping track of the low-pass cutoff. It works but so far sound quality is worse than with the buffered approach. Another way might be to pre-scale the speed of buffer iteration to reach lower frequencies (e.g. update buffer pointer only every other sound loop iteration), and to slow down decay by only running the filter on every other buffer iteration. Still need to find some free t-states for that, though. Perhaps splitting updates so only one channel gets updated per sound loop iteration might be doable. Well, I'm open to ideas, of course wink

Thumbs up wink
A small note about Huby, using Beepola/1tracker is strictly optional for transcriptions. You can also generate music data directly from the imported MIDI data in OpenMPT by using the Huby XM converter. However, the best option is always to ask a chiptune musician to do a cover for you, rather than converting from MIDI data wink

Thx wink Yes, sound and video examples were not recorded for copyright reasons, but you can find all the examples over at the 1-bit timeline post (which I've been updating quite a bit lately).

347

(1 replies, posted in Sinclair)

Ha, neat!

I think anteater has some flaws (namely incorrect tuning due to insufficient frequency divider size). On the other hand, looking back it was probably not that bad for a first (well, probably second or third) attempt big_smile

348

(9 replies, posted in Sinclair)

As much as I'd advocate trying some of the newer beeper engines, the most convenient for you would be to use Tritone, as it's natively supported in z88dk.

Also, don't worry too much about size. Once you use compression (and only unpack into a buffer when you need it), data size doesn't really matter that much. I think z88dk has zx7 compressor build in, which works fairly well on music data.

349

(4 replies, posted in Sinclair)

I don't know what would be the equivalent in SDCC syntax, might be as simple as

oldSP = $+1

Generally, what this pseudo-op construct means is: "assign a label (oldSP) to the current address ($) + 1". So in this example, oldSP will point to the address immediately after the "ld sp,nnnn" instruction, which happens to be the "nnnn" part. There is some other part in the code (usually in the init part) that writes the value of SP at that point in time to the location of oldSP. The snippet "ld sp,old_value" is then usually called on exit, to restore SP to it's proper value, as many beeper engines mess with the stack a lot.

Anyway, the point is that if you don't have any form of "label equ $+x" available, you can also simply locate all points in the code that write to 'label' (in case of the SP backup there'll usually just be one such location). That write op will look something like

  ld (oldSP),sp

which you can change to

  ld (oldSP + 1),sp

and then change the "oldSP equ $+1" to simply "oldSP".

Did a talk about music on mainframes 1949-1965 at the Vintage Computing Festival Berlin last weekend. Nothing new if you've been keeping an eye on the 1-bit timeline post, but anyway, here you go:

https://media.ccc.de/v/vcfb18_-_90_-_en … ames_-_utz