Hi Bushy, welcome aboard!

Glad to see someone being so enthusiastic about 1-bit sound wink And great to have you doing all these engine ports. My first successful attempt at 1-bit code was a Huby port as well wink I'm sure after a while you'll figure out how to write your own engines as well. If you have any questions, feel free to ask, of course.

I'll see if I can dig up some .1tm files from my archive and send them your way. Unfortunately Gmail has a nasty habit of blocking attachments of unknown file types, so if you don't receive a mail from me please let me know here.


(2 replies, posted in General Discussion)

More progress. Got rid of most of the hard-coded compiler parts, so the compiler generator is getting there slowly. (Yes, you read correctly: the new libmdal generates a custom compiler for each target engine, rather than using one big standard compiler with loads of conditionals like in libmdal v1).

Also, I was able to implement a new, much more robust parser using comparse, thanks to its friendly author who gave me an in-depth tutorial last weekend.

Last but not least I did some much needed refactoring, finally breaking up that horrible 3000 lines-of-code core module into various submodules.

Unfortunately in the coming weeks I'll have to work on another, unrelated project, so development will be stalled again until at least the end of the month.


(2 replies, posted in General Discussion)

Great news! The new libmdal compiler produced it's first correct output today. Still cheating a bit by hard-coding a few parameters, but nevertheless it's a major step forward. Testing with Huby at the moment which may not sound all that exiting. However, Huby has some not-so-common quirks so it's good for testing a number of different features (fixed pattern size so MDAL source patterns must be resized, size of order list must be stored in data, which requires multiple compiler passes, and so on).

Still a lot of work to do, but for now I'm very happy smile


(13 replies, posted in Sinclair)

I don't think there is a way to fix it. Even changing the pitch will not make the samples play as intended.
Interrupts and beeper sound don't mix well. And iirc Nirvana's interrupts take up almost the entire frame.


(13 replies, posted in Sinclair)


Since Shiru seems to be busy, I'll try to answer in his place.

I don't think there is any code for the method Shiru describes, because nobody has implemented it. The idea for continuous sound would be not to call the sound generator through an ISR (because 50 Hz is too slow to make anything but the lowest audible tones), but to interleave it with the multicolour code at regular intervals. The usual restrictions with contention apply there, so it would still be quite tricky. Generally, PFM/pin pulse and Zilogat0r's Squeeker method would work best because they both can work with a low sample rate.

For sfx and simple melodies consisting of short blips, doing this on interrupts works fine. Avoid the Basic Beep, as it has a lot of overhead. You can achieve the same result with just three commands:

  add hl,de
  ld a,h
  out (#fe),a

Init DE with the desired frequency divider (for tests, something like #0040 will be fine), and run in a loop for something like 1000 times.

In z88dk, there's also a modified version of the Tritone engine with can be used in combination with game logic. It basically runs all the time and then periodically runs your game code on note changes. If your game logic is simple and doesn't need a lot of CPU time, it might be a good option.

Hahaha this made my day:


I always fail to find an elegant way of commenting a conditional jump. I think this is the perfect way.

Other than that, I can only reiterate what Shiru said: This code is meant to be called through an IM2 interrupt, or some other form of loop.
This appears to be the relevant z88dk documentation: https://github.com/z88dk/z88dk/wiki/interrupts

Released a new XM parser today. It's provided as an extension for the Chicken 4 implementation of the Scheme language. If you have Chicken 4 installed, you can just run "chicken-install xmkit" to use it. The nice thing about Chicken is that it allows you to run code interpreted (for quick prototyping), as well as compile to fairly efficient C code.

The parser handles pretty much everything, including parsing pattern data, performing file integrity checks, and extracting and converting sample data. If you only need to parse patterns then it might be overkill, consider Shiru's Python converter instead.


Dictionary Encoding of Song Data

Since the advent of music trackers, chiptune modules have traditionally been encoded using a pattern/sequence approach. Considering the constraints of 1980s consumer computer hardware, this is undoubtedly a quite optimal approach.

However, in the present day, perhaps the capabilities of modern computers and cross development offer new possibilities for the encoding of chiptune data?

Back in the day, the two core requirements aside from providing a reasonably efficient storage format were:

  • Fast encoding from the editor side with limited memory, as musicians like to listen to their work in progress instantly at the push of a button.

  • Fast decoding, since enough CPU time must be left to perform other tasks. In the realm of 1-bit music, this is especially critical, since on most target machines we cannot perform any tasks while synthesizing the output. That means that reading in music data must be as fast as possible, in order to minimize row transition noise. For this reason, we tend to avoid using multi-track sequences, which are very efficient in terms of storage size, and are fast to encode, but are much slower to decode than a combined sequence for all tracks.

Obviously nothing has changed regarding the second requirement. However, the first requirement no longer stands, when using editors running on a PC. So, how can we use this to our advantage?

A Dictionary Based Encoding Algorithm

I propose the following alternative to the traditional pattern/sequence encoding:

  1. Starting from a uncompressed module with sequence/pattern data, produce a dictionary of all unique pattern rows in the module. The dictionary may consist of at most 0x7fff bytes, otherwise the module is not a viable candidate.

  2. Construct a sequence as follows:

    1. Replace each step in the original sequence with the rows of corresponding pattern.

    2. Replace the rows with pointers to the corresponding dictionary entries.

    1. Beginning with a window size of 3 and a nesting level of 0, find all consecutive repetitions of the given window in the new sequence and replace them with a control value that holds the window size and number of repetitions. The control value is distinguished by having bit 15 set. A dictionary pointer will never have bit 15 set. (On ZX Spectrum, this would be reversed, as the dictionary will reside in upper memory.)
      Substitution shall not take place inside compressed blocks. So, assuming the following sequence:
      <ctrl window:4 length:4>1 2 3 4 5 6 7 3 4 5 6 7...
      the packer will not replace the second occurance of 3 4 5 6 7, because the first occurance is part of the compressed block <ctrl...>1 2 3 4.

    2. If executing step 3A replaced any chunk, increment nesting level by 1.

    3. Increment window size by 1 and repeat from step 3A until reaching a predefined maximum window size, or reaching a predefined maximum nesting level.

Testing the Algorithm

It all may sound good on paper, but some reliable data is needed to prove it actually works.
For this purpose, I ran some tests on a large set of nearly 1800 XM files obtained from ftp.modland.com.

From the XMs, input modules were prepared as follows:

  • Only note triggers are considered, any other data is discarded.

  • From the extracted note data, all empty rows are discarded.

  • Each note row is prepended with a control value, which has bits set depending on which channels have a note trigger. Empty channels are then discarded.

  • The maximum window size is determined as the length of the longest note pattern, divided by 2.

Some larger modules were included in the test set, just to get a picture how the algorithm would perform on larger data sets. In reality, any module that produces a dictionary larger than 0x7fff bytes would of course not be usable.

I then ran the dictionary encoding algorithm on the prepared modules, using 3 different settings:

  1. No recursion applied (step 3 skipped)

  2. Recursion with a maximum depth (nesting level) of 1

  3. Recursion with a maximum depth of 4


modules tested: 1793

no recursion: 1641
recursion depth = 1: 1676
recursion depth = 4: 1701
An optimization is considered a success if the dictionary based algorithm produced a smaller file than the pattern/sequence approach.

Average savings
no recursion: 29.1%
recursion, depth = 1: 31.6%
recursion, depth = 4: 34.1%
Average savings ratios are obtaining by comparing output size against the pattern/sequence encoded input data.

Best result:
no recursion: 72.3%
recursion, depth = 1: 91.5%
recursion, depth = 4: 91.7%

Worst result:
no recursion: -110.4%
recursion, depth = 1: -78.2%
recursion, depth = 4: -59.7%

results graphic
.svg version

The results are quite surprising, in my opinion.

  • Even the non-recursing algorithm performs worse than the pattern/sequence approach for 152 modules (8.5% of all files) could not be compressed better. Recursing up to a depth of 4 nesting levels leaves further reduces the failures to 92 files (5.1%). I had expected that the dictionary algorithm would lose to pattern/sequence encoding in at least 1/3 of the cases.

  • I had assumed that dictionary encoding performance would correlate with module size, but the test data does back up this hypothesis. However, when dictionary encoding performs worse than pattern/sequence encoding, it generally happens at smaller module sizes.

  • Recursive dictionary encoding on average offers only marginal benefits over non-recursive dictionary encoding. However, there were a number of cases where recursive encoding performed significantly better. This includes almost all cases where dictionary encoding performed worse that pattern/sequence encoding.

  • Within the margin of recursive dictionary encoding approaches, the algorithm performs significantly better at higher nesting levels.


I think it is safe to conclude that plain dictionary based encoding of chiptune song data (without recursion) offers significant benefits over the traditional pattern/sequence approach. Considering that non-recursively dictionary encoded data can also be decoded easily and quickly, it can definately be considered a viable alternative.

Whether recursive dictionary encoding is worth it remains debatable. I think that the cost in terms of CPU time will be acceptable, as nested recursions will occur infrequently, so it would be comparable to performing a sequence update on pattern/sequence encoded data. However, it will incur a significant cost in terms of CPU registers resp. temporary memory required. I would argue that it depends on the use case. You might consider it if you are intending to optimize small modules, and have registers to spare.

Test Code
https://gist.github.com/utz82/7c52da950 … 1c2bad5b6b

Excellent write-up. Glad to see it is generating quite some attention, too.


(84 replies, posted in Sinclair)

I finally found a way of consistently reproducing the "Engine not provide any data" bug that I mentioned above.

To reproduce:

1) load the attached .1tm
2) F5 to play, F5 to stop again
3) Move cursor to last column, press 1

I don't know if it is reproducible under Windows, if not then probably some memory is not being initialized to 0.

Some more info: Bug happens regardless of the engine used. As mentioned, it almost always happens when trying to enter something at the beginning of a block. Once it occurs, it is persistent, ie no data that would trigger a row play can be entered at this position anymore regardless of any further actions taken.


(11 replies, posted in General Discussion)

Waaaaaah! It may be just a day old, but to me it's already legendary. I would never have believed that it's possible to push monophonic stuff this far.
Also, there goes my hard earned paypal money. Damn you.

Glad to hear that. Could you tell me what exactly you had to do to get it working in the end? Generally TiLP+Win10 seems to be rather wonkey, so it might be a good idea to document user's findings in the manual.

Just hard-set everything in the "Change Device" dialogue. So cable = SilverLink, port is #1 (usually), calc is TI-82, of course. I think TiLP doesn't auto-detect the 82. Or do everything via command line
tilp ti82 SilverLink CRASH.82B
which usually gives more useful error messages anyway.

You need to force USB driver installation with Zadig. See the TiLP Windows readme for details. If that doesn't work, then yeah maybe it's easier to use that Linux netbook. Anyway, let me know if you have any success with the Zadig thing, so I can perhaps add a note to the HT manual about this.

Hi, I need some more info. Which transferrer software are you using and what OS are you on?
Also, plug a sound cable into your calc and confirm that you are getting a low humming noise on both stereo channels.


(87 replies, posted in Sinclair)

Some Thoughts on Note Data Encoding

On ZX beeper, note data is commonly encoded based on one of the following principles:

  1. 8-bit frequency dividers or countdown values, stored directly.

  2. 8-bit indices into a lookup table holding 16-bit frequency dividers.

  3. 16-bit frequency dividers, stored directly.

  4. 12-bit frequency dividers, stored directly (as 16-bits).

Method A is efficient in terms of data size, but has well-known limitations of note range and detune. Method B is also size-efficient, but table lookup is inevitably slow, which is a problem for pulse-interleaving engines because of row transition noise. Also it requires an additional register for parsing. Method C is size-inefficient, but allows for fast and efficient parsing. Method D has similar constraints as method C, but is slightly more size-efficient as additional information can be stored in the upper 4 bits.

I have been using method D in several of my later engines, and generally regard it as a good compromise except of course in cases where higher precision is needed. However, the question is if there is a better solution.

First of all, it is safe to say that the most significant bit of a 12/16-bit dividers is hardly significant at all. The relevant notes are in a range that not only isn't very useful musically, but also cannot reproduced accurately by most existing beeper engines.

More importantly though, it should be noted that for higher notes, precision actually becomes less important for a number of reasons. One of which is psychoacoustics: humans are naturally bad at distinguishing high frequencies. The key reason is of a more technical nature, though. Imagine we have determined the frequency divider of an arbitrarily chosen reference note A-9 to be 0x7a23. That means the divider for A-8 is 0x3d11, A-7 gives 0x1e88, and so forth, until we arrive at a divider value of 0x7a for A-1. And here comes the funny part. We know that of course for each octave, frequencies will double. So if A-1 is 0x7a, then A-2 is 0xf4, A-3 is 0x1e8... A-9 is 0x7a00. Wait, what? Didn't we just determine that A-9 is 0x7a23? Well, depends on how you look at it. Musically speaking, 0x7a23 may be the correct value when thinking about a system where A-4 = 440 Hz. However, in our magic little beeper world, 0x7a00 is just as correct, as it perfectly satisfies the requirement that frequencies should double with each octave. Hence we can conclude that for A-9, we actually don't need the precision of the lower 8 bits, so we could just store the higher 8 bits and ignore the lower byte altogether. However, for A-1, we very much do need those lower bits. So the point is that either way, 8 bits of precision (or even 7 bits, as demonstrated in the above example) is sufficient for encoding a large range of note frequencies. We just need to be flexible in what these 8 bits represent. Sounds like a use case for a floating point format to you? Well, it sure does to me.

However, the problem is that decoding a real floating point format would be even slower than doing a table lookup. So that's no good. Instead I propose we cheat a little (like so often in 1-bit). We need something that uses 8 bits, is fast to decode, but still gives us some floating point like behaviour. Considering an engine using 12-bit dividers, I propose the following 8-bit note data format:

Bit 7 is the "exponent". If it is reset, then remaining bits are assumed to represent bits 1..7 of the actual 12-bit divider, all other bits being 0. If the "exponent" is set, then the remaining bits are assumed to represent bit 3..9 of the actual divider. As discussed, we don't actually care about the highest bit of the divider, and for practical reasons we will ignore the second most significant bit as well. The choice of what bit 7 represents is of course arbitrary, if it suits your implementation better then there are no drawbacks to inverting the meaning whatsoever.

Now, the nice thing is that we do not need to fully decode our "el cheapo" floating point format during parsing, as it costs just 8 cycles to decode it just-in-time. What's even better though, doing so saves a precious register!

        ld a,b
        add a,c
        ld b,a
        sbc a,a
        add a,d             ;self-mod: add a,d | xor a,d
        ld d,a
        out (#fe),a

Where B is our accumulator, C is our note value that has previously been left-shifted if it signifies bit 1..7, or has bit 7 masked otherwise, and D is the "extended accumulator" that will be used either to accumulate the overflow from the 8-bit add (if note value signifies bit 1..7 of the divider), or will serve as extended bit (if the note value signifies bit 3..9).

Attached to this post, you will find the full source code, an example note table, and a demo. At this point, there are 3 major issues that need to be addressed.

  1. Middle E is detuned. This can probably be rectified by shifting the note table a little, though this method is bound to produce some slight detune around where the "upper" table section starts.

  2. Parsing is quite ugly atm. I'm sure it can be made more efficient, but haven't found a good method yet.

  3. The current way of calculating the output state gets into the way of applying other effects. I believe basic duty control should be possible (via the "phase offset" method), but other things like duty sweeps might be more difficult. My hope here is that instead this opens up the possibility for other tricks that I might not have thought about yet.

Well, that's basically all I wanted to share for now. It's probably possible to do something similar for 16-bit dividers, but most likely it won't work just-in-time. Other than that, I'm very curious to hear your thoughts on this. Is it useful at all? Any cool tricks that we can do with this? Any improvements for the implementation? Please let me know.

Yep, good stuff! The "graphical sound" technique was actually pretty widespread back in the day.
Here's another example from SU: https://www.youtube.com/watch?v=Z7Zb4rso82M
Another famous name is Norman McLaren.
Also, Daphne Oram: https://en.wikipedia.org/wiki/Oramics
Also the Solaris soundtrack was made using a similar technique.


(0 replies, posted in Sinclair)

As of today, individual downloads for my Speccy engines are no longer available from the github repo. You can now grab all the converters etc. as a single download on the Releases page.

The reason for this change is that I intend to use the github repo as a submodule for MDAL/Bintracker at some point, and having a large number of binary files in the repo is somewhat counter-productive in that respect. My apologies for any inconvenience caused.


(76 replies, posted in Sinclair)

Yeah, 2 downvotes already! I'm famous woohooo big_smile


(76 replies, posted in Sinclair)

Thanks, Shiru. I'm not terribly proud of this one, actually. Except for the ghost notes in the section starting at 0:45, really like how those turned out. Think your track is great, too, classic Shiru style. Wanted to send another track, but same problem: not enough time to finish it.

Of course I've already told you a few times that this is a fantastic album but... Awesome album is awesome. It not only shows how far we've come in terms of beeper sound, but even more importantly it shows how far you've come as a composer since (the already outstanding) 1-bit Mechanistic. Well done, mate.


(3 replies, posted in Sinclair)

Thanks you guys! After playing with this engine some more I think it needs more polish, but I'm still glad it works at all. And yes, it's surprisingly cheap even with realtime low pass, but of course banked buffers open up even more possibilities, like emulating attack transient variations (StringKS can do it but there are no built-in buffers for it so they currently need to be prepared by the user). I'm also wondering if this could be a candidate for Jan Deak style "ahead-of-time" buffer generation. In general I have a strong feeling that his method might be useful for something other than just generating loads of pulse trains, but still searching...


(87 replies, posted in Sinclair)

Seems that biggest mistake in many of my newer engines is not to mask bit 3. Worked wonders for the KS thing in any case.

I tried some simple 1-bit mixing on Gameboy a while back, so I can confirm it generally works. On the other hand I currently don't have a machine with AY so I have less of a motivation to try it. Maybe when I get my Next hur hur... Tried to make a combined AY+Beeper engine but it turns out that volume difference is huge and also varies alot between models, so I haven't investigated further into this direction either.

Currently I'm looking a lot into data encoding. I have a sort of el cheapo floating point format now which allows me to encode 12-bit frequency dividers in 8 bits. Costs 8 additional cycles in sound loop, but saves a register. Something similar should be doable for 16-bit dividers. Also I'm experimenting with a new song data format that isn't based on pattern/sequence structure but rather on dictionary based approach. Parsing such data has a small overhead compared to seq/ptn data (needs an additional register pair for decoding and is slightly slower than just popping values from stack), but first tests are promising: on average 10-30% smaller data than the traditional approach. However, I need to test it with more data to be sure. Thinking about grabbing a large set of files from modarchive and build test material from that. Anyway, will of course post more on that once it's progressed a bit further.


(76 replies, posted in Sinclair)

Yes! Thanks for the reminder, Vinnny. I'll try to make something but can't promise.


(3 replies, posted in Sinclair)

Wow, appearantly it's been almost a year since I published a new beeper engine. So it was about time for some good ol' t-state squeezing.

StringKS is an experimental engine that implements Karplus-Strong inspired string synthesis. It's more a proof-of-concept than an actually useful engine (hence no converter is provided), however it does prove that physical modelling is possible in 1-bit, and I think it's worth exploring this concept further.

Synthesis is done by creating an initial ring buffer from various sources (at the moment, ROM noise, rectangle wave with variable duty, and saw wave are supported), and then continually running a simple low-pass filter over the buffer. The size of the buffer determines the pitch. It is also possible to source from user-created data (so theoretically one can start from a pre-filtered buffer to create softer attack transients). Additionally, I threw in PWM sample playback on one of the channels, and regular rectangle wave playback (also on one channel only). There's also a (rather brutal) overdrive mode. All synth methods except the saw wave one support a somewhat crude 3-bit volume control.

source code
An extremely uninspired demo tune is attached.

- 8-bit frequency counters only, so the available note range is rather limited.
- At higher notes, tones will fade out very quickly.

I believe it's possible to rectify these issues, but more research is needed. One possible approach I experimented with was to generate data on the fly with the usual add-and-compare method while keeping track of the low-pass cutoff. It works but so far sound quality is worse than with the buffered approach. Another way might be to pre-scale the speed of buffer iteration to reach lower frequencies (e.g. update buffer pointer only every other sound loop iteration), and to slow down decay by only running the filter on every other buffer iteration. Still need to find some free t-states for that, though. Perhaps splitting updates so only one channel gets updated per sound loop iteration might be doable. Well, I'm open to ideas, of course wink