Some Thoughts on Note Data Encoding
On ZX beeper, note data is commonly encoded based on one of the following principles:
8-bit frequency dividers or countdown values, stored directly.
8-bit indices into a lookup table holding 16-bit frequency dividers.
16-bit frequency dividers, stored directly.
12-bit frequency dividers, stored directly (as 16-bits).
Method A is efficient in terms of data size, but has well-known limitations of note range and detune. Method B is also size-efficient, but table lookup is inevitably slow, which is a problem for pulse-interleaving engines because of row transition noise. Also it requires an additional register for parsing. Method C is size-inefficient, but allows for fast and efficient parsing. Method D has similar constraints as method C, but is slightly more size-efficient as additional information can be stored in the upper 4 bits.
I have been using method D in several of my later engines, and generally regard it as a good compromise except of course in cases where higher precision is needed. However, the question is if there is a better solution.
First of all, it is safe to say that the most significant bit of a 12/16-bit dividers is hardly significant at all. The relevant notes are in a range that not only isn't very useful musically, but also cannot reproduced accurately by most existing beeper engines.
More importantly though, it should be noted that for higher notes, precision actually becomes less important for a number of reasons. One of which is psychoacoustics: humans are naturally bad at distinguishing high frequencies. The key reason is of a more technical nature, though. Imagine we have determined the frequency divider of an arbitrarily chosen reference note A-9 to be 0x7a23. That means the divider for A-8 is 0x3d11, A-7 gives 0x1e88, and so forth, until we arrive at a divider value of 0x7a for A-1. And here comes the funny part. We know that of course for each octave, frequencies will double. So if A-1 is 0x7a, then A-2 is 0xf4, A-3 is 0x1e8... A-9 is 0x7a00. Wait, what? Didn't we just determine that A-9 is 0x7a23? Well, depends on how you look at it. Musically speaking, 0x7a23 may be the correct value when thinking about a system where A-4 = 440 Hz. However, in our magic little beeper world, 0x7a00 is just as correct, as it perfectly satisfies the requirement that frequencies should double with each octave. Hence we can conclude that for A-9, we actually don't need the precision of the lower 8 bits, so we could just store the higher 8 bits and ignore the lower byte altogether. However, for A-1, we very much do need those lower bits. So the point is that either way, 8 bits of precision (or even 7 bits, as demonstrated in the above example) is sufficient for encoding a large range of note frequencies. We just need to be flexible in what these 8 bits represent. Sounds like a use case for a floating point format to you? Well, it sure does to me.
However, the problem is that decoding a real floating point format would be even slower than doing a table lookup. So that's no good. Instead I propose we cheat a little (like so often in 1-bit). We need something that uses 8 bits, is fast to decode, but still gives us some floating point like behaviour. Considering an engine using 12-bit dividers, I propose the following 8-bit note data format:
Bit 7 is the "exponent". If it is reset, then remaining bits are assumed to represent bits 1..7 of the actual 12-bit divider, all other bits being 0. If the "exponent" is set, then the remaining bits are assumed to represent bit 3..9 of the actual divider. As discussed, we don't actually care about the highest bit of the divider, and for practical reasons we will ignore the second most significant bit as well. The choice of what bit 7 represents is of course arbitrary, if it suits your implementation better then there are no drawbacks to inverting the meaning whatsoever.
Now, the nice thing is that we do not need to fully decode our "el cheapo" floating point format during parsing, as it costs just 8 cycles to decode it just-in-time. What's even better though, doing so saves a precious register!
ld a,b
add a,c
ld b,a
sbc a,a
fp_switch
add a,d ;self-mod: add a,d | xor a,d
ld d,a
out (#fe),a
Where B is our accumulator, C is our note value that has previously been left-shifted if it signifies bit 1..7, or has bit 7 masked otherwise, and D is the "extended accumulator" that will be used either to accumulate the overflow from the 8-bit add (if note value signifies bit 1..7 of the divider), or will serve as extended bit (if the note value signifies bit 3..9).
Attached to this post, you will find the full source code, an example note table, and a demo. At this point, there are 3 major issues that need to be addressed.
Middle E is detuned. This can probably be rectified by shifting the note table a little, though this method is bound to produce some slight detune around where the "upper" table section starts.
Parsing is quite ugly atm. I'm sure it can be made more efficient, but haven't found a good method yet.
The current way of calculating the output state gets into the way of applying other effects. I believe basic duty control should be possible (via the "phase offset" method), but other things like duty sweeps might be more difficult. My hope here is that instead this opens up the possibility for other tricks that I might not have thought about yet.
Well, that's basically all I wanted to share for now. It's probably possible to do something similar for 16-bit dividers, but most likely it won't work just-in-time. Other than that, I'm very curious to hear your thoughts on this. Is it useful at all? Any cool tricks that we can do with this? Any improvements for the implementation? Please let me know.