Topic: The optimization thread

Thought I'd open a thread for brainstorming about optimizing various parts of beeper engines. I'm thinking especially about various issues related to data loading.

For a start, let's talk about frequency table lookup. Lately I tend to avoid it altogether, and instead store frequencies directly in the pattern data. But sometimes it comes in handy, I suppose. So, my method is quite primitive. Suppose we have a 16-bit lookup table aligned to a 256b border, and note values are already multiplied by 2.

       ld h,HIGH(frequency_lut)      ;7
       ld a,(de)                     ;7, load note val from pattern data
       ld l,a                        ;4
       ld c,(hl)                     ;7
       inc l                         ;4
       ld b,(hl)                     ;7, frequency now in BC

Which brings us to a total of 36t. Of course, if you don't care about sp, you could also do "ld sp,hl, pop bc" which would be 2t faster. Is there any faster way to do it, preferably without touching sp?

Re: The optimization thread

1. You can avoid wasting the lowest bit of the pattern data by storing separately low and high bytes in two separate LUTs and going from one to the other via INC H (instead of your INC L).

2. Otherwise, there is not much you can do with this.

Re: The optimization thread

This is more of a design question but let's say you had to combine frequency tables and detune. How would you go about it?

I can think of these kinda options,

1. Persistent detune values. Once a detune value is read, it is stored and kept until you run into another one. All subsequent notes in the corresponding channel have their frequencies added with the currenty stored detune value
2. Immediate frequencies. Like when a note normally takes one byte but if a certain bit is set (there's typically a few spare ones), it should instead be interpreted as the MSB of a 2-byte frequency value

I suppose the first approach results in smaller data wherever detuned notes are used frequently, it is perhaps more suitable for SpecialFX-like engines, whereas with the second one it is vice versa

Re: The optimization thread

I'd go for option 2. Unless you preshift note values like in the example above, you never need bit 7, so you can use that to signal detune. Option 1 would be not so practical, because with 16-bit frequencies, you typically need to recalculate the detune value specifically for every note anyway.

Option 2 then again boils down to two choices. Either you interpret this and the following byte as a direct frequency value (ie. skipping the table lookup), which would limit frequency range to 0-$7fff, or you interpret it as note value, followed by a byte-length detune value, which you'd then multiply with e.g. (hi-byte of the frequency/16) or so to get a decent detune range.