26

Re: Tutorial: How to Write a 1-Bit Music Routine

Arrgh, removing the border masking of course. I should've seen that.
How do you feel about border masking, from an aesthetic viewpoint? Over the years I've come to the conclusion that it is indeed quite unnecessary. In fact, almost every time I leave it out I get "nice border effects!" as a comment.

Re: Tutorial: How to Write a 1-Bit Music Routine

No, I did not remove border masking! It is still there. But since you used 224 t-states per sound loop iteration, I felt that it would be an excellent way to show that there is no contention anymore. The actual trick is kinda crazy actually:

playRegular
    exx            ;4
    add hl,de        ;11
    ld a,h            ;4
duty1 equ $+1
    cp #80            ;7
    sbc a,a            ;4
    and #12            ;7
    out (#fe),a        ;11    ; 12 + 11+4+7+4+7+11 = 56t
    
    add ix,bc        ;15
    ld a,ixh        ;8
duty2 equ $+1
    cp #80            ;7
    sbc a,a            ;4
    and #12            ;7
        exx            ;4
    out (#fe),a        ;11    ; 15+8+7+4+7+ 4 +11 = 56t

        nop            ;4
        nop            ;4
        add hl,sp        ;11
    ld a,iyh        ;8
duty3 equ $+1
    cp #80            ;7
    sbc a,a            ;4
    and #12            ;7
    out (#fe),a        ;11    ; 4+4+11+8+7+4+7+11 = 56t
    
        add iy,de        ;15
    ld a,h            ;4
duty4 equ $+1
    cp #80            ;7
    sbc a,a            ;4
stopch equ $+1
    and #12            ;7
        dec c            ;4
        exx            ;4
    out (#fe),a        ;11    ; 15+4+7+4+7+ 4+4 +11 = 56t

    jr nz,playRegular+1    ;12
                ;224
    exx
    ld a,b
ch1Length equ $+1    
    sub #ff                ;= timer - actual length
    jr z,_skip
    djnz playRegular
    jp rdptn

_skip
    ld (stopch),a
    
    djnz playRegular
    jp rdptn

I personally always work hard to keep border masking in all my engines. I know you can argue it is an aesthetic decision, but I kinda like my aesthetics less random smile

28

Re: Tutorial: How to Write a 1-Bit Music Routine

Hehe, sneaky! Thanks, I might very well implement that, with due credit of course.

Generally speaking, timer updates are quite regularly getting in my way somehow. I'd very much love to find an alternative to the usual 16-bit counter update. Though on the other hand, splitting it into two parts like we usually do (dec lo, jp nz, dec hi, jp nz) provides a handy way for doing small data updates on the fly (think effects).

29 (edited by utz 2023-08-08 10:40:20)

Re: Tutorial: How to Write a 1-Bit Music Routine

Part 10: Achieving Simple PCM with Wavetable Synthesis

The idea behind pulse-code modulated sound (PCM) is remarkably simple. A PCM waveform consists of a set of samples which describe the relative volume at a given time interval of constant length. (Note the terminology of "sample" in this context, which has nothing to with a sample as we know it from MOD/XM, for example). For playback, each sample is translated into a discrete voltage level, which is then amplified and ultimately sent to an output device, typically a loudspeaker. The samples are read and output sequentially at a constant rate until the end of the waveform has been reached.

When attempting this on a 1-bit device, we face the problem that we obviously can't output variable voltages. Instead we only have the choice between two levels, silence or "full blast". So how can we do it, then?

In order to understand how we can output PCM on a 1-bit device, let's first recap how Pulse Interleaving works. The underlying principle of P.I. is that we can keep the speaker cone in a floating state between full extension and contraction by changing the output state of our 1-bit port at a very fast rate, thanks to the inherent latency of the cone. So we're actually creating multiple volume levels. I'm sure you've realized by now that the same principle can be applied for PCM playback.

So, say we want to output a single PCM waveform at a constant pitch. All we need to do is interpret the volume levels described by the samples as the amount of time we need to keep our 1-bit port switched on. So we just create a loop of constant length, in which we

- read a sample
- switch the 1-bit port on for a the amount of time which corresponds to the sample volume
- switch the 1-bit port off for the remaining loop time
- check if we've reached the end of the waveform, and loop if we haven't.

That's all - on we go with the next sample, rinse and repeat until the entire waveform has been played.

Loop duration is a critical parameter here, of course. We can't make our loop too long, or else the "floating speaker state" trick won't work. It seems that
a loop time of around 1/15000 seconds is the absolute maximum, but ideally you should do it a bit faster than that.

With common PCM WAVs, we'll run into a problem at this point. An 8-bit PCM WAV has samples which can take 256 different volume levels, take the more popular 16-bit ones and you've already got 65536 levels. How are we supposed to control timing that precisely in our loop? 1/15000 seconds corresponds to around 233 cycles on the ZX Spectrum. The fastest output command - OUT (n),A - takes 11 cycles, which means we can squeeze at most 21 of those into the loop - and that's not taking into account all the tasks we need to perform besides outputting. So how do we output 256 or even 65536 levels? The answer is: We don't. Instead, we'll reduce the sample depth (that is, the number of possible volume levels) to a suitable level. This will obviously degrade sound quality, but hey, it's better than nothing.

As far as the Spectrum is concerned, 10 levels seems to be a convenient choice. You might be able to do more with clever code (or on a faster machine), but for the purpose of this tutorial, let's keep it at 10. That is, if we want to output just a single waveform. But of course we want to mix multiple waveforms at variable pitches, let's say two of them. In this case, our source PCM waveforms should have 5 volume levels.

As you might have already guessed, we'll need to develop our own PCM data format to encode these 5 levels. How this format will look like depends on your sound loop code as well as the device you're targetting - anything goes to make things as fast as possible. On the Spectrum, we may take two things into account:

- bit 4 sets the output state (let's ignore the details for now...)
- we have a fast command available for rotating the accumulator.

So, our samples bytes might look like this:

volume  binary    hex
level   76543210
______________________
  0%    00000000  #00
 25%    00010000  #10
 50%    00011000  #18
 75%    00011100  #1c
100%    00011110  #1e

This reasoning behind this may not be self-evident, but it'll become clear when we look at a possible sound loop.

Unfortunately, this custom PCM format still won't allow us to create a sound loop that is fast enough, so let's apply another restriction - use waveforms with a fixed length of 256 byte-sized samples. You'll see in a moment why this comes in handy.

Our sound loop might look like this:

  set up sample pointer channel 1                             ld bc,waveform1
  set base frequency ch1                                      ld de,noteval1
  clear add counter ch1                                       ld hl,0
                                                              exx
  set up sample pointer channel 2                             ld bc,waveform2
  set base frequency ch2                                      ld de,noteval2
  clear add counter ch2                                       ld hl,0
  set timer                                                   ld ix,0

loop:
  load channel 1 sample byte to accumulator                   ld a,(bc)
  output accu to beeper                                       out (#fe),a
  rotate left accumulator                                     rlca
  output accu to beeper                                       out (#fe),a
  rotate left accumulator                                     rlca
  output accu to beeper                                       out (#fe),a
  rotate left accumulator                                     rlca
  output accu to beeper                                       out (#fe),a
  add base frequency ch1 to counter ch1                       add hl,de 
    IF counter overflows, advance sample pointer ch1          adc a,0 \ add a,c \ ld c,a
                                                              exx
  load channel 2 sample byte to accumulator                   ld a,(bc)
  output accu to beeper                                       out (#fe),a
  rotate left accumulator                                     rlca
  output accu to beeper                                       out (#fe),a
  rotate left accumulator                                     rlca
  output accu to beeper                                       out (#fe),a
  rotate left accumulator                                     rlca
  output accu to beeper                                       out (#fe),a
  add base frequency ch2 to counter ch2                       add hl,de
    IF counter overflows, advance sample pointer ch2          adc a,0 \ add a,c \ ld c,a
    
  decrement timer and loop if not 0                           dec iy \ ld a,iyh \ or iyl \ jp nz,loop

Now you also see why limiting waveforms to 256 bytes is useful - this way, we can loop through them without ever having to reset the sample pointer, which of course saves time.

However, there's a whole array of problems with this code. First of all, it's still quite slow - 218 cycles. Secondly, you can see that the last output from each channel last significantly longer than the first 3. A bit of difference in length is actually not a big problem, but in this case, the last frame is 3 times longer - that's simply too much. Thirdly and most critically, I/O contention has not been taken care of (this mainly concerns the Speccy, of course).

If you've followed the discussion in this thread, you'll have noticed that I normally don't pay as much attention to I/O contention as other coders, but in this case, aligning the outputs to 8 t-state limits does make a huge difference. I'll let you figure this out on your own though. Check my wtfx code if you need further inspiration.

I will tell you one important trick for speeding up the sound loop though. Credits for this one go to sorchard from World of Dragon.

In the above sample, we're actually using 24 bit frequency resolution, since we're keeping track of the overflow from adding our 16-bit counters. But 16 bits are quite enough to generate a sufficiently accurate 7-8 octave scale. So in the above example, instead of doing "adc a,0 \ add a,c \ ld c,a" to update the sample counter, you could simply do "ld c,h", saving a whopping 22 cycles in total. The high byte of our add counter thus becomes the low byte of our sample pointer. The downside of this is that our waveforms need to be simple - e.g. just one iteration of a basic wave (triangle, saw, square, etc.). It's less of a problem than it sounds though, as you won't be creating really complex waveforms in 256 bytes anyway. And for a kick drum or noise, you can simply use a frequency value <256, making sure that you step through every sample in the waveform.

And that's all for now, hope you find the information useful, and as always, let me know if you find any errors or have any further suggestions/ideas.

30 (edited by Shiru 2016-02-06 10:20:42)

Re: Tutorial: How to Write a 1-Bit Music Routine

Maybe I missed something, but isn't it is easier to just use MSB of the 16-bit accumulator as LSB of the sample pointer? Did that before.

I mean,

add hl,de
ld c,h
ld b,MSB (all samples aligned to 256 bytes)
ld a,(bc)

website - 1bit music - other music - youtube - bandcamp - patreon - twitter (latest news there)

31

Re: Tutorial: How to Write a 1-Bit Music Routine

Yes, sure, that's what I'm trying to explain in the last paragraph. I didn't use it in the example code, because I wanted to first show a general, all-purpose approach before talking about possible optimizations, and because the logic behind the accu-MSB -> sample pointer LSB method may not be so obvious for beginners without some additional explanation (at least it wasn't for me when I was told about it).

Re: Tutorial: How to Write a 1-Bit Music Routine

Oh, sorry, my bad, missed that last part.

website - 1bit music - other music - youtube - bandcamp - patreon - twitter (latest news there)

33 (edited by utz 2017-07-01 18:39:11)

Re: Tutorial: How to Write a 1-Bit Music Routine

Part 11: Sound Tricks - Noise, Phasing, SID Sound, Earth Shaker, Duty Modulation

Digital/PCM sound is a powerful and flexible tool, but unfortunately it tends to consume a lot of RAM. So in this part of the tutorial, let's go back to the good old Pulse Interleaving technique, and talk about various tricks that can be used to spice up its sound.


Noise

Historically speaking, 1-bit routines have always been lacking in terms of percussive effects. However, converting a tone generator into a simple noise generator is surprisingly simple, and costs a mere 8 cycles on Z80-based systems. Consider the usual way we generate square wave tones:

   add hl,de           ;add base frequency divider (DE) to channel accumulator (HL)
   ld a,h              ;grab hi-byte of channel accumulator
   cp DUTY             ;compare against duty threshold value
   sbc a,a             ;set A to 0 or 0xFF depending on result
   out (#fe),a         ;output A to beeper port

Now, in order to generate noise instead of tones, one would obviously need to randomize the value held by the channel accumulator. But pseudo-random number generators are slow, so how do we do it? The answer is to simply reduce the quality of the PRNG as much as possible. Believe it or not, adding a single

   rlc h

after the add hl,de operation will provide enough entropy to create a convincing illusion of white noise. But it will do so only if it is fed a suitable
frequency divider as a seed. I usually use a fixed value of 0x2174 in my routines. Other values are possible of course, and may give slightly different results, though most values will just generate nasty glitch sounds instead of noise.

There's a nice side effect that you get for free - you can create the illusion of controlling the volume of the noise by changing the duty threshold. Changing the pitch of the noise however is much more difficult, and requires the use of additional counters. I'm still looking for an efficient way of doing it, if you have any ideas please let me know.

Last note, you can of course also rotate the hi-byte of frequency divider instead of the accu. The result of that however is almost guaranteed to be a glitch sound rather than noise.


Phasing

The Phasing technique was developed by Shiru, and is used to generate the signature sound of his Phaser1-3 engines. It comes at a rather heavy cost in terms of cycle count and register usage, but it's power and flexibility undoubtedly outweigh those drawbacks.

For regular tone generation, we use a single oscillator to generate the square wave (represented by the add hl,de operation). The main idea of Phasing, on the other hand, is to use two oscillators, and mix their outputs into a single signal. The mixing can be done via a binary XOR of the two oscillator outputs (the method used in Phaser1), or via a binary OR or AND (added in Phaser2/3).

   add hl,de           ;OSC 1: add base freq divider (DE) to channel accu (HL) as usual
   ld a,h              ;grab hi-byte of channel accumulator
   cp DUTY1            ;compare against duty threshold value
   sbc a,a             ;set A to 0 or 0xFF depending on result
   ld b,a              ;preserve result in B
   
   exx                 ;shadow register set, yay
   add hl,de           ;OSC 2: exactly the same operation as above
   ld a,h              ;grab hi-byte of channel accumulator
   cp DUTY2            ;compare against duty threshold value
   sbc a,a             ;set A to 0 or 0xFF depending on result
   exx                 ;back to primary register set
   
   xor b               ;combine output of OSC 1 and 2. (xor|or|and)
   out (#fe),a         ;output A to beeper port

As you can see, this method offers a wide range of parameters that affect timbre. The most important one, from which the technique derives its name, is the phase offset between the two oscillators. To make use of this feature, simply initialize the OSC1 accu to another value than the initial value of the OSC2 accu, eg. initialize HL to 0 and HL' to a non-zero value. Especially in conjunction with a slight offset between the OSC1 and OSC2 base dividers, some surprisingly complex timbres can be produced.

Side note: By using a binary OR to mix the signal and keeping the duty thresholds down to a reasonable level, the two oscillators can be used as independant tone generators. This method is used to mix channels in Squeeker and derived engines.


SID Sound

This effect, which derives its name from the key sound that can be heard in many of the early SID tunes, is formed by a simple duty cycle sweep. The velocity of the sweep is in sync with the frequency of the tone generator. Basically, every time the channel accumulator overflows, the duty threshold is increased or decreased. As with noise, this is trivial to pull off and costs only a few cycles. Using the standard tone generation procedure, we can implement it as follows

   add hl,de           ;add base frequency divider (DE) to channel accumulator (HL)
   sbc a,a             ;set A to 0 or 0xFF depending on result
   add a,c             ;add duty threshold (C)
   ld c,a              ;update duty threshold value (C = C - 1 if add hl,de carried)
   cp h                ;compare duty threshold value against hi-byte of channel accu
   sbc a,a             ;set A to 0 or 0xFF depending on result
   out (#fe),a         ;output A

As you can see, this operation costs a mere 4 cycles compared to the standard procedure without duty cycle sweep.


Earth Shaker

This effect is named after the game Earth Shaker, which used a rather unusual sound routine with two semi-independant tone channels, written by Michael Batty. As an actual method of generating multi-channel sound, it is of limited practicality, but it can be applied as an effect to regular Pulse Interleaving at a minimal cost. The core concept here is to continually modulate the duty threshold within the sound loop. Depending on the ratio of the duty cycle change vs the oscillator speed, the result can be a nice chord, phatness, or - in most cases - gruesome disharmony that will strike fear in the hearts of even the most accustomed 1-bit connaisseurs. A simple implementation, as used in HoustonTracker 2 for example, looks like this:

   add hl,de           ;add base frequency divider (DE) to channel accumulator (HL)
   ld a,c              ;load duty threshold (C)
   add a,DUTY_MOD      ;add duty threshold modifier
   ld c,a              ;store new duty threshold
   cp h                ;compare duty threshold value against hi-byte of channel accu
   sbc a,a             ;set A to 0 or 0xFF depending on result
   out (#fe),a         ;output A to beeper port

Duty Modulation

The aforementioned SID sound and Earth Shaker effects are actually basic implementations of a family of effects that may best be described as "Duty Modulation". As a first step into the world of Duty Modulation, let's take the Earth Shaker effect and modify it to change the duty threshold in sync with the main oscillator.

   add hl,de           ;add base frequency divider (DE) to channel accumulator (HL)
   sbc a,a             ;set A to 0 or 0xFF depending on result
   and DUTY_MOD        ;set A to 0 or DUTY_MOD
   xor c               ;XOR with current duty threshold (C)
   ld c,a              ;store new duty threshold
   cp h                ;compare duty threshold value against hi-byte of channel accu
   sbc a,a             ;set A to 0 or 0xFF depending on result
   out (#fe),a         ;output A to beeper port

By syncing the modulation in this way, the nasty glitches of the Earth Shaker effect can be avoided entirely (but also, no chords will be produced). Instead, we can now control harmonic components that share an octave relation with the base note. In other words, we can amplify over- and undertones at will, as long as they are a multiple of 12 half-tones away from the main note.


Things can be pushed even further by decoupling the sync and using a second oscillator to time the duty threshold updates.

   exx
   add hl,de           ;independant oscillator for timed duty threshold updates
   exx
   sbc a,a             ;set A to 0 or 0xFF depending on result
   and DUTY_MOD        ;set A to 0 or DUTY_MOD
   xor c               ;XOR with current duty threshold (C)
   ld c,a              ;store new duty threshold
   
   add hl,de           ;add base frequency divider (DE) to channel accumulator (HL)
   cp h                ;compare duty threshold value against hi-byte of channel accu
   sbc a,a             ;set A to 0 or 0xFF depending on result
   out (#fe),a         ;output A to beeper port

This way, we can create the octave effects from the previous example (by setting the "duty" oscillator to the same value as the main tone oscillator), as well as Earth Shaker style chords, while also gaining better control over the latter. Additionally, some interesting slow-running timbre changes can be achieved by setting the duty oscillator to a frequency near (but not equal to) the main oscillator.

The usefulness of this approach might seem a bit questionable considering the hefty cost in CPU cycles and register usage. However, the required code is almost the same as the one used for the Phasing technique, so with a tiny amount of self-modifying code, it can be implemented in a Phaser style engine at virtually no extra cost.

There's also an added bonus when combining this technique with the noise generator explained above. By setting the duty threshold to the same value as the duty modifier, the duty oscillator can be used as a tone generator, meaning you can actually mix noise and tone on the same channel!

That's all for this time. If you know of any other cool tricks please post them here!

Re: Tutorial: How to Write a 1-Bit Music Routine

A note on the Phaser technique. It also allows to control duty, even in its simplest form, as seen in the Phaser1, without the extra CPs. Set both oscillators to the same frequency, but reset phase of both to different values, one to 0, another to 32768 or less (32768 gives 50% duty).

website - 1bit music - other music - youtube - bandcamp - patreon - twitter (latest news there)

35

Re: Tutorial: How to Write a 1-Bit Music Routine

Good point. I still haven't found a useful application for this, though.
What I came up with some time ago was this:

   ld bc,DIVIDER     ;DIVIDER < #1000
   ld hl,0
   ld de,!0
loop
   add hl,bc
   ex de,hl
   add hl,bc
   ex de,hl
   ld a,h
   xor d
   out (#fe),a
   ...

It's obviously pretty fast, and 12-bit resolution more or less does the job. But as such, it's still seems quite a waste. I wonder if something more useful can be derived from it.

36 (edited by Shiru 2016-10-13 16:05:53)

Re: Tutorial: How to Write a 1-Bit Music Routine

Well, one application of this fact is extra timbre by doing duty modulation in simple Phaser-like engines:

    ld hl,0
    ld ix,1024
    ld bc,200
    
loop
    add hl,bc
    jr c,$+4
    jr $+4
    xor 16
    add ix,bc
    jr c,$+4
    jr $+4
    xor 16
    out (#fe),a
    
    inc ix  ;that's the duty modulation
    
    jp loop
Post's attachments

test.sna 48.03 kb, 8 downloads since 2016-10-13 

You don't have the permssions to download the attachments of this post.
website - 1bit music - other music - youtube - bandcamp - patreon - twitter (latest news there)

37

Re: Tutorial: How to Write a 1-Bit Music Routine

Hmm... so it basically does the "SID sound" thing, but at a different rate. Very interesting idea, looks unassuming but has a lot of potential.
Would be an interesting challenge to write a pulse interleaving engine that doesn't use the "compare counter hi-byte against duty threshold" method.

38

Re: Tutorial: How to Write a 1-Bit Music Routine

   ld hl,0
   ld de,#8000
   ld bc,#7
    
loop
   add hl,bc
   ex de,hl
   adc hl,bc    ;nice switch for toggling duty mod on/off
   ex de,hl
   ld a,h
   xor d
   out (#fe),a
   jr loop

19 byte 48K Spectrum intro xD

Post's attachments

test.tap 109 b, 5 downloads since 2016-10-13 

You don't have the permssions to download the attachments of this post.

Re: Tutorial: How to Write a 1-Bit Music Routine

That's like the hypnotoad. Pretty cool.

website - 1bit music - other music - youtube - bandcamp - patreon - twitter (latest news there)

40

Re: Tutorial: How to Write a 1-Bit Music Routine

Hehehe, yes.
Anyway, let's continue this discussion here, if you like.

41

Re: Tutorial: How to Write a 1-Bit Music Routine

Part 12: Synthesizing Basic Waveforms: Rectangle, Triangle, Saw

In this chapter, I'm going to explain how to synthesize different waveforms without the use of samples or wavetables.


For generating waveforms other than a rectangle/pulse on a 1-bit output, we need to be able to output multiple volume levels. In part 10, we have looked at some methods for outputting PCM samples and wavetables. We concluded that in the 1-bit domain, time is directly related to volume. The longer we keep our 1-bit output "on" within a fixed-length frame, the higher the volume produced by the speaker cone will be. We can use this knowledge to write a very efficient rendering loop that will generate 8 volume levels with just 3 output commands:

  calculate 3-bit volume
  output (volume & 1) for t cycles
  output (volume & 2) for 2t cycles
  output (volume & 4) for 4t cycles and loop

As you can see, the trick here is to double the amount of cycles taken after each consecutive output command in the loop. An implementation of this for the ZX Spectrum beeper could look like this:

  ld c,#fe
  ld hl,0
  ld de,frequency_divider
loop
  add hl,de         ;11                   ;update frequency counter as usual
  ;...              ;y                    ;do some magic to calculate 3-bit volume
  ;...              ;z                    ;put it in bit 4-6 of register A
                                          ;so now bit 4 of A = volume & 1
  out (c),a         ;12: x+10+11+y+z=64   ;output to beeper
  rrca              ;4                    ;now bit 4 of A = volume & 2
  out (c),a         ;12: 4+12=16          ;output
  ds 4              ;16                   ;timing
  rrca              ;4                    ;now bit 4 of A = volume & 4
  out (c),a         ;12: 16+4+12=32       ;output
  ;...              ;x                    ;update timer etc.
  jp loop           ;10                   ;loop

If you count the cycles, you'll notice that this loop takes exactly 112 cycles. Which means we can easily add a second channel in the same manner, which brings the total cycle count to 224 - perfect for a ZX beeper routine. Side note: If necessary, you can cheat a little and reduce the 64-cycle output to 56 cycles, without much impact on the sound.


Anyway, we will use this framework as the basis for our waveform generation. So let's talk about the "magic" part.

The easiest of the basic waveforms is the saw wave. How so, you may ask? Well, the saw wave is actually right in front of your nose. Look at the first command in the sound loop - ADD HL,DE. Say we set the frequency divider in DE to 0x100. What happens to the H register? It is incremented by 1 each sound loop iteration, before wrapping around to 0 eventually. Ok, by now you might have guessed where this is going. If you haven't, then plot it out on a piece of paper - the value of H goes on the y-axis, and the number of loop iterations goes on the x-axis. Any questions? As you can see, our saw wave is actually generated for free while we update our frequency counter (thanks to Shiru for pointing this out to me). We just need to put it into A, and rotate once to get it into the right position.

  add hl,de         ;update frequency counter
  ld a,h            ;now 3-bit volume is in bit 5-7 of A
  rrca              ;now it's in bit 4-6
  out (c),a         ;output as above
  ...

Doing a triangle wave is a little more tricky. In fact, being the lousy mathematician that I am, it took me quite a while to figure this out. Ok, here's how it's done. We've already got the first half of our triangle wave done - it's the same as the saw wave. The second half is where the trouble starts - instead of increasing the volume further as we do for the saw wave, we want to decrease it again. So we could do something ugly like

  add hl,de
  ld a,h            ;check if we've passed the half-way point of the saw
  rla               ;aka H >= 0x80
  jp c,_invert_volume
  ...
reentry
  rrca
  out (c),a
  ...
  jp loop
_invert_volume
  ;A = -H

There's a more elegant way that does the same thing without the need for conditional jumps.

  add hl,de
  ld a,h
  rla
  sbc a,a          ;if h >= 0x80, A = 0xff, else A = 0
  xor h            ;0 xor H = H, 0xff xor H = -H - 1
  out (c),a        ;result is already in bit 4-6, no need to rotate
  ...

We can simply ignore the off-by-one error on H >= 0x80, since we don't care about the lower 4 bits anyway.

Last but not least, a word about rectangle waves. Of course, rectangle waves happen naturally on a 1-bit output, unless you force it to do something else. Which we are doing in this case, so how do we get things back to "normal"? Well, to get a square wave, we simply have to remove the XOR H from the previous code example. Which means that with just two bytes of self-modifying code, we can create a routine that will render saw, triangle, or square waves on demand:

  add hl,de
  ld a,h
  rla
  
  ;saw    |     tri     |    rect
  rra     |   sbc a,a   |   sbc a,a
  rrca    |   xor h     |   nop

  out (c),a
  ...

You'll notice that even with timer updates, register swapping, etc. you'll still have some free cycles left. Which should, of course, be put to some good use - see part 11 if you need some inspiration.

Re: Tutorial: How to Write a 1-Bit Music Routine

Thanks a lot for the write-up, now I get how it works. Pretty clever and cool stuff. 1-bit synthesis findings still able to amaze me, after so many engines and years.

website - 1bit music - other music - youtube - bandcamp - patreon - twitter (latest news there)

43 (edited by Shiru 2017-06-18 13:47:58)

Re: Tutorial: How to Write a 1-Bit Music Routine

Thought to make a note on the row transition noise. There is a way to minimize it. The idea is to avoid row transitions as is. Rather than storing song as a set of rows, with constant skipping of empty fields, store delays between any actual changes that is taking place. This should keep the sound loop tight, and longer gaps when actual changes happens will be masked by the fact that these changes actually change/restart a note, or trigger a drum, i.e. introduce a major change in the sound anyway.

I.e., rather than:

C-1 ...
... ...
... ...
... ...
... C-2
... ...
C-1 E-2

have:

C-1 ... > run sound loop for four rows without parsing anything
... C-2 > run sound loop for two rows
C-1 E-2

website - 1bit music - other music - youtube - bandcamp - patreon - twitter (latest news there)

44

Re: Tutorial: How to Write a 1-Bit Music Routine

Hmm, I believe in most cases, especially with engines with few channels, there is actually at least one change every row. Also, as you know I always use a per-row speed setting. In that case it's something that should be handled on the tracker/converter end - if there is no change, simply increase the previous row length. Nevertheless, I'm sure there will be other cases where this "store only changes" approach is useful. In either way, it's a good reminder to keep thinking about song data structure. Got kind of sloppy with that myself. I'm perhaps too content with my current stack-read/skip-on-flag approach.

45 (edited by perevalovds 2020-12-02 03:55:56)

Re: Tutorial: How to Write a 1-Bit Music Routine

Hello. I have a question about Part 10 ("PCM replay"):
Proposed scheme uses five volume levels:
  0%    00000000  #00
25%    00010000  #10
50%    00011000  #18
75%    00011100  #1c
100%    00011110  #1e
I am wondering if the quality will improve, if we will use more delicate on/off scheme, I mean, if "0" and "1" are placed more uniformly:
00010000
00010100             
00010101
00011110
Does it makes sense?

46

Re: Tutorial: How to Write a 1-Bit Music Routine

From what I remember of my experiments back in the day, there is no noticable benefit to distributing the 1s like this, except for many consecutive 1s - eg. 11101110 is sometimes a bit cleaner than 11111100. I'd say it's not worth spending extra CPU cylces on, though.

Part 10 is actually a bit outdated, in part 12 you will find a better synthesis method that can also be adapted to PCM playback. In a sense, the same caveat about consecutive 1s applies to that, though, e.g. having two 3-bit outputs will be cleaner than having one 4-bit output, but having one 4-bit output saves 24 cycles...

In any case there are some interesting effects that can be observed in this regard. This is because the expansion of a speaker diaphragm isn't linear, so at volumes >50% it will react differently than <50%. So far nobody has been able to really make use of that, though.

47 (edited by chupo_cro 2023-08-08 04:30:25)

Re: Tutorial: How to Write a 1-Bit Music Routine

Hi, excellent tutorial!

While I was reading I noticed a few things that might or might not have to be corrected. In Pulse Interleaving routine most of the time the loop would jump through both JR NZ,SKIPx jumps and at the end the A register would be changed to IXH OR IXL and then output to port 254. That could be avoided by moving LD A,H and LD A,L above JR NZ,SKIPx instructions.

In Variable Pulse Width routine I think the state would be changed when couter wraps from #7FFF to #8000 and not when it wraps from #8000 to #8001 because carry is 1 when H goes from #00 to #7F and when H becomes #80 then carry becomes zero. The other reason why the state can't be changed when counter wraps from #8000 to #8001 is because we are checking only high byte so the state can't be changed when the change is in only low byte.

In Achieving Simple PCM with Wavetable Synthesis there are multiple RCLA instructions where there should be RLCA

Chupo_cro

48

Re: Tutorial: How to Write a 1-Bit Music Routine

Thanks for spotting and reporting these! I've corrected No. 2 and 3.

I'm inclined to leave the part on Pulse Interleaving as is. You are of course correct in that moving the ld a,ixh/l up would save 8t here, but I'm concerned that it might make it harder to see the connection with the pseudo-code on the left. In the end a proper player should not be written like this anyway, because you'd still end up with an all-too-large, 10t timing difference between the skip-taken and skip-not-taken paths.

Re: Tutorial: How to Write a 1-Bit Music Routine

Hi, I didn't mean to move LD A, IXH/L for saving the T-states, I meant to move the instructions for reading the current speaker states back into A register because checking if IX is zero changes A and would completely destroy the sound. And even if A wouldn't be changed when checking if IX is zero the routine wouldn't update the speaker states of channel1 and channel2 when JR NZ jumps are taken which is most of the time. This is what I meant:

            ld h,0
            ld l,0
soundloop   dec b
            ld a,h                  ; I moved this above JR NZ
            jr nz,skip1
            xor #10
            ld h,a
            ld b,c
skip1       out (#fe),a
            dec d
            ld a,l                  ; I moved this above JR NZ
            jr nz,skip2
            xor #10
            ld l,a
            ld d,e
skip2       out (#fe),a
            dec ix
            ld a,ixh                ; These destroy A register which is holding the speaker state in...
            or ixl                  ; ...case of both JR NZ jumps are taken which is most of the time
            jr nz,soundloop

How exactly did you mean to move LD A, IXH/L to save 8 T-states?

Chupo_cro

50

Re: Tutorial: How to Write a 1-Bit Music Routine

D'oh! I was looking at a different part of the tutorial. Thanks for your patience, I've corrected the error now.