Re: next gen engine ideas
Wow utz, something completely different again
Excellent, sounds like a guitar sample
You are not logged in. Please login or register.
The 1-Bit Forum → Sinclair → next gen engine ideas
Wow utz, something completely different again
Excellent, sounds like a guitar sample
Yes, that's the idea - except it's not a sample, it's generated in realtime!
Anyway, getting closer to a two-channel implementation. Still need to get rid of some 20 t's of overhead, though.
Ultimately it'd be great to come up with a more efficient way of implementing this, because even just some simple additonal modifications of the delay line can go a long way with this algorithm. Also, using a non-random source (for example a square wave) can produce some great result as well, but again takes too much time with my current implementation.
Edit: Probably the way to go forward is to render at a lower rate. Did some tests with rendering at 7812.5 Hz, it sounds not too bad (and will give more flexibility I hope) - see test2.tap.
More progress on this. Seperated the code that generates the attack transient, so it's now possible to use arbitrary source material. Attached example uses 50:50 square. 2 channels @ 7812.5 Hz. Some parasite noise remains, I'm afraid it'll be difficult to reduce it further.
Another game that seemingly has the same weird and rare engine as Galaxy Force - Count Duckula. The same year, but seems that it has totally different authors.
Nice work utz, another new sound for the beeper. Reminds me of a marimba
Great work utz! Wish I knew enough about 1bit music programming and/or music generally to make a doom metal album with this
Thanks guys I actually started to work on an engine to encorporate this a while back, but got a bit stuck while trying to reduce noise. I do want to continue with this at some point though. But first, I'm off to my traditional end-of-the-year trash album recording session.
Shiru: Interesting find! I'm especially intrigued by the noise/hihat sounds. Say, didn't we have another thread for this stuff? Can't seem to find it now.
I thought we had one, but I wasn't able to locate, and this one has my post with mention of Galaxy Force and few other interesting old engines. Better to split it to a separate thread anyways.
Seen interesting comment on youtube recently where a guy explained his understanding of how Wham and other interleaving engines works. It is interesting way to look at the things. He considers that logic of such engines is:
- when both channels has output 0, engine outputs 0 to speaker, thus output weight is 0
- when both channels has output 1, engine outputs 1 to speaker, thus output weight is 1
- when channels output is not the same, engine outputs sequence of 1 and 0 at it maximum possible sample rate, which makes kind of output weight 0.5
That's basically the idea behind engines like zbmod. Or, more relevantly, when talking about pulse interleaving, octode2k16. Interleaving of the various pulse trains is abstracted into an actual calculated output weight. So far, so good.
However, there are side effects. If the channel weight is kept near the maximum for consecutive loop iterations, the speaker membrane will build up additional pressure because there is not enough time for it to return to it's rest state. You can hear this clearly in o2k16, when all or almost all channels are busy. I tried to abuse this effect as an engine feature, but it is very hard to control.
When doing actual interleaving, this problem is much less pronounced, since most often there will still be long enough periods for the speaker cone to retract even at when channel weight is high. So by interleaving two "binary trees" (like in pytha, for example), we can mitigate the "ramping" issue, while still maintaining a sizeable number of possible channels. However, this also has side effects, as demonstrated here: http://randomflux.info/1bit/viewtopic.p … 1396#p1396 I still haven't figured what exactly is causing this phenomenon. My guess is that possibly the common assumption about using 8t alignment to mitigate IO contention is not correct, to start with. But probably that isn't the whole picture either. I do want to investigate further with a "perfectly aligned" engine at some point, though. Just need to find some time and motivation for it.
Very impressive research, nice sound! I think the 1bit sound is more advanced than AY today (no surprise). And because there's no reliable ratio between CPU and AY clock, and AY register access is clumsy, there's no need to bother with this PSG... or is?
Has somebody experimented with volume registers on AY and some kind of advanced 1-bit mixing? The logarithmic nature of volume curve might help solve some membrane-pressure sideeffects...
Z.
Seems that biggest mistake in many of my newer engines is not to mask bit 3. Worked wonders for the KS thing in any case.
I tried some simple 1-bit mixing on Gameboy a while back, so I can confirm it generally works. On the other hand I currently don't have a machine with AY so I have less of a motivation to try it. Maybe when I get my Next hur hur... Tried to make a combined AY+Beeper engine but it turns out that volume difference is huge and also varies alot between models, so I haven't investigated further into this direction either.
Currently I'm looking a lot into data encoding. I have a sort of el cheapo floating point format now which allows me to encode 12-bit frequency dividers in 8 bits. Costs 8 additional cycles in sound loop, but saves a register. Something similar should be doable for 16-bit dividers. Also I'm experimenting with a new song data format that isn't based on pattern/sequence structure but rather on dictionary based approach. Parsing such data has a small overhead compared to seq/ptn data (needs an additional register pair for decoding and is slightly slower than just popping values from stack), but first tests are promising: on average 10-30% smaller data than the traditional approach. However, I need to test it with more data to be sure. Thinking about grabbing a large set of files from modarchive and build test material from that. Anyway, will of course post more on that once it's progressed a bit further.
On ZX beeper, note data is commonly encoded based on one of the following principles:
8-bit frequency dividers or countdown values, stored directly.
8-bit indices into a lookup table holding 16-bit frequency dividers.
16-bit frequency dividers, stored directly.
12-bit frequency dividers, stored directly (as 16-bits).
Method A is efficient in terms of data size, but has well-known limitations of note range and detune. Method B is also size-efficient, but table lookup is inevitably slow, which is a problem for pulse-interleaving engines because of row transition noise. Also it requires an additional register for parsing. Method C is size-inefficient, but allows for fast and efficient parsing. Method D has similar constraints as method C, but is slightly more size-efficient as additional information can be stored in the upper 4 bits.
I have been using method D in several of my later engines, and generally regard it as a good compromise except of course in cases where higher precision is needed. However, the question is if there is a better solution.
First of all, it is safe to say that the most significant bit of a 12/16-bit dividers is hardly significant at all. The relevant notes are in a range that not only isn't very useful musically, but also cannot reproduced accurately by most existing beeper engines.
More importantly though, it should be noted that for higher notes, precision actually becomes less important for a number of reasons. One of which is psychoacoustics: humans are naturally bad at distinguishing high frequencies. The key reason is of a more technical nature, though. Imagine we have determined the frequency divider of an arbitrarily chosen reference note A-9 to be 0x7a23. That means the divider for A-8 is 0x3d11, A-7 gives 0x1e88, and so forth, until we arrive at a divider value of 0x7a for A-1. And here comes the funny part. We know that of course for each octave, frequencies will double. So if A-1 is 0x7a, then A-2 is 0xf4, A-3 is 0x1e8... A-9 is 0x7a00. Wait, what? Didn't we just determine that A-9 is 0x7a23? Well, depends on how you look at it. Musically speaking, 0x7a23 may be the correct value when thinking about a system where A-4 = 440 Hz. However, in our magic little beeper world, 0x7a00 is just as correct, as it perfectly satisfies the requirement that frequencies should double with each octave. Hence we can conclude that for A-9, we actually don't need the precision of the lower 8 bits, so we could just store the higher 8 bits and ignore the lower byte altogether. However, for A-1, we very much do need those lower bits. So the point is that either way, 8 bits of precision (or even 7 bits, as demonstrated in the above example) is sufficient for encoding a large range of note frequencies. We just need to be flexible in what these 8 bits represent. Sounds like a use case for a floating point format to you? Well, it sure does to me.
However, the problem is that decoding a real floating point format would be even slower than doing a table lookup. So that's no good. Instead I propose we cheat a little (like so often in 1-bit). We need something that uses 8 bits, is fast to decode, but still gives us some floating point like behaviour. Considering an engine using 12-bit dividers, I propose the following 8-bit note data format:
Bit 7 is the "exponent". If it is reset, then remaining bits are assumed to represent bits 1..7 of the actual 12-bit divider, all other bits being 0. If the "exponent" is set, then the remaining bits are assumed to represent bit 3..9 of the actual divider. As discussed, we don't actually care about the highest bit of the divider, and for practical reasons we will ignore the second most significant bit as well. The choice of what bit 7 represents is of course arbitrary, if it suits your implementation better then there are no drawbacks to inverting the meaning whatsoever.
Now, the nice thing is that we do not need to fully decode our "el cheapo" floating point format during parsing, as it costs just 8 cycles to decode it just-in-time. What's even better though, doing so saves a precious register!
ld a,b
add a,c
ld b,a
sbc a,a
fp_switch
add a,d ;self-mod: add a,d | xor a,d
ld d,a
out (#fe),a
Where B is our accumulator, C is our note value that has previously been left-shifted if it signifies bit 1..7, or has bit 7 masked otherwise, and D is the "extended accumulator" that will be used either to accumulate the overflow from the 8-bit add (if note value signifies bit 1..7 of the divider), or will serve as extended bit (if the note value signifies bit 3..9).
Attached to this post, you will find the full source code, an example note table, and a demo. At this point, there are 3 major issues that need to be addressed.
Middle E is detuned. This can probably be rectified by shifting the note table a little, though this method is bound to produce some slight detune around where the "upper" table section starts.
Parsing is quite ugly atm. I'm sure it can be made more efficient, but haven't found a good method yet.
The current way of calculating the output state gets into the way of applying other effects. I believe basic duty control should be possible (via the "phase offset" method), but other things like duty sweeps might be more difficult. My hope here is that instead this opens up the possibility for other tricks that I might not have thought about yet.
Well, that's basically all I wanted to share for now. It's probably possible to do something similar for 16-bit dividers, but most likely it won't work just-in-time. Other than that, I'm very curious to hear your thoughts on this. Is it useful at all? Any cool tricks that we can do with this? Any improvements for the implementation? Please let me know.
Ha, nobody expects the Spanish Inqui... I mean this new trick I discovered!
So basically I found a way to approximate a sine wave with just two square waves. Which means it's cheaper to implement than what Pytha uses to generate the triangle wave. Quality isn't as good as Pytha (mainly due to the fact that it seems almost impossible to properly align outputs to 8t multiples), but hey, how about squeezing in a 3rd channel? There's also some sweet overtone tricks you can do with this, as can be heard at the end of this short demo track. The third channel is a regular pulse channel with duty cycle control/sweep. Was hoping I could do 3 sine channels but I ran out of cycles. Either I'd need an extra register pair, or I'd need to split frequency counter updates across two loop iterations, which would probably degrade sound quality quite a bit. Overall the technique seems to be less flexible than the Pytha method as well, but I haven't explored it all that much yet.
I always enjoy those small tunes you create for your new engines!
Nice sound, which rather sounds like some samples effects.
Thanks mate! Been missing you around here lately, glad you're alive and kicking
I suspect I'm actually rehashing some old idea, just been experimenting with the phase shift to control volumes yet again. This snippet produces interesting harmonics as the volume rises very smoothly:
ld hl,0
ld bc,1200 ;frequency
ld ix,0
ld a,0
loop1
ld d,0
loop
add hl,bc
jp nc,$+5
xor 16
add ix,bc
jp nc,$+4
xor a
out (#fe),a
inc d
jp nz,loop
inc hl ;slowly increase the volume
jp loop
Volume control here is made of two 16-bit adders that get the same 16-bit value added (3 register pairs per oscillator). The first one inverts the phase, the second one resets it. Volume controlled by the distance between initial values of the adders.
Hmmm, isn't this rather a (very fine-grained) duty sweep? Check what happens when you replace inc hl with inc h... I think for phase-shift controlled volume you'd need 2 outputs.
However, there are indeed some interesting harmonics going on here. No idea where they come from, but they do remind me of that time when I unsuccessfully tried to make a 15ch pwm engine: https://bitrotlabel.bandcamp.com/track/pad5
I really wonder where these harmonics come from.
Very interesting sound Shiru. Don't think I've every heard volume increases that smooth in a 1-bit engine before !
Yeah, seems this technique has potential that is worth to explore further.
For now, the same thing with two channels and slides, demonstrates tons of harmonics.
Aye, it took a while for ye olde brain. I was fixated on the "volume control" part, but it's really all about the harmonics, right?
In that case, a word of caution: Emulators are often misleading in that respect, actual hardware can sound quite different. Generally 48k tends to be more noisy, and harmonics tend to be buried under the noise. Anyway, if you're going to explore further in this direction, I can help with some hw recordings.
The question remains where these harmonics come from in the first place. Is it because at these marginal pulse width threshold overflow errors from the frequency calculation become more significant?
When I worked on those multi-core engines with many volume levels, I also noticed another effect. If there are several consecutive frames with high total volume (mimicing a DC offset), some sort of volume ramping will occur, gradually turning rectangular waves into saws. Octode2k16 is perhaps the best example of this. I tried to control this effect but it was very unreliable. Still wonder though if it can be used somehow.
The super fine volume/duty effect is certainly real and can be used, that's what I was referring to as having potential (will do something in this regard a bit later).
As for harmonics, yes, I suspect most of it comes from the downsampling filter in emulators, seems to be some unwanted oscillation at high frequencies. Still not clear if there is more to it.
So if I understand correctly, the following code would be equivalent:
add ix,bc
ld a,ixh
add a,l
ld a,ixy
adc a,h
sbc a,a
and #10
out (#fe),a
So as I was just about to fall asleep last night, it occurred to me that it's possible to implement a low-pass filter with variable cut-off, without using multiplication.
Assuming an engine that supports multiple volume levels per channel, the idea is as follows:
current_volume = 0
max_volume_delta = x
loop {
new_volume = n // use whatever standard method here
if abs(current_volume - new_volume) > max_volume_delta {
if new_volume > current_volume {
new_volume = current_volume + max_volume_delta
} else {
new_volume = current_volume - max_volume_delta
}
}
... // output sound
}
This could potentially be used to add filter envelope emulation to existing synthesis algorithms such as Phaser. The question though is how to efficiently implement that big fat nested conditional.
It's fairly easy for a saw wave generator, because we can assume that when (abs(current_volume - new_volume) > max_volume_delta) is true, then (new_volume < current_volume) is also true for a real-world implementation (in other words, max_volume_delta is never exceeded on the rising edge of the saw). So that's what the attached example demonstrates.
Very interesting sound utz
The 1-Bit Forum → Sinclair → next gen engine ideas
Powered by PunBB, supported by Informer Technologies, Inc.