Topic: Adding vocal track

Hello,
maybe you are aware of this, but to sum it up - it should be extremely straightforward to add vocal abilities to any 1-bit engine, as it seems, by the means of either microsamples (probably even 16-bit "bitmask" quantity in a register, and doing ADD HL, HL, is enough to express basic in-tune harmonics between distinct vowels) or to use, otherwise unused AY channels to sum it up additively.

Viznut's engine
https://www.youtube.com/watch?v=BFo7IEy92No

Agemixer's Freestyler od SID MOS6581
https://www.youtube.com/watch?v=R5bzqhp1Jqk

Famous speech in U96 songs was actually STSPEECH.TOS utility on Atari ST, utilising AY-3-891x
https://www.youtube.com/watch?v=_dG2zekU2ak
https://www.youtube.com/watch?v=k7FMfy3YgN0
https://www.atari-forum.com/viewtopic.php?t=2254

Kecal, the routine from unknown author (it's NOT Oldsoft - Zdeněk Starý), was pioneering microsamples
https://worldofspectrum.org/archive/sof … -2-oldsoft

And recetly, I have found this, looks like it uses sequence of 32 Gameboy's 4-bit samples:
https://www.youtube.com/watch?v=8g9AN5PPYQE
https://gbdev.gg8.se/wiki/articles/Game … d_hardware
http://www.devrs.com/gb/files/hosted/GBSOUND.txt

Are You aware of any other routine of this kind? I think this is very unexplored area...

Zilogat0r

P.S.: UTZ, great posts about engine techniques and computer music history! I'm a biiig fan of what you are doing.

Re: Adding vocal track

At the moment we have plenty engines with wavetable and sample playing capabilities.

utz did a great range of wavetable engines that using 256-byte long looped samples, which is enough to represent vowels; I did a number of engines with sample-based percussion of various kinds.

Just a few days ago I released SquatE, which is very close to your Squeeker (thanks a ton for this thing!), plus it features a sample channel that plays along without interrupting the tone channels. This was done exactly with vocal and pitched instrumental samples in mind. It can be done even better, as a song with vocal parts would need good compression, and 1-bit samples seem to be compressible with basic RLE well enough, so the next step would be an engine with on-fly RLE sample decoder, and a wider number of samples (currently numbers limited to 7-15). Perhaps with a sample offset effect, even.

As for a real-time vowel synth, I think we're discussed this possibility at some point somewhere around here. This sure yet to be done.

website - 1bit music - other music - youtube - bandcamp - patreon - twitter (latest news there)

Re: Adding vocal track

Great thread, I've been interested in trying to add vocal samples to my beeper tunes for ages.
I'll check out the updated Squat engine.

Here's a couple of my tunes that I've managed to use vocal samples in Phaser 3, although these are interrupting.
https://youtu.be/pVvuPjC8W5Y
https://youtu.be/ap5hNtAVsG8

The most impressive example I've seen is Lman's C64 music
https://youtu.be/HnWntppFOuA

Jammer has done some amazing stuff too (samples again though)
https://multistylelabs.bandcamp.com/track/left-right

This Atari ST stuff got my interest too:
https://stumusic.bandcamp.com/album/3chnls4bit

Re: Adding vocal track

Been actually thinking about making an engine with a dedicated vocal/speech synth track for quite some time now. 256-byte wavetables is indeed well-suited for this. What I'd really want to do though is formant synthesis. Haven't been able to pull it off yet, though.

Re: Adding vocal track

Go for it utz. Would be a brilliant feature for beeper tracks smile

Got me thinking, remember this track on BOTB ?

https://battleofthebits.org/arena/Entry … %21/16668/

Re: Adding vocal track

Shiru wrote:

Just a few days ago I released SquatE, which is very close to your Squeeker (thanks a ton for this thing!), plus it features a sample channel that plays along without interrupting the tone channels. This was done exactly with vocal and pitched instrumental samples in mind.

My pleasure. Btw. is there some demo song, showing the SquatE abilities with all bells and whistles? Especially, that non-interupting samples must be great.

Shiru wrote:

and 1-bit samples seem to be compressible with basic RLE well enough, so the next step would be an engine with on-fly RLE sample decoder

Definitely. For background chords in the Amiga minor/major style. But still, I think that vocals are even cheaper to play. When 32 samples are enough on GB, are 16 probably as well... and this way, the looping and separation could be trivialised, probably. It's also of little importance, whether the voice sounds harsh, distorted by quantisation - one could take it rather as a feature... we all love robotic voices, especially when they are in-tune smile.

AtariTufty wrote:

Here's a couple of my tunes

Great list. Everything there was new for me, nice to see such huge advances in this techinque.

utz wrote:

Zilogat0r
What I'd really want to do though is formant synthesis

Do you know LPC fundamentals and this marvelous document? https://cnx.org/contents/swFM2W46@5.12: … troduction
It describes details of development of famous 80s SpeakNSpell toy - first LPC voice synthesizer chip. I think while LPC is fully computable on ZX (you can nicely interleave each LPC value processing with pulse width value decrement etc.), microsamples are better suited for this. We really need just few of them, they are ultrashort, might be 1-bit, and there's no computational overhead at all...

Re: Adding vocal track

Zilog wrote:

Do you know LPC fundamentals and this marvelous document? https://cnx.org/contents/swFM2W46@5.12: … troduction

Fantastic, just what I need! I briefly looked at LPC before, but was like "meh, too much maths". But in recent months I've been working on my math skills so I might be able to tackle this now.

Shiru wrote:

RLE

For if we'd want to go full PCM instead of PWM (3-bit should be possible and give quite decent speech quality), I recently read that dictionary based audio compression apparently works well for low res audio, and was used quite a bit in old Amiga demos. So, chop up the audio into blocks of, say, 8 samples, create a dictionary of those, (optionally) massage dictionary a bit to eliminate similar samples, replace audio stream with pointers to dictionary. Simple enough to decode imo.

Regarding microsamples, there is one issue with fixing the length to 256 samples. You essentially end up with 3 different use cases:
- vowels: arbitrary length, must be looped
- k, t, p etc: one-shot samples with fixed length
- s, th, kh: noise, arbitrary length.
The main thing here is the noise ones: If you use a 256 b looped sample for it, you cannot control pitch, because it won't sound like noise for anything but the lowest stepping speed.
So to me it seems like the best approach is to detect the "type" of sound beforehand, and just synthesize the noisy ones, and perhaps the plosives as well. That's how I ended up with "hey, if we can synthesize the vowels as well, then we don't actually need samples".

8 (edited by Zilog 2020-09-21 11:59:45)

Re: Adding vocal track

utz wrote:
Zilog wrote:

Do you know LPC fundamentals and this marvelous document? https://cnx.org/contents/swFM2W46@5.12: … troduction

Fantastic, just what I need! I briefly looked at LPC before, but was like "meh, too much maths". But in recent months I've been working on my math skills so I might be able to tackle this now.

Actually, the math behind is a bit misleading. In reality, it's dead-simple. Just few-item array (either round buffer, because there's no need to copy or shift anything, or array with fixed indexes that can be better hardcoded into the loop) and you compute weighted sums of previous items, and this sum is the new item.

Normally, such array would never leave the initial all-zeroes state, so it's also fed with excitation frequency (spikes, because they pre-contain upper harmonics, which are typically suppressed in the predicted output, because it's sum-averaged). That's all.

Have a look also here: https://cnx.org/contents/wh_aQ2UJ@20/Sp … -Synthesis

Re: Adding vocal track

Haha, as usual. Giant scary looking math formula boils down to rather trivial real world solution big_smile
Ok, so this looks indeed like something that can be done on Spectrum. I'll look into this once I have some free time (end of the year, hopefully). Unless you want to try, of course! It'd be great to see a new engine from you!