The No.1 Website for Pro Audio
taxonomy of early digital synthesizers
Old 3rd April 2013
  #1
Lives for gear
 
acreil's Avatar
 

taxonomy of early digital synthesizers

I've kind of touched on this topic in various threads, but I haven't really presented a clear overall picture, and I think most people have no idea what I'm going on about. I also noticed a considerable amount of confusion in Won't computers produce a convincing analog sound someday? (which I didn't see earlier because I avoid such threads), so I think it's worth elaborating a little bit more.

The basic idea is this: certain early digital synthesizers played back low resolution waveforms at jitter-free variable sample rates (derived from a high frequency clock by dividing it by an integer) with inadequate signal reconstruction. The result is that there are audible image frequencies (copies of the harmonic series transposed to higher frequencies). Technically they're not supposed to be there, but the result is a nice "exciter" effect. This is distinct from aliasing, as these are extended high harmonics rather than inharmonic tones.



A discrete time signal is basically an amplitude modulated impulse train (mid left of image) with a periodic spectrum (lower left). The DAC holds a constant value until the next sample (mid center), which filters the periodic spectrum with a sinc function (lower center). To be theoretically proper, the extra higher frequency stuff is supposed to be removed by a reconstruction filter, resulting in a smooth signal (mid right) and correctly reproduced spectrum (lower right). But many vintage digital synths don't do this (or do a poor job of it), which instead results in a bright, gritty sound.

Modern sample playback techniques, which use phase accumulators and high quality interpolation, do a reasonably good job of producing the "correct" signal, but this doesn't have a gritty vintage sound because the image frequencies aren't there. The result is that low resolution samples just sound dull. Using drop-sample interpolation (equivalently known as truncation, nearest neighbor interpolation, etc.) retains the image frequencies, but this approach suffers from jitter/aliasing. Reasonable results can be obtained if a high sample rate is used (like oversampling in software), but it's still not as good as the jitter-free divide-by-n case.

Also, note that the divide-by-n architectures have limited frequency resolution:

For clock frequency Fclk and playback sample rate fp, the pitch resolution in cents is 1200*log(n+1/n)/log(2) for n = floor(Fclk/fp). For playback frequency f and waveform resolution m (samples per period), fp = f*m, so the pitch resolution becomes worse as the waveform resolution is increased.

So if you're playing back a sample at 50 kHz on an Emulator II, the pitch resolution is 8.63 cents, or the worst case error is 4.32 cents. That's not very good.

Good performance for high resolution waveforms at high pitches would require extremely high clock frequencies. This is why some systems (PPG Wave 2 and Synclavier FM) automatically downsample the waveforms for higher octaves (by incrementing the waveform address by 2, 4, 8, etc. samples). This distorts the waveform but ensures that pitch resolution is the same for each octave. Samplers similarly have limited transposition ranges so that several samples must be used to span the keyboard.

Here's a basic overview of the different classes. The examples included all use a zero order hold DAC or drop-sample interpolation:

High frequency VCO (no jitter/aliasing, theoretically sort of the ideal case, but too complicated for polyphonic implementation)

RMI Harmonic Synthesizer (uses Walsh functions; 32 samples per period)
Early digital drum machines (Linn LM-1, Linndrum, Oberheim DMX, Sequential Circuits DrumTraks, Roland TR-909)
Early digital delays
Early pitch shifters (Eventide H910, H949, H969)

High frequency VCO, divide-by-n clock generation (no jitter/aliasing, VCO only used for fine tuning, vibrato and pitch bend, thus pitch control is smooth but limited)

Most divide-down TOS organs and DCO synthesizers (clock frequency typically 500 kHz to 2 MHz)
PPG Wave 2 (128 samples per period; waveforms downsampled by powers of two for higher octaves, so the highest notes should just be square waves)
Gleeman Pentaphonic (32 samples per period)
Emu Emulator I (11 MHz master clocks, independent VCOs for upper and lower keyboard split)

High frequency fixed clock, divide-by-n clock generation (no jitter/aliasing, limited frequency resolution, most examples can be implemented in software with Fs=Fclk and constrained playback frequencies)

Atari 10444D TIA (30 kHz; spectacularly poor frequency resolution)
Atari CO12294 POKEY (64 kHz in the most commonly used mode)
Nintendo 2A03 (1.79 MHz; triangle wave is 32 samples per period)
Wersi MK1, EX20, DX10, EX10R (3 MHz; probably 256 samples per period, multisamples, waveforms downsampled for higher octaves)
Hudson HuC6280 (3.58 MHz; 32 samples per period)
Konami SCC (3.58 MHz; 32 samples per period)
Commodore Amiga (3.58 MHz)
Roland Juno 106 (4 MHz; obviously I'm referring to clock pulse generation only)
OSC OSCar (4 MHz with phase locked loop frequency multiplier; 256 samples per period)
Sequential Circuits Prophet 2000 (6 MHz)
Wersi DX400/DX500 (8 MHz; probably 256 samples per period)
Wersi CD600/CD700/CD800/CD900 (8 MHz; probably 256 samples per period plus sampled attack transients)
Akai S900/S950 (8 MHz; probably also applies to S612, S700/S750 etc.)
Korg DSS-1 (8 MHz; drawn/additive waveforms are 512 samples per period, multisampled for each octave)
Emu Emulator II (10 MHz)
Kurzweil 250 (10 MHz)
Emu Emulator III (20 MHz; includes 2x oversampling)

Fixed rate phase accumulator with drop-sample interpolation (good frequency resolution, basically equivalent to a trivial software implementation) In this category the sound quality depends a great deal on the sample rate, use of multisamples and waveform resolution. ~128 samples per period results in a desirably gritty sound, but the resulting image frequencies require a high sample rate (~200 kHz minimum) to avoid severe aliasing. Consequently the DW-8000 sounds reasonably clean and not particularly gritty while the Evolver sounds gritty but has a huge amount of aliasing.

Yamaha GS-1 (23 kHz?)
Emu SP-12 (~26 kHz)
Ensoniq Mirage (31.25 kHz)
DK Synergy (32 kHz)
Con Brio ADS 200 (35.2 kHz, 4096 samples per period)
Keytek CTS-2000 (128 samples per period, multisamples)
Ensoniq ESQ-1 (41.666 kHz, uses multisamples)
Dave Smith Instruments Evolver (I think 48 kHz, 128 samples per period)
Yamaha DX7 (49.097 kHz)
Korg DW-8000 (50 kHz; uses multisamples, bandlimited/oversampled tables up to 1024 samples per period)
Allen Digital Computer Organ (MOS series)/RMI Keyboard Computer (83.333 kHz, 32 samples per period)
PPG Wave 2.3 (195.313 kHz, 128 samples per period. I haven't confirmed but this should apply to the 2.2 as well)
Dynacord ADD One (something like 250 kHz)
Sequential Circuits Prophet VS (probably 250 or 500 kHz, 128 samples per period)
Casio Consonant-Vowel and SD keyboards (600 kHz?, 16 samples per period)
Emu Emax (~1 MHz)
MOS6851 SID (0.985 or 1.023 MHz)

High frequency fixed clock, jittery variable clock generation (jitter is equivalent or worse to drop-sample interpolation at the same sample rate, pitch resolution is worse)

NED Synclavier FM (401.929 kHz; 256 samples per period, waveforms downsampled by powers of two for higher octaves)
NED Synclavier polyphonic sample playback (effective sample rate ~1.6 MHz)
Fairlight CMI (17.145865 MHz, 128 samples per period. This applies to all models, but the CMI III should have slightly less jitter)

Any corrections or additions to this list would be appreciated. I can possibly also determine sample rate and waveform resolution from submitted test recordings.
Old 3rd April 2013
  #2
Lives for gear
 
wwjd's Avatar
so, the new stuff is better accuracy, but not as (error prone) vintage sounding? k, I'll buy that. But I think it is easier to take a clean signal and dirty it up than a dirty signal and clean it up. A certain specific "sound/color" of vintage digital is simply one of thousands of available FX to be mixed in.
I still have some vintage stuff and notice how "bad" it sounds. Back in the day, it tried its hardest to be as real and accurate sounding as it could - that was the intention of sampling: real live sounds from a keyboard - it fell short of that but was still musically usable.

I don't know where this thread is heading but wanted to reply.
Old 4th April 2013
  #3
Lives for gear
 
Mefistophelees's Avatar
You left out the Amiga.

I think it falls under:
"High frequency fixed clock, divide-by-n clock generation"

4 channels of 8 bit playback with options for AM and FM.

The clock frequency depended on whether it was NTSC or PAL. On later models the max playback rate went up to around 55KHz in VGA mode.

Early on you would have multiple tunes put onto floppies along with a player. 880K was not a lot of room so the sounds would be sampled at 8KHz.

This gave quite a distinctive sound.

For higher quality you could sample up to 28KHz (NTSC: 31KHz) and the standard format IFF-8SVX files supported 1 sample per octave over 5 octaves.
Old 4th April 2013
  #4
Lives for gear
 
fanriffic's Avatar
 

Old 4th April 2013
  #5
What about SNES APU (SPC700) which is related to MOS6502? Apparently it runs at 2.048 MHz and uses that strange BRR audio compression. Listening to its music now on modern equipment it very clear that there's loads of aliasing and artefacts. Might be from the BRR compression or it's nasty integrated BBD.

Eight stereo channels each with BBD. DAC Playback at 32k.

Also OPL2 YMF262 & YAC512 DAC which runs off a 14.318 MHz clock. Since they're yammies probably it's not too far off from the mighty 6OP DX7.

Nine mono channels of 2-OP FM.

Good topic.
Old 4th April 2013
  #6
Lives for gear
 
loujudson's Avatar
Maybe you can asnwer a question for me. I'm neither a player nor programmer, just a "simple audio engineer" and don't want to hijack the thread, but a simple question if I may:

I do a lot of radio production and mixing and mastering, and I always monitor the audio in SpectraFoo. which gives me a frequency and level readout in several visual forms. Many many older synths parts contain a significant (if low level, around -51) steady 16kHz tone, not audible most times, but definitely there in a lot of them, on records and CDs, visible as a single narrow tone. What does that come from, a specific type of synth, certain brand, or what? I can post a screenshot if it isn't obvious.

Thanks!

L
Old 4th April 2013
  #7
Lives for gear
 
ben_allison's Avatar
wut?
Old 4th April 2013
  #8
I'd guess it's some sort of power related leakage, DC ripple, rectifier (DC regulator) or switching frequency resonance of some kind. I'm guessing probably somebody else knows.

I can hear up to 17k and if it's quiet enough I can clearly hear most cheapo AD-DC wall-wart's switching.
Old 4th April 2013
  #9
Lives for gear
 
wwjd's Avatar
hmmm I too used to hear that in the older CDs. My assumption was digital recording/mastering snafu. I never noticed it was the synths. My very uneducated guess was it was recorded using old (not yet great) A/D, then saved that way, then mastered that way, then played back with the still old (not yet great) D/A converters.
Just my reasoning.
I have very few CDs now, and none of mine had that high pitch. It was on the occasional song on some compilation CDs I no longer have.... but I know the sound you speak of. Could have also been some goofy copy protection sceme where they insert so high frequency above that, that screw up making digital copies? Like they used to do with subsonics and phone tones in the past?

Do you have one of those old CDs laying around that does that? Would be cool if you would rip part of a track to WAV or 320CBR mp3 and let us give a listen
Old 4th April 2013
  #10
Lives for gear
 
acreil's Avatar
 

Quote:
Originally Posted by Mefistophelees View Post
You left out the Amiga.

I think it falls under:
"High frequency fixed clock, divide-by-n clock generation"
You're right. The clock frequency is 3.58 MHz. I had skimmed some documentation previously, but didn't read closely enough to find that.

I'd also like to add the Hudson HuC6280 (TurboGrafx-16). It's divide-by-n, 3.58 MHz, 32 samples per period (edit: so is the Konami SCC chip). But I get errors now when I try to edit the post... (edit: fixed)

Quote:
Originally Posted by lain2097 View Post
What about SNES APU (SPC700) which is related to MOS6502? Apparently it runs at 2.048 MHz and uses that strange BRR audio compression.
I think that uses linear interpolation or something, so I left it out. Tons of stuff uses linear interpolation, and I think it's not really worth writing about.

Quote:
Also OPL2 YMF262 & YAC512 DAC which runs off a 14.318 MHz clock. Since they're yammies probably it's not too far off from the mighty 6OP DX7.
Datasheet says the sample rate is 49.7 kHz. The input clock frequency by itself usually isn't enough to figure out the output sample rate, since it takes some number of clocks to generate each output sample. There's more about the Yamaha OPL chips here: Adlib / OPL2 / YM3812

Quote:
Originally Posted by loujudson View Post
I do a lot of radio production and mixing and mastering, and I always monitor the audio in SpectraFoo. which gives me a frequency and level readout in several visual forms. Many many older synths parts contain a significant (if low level, around -51) steady 16kHz tone, not audible most times, but definitely there in a lot of them, on records and CDs, visible as a single narrow tone. What does that come from, a specific type of synth, certain brand, or what? I can post a screenshot if it isn't obvious.
I dunno
Old 4th April 2013
  #11
Lives for gear
 

The 16khz signal is even present on things completely free of synths, like e.g. movie scores.

It was old CRT (tube) screens, either in the recording room or leaking into the console circuitry in the control room.
Old 4th April 2013
  #12
Lives for gear
 

Great information, acreil! So would the bad frequency resolution be the same (for coresponding tones) accross the frequency range? I'll have to measure my old AKAIs now.
Old 4th April 2013
  #13
Quote:
Originally Posted by loujudson View Post
Maybe you can asnwer a question for me. I'm neither a player nor programmer, just a "simple audio engineer" and don't want to hijack the thread, but a simple question if I may:

I do a lot of radio production and mixing and mastering, and I always monitor the audio in SpectraFoo. which gives me a frequency and level readout in several visual forms. Many many older synths parts contain a significant (if low level, around -51) steady 16kHz tone, not audible most times, but definitely there in a lot of them, on records and CDs, visible as a single narrow tone. What does that come from, a specific type of synth, certain brand, or what? I can post a screenshot if it isn't obvious.

Thanks!

L
Clock noise?
Old 4th April 2013
  #14
Lives for gear
 

Quote:
Originally Posted by jimmyklane View Post
Clock noise?
No.

"CRTs used for television operate with horizontal scanning frequencies of 15,734 Hz (for NTSC systems) or 15,625 Hz (for PAL systems).[56] These frequencies are at the upper range of human hearing and are inaudible to many people; however, some people (especially children) will perceive a high-pitched tone near an operating television CRT.[57] The sound is due to magnetostriction in the magnetic core and periodic movement of windings of the flyback transformer. Compare to the low-frequency noise (50 Hz or 60 Hz) of mains hum."


Cathode ray tube - Wikipedia, the free encyclopedia
Old 4th April 2013
  #15
Lives for gear
 
acreil's Avatar
 

Quote:
Originally Posted by living sounds View Post
Great information, acreil! So would the bad frequency resolution be the same (for coresponding tones) accross the frequency range? I'll have to measure my old AKAIs now.
No, it's worse at high frequencies for divide-by-n systems. I'll post some examples when I get some stuff together.
Old 4th April 2013
  #16
Lives for gear
 
ben_allison's Avatar
Quote:
Originally Posted by living sounds View Post
"CRTs used for television operate with horizontal scanning frequencies of 15,734 Hz (for NTSC systems) or 15,625 Hz (for PAL systems).[56] These frequencies are at the upper range of human hearing and are inaudible to many people; however, some people (especially children) will perceive a high-pitched tone near an operating television CRT.[57] The sound is due to magnetostriction in the magnetic core and periodic movement of windings of the flyback transformer. Compare to the low-frequency noise (50 Hz or 60 Hz) of mains hum."]
Yeah I was always able to tell if someone in my house was watching TV, when I was out front of the house, because of that almost imperceptibly high whistling.
Old 4th April 2013
  #17
Lives for gear
 
wwjd's Avatar
Quote:
Originally Posted by living sounds View Post
The 16khz signal is even present on things completely free of synths, like e.g. movie scores.

It was old CRT (tube) screens, either in the recording room or leaking into the console circuitry in the control room.
While I can hear that 16k tone and it kinda makes sense, I find it hard to believe professional studio engineers missed that sound being there from the mixing to the final release? Maybe old analog could not reproduce 16k as loud as more accurate digital, but still.... engineers didn't suck that bad. At least not when I was doing studio work in the 80s.
I believe there was something else at play than VERY failed mix/master/release allowing CRT noise in.
Old 4th April 2013
  #18
Lives for gear
 
loujudson's Avatar
Thanks for all the guesses, and again sorry if I'm stealing the thread, but it is exactly 16k, and comes and goes with the mix and occurs even on Lps recorded in radio programs produced from Lps on 7.5 IPS 2 track tape. I'm off to a gig today but will find and upload samples later...

I did a bit if TV in the 70s and that flytransformer whine drove me out of it.
Old 4th April 2013
  #19
Lives for gear
 

Quote:
Originally Posted by wwjd View Post
While I can hear that 16k tone and it kinda makes sense, I find it hard to believe professional studio engineers missed that sound being there from the mixing to the final release? Maybe old analog could not reproduce 16k as loud as more accurate digital, but still.... engineers didn't suck that bad. At least not when I was doing studio work in the 80s.
I believe there was something else at play than VERY failed mix/master/release allowing CRT noise in.
Check the mixes, it often matches the CRT frequency perfectly. I've found all kinds of stuff buried in mixes, it's often very audible on my high resolution playback system. For example, there is even some weird high frequency artifact that reminds me of Apogee noise shaping in Michael Jackson's "Earth Song", I can hear it easily here and it shows up in the spectrum analyzer. All kinds of cuts and edits are audible in many, many priced recordings. The point is that performance always trumps minor technical inadequacies. The same goes for workflow in a busy, costly professional analog studio environment.

People were also often mixing on Auratones and similar low-range speakers where it wouldn't show up. In the pre-digital era tools to surgically remove artifacts weren't readily availible.

And even today removing the artifact usually does more harm than keeping it in. I'd rather have that sinewave than the problems created by a steep notch filter. Anyway, it's not audible most of the time. And even when it is the sinewave so high up in the spectrum doesn't affect the listening experience much IMO, even if you can hear that high and are using a high resolution playback system. As long as it's there in the background at a low level. It can become a problem with punches and edits, here's an example where the artifact becomes quite audible (vocal overdubs, I think):

Tasmin Archer 'Ripped Inside' - YouTube

EDIT: Wrong song, the one with the obvious noise punched in is "Hero" from the same album, couldn't find it online, unfortunately.
Old 4th April 2013
  #20
Lives for gear
 
ben_allison's Avatar
Yeah trying to notch that out is likely to create more problems than it fixes.
Old 4th April 2013
  #21
Oli
Lives for gear
 
Oli's Avatar
 

Quick google search shows examples of CRT frequencies in audio recordings, as well as other foreign tones. Would like to see the 16kHz example.

edit - what exactly is meant be jitter in this context?
Old 4th April 2013
  #22
Lives for gear
 
Starspawn's Avatar
 

What about the ensoniqs?
I remember the EPS16+/ASR had some odd optional ways of sampling, one of them simply claimed it would make the sound better I think, and didnt really explain well what it did :D
Though the regular sampling was explained well with diagrams.
Old 4th April 2013
  #23
Lives for gear
 

I love how the S-950 takes a drum loop and pitches it down 1 or 2 octaves. All these beautiful frequencies come into the greater hearing range in a way that the E-64 or any software sampler just can not do. It may be 12 bit but there is NO aliasing to my ears going on.
Old 4th April 2013
  #24
Lives for gear
 

Quote:
Originally Posted by Captain Proton View Post
I love how the S-950 takes a drum loop and pitches it down 1 or 2 octaves. All these beautiful frequencies come into the greater hearing range in a way that the E-64 or any software sampler just can not do. It may be 12 bit but there is NO aliasing to my ears going on.
That's due to the fact that the S950 does not resample in the digital domain to change pitch. Each of the 8 voices has its own DAC, the pitch is changed by adjusting the clock rate of these DACs. It can sound better than software this way, since high quality sample rate conversion (especially real-time) needs a lot of computing power, and the associated digital filtering is not transparent. The S950 has an analog low pass filter (switched-capacity) after the DAC chip to get rid of high frequency artifacts a little below Nyquest, which also doubles as an "effect filter".

12 bit has nothing to do with aliasing but is relevant in regards to quantization noise.
Old 4th April 2013
  #25
Lives for gear
 

Checked out my AKAIs, the S950 was off 3 cents worsed case in the higher registers. However, maximum tranposition range is only 2 octaves.

As predicted the S1000 was pitch perfect.
Old 5th April 2013
  #26
Lives for gear
 
acreil's Avatar
 

Quote:
Originally Posted by Starspawn View Post
What about the ensoniqs?
I've only seen the documentation for the EPS. It's linear interpolation. I think the output sample rate is about 30 kHz with 20 note polyphony (it can be set to 12 or 16 for higher sample rates). Probably the other models are similar.

Quote:
Originally Posted by Oli View Post
edit - what exactly is meant be jitter in this context?
Similar to aliasing. The Fairlight CMI, for example, uses binary rate multipliers (7497) to generate the sample rates. These effectively drop pulses from the incoming clock, introducing jitter. I think it's not appropriate here to call it aliasing when the waveform address is incremented by a constant amount at a jittery sample rate, but the result is more or less the same (spurious sidebands). It's important to note though that I'm referring only to jitter resulting from the sample rate generation process rather than any other non-ideal effects or noise or whatever.

Quote:
Originally Posted by living sounds View Post
As predicted the S1000 was pitch perfect.
The S1000 is supposed to use very good interpolation. But allegedly no one seemed to notice or appreciate it, and they decided to use ****ty linear interpolation in the S2000/S3000.
Old 5th April 2013
  #27
Lives for gear
 
acreil's Avatar
 

I'm going to talk a bit about emulating divide-by-n sample playback since it's free of aliasing. I've attached some sound samples. These are -0 dBFS (LOUD!), so be careful.

Sample #01 plays a sine wave at a range of different resolutions from 1024 down to 2 samples per period. There's no aliasing; all the added high harmonics are images. I think the range of 32 to 256 samples per period is most interesting.

For the next samples I'll use a resolution of 64 samples per period, playing pitch sweeps from C1 to C8 (about 33 to 4186 Hz). For 64 samples per period, you have control over the first 32 harmonics, with the higher harmonics being image frequencies. The fundamental frequency re-appears at harmonics 63, 65, 127, 129, etc. and the 31st harmonic is mirrored at harmonics 33, 95, 97...

The sample rate is 48 kHz. That's far lower than any of the good quality systems I mentioned previously, so the defects are exaggerated.

Sample #02 uses drop-sample interpolation. The image frequencies are audible, but there's a large amount of aliasing too. And this is just a sine wave; other waveforms would be much worse.

Sample #03 uses 4 point cubic Lagrange interpolation, more or less representative of modern samplers and software. This isn't a fantastic interpolation algorithm, but it's already enough to mostly eliminate the image frequencies.

Sample #04 simulates a divide-by-n architecture. It's restricting playback frequencies to integer divisions of the sample rate. The frequency resolution is terrible, but there's no aliasing.

I'll follow up with some improvements.
Attached Files

01_sine_resolution.mp3 (393.8 KB, 6295 views)

02_sine_64_drop.mp3 (1.15 MB, 6353 views)

03_sine_64_interp.mp3 (1.15 MB, 6323 views)

04_sine_64_divide.mp3 (1.15 MB, 6317 views)

Old 6th April 2013
  #28
Lives for gear
 

By "fixed HF clock, divide-by-n", you mean that, given an 8 MHz clock and a desired output frequency of 8 KHz, a sample is output every 1000 clock ticks?
Old 6th April 2013
  #29
Lives for gear
 
acreil's Avatar
 

Quote:
Originally Posted by niklasni1 View Post
By "fixed HF clock, divide-by-n", you mean that, given an 8 MHz clock and a desired output frequency of 8 KHz, a sample is output every 1000 clock ticks?
Yes. There's a counter (Intel 8253 or similar) that's set to some value according to the desired playback rate; it counts down by one for each clock pulse. When it reaches zero it sends an output pulse and resets the counter value. So it's a constant increment, variable modulus counter. And that's why the frequency resolution is limited, because the counter modulus can only be 999, 1000, 1001, etc.

I can steal a diagram from a Synclavier patent (4108035) to illustrate the other case (phase accumulator or fractional-n divider: variable increment, constant modulus). The Synclavier actually isn't the best example since it uses a really odd design with a variable increment AND variable modulus, but the diagram works well enough if you just pay attention to the "remainder" part and the spacing between the output pulses (7, 6, and 6 clock pulses). An output pulse is generated when the counter overflows, but it keeps a remainder rather than resetting to 0. This means that the generated pulses don't come at a constant rate (i.e. there's jitter). There's an integer number of clock pulses between each output pulse, but the jitter means that the long term average frequency is more accurate than the divide-by-n case. You can compare this to dithering.

The difference between a fractional-n frequency divider (as in the Synclavier) and a phase accumulator (which is much more common) is that the divider uses the clock pulse to advance one sample while the phase accumulator uses the accumulator contents to directly address the waveform.
Attached Thumbnails
taxonomy of early digital synthesizers-synclavier.png  
Old 6th April 2013
  #30
Lives for gear
 
acreil's Avatar
 

It's clear in samples #02 and #04 that a 48 kHz sample rate isn't nearly high enough for good sound quality (and while #03 sounds ok, it's not what I'm going for). But for divide-by-n playback, how much oversampling would be needed for acceptable pitch resolution? Given that the waveform resolution is 64 samples per period, the playback sample rate at C8 is about 268 kHz. For 1024x oversampling (which is absurdly high; about 49 MHz), the pitch resolution at C8 is about 9.4 cents.

The PPG Wave 2 gets around this by incrementing the waveform index by 2, 4, 8, etc. samples for higher octaves. This introduces harmonic distortion (folding the higher harmonics down on top of the lower harmonics), but improves the frequency resolution for higher octaves. Sample #05 demonstrates this. At the highest frequency there are only 2 samples per period, so the highest required playback rate is only about 8372 Hz. This can achieve the same pitch resolution of 9.4 cents with only 32x oversampling (about 1.5 MHz). But this is still computationally expensive.

Frequency resolution can be improved further if more harmonic distortion is tolerated. Divide-by-n systems constrain playback frequencies such that each playback sample period is an integer number of clock pulses. But actually inharmonic aliasing is still avoided if the total waveform period is constrained to an integer number of samples. In this case, aliasing is reflected back down onto the harmonic series. The reproduced waveform isn't necessarily exactly what it's supposed to be (for instance the duty cycle of a square wave may be distorted, introducing even harmonics), but it's still harmonic. This is illustrated in Sample #06. This actually sounds like the aliasing in Sample #02, but that's only because the frequency is swept. At any fixed frequency there are no inharmonic tones, but the harmonic distortion varies as a function of frequency. It's sort of "quantized aliasing".

The two approaches can be sort of combined so that the frequency resolution is the same for each octave, and harmonic distortion is minimized (sample #07 and corresponding spectrogram). In addition to constraining the playback frequencies such that there are an integer number of samples per waveform period, it's also possible to maintain an integer number of samples per half period, or some other fraction of the total period. This correspondingly reduces frequency resolution, but it's useful as it allows the waveforms in the lower octaves to be more accurate. It's also good for looped samples, as those may have multiple periods in a single loop.

This permits an adjustable tradeoff. If the frequency is constrained to an integer number of samples every two consecutive periods, this adds a subharmonic for the highest octave (visible on the bottom right of the spectrogram but not particularly audible) but doubles the frequency resolution. It's still awful (about 77 cents at worst) for 48 kHz, but it allows the same 9.4 cent resolution to be achieved with only 8x oversampling (384 kHz). That's not too bad.
Attached Thumbnails
taxonomy of early digital synthesizers-07_spectrogram.jpg  
Attached Files

05_sine_64_downsample.mp3 (1.15 MB, 6223 views)

06_sine_64_integer.mp3 (1.15 MB, 6169 views)

07_sine_64_uniform_integer.mp3 (1.15 MB, 6182 views)

๐Ÿ“ Reply

Similar Threads

Thread / Thread Starter Replies / Views Last Post
replies: 1623 views: 209084
Avatar for UnderTheStairs
UnderTheStairs 2 weeks ago
replies: 106 views: 15284
Avatar for Deleted 7a792f4
Deleted 7a792f4 28th February 2019
replies: 54 views: 11965
Avatar for TNC
TNC 3rd October 2017
replies: 990 views: 67229
Avatar for psionic11
psionic11 2 weeks ago
Topic:
Post Reply

Welcome to the Gearslutz Pro Audio Community!

Registration benefits include:
  • The ability to reply to and create new discussions
  • Access to members-only giveaways & competitions
  • Interact with VIP industry experts in our guest Q&As
  • Access to members-only sub forum discussions
  • Access to members-only Chat Room
  • Get INSTANT ACCESS to the world's best private pro audio Classifieds for only USD $20/year
  • Promote your eBay auctions and Reverb.com listings for free
  • Remove this message!
You need an account to post a reply. Create a username and password below and an account will be created and your post entered.


 
 
Slide to join now Processing…
๐Ÿ–จ๏ธ Show Printable Version
โœ‰๏ธ Email this Page
๐Ÿ” Search thread
๐ŸŽ™๏ธ View mentioned gear
Forum Jump
Forum Jump