The No.1 Website for Pro Audio
 Search This Thread  Search This Forum  Search Reviews  Search Gear Database  Search Gear for sale  Search Gearslutz Go Advanced
Good dither practices, what are yours? Dynamics Plugins
Old 11th February 2019
  #661
Lives for gear
 

Quote:
Originally Posted by chrisj View Post
Thank you so I'd like to ask, most specifically:

Did you scale the dither by the exponent (so that for any area of sustained mantissa the dither accurately linearizes it)? Or do you mean that you picked a dither level, perhaps for 24 bit, and used that?
This is a good question with possibly a complex answer. I'm sorry if these replies might get long, but I do feel people may be misunderstanding some of it - because it's a pain in the arse to visualise it. It's particularly difficult to visualise it if coming from the most popular 'bitwise' perception of it all.

It's necessary to think of it in terms of numerical values rather than 'bits' to grasp it successfully - because in fact all the samples in a data stream are in fact only numerical values - until they eventually arrive in the real world as signals.

The quick answer is that the dither is added in the float domain - such that the result will produce the least distortion when the signal is truncated to the target 24 or 16bit fixed point domain in the real world.

Here follows the explanation as to why::

In a fixed point data stream the error due to quantisation over the real world whole range is constant and fixed - each value is subject to the same constant and fixed 'lattice' error in the output. It only has significant bits and no exponent scaling This means that we can completely remove the harmonic error by statistically randomising the boundaries of the representation with dither and achieve complete decorrelation between the signal and the randomised error - because the boundaries do not change their significance in the grand overall scheme of things :-)

But in float representation the error mechanism is different. In this case each value however large or small is represented by constant significant term (the mantissa) at the same precision. E.g. 24 significant bits. This is why it can be scaled up and down - without loss of accuracy. It is a convenient numerical format - but it is not a signal format until it's converted back into the real world. The universe is not a floating point entity :-)

What this means is that we cannot simply dither the mantissa on each value (the accuracy part) because its error contribution changes dynamically depending on where it is in the scaling range. I.e. adding noise to the mantissa will cause the noise to be correlated with the signal - and will therefore be yet another unwanted 'error' when its converted back into the real world. In concept - adding anything at all to the mantissa voids the whole point of a floating point representation - because it constitutes a random 'error' in the math representation itself.

BTW - this is why adding dither to conversions between 64 to 32 bit floating data is a bad idea and gets you nothing of advantage..

Ok - so clearly floating point data is not ideal for an audio signal representation in the real world - because it can't be harmonically perfectible in real world signal. So what do we do about it?

Firstly and most luckily, in the real world all signals have a known scale in order to be of any use to us. So in our case we have flat out full signal range which is known (+/-1.0; 0dBfs) and a noise floor, which want to keep as low as possible - and a distortion figure (harmonic inaccuracy) which we want to do our best to minimise. We have a real scale in the real world within which we operate - which we call signal :-)

Ok, so looking at the existing error in say a 32bit float data stream with a 24bit mantissa all we can say is that in any full level signal which is constrained to our range +/-1.0 scaling there will be an element of error at around -140dB down (due to the mantissa length) in the largest sample values, the significance of which will depend on how large each sample is in respect to the overall result..

It's complex, but the overall result for constant flat out signals is around the performance of 24bit fixed point, with everything less than flat out being somewhat (and usefully) better..

Now since we cannot change all the above, the best thing we can do is to minimise the distortion of the conversion to real world signal by dithering that.. Hence the dither on the output of some plugins :-)


Ok sorry to make this post even longer - but I must comment on the effect of FP within DAWs and how errors may or may not stack up. Some people have asked this::

It's best to imagine that a DAW is basically a socking great math processor, with stuff being passed between processes and results being added up in various proportions and combinations in various orders.
At first sight it all looks like a great bucket of potential errors building up relentlessly, where it all gets worse and worse the more stuff you do with it - until it all sounds like crap - as we would naturally expect in the real world?

However - in every single processing operation (maths function, storage whatever) and in every single data transfer, the accuracy of each and every sample value is passed with the same precision (i.e no loss of accuracy) - however large and small such values are - and however much they are passed around - and however large and convoluted the whole thing becomes.

This means that unless it's busted, there is actually no possibility of further degradation throughout the whole thing - provided everything stays in the host processor until the end!!

Now this is the real power of non-real world number representations like floating point, because absolutely no fixed point math or real world physical system can ever achieve this - ever...!!

If your mix is sounding bad the more stuff you put into it, it may be because the stuff itself is flawed, inappropriate and/or low quality processing. But it sure as hell isn't because of the precision of the system itself..
Old 11th February 2019
  #662
Lives for gear
Quote:
Originally Posted by Paul Frindle View Post
This is a good question with possibly a complex answer. I'm sorry if these replies might get long, but I do feel people may be misunderstanding some of it - because it's a pain in the arse to visualise it. It's particularly difficult to visualise it if coming from the most popular 'bitwise' perception of it all.

It's necessary to think of it in terms of numerical values rather than 'bits' to grasp it successfully - because in fact all the samples in a data stream are in fact only numerical values - until they eventually arrive in the real world as signals.

The quick answer is that the dither is added in the float domain - such that the result will produce the least distortion when the signal is truncated to the target 24 or 16bit fixed point domain in the real world.

Here follows the explanation as to why::

In a fixed point data stream the error due to quantisation over the real world whole range is constant and fixed - each value is subject to the same constant and fixed 'lattice' error in the output. It only has significant bits and no exponent scaling This means that we can completely remove the harmonic error by statistically randomising the boundaries of the representation with dither and achieve complete decorrelation between the signal and the randomised error - because the boundaries do not change their significance in the grand overall scheme of things :-)

But in float representation the error mechanism is different. In this case each value however large or small is represented by constant significant term (the mantissa) at the same precision. E.g. 24 significant bits. This is why it can be scaled up and down - without loss of accuracy. It is a convenient numerical format - but it is not a signal format until it's converted back into the real world. The universe is not a floating point entity :-)

What this means is that we cannot simply dither the mantissa on each value (the accuracy part) because its error contribution changes dynamically depending on where it is in the scaling range. I.e. adding noise to the mantissa will cause the noise to be correlated with the signal - and will therefore be yet another unwanted 'error' when its converted back into the real world. In concept - adding anything at all to the mantissa voids the whole point of a floating point representation - because it constitutes a random 'error' in the math representation itself.

BTW - this is why adding dither to conversions between 64 to 32 bit floating data is a bad idea and gets you nothing of advantage..

Ok - so clearly floating point data is not ideal for an audio signal representation in the real world - because it can't be harmonically perfectible in real world signal. So what do we do about it?

Firstly and most luckily, in the real world all signals have a known scale in order to be of any use to us. So in our case we have flat out full signal range which is known (+/-1.0; 0dBfs) and a noise floor, which want to keep as low as possible - and a distortion figure (harmonic inaccuracy) which we want to do our best to minimise. We have a real scale in the real world within which we operate - which we call signal :-)

Ok, so looking at the existing error in say a 32bit float data stream with a 24bit mantissa all we can say is that in any full level signal which is constrained to our range +/-1.0 scaling there will be an element of error at around -140dB down (due to the mantissa length) in the largest sample values, the significance of which will depend on how large each sample is in respect to the overall result..

It's complex, but the overall result for constant flat out signals is around the performance of 24bit fixed point, with everything less than flat out being somewhat (and usefully) better..

Now since we cannot change all the above, the best thing we can do is to minimise the distortion of the conversion to real world signal by dithering that.. Hence the dither on the output of some plugins :-)


Ok sorry to make this post even longer - but I must comment on the effect of FP within DAWs and how errors may or may not stack up. Some people have asked this::

It's best to imagine that a DAW is basically a socking great math processor, with stuff being passed between processes and results being added up in various proportions and combinations in various orders.
At first sight it all looks like a great bucket of potential errors building up relentlessly, where it all gets worse and worse the more stuff you do with it - until it all sounds like crap - as we would naturally expect in the real world?

However - in every single processing operation (maths function, storage whatever) and in every single data transfer, the accuracy of each and every sample value is passed with the same precision (i.e no loss of accuracy) - however large and small such values are - and however much they are passed around - and however large and convoluted the whole thing becomes.

This means that unless it's busted, there is actually no possibility of further degradation throughout the whole thing - provided everything stays in the host processor until the end!!

Now this is the real power of non-real world number representations like floating point, because absolutely no fixed point math or real world physical system can ever achieve this - ever...!!

If your mix is sounding bad the more stuff you put into it, it may be because the stuff itself is flawed, inappropriate and/or low quality processing. But it sure as hell isn't because of the precision of the system itself..
OK I 'think' this info is making at least 50% sense to my very non mathematical brain. So basically in a practical application are you saying that if in 32 bit floating and staying fully ITB, there is no need to dither internally (assuming plugins working correctly) until the very end at print and monitoring. You would dither the bounced tracks?

If you are in 32 bit floating but then sending stuff out to analog gear (after changing gain or adding its processing) and summing on a console, you would dither the output of each track because destined for analog real world? Am I following you or way off? If u are in that situation, would it be better to work in 24 bit session or is 32 bit floating the same.

Thanks for your expertise....
Old 11th February 2019
  #663
Lives for gear
 

Quote:
Originally Posted by Moondog007 View Post
OK I 'think' this info is making at least 50% sense to my very non mathematical brain. So basically in a practical application are you saying that if in 32 bit floating and staying fully ITB, there is no need to dither internally (assuming plugins working correctly) until the very end at print and monitoring. You would dither the bounced tracks?
Yes in essence that is true. Plugins that use dither destined for 24bit fixed point will still pass on a 32bit float result to following processing. In this case the point of the dither is to establish a legal dynamic range (i.e. compressors, limiters and other dynamic processors). So for instance in our DSM the dither noise signal at 24bits establishes the suitable dynamic range - and - produces dither for 24bit fixed point if stuff goes out to the real world.

Quote:
If you are in 32 bit floating but then sending stuff out to analog gear (after changing gain or adding its processing) and summing on a console, you would dither the output of each track because destined for analog real world? Am I following you or way off? If u are in that situation, would it be better to work in 24 bit session or is 32 bit floating the same.

Thanks for your expertise....
Yes.

IF you go out to the analogue world (or digtial AES spdif world) you will be forcibly decoding the floating point data to fixed point -and all the advantages of floating point are lost. So in every case dither should be required to clobber the truncation distortion of the fixed point destination.
Every time you fly out to external gear and back in again accuracy will be lost. True conformity is only preserved for stuff running in the host processor..

Last edited by Paul Frindle; 11th February 2019 at 11:59 PM..
Old 12th February 2019
  #664
Airwindows
 
chrisj's Avatar
Quote:
Originally Posted by Paul Frindle View Post
Yes in essence that is true. Plugins that use dither destined for 24but fixed point will still pass on a 32bit float result to following processing. In this case the point of the dither is to establish a legal dynamic range (i.e. compressors, limiters and other dynamic processors). So for instance in our DSM the dither noise signal at 24bits establishes the suitable dynamic range - and - produces dither for 24bit fixed point if stuff goes out to the real world.

Yes.

IF you got out to the analogue world (or digtial AES spdif world) you will be forcibly decoding the floating point data to fixed point -and all the advantages of floating point are lost. So in every case dither should be required to clobber the truncation distortion of the fixed point destination.
Every time you fly out to external gear and back in again accuracy will be lost. True conformity is only preserved for stuff running in the host processor..
I see. You've answered my question (didn't end up being that complicated really) and I'm rather pleased, 'cos I was entirely prepared to assume you'd thought up every thought I had, first and better

So Bob wasn't quite correct in what he in turn thought. You weren't using scaled dither values according to the exponent: you're using a set value that would correctly dither 24 bit, and failing that it's setting a default noise level. It's a different choice from trying to dither the mantissa for the floating point (since it is in fact a tiny chunk of fixed point, at 256 distinct loudnesses for 32 bit floats, with a sign bit)

Like I said in post #630, I feel that since there isn't such a thing as 'self dithering' by simply making the noise louder (and I feel that since I had that experience of effortlessly ABXing bitcrushed pure noise right up until I TPDF dithered that same pure noise and found it impossible to tell the difference any more), and since I can demonstrate behavior that's just like the difference between dithered and undithered (run any test tone through DitherFloat and use the top control to add a huge offset before assigning the result to a float and then subtracting the same huge offset again, and you can hear the truncation and the dithered version… the plugin is open source and very very simple inside) I'm going to say that this claim:

"you can't dither to a floating point number"

no matter who makes it, is subject to re-evaluation.

Dither is a particular application of ideally two random sources at the exact amplitude of your fixed point system (that can be used in a highpassing configuration, thanks Paul for that lovely refinement that saves us an extra rand() call) which entirely removes truncation artifacts for that fixed point system.

If you use a nice low frequency sine that will show truncation artifacts as a characteristic buzz on the sine, you can try this yourself with DitherFloat, and see that you can indeed dither to a floating point number. (nobody sensible has ever suggested that you can reliably hear this on a 32 bit float; I think some people are fugitively hearing the loudest, roughly 25 bit outside quarters of the waveform being changed and getting all carried away. My contention is that dither is always better than truncation under all circumnstances, whether you hear it that time or not, and I updated my entire library in accordance with what I thought was right, so all the Airwindows plugins now dither to floating point buss.)

As a useful thought experiment (never mind whether my experience with unwanted DC offsets predisposes me to this thought experiment! ) consider this: you don't have to have an oscillating waveform centered around 0 to have dither. You can have a floating point output that is only a single, straight line that forever increases. That's an output too.

You can use DitherFloat to examine what would happen to your straight line (ever increasing by infinitesimal amounts per sample), and exactly the same principles apply as they do to waveforms centered around floating point 0.0.

With the increasing straight line, you will gradually drift into much larger numbers where the exponent's scaling the mantissa right up. As you do, you get 'stairsteps' from truncation (reconstructed audio output does NOT have stairsteps but this is what happens to data when you truncate to a fixed point format on low frequency events! Then the audio output is a bandlimited version of the 'stairsteps' with finite slew rate limited to the Nyquist theorem's dictates).

And if you used my DitherFloat or any alternate way to apply TPDF dither scaled correctly to the number's exponent, you will get, every step of the way, a linearized output that is the original input plus a background noise. Filter it, and you get the original ever-increasing line again, to an accuracy set by how intensely you filtered out high frequency material. You can only do that with correctly applied dither, but it works on straight lines just as well as on waveforms.

You absolutely can apply floating point dither and it works as dither. The variance in amplitude (which can be so striking as to seem like its own random source) is correlated to the EXPONENT, not the mantissa or the waveform being transmitted. This isn't as good as showing no fluctuation at all, but on the other hand it means your dither (linearizing) noise gets right out of the way for nearly all sample values in musical data, so it's like you get the linearizing effect you want from dither but for free. You don't have to have a set noise floor, you can have a noise floor scaled to what is needed to linearize changes in the mantissa.

There's a demo, and any low frequency sine run through that demo will tell the whole story. The source is open. In fact the strictly dither-related part is public domain (Unlicense). Honest, Paul, it really does work, try it.
Old 12th February 2019
  #665
Motown legend
 
Bob Olhsson's Avatar
 

Thanks, Paul and Chris for the clarification.
Old 12th February 2019
  #666
Lives for gear
 

Quote:
Originally Posted by chrisj View Post
Cut for brevity only.

There's a demo, and any low frequency sine run through that demo will tell the whole story. The source is open. In fact the strictly dither-related part is public domain (Unlicense). Honest, Paul, it really does work, try it.
Yes - one can do that, but the dither is then modulated by the signal.. This isn't something I would want to adopt to be honest.

It can appear to be stable if you simply add a large DC offset to the signal in order to force a change in the exponent.

How do you feel this is advantageous within a normal signal range up to +/-1 flat out, where the greatest peak errors are -140dB down?
Old 12th February 2019
  #667
Airwindows
 
chrisj's Avatar
Quote:
Originally Posted by Paul Frindle View Post
Yes - one can do that, but the dither is then modulated by the signal.. This isn't something I would want to adopt to be honest.

It can appear to be stable if you simply add a large DC offset to the signal in order to force a change in the exponent.

How do you feel this is advantageous within a normal signal range up to +/-1 flat out, where the greatest peak errors are -140dB down?
I feel that the exponent is constantly changing. Every 6db you go to a finer (or coarser) granularity on the mantissa, by design. So I don't want a loud noise component when the mantissa is actually more than fine enough to handle very subtle changes (far far more than -140 dB down), but at times when the peak errors are much higher in level relative to +/-1 (which is generally what we're dealing with) I do want that stuff dithered. I would say the quantization noise is also modulated by the signal (because it's scaled by the exponent, just like the dither) and so the only correct answer is to dither scaled to the exponent.

It's there to entirely remove quantization noise. I can do that (without a fixed noise floor, even). In practical terms I'm not concerned with variations in this noise level because (a) it does work to remove truncation artifacts and linearize signals, particularly low frequency signals that spend continuous time near +1 or -1 as they contain sine-like components, and (b) when scaling the dither to match the exponent, it stays more than -140dB below the signal in ALL conditions. If you compress up the faintest quietnesses, they will be dithered but the dither will always be that much quieter than the signal. If you get a +6dB noise that suddenly produces higher dither levels (and the way I've got it, it retains the scaling for the subsequent highpassed TPDF component, so it always highpasses optimally) then the sound you made will tend to desensitize your ears and mask the dithering noise just when it's become significantly louder.

That part would be emphatically asserted by the lossy-compression guys, who'll insist that none of us can hear any of this and nobody should dither at all. I just like handling wordlength differently and get a sound accordingly.

So the answer is, I'm not concerned whether it's a stable noise. The quantization noise itself is also characteristic, and that isn't stable either. It'll generate known artifacts (folks like me and Bob Ohlsson peculiarly hate truncation artifacts, 'stairsteps' on low frequency phenomena that aren't supposed to be there and are periodic) and at floating point these artifacts aren't at stable levels at all.

Since I can successfully and exactly remove those truncation artifacts (even when they're at all different amplitudes) I choose to do so, and consider it an improvement over not removing the artifacts I feel over the long run it will end up benefitting the sound, to make that choice. Otherwise you are either progressively incurring quantization artifacts on the raw audio content, or incurring quantization artifacts on an artificial noise floor (and in my experience I have heard quantization on raw noise, unless it was dithered). I would prefer (do prefer: I've acted on this) to dither away the artifacts at whatever level they present themselves.
Old 12th February 2019
  #668
Very interesting to hear Paul Frindle's take on this. It's entirely consistent with best understanding at the time he was making those choices for Sony.

Paul's approach does make some assumptions about the eventual output scaling. These assumptions were reasonably safe at the time he made them, but I don't think they are safe now that some audio data "escapes" in one or another 32-bit form. Today we need to leave output dithering to whomever is doing the actual conversion to fixed point. That may still be us, if we're monitoring at 24-bit or burning a 16-bit CD reference. But when sending floating-point files to a mastering engineer (mine takes them), then don't dither them to 24 bits. Nor do I think it's safe to assume that all software uses 1.0 for full scale. It's the most common choice, but I remember encountering audio software that used a different full-scale reference. Folks writing file interchange code had to account for this.

I tend to agree with Chris's argument that the signal-noise correlation introduced by exponentially-scaled dither is too far below the instantaneous signal level to warrant concern. But if that's true, then nothing is gained by using TPDF dither in the exponentially-scaled case -- rectangular PDF dither should be adequate.

Last edited by David Rick; 12th February 2019 at 03:01 AM.. Reason: add: not all code uses 1.0 full scale
Old 12th February 2019
  #669
Airwindows
 
chrisj's Avatar
Quote:
Originally Posted by David Rick View Post
I tend to agree with Chris's argument that the signal-noise correlation introduced by exponentially-scaled dither is too far below the instantaneous signal level to warrant concern. But if that's true, then nothing is gained by using TPDF dither in the exponentially-scaled case -- rectangular PDF dither should be adequate.
That's a really good point, actually. The second moment of dithering from TPDF is all about preventing the audio data from modulating the noise level, and if the noise level's going to be modulating for entirely separate reasons…

I have a feeling that TPDF brings more than that… that the reason it doesn't modulate on audio data is that it's doing a better job of producing linearity, something that's still applicible.

The real reason I'm not going 'awk!' and revising 150 plugins again, though, is that if I choose TPDF I can get a very tidy little highpassed dither going on. But I think you're right and I'd overlooked that: much of the reason for choosing TPDF isn't relevant in that case. Got me there

In the event that I do another mega update due to finding a much faster rand() to use (a planned improvement) I'll have to look hard at using flat dither, because your observation is incredibly relevant.
Old 12th February 2019
  #670
Lives for gear
 

Quote:
Originally Posted by chrisj View Post
Cut for brevity.

Since I can successfully and exactly remove those truncation artifacts (even when they're at all different amplitudes) I choose to do so, and consider it an improvement over not removing the artifacts I feel over the long run it will end up benefitting the sound, to make that choice. Otherwise you are either progressively incurring quantization artifacts on the raw audio content, or incurring quantization artifacts on an artificial noise floor (and in my experience I have heard quantization on raw noise, unless it was dithered). I would prefer (do prefer: I've acted on this) to dither away the artifacts at whatever level they present themselves.
You must of course do what you think is best :-)

But for me I would definitely not interfere with the floating point math concept and action - until it gets converted to fixed point - for all the reasons in my post, not least of all the last comment about accuracy throughout the whole DAW.

So now I have to go away and ask myself what effect this may have on our own plugin product - if you guys start spraying random inconsistencies around the place within your DAW - because you reckon you can hear truncation errors on full scale signals which are smaller than -140dB below the programme? Another day - another challenge, it keeps us on our toes :-)

Thanks for the heads up.. :-)
Old 12th February 2019
  #671
Airwindows
 
chrisj's Avatar
Quote:
Originally Posted by Paul Frindle View Post
But in float representation the error mechanism is different. In this case each value however large or small is represented by constant significant term (the mantissa) at the same precision. E.g. 24 significant bits. This is why it can be scaled up and down - without loss of accuracy. It is a convenient numerical format - but it is not a signal format until it's converted back into the real world. The universe is not a floating point entity :-)

What this means is that we cannot simply dither the mantissa on each value (the accuracy part) because its error contribution changes dynamically depending on where it is in the scaling range. I.e. adding noise to the mantissa will cause the noise to be correlated with the signal - and will therefore be yet another unwanted 'error' when its converted back into the real world. In concept - adding anything at all to the mantissa voids the whole point of a floating point representation - because it constitutes a random 'error' in the math representation itself.
In what way does this not describe fixed point itself?

I'm not going to worry about the objections of stinkyfingers: this is the full explanation I got, and you can decide for yourself if 'the residual of dither is louder, therefore it is wrong' is a good argument.
Quote:
Dither float is too loud, bright, and dirty.
I will post some proof(s) when I get my internet back...

*when compared to Reaper rendering a 64 bf to 32 bf track/file, Reaper’s “noise floor” is lower and cleaner than DF. (by at least 1 bit, often more)

**this is very easy to test/prove for anyone, just do a null test with 64/32 bf files. Measure/listen to residual.
That's from the last page of the Ultimate Plugin Analysis thread, and I'm glad I found it because stinky wasn't the only one in that discussion. I'd said "Dither is always louder than truncation, but truncation is strongly patterned (even on floating point mantissas) and dither is not. Try with a sine" and sleepcircle did just that and posted a comparison of residuals. I think this will be of interest to you, Paul, in your apparent worries that my dithering will break and hurt your plugins and the world (I simply don't get why you went there in a world where other vendors are putting out noise generators built into their plugins for 'analog simulation' purposes that can't be switched off, thinking of Slate more than Sony there)

This is a hugely amplified wave file of residuals only, comparing Reaper's raw truncation to 32 bit float with the output of DitherFloat run with no offset. That's what sleepcircle did. It's significant to the extent that each truncation to 32 bit is significant (i.e. not much: you were saying there was no significance at all hence any alteration to the floating point was undesirable).

I find it's a mixed bag and could perhaps stand improvement, such as with David Rick's observations: the 30 hz sine test (a very real-world type of audio) produces extremely objectionable artifacts when truncated, characteristic of fixed point truncation. The DitherFloat version is plainly more noiselike, though the modulation is apparent (I don't know if I would behave like it was wrecking the joint, considering that the truncated version is a comparable loudness and way more annoying). Then, a sine sweep interacts with the highpass in DitherFloat, causing undesirable birdies compared to the cleaner behavior of the raw truncation: again, this is amplified residuals and I'd contrast the behavior with the behavior on the 30 hz tone.

Stinkyfingers did post a picture explaining his emphatic conclusion: he's made a special test tone of a pulse wave constructed out of Fibonnaci sequences, to place more activity at lower bits in a patterned way. Sleepcircle added,
Quote:
EDIT: Yep, looks like f144_64.wav results in nearly the exact same situation with 24-bit truncation vs a very very standard 24-bit TPDF dither. Partly broken up by noise but not completely gone. I'm going to try 1664hz.wav next.

EDIT: Yup. 1664hz.wav gives a regular 24-bit TPDF dither the same problem, and also results in strangely dither-like noisy residuals with pure 24-bit truncation. Looks like it's not so much a problem with 32-bit floating point dither as it is a problem with any kind of dither + those specific tones.
The picture of Stinkyfingers' special tone contrasting 32 bit truncation with DitherFloat is also available: his conclusion is that because highpass dithering this Fibonacci tone causes the highs to be louder, therefore DitherFloat is broken forever, and so he's been posting 'I tested this extensively and it's totally broken and doesn't work at all', more than once, without showing his work. To which I would respond, if you like all artifacts 1K and lower to be 6 to 12 db louder than the dithered version as well as periodic according to your own posted picture, that's your privilege, but I would prefer not to see those low frequency spikes that clearly illustrate objectionable digital distortions: stuff that floating point dither imperfectly but completely removes. (imperfectly, in that these louder truncation distortions are replaced by noise just as they are in fixed point/TPDF, but it's not at all uniform noise and that is a handicap)

On the other hand, by design the less-than-ideal noise is also going to be far quieter than fixed-point dithering noise literally all of the time for signals not exceeding +/-1.0, so calling out the fire brigade seems excessive.

Everyone is still advocating dithering to 16 bit for CD output. Some (such as myself and Bob Ohlsson and, I think, more than a few others) are happy to generalize that to all fixed point formats including 24 bit. Some think self-dithering is a thing, others do not… and some are really put out by the notion of dithering to 32 bit, but I would have you remember that it is directly comparable (for instance with residual tests) to the loudness of 32 bit float truncation, so if you're not worried about the one and consider it insignificant or absent, you've got no grounds for worrying at all about what dithering would to do it. It's all at very similar amplitudes, so the extent to which you're even worried about what I'm doing is the extent to which you're acknowledging the existing truncation is a thing. Individual instances of this can't be observed by listening. If it builds up to become a problem, so will the truncation and that's just what some of us are saying.
Attached Thumbnails
Good dither practices, what are yours?-796810d1549106806-lets-do-ultimate-plugin-analysis-thread-er.gif  
Attached Files

32bit residual comparison.mp3 (3.19 MB, 1014 views)


Last edited by chrisj; 12th February 2019 at 12:01 PM..
Old 12th February 2019
  #672
Lives for gear
 
stinkyfingers's Avatar
 



whatever...it's no surprise you don't understand. if you did, then you would...
Old 12th February 2019
  #673
Gear Guru
 
Karloff70's Avatar
 

Wow, this is getting even better again. Thank you Paul for sharing your knowledge here so comprehensively!!

I had a feeling that what would come back after asking the opening question would probably make me change my habits but this thread went off and grew massive legs. More learning than I have done on gs for ages. Thanks all you geniuses for spelling out the underlying dithering realities which it turns out rather fundamentally affect the whole digital audio experience and its real world outcomes.
Old 12th February 2019
  #674
Gear Maniac
 

i made a little experiment today.

in reaper i set up a tonegenerator that outputs 64bit floatingpoint. i set the lowest amplitude i could in the generator to see what happens towards the LSB.
i ran that through a Limiter no6 unit with all the modules bypassed to act as a 32bitFP junction.

i then tried with and without dither before the plugin and also with out anything.

here are some pics.
one without anything prior to the Limiter No6 plug.
one is with reapers 32bit tpdf
and one is with DitherFloat

i also did the same with higher amplitude from the generator. the results were similar but with lower IMD around the fundamental.
Attached Thumbnails
Good dither practices, what are yours?-low64bfp-32bfp.jpg   Good dither practices, what are yours?-low64bfp-reaper-32bfp.jpg   Good dither practices, what are yours?-lowg4bfp-ditherfloat-32bfp.jpg  

Last edited by 5.333V; 12th February 2019 at 08:28 PM..
Old 12th February 2019
  #675
Airwindows
 
chrisj's Avatar
Quote:
Originally Posted by 5.333V View Post
i made a little experiment today.
Seems right. It dithers away the truncation artifacts (shown in picture 1 as that pseudo-noisefloor made up entirely of artifacts) while producing a few quirks from amplitude shifts: pretty sure those come from the transitions between exponent levels and can't be removed. Completely removes sidebands around the test tone. It looks to me like your pictures depict (a) truncation artifacts, (b) fixed point dither, (c) floating point dither. It also looks like on the whole the FP dithered output is quieter and evener than the truncation behavior, and both are way quieter than the fixed point dither.

Also seems like if you do incur transitions between exponent levels those will turn into harmonic (not inharmonic) distortion, as the remaining stuff is distinctly harmonically related to the test tone. I'd bet you that the distribution of these harmonics have everything to do with amplitude of the test sine, and that they stay harmonically related as you change the pitch of the tone.

Checks out. I'd call that a successful experiment
Old 12th February 2019
  #676
Gear Maniac
 

yes you are right in every point.

the distortions in the fixed tpdf is like 70-80dbs higher than ditherfloat.
and as you say, the distortions are pure harmonics. exclusivly even harmonics. its so low that its rediculus to compare but i would trade imd for harmonic distortion any day!
the scale in my pics dont mean much because i have gaind aloooot with bitshiftgain after the limiter no6, just so i could get a reading.

dither float does a good job in 24bit dither prior to 24trunc aswell.

i havnt made any listening tests yet though..

left and right is uncorelated aswell, just like we want it. aint that right Bob?
Old 12th February 2019
  #677
Airwindows
 
chrisj's Avatar
Quote:
Originally Posted by 5.333V View Post
dither float does a good job in 24bit dither prior to 24trunc aswell
Uh nope, it doesn't do that at all
Old 13th February 2019
  #678
Gear Maniac
 

maybe im interpreting this the wrong way then.

i did a similar test but this time i truncated that 64bitFP sinewave directly to 24BitFixd.

first pic is direct truncation
next is with ditherfloat
and last i did a reference with reaper dither.
Attached Thumbnails
Good dither practices, what are yours?-64bfp-24bfxd.jpg   Good dither practices, what are yours?-64bfp-ditherfloat-24bfxd.jpg   Good dither practices, what are yours?-64bfp-reaper-24bfxd.jpg  
Old 13th February 2019
  #679
Gear Nut
 

Quote:
Originally Posted by stinkyfingers View Post


whatever...it's no surprise you don't understand. if you did, then you would...
surely, everyone on every side of a debate always thinks that about the other side.
Old 13th February 2019
  #680
Gear Maniac
 

I've read this whole thread and it's full of incredible knowledge.

We could simplify it by adding details -
1) origin raw file resolution and sampling rate.
2) daw mixer behind the scenes dithering (meaning just playback) with or without moving a knob.
3) track bounce dithering (same raw file res)
4) track export " " " "
5) actual truncating of raw file on track bounce/export.

All I can add is this bit from a review in SOS.
When Samplitude made a change back in version 6.0 I heard degradation on playback (not track bounce) and gave them a call. Here is what happened.

"I think the big thing about the sound quality is to make no mistakes. You must not do mistakes in the DSP. It's a big goal, and a lot of errors and not-clever routines are done by a lot of parties on the market, and people who are trained to hear audio will discover these immediately. Six or seven years ago, we had a patch for a new Samplitude version, and one day an American guy called us and said 'Hey, you did something wrong in your program. It sounds bad now.' We measured, and did tests, and after a long time we found out that in the 24th bit of the audio in going from floating-point arithmetic that we do internally down to the sample level through a 24-bit converter, we forgot the dithering. I personally could not hear this, to be honest — but you can measure it, and in a program as huge as Samplitude, you have a thousand points where you can make a mistake of this sort.”

I think it's program dependent and since dithering is hard to detect most code writers just skimp on it. This is why some say Samplitude resolves/feels better.

btw - when they kept raising the price I moved over to Reaper and it's sounds more grainy (rock'n roll) than Samplitude. Also Reaper doesn't stack as well on high track counts (I'm talking playback - no knob turning)
Old 13th February 2019
  #681
Gear Nut
 

Quote:
Originally Posted by Quantumphysics View Post
btw - when they kept raising the price I moved over to Reaper and it's sounds more grainy (rock'n roll) than Samplitude. Also Reaper doesn't stack as well on high track counts (I'm talking playback - no knob turning)
What do you mean 'doesn't stack as well'--is there something I could be doing to avoid the problem? Some best practice or another?


It's peculiar because Samplitude only does 32-bit floating point mixing, and reaper does its mixing in a full 64 bits.
Old 4 weeks ago
  #682
Motown legend
 
Bob Olhsson's Avatar
 

I understand Samplitude has always done 80-bit mixing or whatever the limit of the CPU is. It supports 64-bit float files too.
Old 4 weeks ago
  #683
Gear Nut
 

are you quite sure? i searched 'samplitude 80-bit' and all i found was stuff like this Why Sequoia is a better DAW than most where they were told by magix that it was 32-bit/64-bit, and various people saying that other people said that it was 80-bit, or that they heard that it was, but no actual documentation anywhere
Old 4 weeks ago
  #684
Lives for gear
 

Quote:
Originally Posted by Bob Olhsson View Post
I understand Samplitude has always done 80-bit mixing or whatever the limit of the CPU is. It supports 64-bit float files too.
Hi Bob... I don't think the accuracy of the host mixers is in question to be honest? Anyone making a DAW in the host processing domain would have to go to a hell of a lot of trouble to get it to create distortion..

We do some double precision stuff within plugins when needed, but the host is more than likely to be passing 32bit float values.

I don't want to get into this too deeply - as there are still some lingering misunderstandings about what floating point and dither actually are? Floating point data and calculations are not at all like fixed point.

I did some quick tests on Reaper, which seems to be passing 32bit float values throughout.

1) I generated a 100Hz tone at just less than flat out.
2) I then passed this through 3 instances of the Oxford EQ with 3 bands on each (9 bands in total), notching out the 100Hz signal to -180dB.
3) I then bumped up the result by +120dB so that the artefacts could be seen on the Bluecat FFT plugin. Which means that the bottom line on the FFT plot is equivalent to -240dB
4) Where I needed to run plots with 24bit dither, I simply used our DSM to add the dither. So those plots are with the DSM processing as well.
5) I then ran some plots included here.

From the first plot without dither we can see that the artefacts are all less than -180dB below the signal - even though the signal chain has passed signals at entirely different levels and there is a lot of processing going on. I would posit that this is indeed good enough for any musical signal - and the chances of hearing any of this whilst listening to any flat out signal are zero..

The second plot shows this with 24bit dither added by passing it through the DSM plugin. We can see that harmonic artefacts are still below -180dB, although of course the signal now has the dither noise on it. The sum total of the extra DSM processing hasn't caused any significant excess harmonic distortion.

The 3rd and 4th plots show the signal present on the master after the mixer (summing), and we can see that it's similar with harmonics still around the -180dB level with the DSM dither out or in.

The 5th plot shows what happens if the generator signal is reduced to -40dB. The errors have also dropped by the same amount and are coming out at around -220dB. This is because the floating point data and calculation presents every single sample value at the same precision - however large or small it happens to be. This means that your low level signals are treated exactly the same as your high levels. You piccolo playing solo in the orchestra will have as much precision as the orchestra playing full pelt - and there will be no raggy reverb tails.

The 6th plot shows the same thing with 24bit dither. Essentially all harmonic errors are below the noise floor of the dither.

As we can all appreciate, this is definitely NOT what happens with fixed point, where the proportion of error increases as the signal level decreases

Additionally and crucially; since the whole darned DAW is floating point - there will be no additional harmonic degradation however many channels are mixed together and however many busses you have running here there and everywhere, provided that you stay in the box. :-)

Now this kind of performance is streets better than the OXF-F3 console we made, which (sadly) had a 32bit fixed point processor. Just one single EQ band with a narrow cut -20dB at 100Hz came out at about -98dB!
It is many orders of magnitude better than any analogue gear could ever reach.
And all this with 3 decent full EQ plugins and one FFT based compressor in the signal path!

So I would like to suggest in all honesty and with great respect to everyone, that all this hoo-ha involving dither and floating point is a waste of time and a worrying distraction for people trying to relax, be creative and get on with their art.. Art is what it's all about - that's the only reason we do any of this.

We do not even actually have a harmonic distortion or a dither problem at all.. :-)
Attached Thumbnails
Good dither practices, what are yours?-float-error-clean.jpg   Good dither practices, what are yours?-float-error-24bit-dither.jpg   Good dither practices, what are yours?-float-error-24bit-dither-master.jpg   Good dither practices, what are yours?-float-error-clean-master.jpg   Good dither practices, what are yours?-float-error-clean-40db.jpg  

Good dither practices, what are yours?-float-error-clean-40db-24bit-dither.jpg  

Last edited by Paul Frindle; 4 weeks ago at 08:19 PM..
Old 4 weeks ago
  #685
The rest of the story

Quote:
Originally Posted by Quantumphysics View Post
All I can add is this bit from a review in SOS.
When Samplitude made a change back in version 6.0 I heard degradation on playback (not track bounce) and gave them a call. Here is what happened.

"...one day an American guy called us and said 'Hey, you did something wrong in your program. It sounds bad now.' We measured, and did tests, and after a long time we found out that in the 24th bit of the audio in going from floating-point arithmetic that we do internally down to the sample level through a 24-bit converter, we forgot the dithering. I personally could not hear this, to be honest — but you can measure it, and in a program as huge as Samplitude, you have a thousand points where you can make a mistake of this sort.”
That was a rather famous event in Samplitude/Sequoia history. One of the complaints about sound quality came from a recording client of mine. Third-party analysis tools weren't widely available in those days, so I had to write my own scripts in MATLAB to prove what was happening. It took a while because MATLAB's wave file import functions weren't bit-transparent in those days (they are now). I found some import/export code written by a researcher at a university, modified it a bit, and tested it to be correct. Then I set about doing performance comparisons between the new Samplitude release and the previous one. There was a clear difference between the error spectra. To prove what was causing it, I wrote my own truncation and dithering scripts. After I posted my results to the support forum, the Samplitude folks jumped on the problem and fixed it in a maintenance release. That was more than 15 years ago. BTW, even with the tools I built back then, I could clearly see a performance difference between 32-bit and 64-bit floating point.

So now you know the rest of the story. My reason for diving so deeply into quantization and dither theory was to keep an important client happy. The album in question is still making him money on Spotify. Keep this in mind the next time someone tells you that worrying about this stuff is a waste of time.

David L. Rick
Seventh String Recording
Old 4 weeks ago
  #686
Motown legend
 
Bob Olhsson's Avatar
 

Quote:
Originally Posted by SleepCircle View Post
are you quite sure? i searched 'samplitude 80-bit' and all i found was stuff like this Why Sequoia is a better DAW than most where they were told by magix that it was 32-bit/64-bit, and various people saying that other people said that it was 80-bit, or that they heard that it was, but no actual documentation anywhere
One of their developers mentioned it in their forum some years ago. It was never advertised.
Old 4 weeks ago
  #687
Gear Nut
 

Oh, thank you, Paul, for checking! I'm certain reaper does 64-bit stuff, though—if only because I've used Stillwell Audio's Bitter before, to check an unrelated thing, and it registered all 64 bits as being active. In addition, you can pick the bit-depth that you want Reaper to mix in, and it goes up to 64. Maybe blue cat's freqanalyst doesn't actually receive 64-bit data? I know that MeldaAnalyzer doesn't.

I guess I'm not too worried, though. I take an interest in getting the best fidelity I can, but I also realize that in the end it's the music itself that'll make or break the album.

One of my favourite and most personally memorable songs was from Shinobi III, on the SEGA Genesis.

YouTube

Nobody can tell ME that's hi fi, but i still love it.
Old 4 weeks ago
  #688
Airwindows
 
chrisj's Avatar
Quote:
Originally Posted by Paul Frindle View Post
The 5th plot shows what happens if the generator signal is reduced to -40dB. The errors have also dropped by the same amount and are coming out at around -220dB. This is because the floating point data and calculation presents every single sample value at the same precision - however large or small it happens to be. This means that your low level signals are treated exactly the same as your high levels. You piccolo playing solo in the orchestra will have as much precision as the orchestra playing full pelt - and there will be no raggy reverb tails.
Indeed, quite the opposite.

Floating point means your piccolo, and your reverb tails, get MORE precision than the top 6dB of your recording, by a huge margin. Every single sample value gets more precision the closer it is to 0, and the quiet stuff like piccolos and reverb tails by definition are closer to silence. You can never lose out in floating point by going more quiet: since the listener experience is always grounded in a playback system of -1.0 to 1.0, we experience this heightening of precision as samples get quieter, as the perception of infinite resolution. Even compression won't change that: you can only bring up samples to the listening level at which point they'll still have the same precision they'd have had if they started out loud.

Floating point is real good at 'quiet', and reverb tails.

Same is true of zero-centered stuff blowing way past clipping: turn them down and they have the precision again. The reason my offset trick works is that it's adding DC and then casting to 32 bit float and subtracting the DC again. In normal experience you wouldn't have this and nothing you could do would really change this apparently infinite resolution.

On the other hand, the outer part of your -1.0 to 1.0 is always 25 bit fixed point. And we already know that if you repeatedly truncate to fixed point, there's a cost. The floating point number is like a seashell of mantissa precisions ever expanding by 2X each time the range it covers, expands. From 1.0 back down to 0 you have 25 bit, 26, 27, 28, 29, 30, 31… on and on until FLOAT_MIN, which is 1.175494e-38, one hell of a lot of dB below clipping.

If you went louder, you'd have to go up to +16777216 before each new floating point number was already larger than the entire reproducible audio range: that's the point where you can't even represent integers and the next number is +16777218, then +16777220 and so on. Later, the numbers begin stepping up by 4, then 8, then 16 and so on. (If you scaled them down to audio range you'd still have 25 bit precision at a minimum. This is for thread readers, not really for Paul who knows all this already)

That's why my choices around dithering floats (such as the buss on every Mac CoreAudio application, and plenty of older VST implementations) are never about helping the reverb tails or piccolos. Those don't need any help, relative to 1.0 and -1.0 their precision is fantastic! It's strictly about helping out that outer 1/2 of the usable audio sample range: 0.5 to 1.0, and -0.5 to 1.0. And to a lesser extent, 0.25 to 0.5 (and -0.25 to -0.5) and so on.

And even then, though it is dithering a mantissa that is a region of fixed point data of known precision, ain't nobody going to hear changes at the 25 bit level reliably, especially when they by definition have to be happening while audio of not less than -6dBFS is playing. That is an extremely important caveat. This is not and cannot be happening against a quiet background. That precision level (that I'm seeking to dither) doesn't exist unless samples are happening that are louder than -6dbFS, and any lossy-compression guy will tell you correctly that masking will conceal any artifacts happening, so you'll not directly hear them, probably not even if the audio is very deep bass that lacks distracting overtones (a best case scenario for hearing this stuff)

I agree with Paul on the basic imperceptibleness of any of this as a single operation.

I just don't want the truncation aspect to build up, including on those loudest samples. DAWs have long used huge numbers of floats in their calculations, as have plugins, and I feel this is where 'ITB sound' comes from. If you completely get rid of known issues like truncation and aliasing, ITB sound is unrecognizably different from the 'DAW sound' of the early days of digital workstations. The more people overprocess, the more this truncation-erosion has affected the tonality of audio. We can do better.
Old 4 weeks ago
  #689
Allow me to remind that logarithmic scales such as the decibel are tools meant to help illustrating and comparing enormous, rather difficult to imagine ratios. It's easy to underestimate the true ratio between say, 10dB, 50dB and 100dB. 100dB certainly is not twice 50dB, it's ~330x "more".

The same happens with bit depth. 48 bit fixed point is not "roughly twice" 24 bit. It really has 24bit (or 16777216) times more steps than 24bit. Enormous! All this is easy to underestimate.

Don't forget this logarithmic nature when interpreting plots or bit depths.


There's a saying in stats: "anything looks linear on a log-log plot with thick marker pen" (and certainly vice versa)

Last edited by FabienTDR; 4 weeks ago at 09:24 PM..
Old 4 weeks ago
  #690
Quote:
Originally Posted by Bob Olhsson View Post
One of their developers mentioned it in their forum some years ago. It was never advertised.
I can vouch for seeing that discussion.

Intel's floating-point co-processor allows use of 80-bit intermediate register variables. They are truncated to 64-bits when exiting the co-processor. But short-term state variables can use 80-bit precision, and a careful DSP engineer can wring out some extra performance using this fact. The primary benefit is in recursive calculations like high-Q IIR filtering.

Paul Frindle just pointed out the difficulty of making narrow notch filters with good noise performance on a processor with limited precision. I once had to design a bi-quadratic notch filter that was "zero" at 80 Hz and -0.1 dB at 79 and 81 Hz. Working on a 16-bit fixed-point DSP processor, I was happy to get "zero" to be -18 dB. The only reason I could do that well is that I had a 40-bit accumulator available for intermediate results.

Paul writes:
Quote:
So I would like to suggest in all honesty and with great respect to everyone, that all this hoo-ha involving dither and floating point is a waste of time and a worrying distraction for people trying to relax, be creative and get on with their art.. Art is what it's all about - that's the only reason we do any of this.
The difficulty with this thread is that it started out being about "best practice" for working audio engineers helping clients with their art, but then morphed into a dialog between advanced DSP folks on the question of "What's the best we can possibly do?" I apologize if we've ended up alarming a lot of working folks while examining our own dirty laundry. They shouldn't be alarmed, but rather reassured, that algorithm designers worry about such stuff.

We (the DSP guys and gals) should worry about such stuff. That's how the state of the art advances: step by painful step. We already know the result of asking "What's the least we can do?" instead of "What's the best?": It's called MPEG layer 3.

David L. Rick
Topic:
Post Reply

Welcome to the Gearslutz Pro Audio Community!

Registration benefits include:
  • The ability to reply to and create new discussions
  • Access to members-only giveaways & competitions
  • Interact with VIP industry experts in our guest Q&As
  • Access to members-only sub forum discussions
  • Access to members-only Chat Room
  • Get INSTANT ACCESS to the world's best private pro audio Classifieds for only USD $20/year
  • Promote your eBay auctions and Reverb.com listings for free
  • Remove this message!
You need an account to post a reply. Create a username and password below and an account will be created and your post entered.


 
 
Slide to join now Processing…
Thread Tools
Search this Thread
Search this Thread:

Advanced Search
Forum Jump
Forum Jump