20th September 2019
Yes - I too struggle with the reality of physics and the numbers in that I sing between 6" (rare) and 1.5 feet from the Mic which totally effects the total RTL (Hardware + distance) and thus the comb filtering. In general I have little issues with 1ms RTL from the hardware. But it is so hard to get a full Native system to run there. I am sure 44.1 is better from the computer POV. Whenever I engage 2+ms from the hardware it is a big tonal change. Vocals is the most significant.

The 0.3ms I quote is analog to analog (measured scope and Function Generator) through the Apogee FPGA FX. Antelope is 0.3 + some for each added AFX.
What is curious to me is the same physics don't seem to affect us so negatively in actual acoustic situations. IE, singing in the shower- where early reflections are coming in at probably 3-6ms to start.

Switching over to the science of studio design, people like Thomas @ Northward put a lot of work into a dense listener to listener (not speaker to listener, but rather self noises back to self) ambience in an otherwise dead room (the room is dead from a speaker to listener perspective save the floor bounce). The theory is that we live almost all of our lives with early reflections (and floor bounce) and if we don't have them it puts us into an anxious state. Having some early reflections that are reasonably diffused eases the mind and opens up the ears. Same basic theory with NE control rooms.

This is interesting to me with regard to RTL because I wonder if the problem is actually the delay, or the unnatural state of having a singular specular reflection. Also if the hypothesis is that it is comb filtering that bothers us, this is a problem related only to a pronounced specular reflection that isn't mitigated by other early reflections with different flight paths. IE the difference between a shower and cans with RTL (plus acoustic latency) is mainly that that the shower has myriad early reflections, and the cans only have the 1. Then we add reverb...

Something more thought out in terms of space emulation like the Klang:Fabrik is really interesting to me. I think typical reverb presets with longer decay times naturally get mixed too low to properly emulate a dense ER field. So you get a specular reflection with a dense reverb field that is 10dB or more below the first reflection (or RTL+acoustic delay), and there would (could) still be comb filtering. But on the science side, there is PLENTY of comb filtering starting at just 1ms (voice to mic delay only) and quite a bit by 2ms total (including the acoustic latency plus any RTL)...

Really just thinking out loud here- it's hard for me to wrap my head around not being able to live with a couple ms before the first reflection, when in reality we spend all of our lives outside of headphones like that.