...No need for such elaborate steps!
By placing the measurement object on a rigid, unobstructed ground surface such as a driveway or parking lot away from any walls or boundary surfaces, and placing the measurement microphone flush with the ground, an accurate approximation of the anechoic response of the object together with its image source may be obtained.
But there is
much more. Without realizing it, you are opening a much larger can of worms than your realize. And with further examination, I might suggest that the 'frequency response' may actually be one of the last things you really want to know about the speaker.
I would suggest you look at the impulse response and then convolve the various alternative views. The frequency response will tell you simply that 'lots of stuff' is going on, none of which can be reliably discerned from the frequency response view.
What do I mean by this?
First, are you measuring in the near or far field? This is important as in the near field you are seeing more of the individual driver interaction and the superposition of the differing source signals offset in time - due to the differing acoustic centers of the drivers due to the physical offset inherent in separate non-coincident drivers, as well as the acoustic astigmatism of the acoustic origin among the various drivers themselves. (If you are not familiar with this, the apparent source of the signal is not necessarily the voice coil, but is often shifted forward - complicated by the origin physically shifting with frequency.)
The importance of this? As the various sources (not to mention the virtual diffractive edge sources) combine (superpose), the result is comb filtering in the frequency domain and polar lobing in the spatial planes.
Speakers with larger driver offsets using passive crossovers will suffer even more (and this is a primary reason for active crossovers with delay settings used to bring the measured driver signal offsets into alignment - at least with respect to one plane).
These differences will become less critical in the far field as the difference in signal driver offset become small relative tothe distance/time of travel to the far field listening position. But the effects of spatial polar lobing will be amplified as the spacing between lobes increases with radial distance.
Additionally, impulse based measurements can often be windlwed to remove the interference enabling a more atomistic examination of the isolated response minus the destructive interaction.
Thus, there is much more beneficial information to be gained by first measuring the system in the time domain and then convolving the frequency domain from this base information, The frequency response simply provides a partial snapshot of the resultant complex interaction of all of the various factors, without providing any insight into what the cvarious factors are or how they are interacting.
ARTA is an
excellent place to begin. Just get it!!!!
There is lots more to this, and I realize that I have provided a very abbreviated 100,000 foot view, but trust me, thime based measurements are the way to go, and ARTA is definitely a worthwhile investment easily capable of packages costing literally 10x as much!
Below is an illustration generated in EASE that shows an example of the comb filtering and polar lobing, in this case based upon two single offset drivers at various fractions of a wavelength. Note: This occurs with separate drivers in their common passbands in the crossover region as well as with separate speakers, repeating with each increase in the order of magnitude (driver-driver, speaker speaker, array-array...and all of the combinations and permutations available as well! And then you get to look at this relationship in terms of the superposition of signals between the direct source and all of the virtual reflected sources inherent in a bounded space - eg. room!!!). And once you realize that, then the polar response and Q of a speaker become an even more critical response as one comes to understand the importance of controlled Q in minimizing destructive interference between sources and within rooms that causes so many of the anomalies that contribute to the frequency response that simply indicate that many more critical factors are involved in the actual response than what is normally assumed...and more often than not, these factors are also mixed up in what we assume is a strict measurement of the speaker itself in the frequency response. As, how many speakers are used in an anechoic chamber where multiple sources (at each order of magnitude) and boundaries are not a factor?