Quote:
Originally Posted by Casey How you handle internal block delays is your own problem.  
It is the alignment issue that must be attended to. Delays cannot be multiples of some block size.
So, and Dale alludes to this, some head and tail processing must take place to allow delays to be of arbitrary length.
Really, this is just for the record, I know you understand this! |
Processing by-block is probably fine, but actually, by-sample probably isn't that much more once the samples are moved into internal RAM - it should hopefully be whatever the programmer wanted to do, not necessarily a limitation of the hardware. Even processing by blocks can be tricky with SDRAM as the SDRAM page length probably doesn't line up with the desired delay line lengths.
What I would be tempted to do is make an internal RAM delay line equal to the desired delay line length (and making the basic delay line non-tappable except at the output end), then you subtract out however many 'blocks' you need out of the middle, leaving maybe 100 or 200 samples in the internal SRAM. Then you can DMA into and out of the middle of the array, although care must be taken about delay line wraparound, since the DMA controller must be programmed to reload its address register once it gets to the end. The modulus DSP addressing mode takes care of the code of an arbitrary length circular buffer, but the DMA controller may not be that sophisticated. If the delay is less than, say, 300 samples or so, then don't bother moving the data around - just leave it in SRAM. There would be a bit of work in the memory management control software, but it only needs to run to completion at least every 'N' samples where 'N' is the transfer block size (which may or may not be equal to the algorithm block size). But in this way, a delay line can be interpolated at the end (in SRAM), and it can be an arbitrary length. The algorithm that takes the reverb descriptor tables and allocates memory based on programmed reverb size would need to manage setting all of this up. You also could program sample-by-sample if you wanted to, or in blocks. In this way, the SRAM delay line acts as the arbitrary length adjustment, and also as a (manually maintained) data cache.
I know there are more ways to do this than this, but it should work ok. Another option is to have two circular buffers. The write buffer is, say, double the transfer block size. Once the write pointer gets halfway, write the first half. Once it wraps around, write the other half. The read buffer would be similar - the same length. The phase offset between the read and write pointers, plus the length of the bulk SDRAM delay would determine the delay line length. I can only think of maybe five ways of managing the SDRAM to SRAM data shuffle, and a couple of ways of doing the calculations in-place.
There are three main time delays in SDRAM that are important from what I can tell out of the datasheets. The first delay and the last delay are obvious - the open page and close page operations. The other delay is address supply to data available, and that applies to in-page accesses. And to make matters worse, if your block size aligns to a page length in SDRAM (which it should), that would be fine, but also for efficient access to SDRAM by the software, you either will have to misalign the writes (open and close two pages for one operation) or misalign the reads, unless the delay size is also a multiple of the block size (which also aligns with SDRAM page size). SDRAM also does not necessarily do well when you try to interleave writes and reads to the same bank, though that depends on the SDRAM controller built into the DSP.
This is interesting, I think one thing that's missing from people developing software these days is how the software actually has to mess with chips.