Yes, it would be a sparse matrix. It may not be as 'efficient' to do it that way, but I'm looking at the hardware and trying to figure out the most efficient way of doing it - or deciding if I need to rework the processing hardware. The last thing you want to do is have a very beautiful PC board all put together, then find out the algorithms you want to run won't run efficiently on the hardware. An option I was thinking of was to put more than one DSP on there, but for a DIY project, multiprocessing seems a bit overkill.
What I found on the last design was:
- The DSP was adequate to do very complex loop-based reverbs
- Long loop times were not accomodated particularly well (special effects and sampling, not reverbs) - and there was a barrier at 256k samples because of the address line arrangement of the DSP56366.
- The SRAM performed adequately but did add a wait state for every access. More on-chip RAM (zero wait state) would be better.
- I did not have digital audio I/O or USB capabilities, and I wanted to add them - espeically the digital I/O.
- The host processor (MCF51AC256) was not really fast enough to do modulation, and the DSP was busy enough processing audio that I didn't want to burden it.
- The I2C interface to the front panel was ok but added too many chips to the front panel PC boards.
- Though the SRAM was nice, it also cost $50 for a measly 512k of it.
- The linear regulators were ugly so I'm moving to a switcher for the main PSU. The PC board now has extra ferrite filters to get rid of the hash, plus separate analogue and digital 3.3V regulators.
So I upgraded the host processor to a MCF52259, ran parallel I/O off of it to the display panel, upgraded the DSP to a DSP56720, added AES digital I/O, a USB connector, and added a ton of SDRAM. The problem with SDRAM is its random access time is sucky, but it's cheap, and it bursts really quickly. Everything else is a significant step upwards, except for memory access time, which is about a factor of 10 worse (for the first access) - then its on-par. I'm just finishing up the PC board in my CAD package now. But I want to make sure that the algorithms can run efficiently before I get the new boards made up. I picked the same families of processors so I can reuse most of the code, and all of the development tools.