View Single Post
Old 24th April 2009   #77
dale116dot7
Lives for gear
 
Join Date: Dec 2003
Location: Calgary, Alberta
Posts: 815

Thread Starter
I've been trying to figure out if there's a more efficient way of doing this:

move x:(r3)+,x1 ; grab number of allpasses
move x:(r3)+,x0 ; grab coefficient
move x:(r3)+,r4 ; grab start of allpass pointer...
do x1,allpassloop
move x:(r3)+,r5 ; grab end of allpass pointer
move y:(r4+n4),a ; get input (start of allpass)
move y:(r5+n5),y0 ; get end of allpass.. generates pipeline stall by using r5
macr -x0,y0,a y0,b ; multiply end data to
move a,y1 ; save result for now... generates pipeline stall by reading accumulator
macr y1,x0,b x:(r3)+,x0 ; mac to generate output, next coeff
move a,y:(r4+n4) ; first mac, send back to input
move x:(r3)+,r4 ; grab start of allpass pointer, generates pipeline stall by fetching pointer r4 being reused at top of loop
move b,y:(r5+n5) ; store end of allpass
allpassloop:

It's an allpass, if you guys haven't guessed. It takes about 15 clock cycles per loop (one loop per allpass) which seems pretty high to me. I'm used to the AL3201 which does it in two cycles. Too many stalls, the assembler is talking, I'm hearing, but not understanding. Any ideas? Any DSP56300 programmers out there???
dale116dot7 is offline   Reply With Quote