There are two mistakes in your program.
Reading from the wrong register after return
Let's look at this SQUARE subroutine:
SQUARE
ADD R4,R5,#0 ;R4 <- multiplier
AND R6,R6,#0 ;R3 <- 0, sq
;inner loop
AGAIN
ADD R6,R6,R4
ADD R5,R5,#-1 ;decerement counter
BRp AGAIN ;check end of calculation
RET
You don't have any comments here about how the subroutine should be called, or what it returns. This makes it difficult to see what's wrong. Here's what you should write above this subroutine:
; SQUARE subroutine
; Squares an integer value
;
; Parameters:
; R5: Value to be squared
; Return value:
; R6: Squared value
; Notes:
; This subroutine tramples R4
Why? If you're trying to debug the calling code, it's much easier to do so if you know how the subroutine is supposed to be called, and what assumptions you made while writing it. For example, you might make a change to the outer loop to make use of R4 as a temporary variable, and be confused as to why that value is being overwritten. (If you want to really go the extra mile, also document whether the subroutine supports zero or negative values as an argument.)
Now that we have this comment, the issue in the main code is clear:
LDR R4,R1,#0 ;element ->R4
LDR R5,R1,#0 ;counter
JSR SQUARE
ADD R2,R2,R4 ;ans = ans + element
We load the value to be squared into R5, and call SQUARE. Then, SQUARE writes the squared value into R6, and we... load the value from R4. That's not correct. It should load from R6, not R4.
(As a side issue, LDR R4,R1,#0
is unnecessary here, because the SQUARE subroutine ignores the value in R4. Another benefit of commenting your code!)
So we can correct this code to:
LDR R5,R1,#0 ;counter
JSR SQUARE
ADD R2,R2,R6 ;ans = ans + element
DONE does not store to ans
Let's look at the outer loop condition:
LOOP LDR R4,R1,#0 ;element -> R4
ADD R4,R4,-1
BRn DONE ;if R1 < 0, condition fails
So we load from the address pointed to by R1, and check if the value is less than or equal to zero. (The comment is incorrect, by the way.)
So what is at DONE?
DONE HALT ;halt
It halts immediately, without storing R2 to your result!
So, the fix is to move the line which stores to R2, so that it's run after you branch to DONE.
DONE
ST R2, ans
HALT ;halt
Here's a complete code listing of the fixed program:
; Program to calculate Euclidian sum of numbers stored at location x4000
;
.ORIG x3000
LD R1,a ;first element address
LD R2,zero ;ans -> R2 initialized to 0
;while (R1 isn't zero)
LOOP
LDR R4,R1,#0 ;element -> R4
ADD R4,R4,-1
BRn DONE ;if R1 < 0, condition fails
;loop body
LDR R5,R1,#0 ;counter
JSR SQUARE
ADD R2,R2,R6 ;ans = ans + element
ADD R1,R1,#1 ;prepare for next element
BR LOOP ;another iteration
; SQUARE subroutine
; Squares an integer value
;
; Parameters:
; R5: Value to be squared
; Return value:
; R6: Squared value
; Notes:
; This subroutine tramples R4
SQUARE
ADD R4,R5,#0 ;R4 <- multiplier
AND R6,R6,#0 ;R3 <- 0, sq
;inner loop
AGAIN
ADD R6,R6,R4
ADD R5,R5,#-1 ;decerement counter
BRp AGAIN ;check end of calculation
RET
;
zero .FILL 0
a .FILL x4000 ;a has the address of first location
ans .BLKW 1 ;reserve location for ans
DONE
ST R2, ans
HALT ;halt
.END
Here's how I tested the fixed program:
$ lc3sim -quiet prog_fixed.obj
x0289 x0FFB BRNZP TRAP_LOOP
Loaded "prog_fixed.obj" and set PC to x3000
(lc3sim) memory x4000 3
Wrote x0003 to address x4000.
(lc3sim) continue
x0289 x0FFB BRNZP TRAP_LOOP
(lc3sim) translate ans
Address ans has value x0009.