Tuesday, November 25, 2014

Understanding 6502 assembly on the Commodore 64 - (15) The Zero Page

Many people learning 6502 will often hear about the mystical and revered Zero Page area of memory, and all of the magical things it can do.  Well, there is no trickery behind it. The concept is actually quite easy to understand. We can get through this in a single post.









Consider for a minute how we have been addressing ourselves with regard to memory and what is happening within the inner workings of the processor.

lda #$55      -     A9    55
sta $0C00     -     8D    00    0C



On the first line A9 our LDA loads the hex value 55 in the Accumulator.  This required 2 cycles to be accomplished.

On the second line 8D stores the value in the Accumulator to 0C00    00 0C (little Endian). This required 4 cycles to be accomplished, as the actual storing of the data took a cycle


Our first line took 2 bytes, our second line took 3 bytes for a grand total of 5 bytes and 6 cycles.  I know this is obvious, but bear with me a moment.

The zero page memory area is basically $0000 - $00FF, conceptually it is no different than any other area of memory.  What makes zero page particularly interesting is how the 6502 processor can deal with it.  The 6502 allows us to express the addresses within zero page as one byte, leaving off the 00 MSB from the memory address.  Lets look at out initial example in zero page.  For my small examples now and going forward I will be using 00FB through 00FE for demonstrations when I can.  This area of memory has been set aside on the C64 as unused and available for machine language programs.   Yes, thats right, 4 whole bytes.



lda #$55       -     A9    55
sta $FB        -     85    FB

A simple move to zero page, with no other optimizations reduced the code size from 5 bytes to 4.  The use of zero page reduced our cycles on the second line from 4 to 3.

This is a 20% reduction in code size and about a 17% reduction in processor cycles.  All of which was accomplished simply by using zero page as storage.

So, where did that 85 come from? When we use an assembler we simply type STA $FB, but when it is broken down into machine code, the assembler sees that we are using zero page and produces the OPCODE for it.  With 85, the processor is only expecting one byte to follow it, not 2, as with 8D

At best, zero page is only 256 bytes long.   Luckily on the C64,  and other computers, a lot of the zero page is used by BASIC.  If we can forego the use of basic, we end up with quite a bit of room available.  Consider a program that had many routines and loops requiring saving to particular memory addresses. how much would zero page save us.

Lets take our optimized binary converter in the last chapter



; C64 Hex to Binary display converter optimized
; 64TASS assembler style code for 6502
; Jordan Rubin 2014 http://technocoma.blogspot.com
;
; Takes the HEX value in OURHEXVAUE, converts it to Binary for display 
; on the screen as a binary number.   
   
*=$C000 ; SYS 49152 to begin

OURHEXVALUE = #$55 ; Enter the Hex value to be converted here

OURHEXNUM = $033C  ; This is where the constant OURHEXVALUE will be stored
TESTBYTE = $0345   ; This is where our test byte will be stored for lsr
BIT7 = $0708       ; This is the location of the 7th bit, required room for
               ; 8 contiguous bytes after the starting address
               ; using 0708 dumps it right to screen ram, bottom center

nop


INIT:

lda OURHEXVALUE ; this will be out test number 
sta OURHEXNUM   ; we will store the test number here permanently       
ldy #$80        ; Out first bit test for bit 7 must be 10000000 $80 
sty TESTBYTE    ; Store our initial test byte here
ldx #$00        ; Initialize X for our loop

CONVERTION:

lda OURHEXNUM   ; load our test hex number, this is a constant
and TESTBYTE    ; mask it with our test byte
cmp #$00        ; is the result 00?
bne STORE1      ; No, jsr to STORE1
lda #$30        ; Load the display value of 0 into A
jmp CONTINUE    ; jump to CONTINUE

STORE1:

lda #$31       ; Load the display value of 1 into A

CONTINUE:

sta BIT7,x     ; Load the display value into A
inx            ; Increment X for our loop
lda TESTBYTE   ; load testbyte into A                                                                                                                              
lsr            ; divide it by 2
sta TESTBYTE   ; store new testbyte back to its memory area
cpx #$08       ; is X=8?
bne CONVERTION ; No, LOOP back to CONVERSION
brk

OUR current code, as shown occupies memory area C000 to C02D. [46 bytes]

Executing it in virtual 6502 shows it required 319 Cycles to complete


ea a9 55 8d 3c 03 a0 80 8c 45 03 a2 00 ad 3c 03 2d 45 03 c9 00 d0 05 a9 30 4c 1e c0 a9 31 9d 08 07 e8 ad 45 03 4a 8d 45 03 e0 08 d0 e0 00




Now lets make a few changes to it right at the top..........

OURHEXVALUE = #$55 ; Enter the Hex value to be converted here
OURHEXNUM = $FB    ; This is where the constant OURHEXVALUE will be stored
TESTBYTE = $FC     ; This is where our test byte will be stored for lsr
BIT7 = $0708       ; This is the location of the 7th bit, required room for
               ; 8 contiguous bytes after the starting address
               ; using 0708 dumps it right to screen ram, bottom center


Now well recompile it...........
OUR new code, as shown occupies memory area C000 to C027. [40 bytes]
Executing it in virtual 6502 shows it required 285 Cycles to complete

ea a9 55 85 fb a0 80 84 fc a2 00 a5 fb 25 fc c9 00 d0 05 a9 30 4c 1a c0 a9 31 9d 08 07 e8 a5 fc 4a 85 fc e0 08 d0 e4 00





From 46 Bytes to 40 Bytes was a code size reduction of 13%
From 319 cycles to 285 cycles was a cycle reduction of over 10.5%


The bigger the program and the more references to memory, the greater the benefit of using zero page.



Monday, November 24, 2014

Understanding 6502 assembly on the Commodore 64 - (14) Space and cycle optimization

With the C64 and other 8 bit computers, limited in speed and storage space, it important to optimize code to be a small and as fast as possible.  While our simple binary conversion requires neither, its good to see how much we can reduce the size of the code and reduce the amount of cycles required to execute it.  I did not include every single possible optimization the world has to offer, so do not contact me with your idea.  This is primarily about optimizing there flow of the program, and not about   optimization through ways which beginning 6502 programmers might be confused.  


You are however welcome to share you thoughts or techniques below.  I will remind people that contacting someone online does not preclude you from having good manners.  Interact with people as you would interact with a stranger in person.








This is our program, as it was, I've left the NOP in it as to not change our initial results from chapter 13, also we will have to replace RTS with BRK to accurately count cycles on virtual 6502.


; C64 Hex to Binary display converter
; 64TASS assembler style code for 6502
; Jordan Rubin 2014 http://technocoma.blogspot.com
;
; Takes the HEX value in OURHEXVAUE, converts it to Binary for display 
; on the screen as a binary number.   
   
*=$C000 ; SYS 49152 to begin

OURHEXVALUE = #$55 ; Enter the Hex value to be converted here
OURHEXNUM = $033C  ; This is where the constant OURHEXVALUE will be stored
TESTBYTE = $0345   ; This is where our test byte will be stored for lsr
BIT7 = $0708       ; This is the location of the 7th bit, required room for
                  ; 8 contiguous bytes after the starting address
                 ; using 0708 dumps it right to screen ram, bottom center

nop

INIT:
lda OURHEXVALUE     ; this will be out test number 
sta OURHEXNUM       ; we will store the test number here perminantly       
ldy #$80            ; Out first bit test for bit 7 must be 10000000 $80 
sty TESTBYTE        ; store our initial test byte here
ldx #$00            ;   Initialize X for our loop

CONVERTION:
lda OURHEXNUM   ; load our test hex number, this is a constant
and TESTBYTE    ; mask it with our test byte
cmp #$00        ; is the result 00?
bne STORE1      ; No, jsr to STORE1
beq STORE0      ; Yes, jsr to STORE0
CONTINUE:
inx            ; Increment X for our loop
lda TESTBYTE   ; load testbyte into A                                                                                                                              
lsr            ; divide it by 2
sta TESTBYTE   ; store new testbyte back to its memory area
cpx #$08       ; is X=8?
bne CONVERTION ; No, LOOP back to CONVERSION
brk

STORE0:
lda #$30       ; Load the display value of 0 into A
sta BIT7,x     ; store A to the current storage memory location
jmp CONTINUE   ; jump to CONTINUE

STORE1:
lda #$31       ; Load the display value of 0 into A
sta BIT7,x     ; store A to the current storage memory location
jmp CONTINUE   ; jump to CONTINUE


Writing optimized code could be daunting, maybe its better to write working code.  Such as that above and then optimize it.

OUR current code, as shown occupies memory area C000 to C035. [54 bytes]
Executing it in virtual 6502 shows it required 349 Cycles to complete

ea a9 55 8d 3c 03 a0 80 8c 45 03 a2 00 ad 3c 03 2d 45 03 c9 00 d0 17 f0 0d e8 ad 45 03 4a 8d 45 03 e0 08 d0 e8 60 a9 30 9d 08 07 4c 19 c0 a9 31 9d 08 07 4c 19 c0






Lets see if we can improve upon this, its not a big program, and there isn't much to do, but there is enough to do..... remember we can save a byte and a cycle getting rid of NOP, but well keep it so we can view the code in the monitor, lets move to the real stuff

INIT:
      This is rather straight forward, and no waste,we'll leave it alone

STORE0: and STORE1:

     Both seem to have the instruction stay BIT7,x just before jumping to CONTINUE.  why not move that instruction into CONTINUE to the top line, both STORE0 and STORE1 will execute it anyway.   This won't save us any cycles, but it will save us some space

[OLD]

CONVERTION:
lda OURHEXNUM   ; load our test hex number, this is a constant
and TESTBYTE    ; mask it with our test byte
cmp #$00        ; is the result 00?
bne STORE1      ; No, jsr to STORE1
beq STORE0      ; Yes, jsr to STORE0

CONTINUE:
inx            ; Increment X for our loop
lda TESTBYTE   ; load testbyte into A                                                                                                                              
lsr            ; divide it by 2
sta TESTBYTE   ; store new testbyte back to its memory area
cpx #$08       ; is X=8?
bne CONVERTION ; No, LOOP back to CONVERSION
brk

STORE0:
lda #$30       ; Load the display value of 0 into A
sta BIT7,x     ; store A to the current storage memory location
jmp CONTINUE   ; jump to CONTINUE

STORE1:
lda #$31       ; Load the display value of 0 into A
sta BIT7,x     ; store A to the current storage memory location
jmp CONTINUE   ; jump to CONTINUE



[NEW]

CONVERTION:
lda OURHEXNUM   ; load our test hex number, this is a constant
and TESTBYTE    ; mask it with our test byte
cmp #$00        ; is the result 00?
bne STORE1      ; No, jsr to STORE1
beq STORE0      ; Yes, jsr to STORE0

CONTINUE:
sta BIT7,x     ; store A to the current storage memory location
inx            ; Increment X for our loop
lda TESTBYTE   ; load testbyte into A                                                                                                                              
lsr            ; divide it by 2
sta TESTBYTE   ; store new testbyte back to its memory area
cpx #$08       ; is X=8?
bne CONVERTION ; No, LOOP back to CONVERSION
brk

STORE0:
lda #$30       ; Load the display value of 0 into A
jmp CONTINUE   ; jump to CONTINUE

STORE1:
lda #$31       ; Load the display value of 0 into A
jmp CONTINUE   ; jump to CONTINUE



Now that this was done, lets look further at our code.  We can see in conversion that two possible branches exist STORE0 or STORE1

CONVERTION:
lda OURHEXNUM   ; load our test hex number, this is a constant
and TESTBYTE    ; mask it with our test byte
cmp #$00        ; is the result 00?
bne STORE1      ; No, jsr to STORE1
beq STORE0      ; Yes, jsr to STORE0


If there are only 2 possibilities it seems like a waste to jsr to both based on our tests when we can have a normal program flow and throw 1 exception

Why not keep our exception bne STORE1 and put our STORE0 code right below to just continue on.

Well have to move the code so that STORE0 is directly under CONVERSION

CONVERTION:
lda OURHEXNUM   ; load our test hex number, this is a constant
and TESTBYTE    ; mask it with our test byte
cmp #$00        ; is the result 00?
bne STORE1      ; No, jsr to STORE1
lda #$30       ; Load the display value of 0 into A
jmp CONTINUE   ; jump to CONTINUE

STORE1:
lda #$31       ; Load the display value of 1 into A

CONTINUE:
sta BIT7,x     ; Load the display value into A
inx            ; Increment X for our loop
lda TESTBYTE   ; load testbyte into A                                                                                                                              
lsr            ; divide it by 2
sta TESTBYTE   ; store new testbyte back to its memory area
cpx #$08       ; is X=8?
bne CONVERTION ; No, LOOP back to CONVERSION
brk

Essentially there is no more STORE0 function.  In our new program flow A will be $30 unless it branched to STORE1, which makes A $31.  Both CONVERSION and STORE1 ultimately lead to CONTINUE.

 Lets look at the final optimized program, we kept the no in for a fair comparison between the old and new code 

; C64 Hex to Binary display converter optimized
; 64TASS assembler style code for 6502
; Jordan Rubin 2014 http://technocoma.blogspot.com
;
; Takes the HEX value in OURHEXVAUE, converts it to Binary for display 
; on the screen as a binary number.   
   
*=$C000 ; SYS 49152 to begin

OURHEXVALUE = #$55 ; Enter the Hex value to be converted here
OURHEXNUM = $033C  ; This is where the constant OURHEXVALUE will be stored
TESTBYTE = $0345   ; This is where our test byte will be stored for lsr
BIT7 = $0708       ; This is the location of the 7th bit, required room for
               ; 8 contiguous bytes after the starting address
               ; using 0708 dumps it right to screen ram, bottom center

nop

INIT:
lda OURHEXVALUE ; this will be out test number 
sta OURHEXNUM   ; we will store the test number here permanently       
ldy #$80        ; Out first bit test for bit 7 must be 10000000 $80 
sty TESTBYTE    ; Store our initial test byte here
ldx #$00        ; Initialize X for our loop

CONVERTION:
lda OURHEXNUM   ; load our test hex number, this is a constant
and TESTBYTE    ; mask it with our test byte
cmp #$00        ; is the result 00?
bne STORE1      ; No, jsr to STORE1
lda #$30        ; Load the display value of 0 into A
jmp CONTINUE    ; jump to CONTINUE

STORE1:
lda #$31       ; Load the display value of 1 into A

CONTINUE:
sta BIT7,x     ; Load the display value into A
inx            ; Increment X for our loop
lda TESTBYTE   ; load testbyte into A                                                                                                                              
lsr            ; divide it by 2
sta TESTBYTE   ; store new testbyte back to its memory area
cpx #$08       ; is X=8?
bne CONVERTION ; No, LOOP back to CONVERSION
brk

OUR current code, as shown occupies memory area C000 to C02D. [46 bytes]
Executing it in virtual 6502 shows it required 319 Cycles to complete


ea a9 55 8d 3c 03 a0 80 8c 45 03 a2 00 ad 3c 03 2d 45 03 c9 00 d0 05 a9 30 4c 1e c0 a9 31 9d 08 07 e8 ad 45 03 4a 8d 45 03 e0 08 d0 e0 00


We put our code in and change the start address.  Than click load memory.






Then we click show memory and click the green PC and change it to C000



Clicking continuous run we see how many cycles are required before it breaks




So completing the same function we went from

[54 bytes] 349 Cycles to complete
to
[46 bytes] 319 Cycles to complete

Reducing the code size by 8 bytes and reducing the processor cycles by 30. (Almost 15%)




NEXT----->
Understanding 6502 assembly on the Commodore 64 - (15) The Zero Page


Saturday, November 22, 2014

Understanding 6502 assembly on the Commodore 64 - (13) Watching the 6502 Think

I wrote a small program today, It takes a HEX value and returns the binary value onto the screen, or any other contiguous memory location.  I kept it simple, and while simple, still highly effective and adaptable.  What is more interesting though, is stepping through each process as the processor works towards the end of the program.  Thats what were going to do.  This is very hardware neutral code, Changing OURHEXNUM,TESTBYTE, and BIT7 to your preferred memory area will allow this to run on any 6502 based computer.









Heres the program..............

; C64 Hex to Binary display converter
; 64TASS assembler style code for 6502
; Jordan Rubin 2014 http://technocoma.blogspot.com
;
; Takes the HEX value in OURHEXVALUE, converts it to Binary for display 
; on the screen as a binary number.   
   
*=$C000 ; SYS 49152 to begin

OURHEXVALUE = #$55 ; Enter the Hex value to be converted here

OURHEXNUM = $033C  ; This is where the constant OURHEXVALUE will be stored
TESTBYTE = $0345   ; This is where our test byte will be stored for lsr
BIT7 = $0708       ; This is the location of the 7th bit, required room for
                   ; 8 contiguous bytes after the starting address
                 ; using 0708 dumps it right to screen ram, bottom center

nop                ; added only to pad the monitor in vice for this demo

INIT:

lda OURHEXVALUE    ; this will be out test number 
sta OURHEXNUM      ; we will store the test number here perminantly       
ldy #$80           ; Out first bit test for bit 7 must be 10000000 $80 
sty TESTBYTE       ; store our initial test byte here
ldx #$00           ;   Initialize X for our loop

CONVERTION:

lda OURHEXNUM   ; load our test hex number, this is a constant
and TESTBYTE    ; mask it with our test byte
cmp #$00        ; is the result 00?
bne STORE1      ; No, jsr to STORE1
beq STORE0      ; Yes, jsr to STORE0
CONTINUE:
inx            ; Increment X for our loop
lda TESTBYTE   ; load testbyte into A
lsr            ; divide it by 2
sta TESTBYTE   ; store new testbyte back to its memory area
cpx #$08       ; is X=8?
bne CONVERTION ; No, LOOP back to CONVERSION
rts

STORE0:

lda #$30       ; Load the display value of 0 into A
sta BIT7,x     ; store A to the current storage memory location
jmp CONTINUE   ; jump to CONTINUE

STORE1:

lda #$31       ; Load the display value of 0 into A
sta BIT7,x     ; store A to the current storage memory location
jmp CONTINUE   ; jump to CONTINUE


I could optimize it further, but for now it looks rather elegant.  I wrote this for an easy display of some outputs I need off of a serial interface control.  We'll work with this more in the future, for now, lets run it!!!!





You'll see that at the top of the program, OURHEXVALUE was $55.  You can see upon execution the binary value of $55 is displayed on the screen.  Changing OURHEXVALUE will yield a different result when the program is executed.  I know, it seems pretty useless in this form.  In its actual form,

Loading A with the HEX value
Loading X with the MSB of the output memory location bit 7, and
Loading Y with the LSB of the output memory location bit 7
JSR to beginning of where you store this program

We just created a kernel level function for another program to use within the computer!!!!!!

OK so, I want to show you a nifty trick, this is the basis for this whole chapter

With the screen showing our result, go up to your menu, to machine, and select monitor, bringing up the monitor is really unimpressive, I know.....  lets see what amazing things we can to here.  Our program has executed and ended lets re-run the program in the monitor and see what happens with our value $55

first we need to move the program counter to the memory area of our program 

r pc=c000

then we type step



I added a function to the beginning NOP,   it does nothing.... literally nothing, but its existence allowed me to pad the monitor to step nicely from C000.  With our monitor lets watch the program run from beginning to end,  Ill highlight the interesting parts, pressing enter will execute 1 instruction at at time




Remember, we will not see our LABELS here, the processor uses the actual memory addresses from the code we wrote.

Remember
OURHEXVALUE = #$55 ; Enter the Hex value to be converted here
OURHEXNUM = $033C  ; This is where the constant OURHEXVALUE will be stored
TESTBYTE = $0345   ; This is where our test byte will be stored for lsr
BIT7 = $0708       ; This is the location of the 7th bit, required room for
                   ; 8 contiguous bytes after the starting address
                 ; using 0708 dumps it right to screen ram, bottom center


I didn't remark everything, the first two iterations are remarked. there after I pointed to certain areas to make the code easier to follow.  Note the BIT 1 or our flags, the Z bit Z and the .  when values are tested.  This controls the flow of the whole program!

A move of the process is included below

(C:$e5d4) r pc=c000
(C:$e5d4) step
.C:c001  A9 55       LDA #$55       - A:00 X:00 Y:0A SP:f2 ..-...Z.   10476655
(C:$c001) 
.C:c003  8D 3C 03    STA $033C      - A:55 X:00 Y:0A SP:f2 ..-.....   10476657
(C:$c003) 
.C:c006  A0 80       LDY #$80       - A:55 X:00 Y:0A SP:f2 ..-.....   10476661
(C:$c006) 
.C:c008  8C 45 03    STY $0345      - A:55 X:00 Y:80 SP:f2 N.-.....   10476663
(C:$c008) 
.C:c00b  A2 00       LDX #$00       - A:55 X:00 Y:80 SP:f2 N.-.....   10476667
(C:$c00b) X counter is 0, A is our convert number  TESTYBYE is stored as 10000000
.C:c00d  AD 3C 03    LDA $033C      - A:55 X:00 Y:80 SP:f2 ..-...Z.   10476669
(C:$c00d) 
.C:c010  2D 45 03    AND $0345      - A:55 X:00 Y:80 SP:f2 ..-.....   10476673
(C:$c010) 
.C:c013  C9 00       CMP #$00       - A:00 X:00 Y:80 SP:f2 ..-...Z.   10476677  
(C:$c013)The AND turns A to 00 if 0 or anything else if 1 
.C:c015  D0 17       BNE $C02E      - A:00 X:00 Y:80 SP:f2 ..-...ZC   10476679  
(C:$c015) If not 00 jump to C02E, but it is, ignore
.C:c017  F0 0D       BEQ $C026      - A:00 X:00 Y:80 SP:f2 ..-...ZC   10476681
(C:$c017) If is 00 jump to C026 
.C:c026  A9 30       LDA #$30       - A:00 X:00 Y:80 SP:f2 ..-...ZC   10476684
(C:$c026) Here we are at C026, load A the screen value of 0, $#30 
.C:c028  9D 08 07    STA $0708,X    - A:30 X:00 Y:80 SP:f2 ..-....C   10476686
(C:$c028) Store it to location BIT7+0 
.C:c02b  4C 19 C0    JMP $C019      - A:30 X:00 Y:80 SP:f2 ..-....C   10476691
(C:$c02b) Jump to CONTINUE label
.C:c019  E8          INX            - A:30 X:00 Y:80 SP:f2 ..-....C   10476694
(C:$c019) Make X, or cycle counter 1, X=X+1
.C:c01a  AD 45 03    LDA $0345      - A:30 X:01 Y:80 SP:f2 ..-....C   10476696
(C:$c01a) Load TESTBYE into A
.C:c01d  4A          LSR A          - A:80 X:01 Y:80 SP:f2 N.-....C   10476700
(C:$c01d) Divide A by 2
.C:c01e  8D 45 03    STA $0345      - A:40 X:01 Y:80 SP:f2 ..-.....   10476702
(C:$c01e) Store our new value in TESTBYTE
.C:c021  E0 08       CPX #$08       - A:40 X:01 Y:80 SP:f2 ..-.....   10476706
(C:$c021) Does X = 8?
.C:c023  D0 E8       BNE $C00D      - A:40 X:01 Y:80 SP:f2 N.-.....   10476708
(C:$c023) NO? ok jump to beginning of our loop 
.C:c00d  AD 3C 03    LDA $033C      - A:40 X:01 Y:80 SP:f2 N.-.....   10476711
(C:$c00d) 
.C:c010  2D 45 03    AND $0345      - A:55 X:01 Y:80 SP:f2 ..-.....   10476715
(C:$c010) X counter is 1, A is our convert value, TESTYBYE is stored as 01000000
.C:c013  C9 00       CMP #$00       - A:40 X:01 Y:80 SP:f2 ..-.....   10476719
(C:$c013) The AND turns A to 00 if 0 or anything else if 1
.C:c015  D0 17       BNE $C02E      - A:40 X:01 Y:80 SP:f2 ..-....C   10476721
(C:$c015) If not 00 jump to C02E, so we jump
.C:c02e  A9 31       LDA #$31       - A:40 X:01 Y:80 SP:f2 ..-....C   10476724
(C:$c02e) Here we are at C026, load A the screen value of 1, $#31
.C:c030  9D 08 07    STA $0708,X    - A:31 X:01 Y:80 SP:f2 ..-....C   10476726
(C:$c030) Store it to location BIT7+1, because x is now 1
.C:c033  4C 19 C0    JMP $C019      - A:31 X:01 Y:80 SP:f2 ..-....C   10476731
(C:$c033) 
.C:c019  E8          INX            - A:31 X:01 Y:80 SP:f2 ..-....C   10476734
(C:$c019) Make X, or cycle counter 2, X=X+1
.C:c01a  AD 45 03    LDA $0345      - A:31 X:02 Y:80 SP:f2 ..-....C   10476736
(C:$c01a) Load TESTBYE into A
.C:c01d  4A          LSR A          - A:40 X:02 Y:80 SP:f2 ..-....C   10476740
(C:$c01d) Divide A by 2
.C:c01e  8D 45 03    STA $0345      - A:20 X:02 Y:80 SP:f2 ..-.....   10476742
(C:$c01e) Store our new value in TESTBYTE
.C:c021  E0 08       CPX #$08       - A:20 X:02 Y:80 SP:f2 ..-.....   10476746
(C:$c021) Does X = 8?
.C:c023  D0 E8       BNE $C00D      - A:20 X:02 Y:80 SP:f2 N.-.....   10476748
(C:$c023) NO? ok jump to beginning of our loop
.C:c00d  AD 3C 03    LDA $033C      - A:20 X:02 Y:80 SP:f2 N.-.....   10476751
(C:$c00d) 
.C:c010  2D 45 03    AND $0345      - A:55 X:02 Y:80 SP:f2 ..-.....   10476755
(C:$c010) X counter is 2, A is our convert value, TESTYBYE is stored as 00100000
.C:c013  C9 00       CMP #$00       - A:00 X:02 Y:80 SP:f2 ..-...Z.   10476759
(C:$c013) 
.C:c015  D0 17       BNE $C02E      - A:00 X:02 Y:80 SP:f2 ..-...ZC   10476761
(C:$c015) 
.C:c017  F0 0D       BEQ $C026      - A:00 X:02 Y:80 SP:f2 ..-...ZC   10476763
(C:$c017) 
.C:c026  A9 30       LDA #$30       - A:00 X:02 Y:80 SP:f2 ..-...ZC   10476766
(C:$c026) 
.C:c028  9D 08 07    STA $0708,X    - A:30 X:02 Y:80 SP:f2 ..-....C   10476768
(C:$c028) 
.C:c02b  4C 19 C0    JMP $C019      - A:30 X:02 Y:80 SP:f2 ..-....C   10476773
(C:$c02b) 
.C:c019  E8          INX            - A:30 X:02 Y:80 SP:f2 ..-....C   10476776
(C:$c019) Make X, or cycle counter 3, X=X+1
.C:c01a  AD 45 03    LDA $0345      - A:30 X:03 Y:80 SP:f2 ..-....C   10476778
(C:$c01a) 
.C:c01d  4A          LSR A          - A:20 X:03 Y:80 SP:f2 ..-....C   10476782
(C:$c01d) 
.C:c01e  8D 45 03    STA $0345      - A:10 X:03 Y:80 SP:f2 ..-.....   10476784
(C:$c01e) 
.C:c021  E0 08       CPX #$08       - A:10 X:03 Y:80 SP:f2 ..-.....   10476788
(C:$c021) 
.C:c023  D0 E8       BNE $C00D      - A:10 X:03 Y:80 SP:f2 N.-.....   10476790
(C:$c023) 
.C:c00d  AD 3C 03    LDA $033C      - A:10 X:03 Y:80 SP:f2 N.-.....   10476793
(C:$c00d) 
.C:c010  2D 45 03    AND $0345      - A:55 X:03 Y:80 SP:f2 ..-.....   10476797
(C:$c010) X counter is 3, A is our convert value, TESTYBYE is stored as 00010000
.C:c013  C9 00       CMP #$00       - A:10 X:03 Y:80 SP:f2 ..-.....   10476801
(C:$c013) 
.C:c015  D0 17       BNE $C02E      - A:10 X:03 Y:80 SP:f2 ..-....C   10476803
(C:$c015) 
.C:c02e  A9 31       LDA #$31       - A:10 X:03 Y:80 SP:f2 ..-....C   10476806
(C:$c02e) 
.C:c030  9D 08 07    STA $0708,X    - A:31 X:03 Y:80 SP:f2 ..-....C   10476808
(C:$c030) 
.C:c033  4C 19 C0    JMP $C019      - A:31 X:03 Y:80 SP:f2 ..-....C   10476813
(C:$c033) 
.C:c019  E8          INX            - A:31 X:03 Y:80 SP:f2 ..-....C   10476816
(C:$c019) Make X, or cycle counter 4, X=X+1
.C:c01a  AD 45 03    LDA $0345      - A:31 X:04 Y:80 SP:f2 ..-....C   10476818
(C:$c01a) 
.C:c01d  4A          LSR A          - A:10 X:04 Y:80 SP:f2 ..-....C   10476822
(C:$c01d) 
.C:c01e  8D 45 03    STA $0345      - A:08 X:04 Y:80 SP:f2 ..-.....   10476824
(C:$c01e) 
.C:c021  E0 08       CPX #$08       - A:08 X:04 Y:80 SP:f2 ..-.....   10476828
(C:$c021) 
.C:c023  D0 E8       BNE $C00D      - A:08 X:04 Y:80 SP:f2 N.-.....   10476830
(C:$c023) 
.C:c00d  AD 3C 03    LDA $033C      - A:08 X:04 Y:80 SP:f2 N.-.....   10476833
(C:$c00d) 
.C:c010  2D 45 03    AND $0345      - A:55 X:04 Y:80 SP:f2 ..-.....   10476837
(C:$c010) X counter is 4, A is our convert value, TESTYBYE is stored as 00001000
.C:c013  C9 00       CMP #$00       - A:00 X:04 Y:80 SP:f2 ..-...Z.   10476841
(C:$c013) 
.C:c015  D0 17       BNE $C02E      - A:00 X:04 Y:80 SP:f2 ..-...ZC   10476843
(C:$c015) 
.C:c017  F0 0D       BEQ $C026      - A:00 X:04 Y:80 SP:f2 ..-...ZC   10476845
(C:$c017) 
.C:c026  A9 30       LDA #$30       - A:00 X:04 Y:80 SP:f2 ..-...ZC   10476848
(C:$c026) 
.C:c028  9D 08 07    STA $0708,X    - A:30 X:04 Y:80 SP:f2 ..-....C   10476850
(C:$c028) 
.C:c02b  4C 19 C0    JMP $C019      - A:30 X:04 Y:80 SP:f2 ..-....C   10476855
(C:$c02b) 
.C:c019  E8          INX            - A:30 X:04 Y:80 SP:f2 ..-....C   10476858
(C:$c019) Make X, or cycle counter 5, X=X+1
.C:c01a  AD 45 03    LDA $0345      - A:30 X:05 Y:80 SP:f2 ..-....C   10476860
(C:$c01a) 
.C:c01d  4A          LSR A          - A:08 X:05 Y:80 SP:f2 ..-....C   10476864
(C:$c01d) 
.C:c01e  8D 45 03    STA $0345      - A:04 X:05 Y:80 SP:f2 ..-.....   10476866
(C:$c01e) 
.C:c021  E0 08       CPX #$08       - A:04 X:05 Y:80 SP:f2 ..-.....   10476870
(C:$c021) 
.C:c023  D0 E8       BNE $C00D      - A:04 X:05 Y:80 SP:f2 N.-.....   10476872
(C:$c023) 
.C:c00d  AD 3C 03    LDA $033C      - A:04 X:05 Y:80 SP:f2 N.-.....   10476875
(C:$c00d) 
.C:c010  2D 45 03    AND $0345      - A:55 X:05 Y:80 SP:f2 ..-.....   10476879
(C:$c010) X counter is 5, A is our convert value, TESTYBYE is stored as 00000100
.C:c013  C9 00       CMP #$00       - A:04 X:05 Y:80 SP:f2 ..-.....   10476883
(C:$c013) 
.C:c015  D0 17       BNE $C02E      - A:04 X:05 Y:80 SP:f2 ..-....C   10476885
(C:$c015) 
.C:c02e  A9 31       LDA #$31       - A:04 X:05 Y:80 SP:f2 ..-....C   10476888
(C:$c02e) 
.C:c030  9D 08 07    STA $0708,X    - A:31 X:05 Y:80 SP:f2 ..-....C   10476890
(C:$c030) 
.C:c033  4C 19 C0    JMP $C019      - A:31 X:05 Y:80 SP:f2 ..-....C   10476895
(C:$c033) 
.C:c019  E8          INX            - A:31 X:05 Y:80 SP:f2 ..-....C   10476898
(C:$c019) Make X, or cycle counter 6, X=X+1
.C:c01a  AD 45 03    LDA $0345      - A:31 X:06 Y:80 SP:f2 ..-....C   10476900
(C:$c01a) 
.C:c01d  4A          LSR A          - A:04 X:06 Y:80 SP:f2 ..-....C   10476904
(C:$c01d) 
.C:c01e  8D 45 03    STA $0345      - A:02 X:06 Y:80 SP:f2 ..-.....   10476906
(C:$c01e) 
.C:c021  E0 08       CPX #$08       - A:02 X:06 Y:80 SP:f2 ..-.....   10476910
(C:$c021) 
.C:c023  D0 E8       BNE $C00D      - A:02 X:06 Y:80 SP:f2 N.-.....   10476912
(C:$c023) 
.C:c00d  AD 3C 03    LDA $033C      - A:02 X:06 Y:80 SP:f2 N.-.....   10476915
(C:$c00d) 
.C:c010  2D 45 03    AND $0345      - A:55 X:06 Y:80 SP:f2 ..-.....   10476919
(C:$c010) X counter is 6, A is our convert value, TESTYBYE is stored as 00000010
.C:c013  C9 00       CMP #$00       - A:00 X:06 Y:80 SP:f2 ..-...Z.   10476923
(C:$c013) 
.C:c015  D0 17       BNE $C02E      - A:00 X:06 Y:80 SP:f2 ..-...ZC   10476925
(C:$c015) 
.C:c017  F0 0D       BEQ $C026      - A:00 X:06 Y:80 SP:f2 ..-...ZC   10476927
(C:$c017) 
.C:c026  A9 30       LDA #$30       - A:00 X:06 Y:80 SP:f2 ..-...ZC   10476930
(C:$c026) 
.C:c028  9D 08 07    STA $0708,X    - A:30 X:06 Y:80 SP:f2 ..-....C   10476932
(C:$c028) 
.C:c02b  4C 19 C0    JMP $C019      - A:30 X:06 Y:80 SP:f2 ..-....C   10476937
(C:$c02b) 
.C:c019  E8          INX            - A:30 X:06 Y:80 SP:f2 ..-....C   10476940
(C:$c019) Make X, or cycle counter 7, X=X+1
.C:c01a  AD 45 03    LDA $0345      - A:30 X:07 Y:80 SP:f2 ..-....C   10476942
(C:$c01a) 
.C:c01d  4A          LSR A          - A:02 X:07 Y:80 SP:f2 ..-....C   10476946
(C:$c01d) 
.C:c01e  8D 45 03    STA $0345      - A:01 X:07 Y:80 SP:f2 ..-.....   10476948
(C:$c01e) 
.C:c021  E0 08       CPX #$08       - A:01 X:07 Y:80 SP:f2 ..-.....   10476952
(C:$c021) 
.C:c023  D0 E8       BNE $C00D      - A:01 X:07 Y:80 SP:f2 N.-.....   10476954
(C:$c023) 
.C:c00d  AD 3C 03    LDA $033C      - A:01 X:07 Y:80 SP:f2 N.-.....   10476957
(C:$c00d) 
.C:c010  2D 45 03    AND $0345      - A:55 X:07 Y:80 SP:f2 ..-.....   10476961
(C:$c010) X counter is 7, A is our convert value, TESTYBYE is stored as 00000001
.C:c013  C9 00       CMP #$00       - A:01 X:07 Y:80 SP:f2 ..-.....   10476965
(C:$c013) 
.C:c015  D0 17       BNE $C02E      - A:01 X:07 Y:80 SP:f2 ..-....C   10476967
(C:$c015) 
.C:c02e  A9 31       LDA #$31       - A:01 X:07 Y:80 SP:f2 ..-....C   10476970
(C:$c02e) 
.C:c030  9D 08 07    STA $0708,X    - A:31 X:07 Y:80 SP:f2 ..-....C   10476972
(C:$c030) 
.C:c033  4C 19 C0    JMP $C019      - A:31 X:07 Y:80 SP:f2 ..-....C   10476977
(C:$c033) 
.C:c019  E8          INX            - A:31 X:07 Y:80 SP:f2 ..-....C   10476980
(C:$c019) Make X, or cycle counter 8, X=X+1
.C:c01a  AD 45 03    LDA $0345      - A:31 X:08 Y:80 SP:f2 ..-....C   10476982
(C:$c01a) 
.C:c01d  4A          LSR A          - A:01 X:08 Y:80 SP:f2 ..-....C   10476986
(C:$c01d) 
.C:c01e  8D 45 03    STA $0345      - A:00 X:08 Y:80 SP:f2 ..-...ZC   10476988
(C:$c01e) 
.C:c021  E0 08       CPX #$08       - A:00 X:08 Y:80 SP:f2 ..-...ZC   10476992
(C:$c021) X is equal to 8, we no longer loop
.C:c023  D0 E8       BNE $C00D      - A:00 X:08 Y:80 SP:f2 ..-...ZC   10476994
(C:$c023) End of program, return to BASIC
.C:c025  60          RTS            - A:00 X:08 Y:80 SP:f2 ..-...ZC   10476996
(C:$c025) 






Heres another thing to do thats really cool

Go to Virtual 6502 and paste the program in. Here is is compiled


ea a9 55 8d 3c 03 a0 80 8c 45 03 a2 00 ad 3c 03 2d 45 03 c9 00 d0 17 f0 0d e8 ad 45 03 4a 8d 45 03 e0 08 d0 e8 60 a9 30 9d 08 07 4c 19 c0 a9 31 9d 08 07 4c 19 c0


Set the start address hex to C000






Click LOAD MEMORY then
Click SHOW MEMORY



Click the Green PC on the top left and type in C000, click OK



Click Continuous run, and watch the program execute, total cycles should be 356.  Click Look up me # and type in 0708.  you will see the screen codes for 01010101, a combination of 30 and 31














NEXT----->
Understanding 6502 assembly on the Commodore 64 - (14) Space and cycle optimization