Reading FAT data Part 2

Let’s just recap a little before we move on. In FAT the SD Card is divided into clusters. Each cluster has the same size, which is a power of 2 times the SD sector size (1 sector = 512 bytes). On my current FAT32 formatted SD card the cluster size is 4096 bytes or 8 sectors. A cluster represents the minimum size of data you can allocate on the card and files always takes up a whole number of clusters (which might be more than the actual file size).

The FAT32 system uses 4 byte numbers for the clusters, starting with $000000, $00000001, $00000002, etc. They are stored in little endian in the file allocation table (FAT) and you can see an example of this here. A file is a simple chain of sectors containing its data and that chain can be found in the FAT. The table below shows an example of a FAT entry for a file starting in sector 3 and continuing to sector 6. Each cluster entry contains the number of the next cluster in the chain. In the example below the value for cluster 3 can be found in bytes 12-15 (4 byte values per cluster) and contains the number 4, which is the next next cluster in the chain. The chain continues to cluster 6 where we encounter the end of file marker (in this case $0FFFFFFF). Note that clusters in a chain do not have to be consecutive numbers.

Offset+0+1+2+3+4+5+6+7
$0000F0FFFF0FFFFFFF0F
$0008FFFFFF0F04000000
$00100500000006000000
$0018FFFFFF0F
Example FAT table with a file starting in sector 3 and continuing through sector 6

As mentioned in the previous post the root directory can typically be found in cluster 2 and contains top level directory and file information. Each directory entry is 32 bytes long and can contain short file names, long file names, directories, etc. If it is a short file entry then the first 8 bytes contain the name and the next 3 the file extension. The first cluster of the file is stored in bytes 26, 27, 20 and 21 of the directory entry (don’t ask why … read the Wikipedia article 🙂

The last remaining thing to mention before we get to the code, is that you need the address of a given sector on the SD card to read it with command 17. E.g. sector 5 is at address 5*512 = $00000A00. Since FAT gives us the cluster numbers we need a way to convert from that to sector address. In my previous post we used the information in the MBR and boot sector to find the address of the first root cluster – and we also know its cluster number (typically 2). We can use this information to find the start address of any given cluster.

cluster_address = root_start_address + (cluster_number – root_cluster_number) * cluster_byte_size

With zp_temp storing a 4 byte cluster number the following code returns the address in sd_file_sector. For the math helper functions see my previous post or the full source code.

.cluster_adr    sec                    ; cluster number - root cluster number 
                lda zp_temp+0
                sbc sd_root_cluster+0 
                sta zp_temp+0
                lda zp_temp+1
                sbc sd_root_cluster+1 
                sta zp_temp+1
                lda zp_temp+2
                sbc sd_root_cluster+2 
                sta zp_temp+2
                lda zp_temp+3
                sbc sd_root_cluster+3 
                sta zp_temp+3
                
                ldx #8                 ; 512 = 2 power 9 (loop always adds 1) 
                lda sd_cluster_size 
                beq .cfe_end           ; error handling  
@power2         inx                    ; shift right to calculater how many powers of 2 the cluster size is  
                lsr 
                bcc @power2            ; carry will be set when we encounter first (and only) bit 
                
                jsr m.shift2w          ; offset address to this cluster (multiply w 512 * clusters/sector)
                
                ldx #<sd_root_sector   ; copy root start address ... 
                ldy #<sd_file_sector                                                                            
                jsr m.move2w_osd 
        
                ldx #<sd_file_sector   ; ... add offset to the right sector for this cluster number                                                                                  
                ldy #>sd_file_sector 
                jmp m.add2w
                ; rts

Note that the commented rts statement is just the way I remind myself that the code returns after the jump. It is one of the numerous small 6502 optimizations to replace “jsr, rts” with “jmp” which saves 6 cycles and 1 byte. Probably overkill in this context, but it’s second nature for me.

Let’s get into the meat of the code for loading a file. As noted previously my implementation only supports files in the root directory. I might later extend to support sub-directories, but for most usage the root directory limitation should not be an issue. I also only support the short file name that must be stored in sd_file_name in the 8+3 format; e.g. “boot prg” for a file named boot.prg

sd.load_file    jsr sd.load_root                        ; load first root sector                        
                bcc .stlb_end                           ; carry clear = failure 

                
@check_sector   ldx #(SD_SECTOR_SIZE/F32_ENTRY_SIZE)    ; number of entries per sector (is 16 assuming 512 byte sectors)
                lda #<sd_data_buffer
                sta zp_source+0
                lda #>sd_data_buffer
                sta zp_source+1

                
@cmp_attribute  ldy #11                                 ; file attribute 
                lda (zp_source),y
                and #%00011000                          ; check volume labe or directory (will include LFN)
                bne @next_entry                         ; ... then skip                         
                dey
                
@cmp_file_name  lda sd_file_name,y                      ; check short name only (11 chars)
                cmp (zp_source),y
                bne @next_entry 
                dey 
                bpl @cmp_file_name
                bmi @found_file                         ; if all letters match file name we have the right entry 

                
@next_entry     clc                                     ; advance to the next entry in the root 
                lda zp_source+0
                adc #F32_ENTRY_SIZE
                sta zp_source+0
                bcc @move_entry_end
                inc zp_source+1
@move_entry_end                 
                dex                                     ; have we checked this entire sector? 
                bne @cmp_attribute
                
@load_sector    jsr sd.load_rootx                       ; load next sector                      
                bcs @check_sector                       ; carry clear = failure or end of root  
                                
.stlb_error     lda #SD_FILE_ERR                        ; return file error code 
                sta sd_cmd_dat+0 
.boot_error     clc                                     ; clear carry to indicate failure 
                rts      

Hopefully this is straight forward. Bytes 0-10 of a directory entry contains the short file name and is compared against our target file name stored in sd_file_name. Perhaps the only thing to comment on is the file attribute in byte 11, which allows us to skip the check if the entry is a directory, long file name, or similar. Any of those entries will have bits 3 and/or 4 of the attribute set. The code simply goes through each sector of the root and loads a new sector when needed. When a match is found it breaks the loop or returns an error if we dont find a match.

The root directory is loaded sector by sector using the following function. Calling sd.load_root loads the first sector and subsequent calls to sd.load_rootx loads the next sector, returning carry clear once we have gone through all sectors of the root. Note it uses the zp_temp+5 to zp_temp+7 variables to keep track of where it is and those cannot be clobbered between calls. It also assumes that no other SD card commands or similar are called, so the sd_cmd_dat structure is unchanged between calls. As a reminder I have allocated 512 bytes starting at sd_data_buffer to contain the sector data loaded by calling sd.get.block

sd.load_root    ldx #3
@root_cluster   lda sd_root_cluster,x      ; copy cluster number of root to zp_temp 0-3
                sta zp_temp,x
                dex 
                bpl @root_cluster
                jsr sd.get_fchain          ; get fat chain for root folder  
                bcc .sdlf_end              ; error if carry is clear 
                
                ldy #0                     ; first cluster   
                
.sdlr_sector1   ldx #0                     ; store cluster number 
@copy_cn        lda sd_fat_chain,y 
                sta zp_temp,x 
                iny 
                inx
                cpx #4
                bne @copy_cn
                
                jsr .chk_fat_entry         ; check if we are at end of file 
                bcs .clear_carry           ; ... if so return carry clear 
                
                sty zp_temp+7              ; store/update index to next entry 
                lda sd_cluster_size        ; number of sectors to load per cluster  
                sta zp_temp+5
                                
                jsr .cluster_adr           ; convert cluster number to address (address in zp_temp)
@load_fsector   ldx #<sd_file_sector       ; get sector in this cluster of file                                    
                ldy #>sd_file_sector 
                jmp sd.get_block
                ; rts 


sd.load_rootx   ldy zp_temp+7              ; index to current entry in fat chain (if we need to load a new cluster)     
                dec zp_temp+5              ; sectors left in this cluster?
                beq .sdlr_sector1          ; ... if not get next cluster  
                                
@next_sector    ldx #<(sd_cmd_dat+1)       ; add 512 to current sector/load address                                                                                                                                               
                ldy #>(sd_cmd_dat+1)     
                jsr m.add2w_512 

@load_sector    jmp sd.get_blockx          ; load next sector                      
                ; rts   

The root directory can be spread over multiple clusters, similar to a file, and as such has its own FAT chain. The following function will find the chain of any file (or directory) given the first cluster number zp_temp (4 bytes). I returns the chain in the sd_fat_chain buffer. Note that since I only have 32K RAM the maximal cluster chain length of a file will be 4 * 32K / 512 = 256 bytes (smallest cluster size is 512 or one sector). Most cards will have cluster sizes greater than 1 sector, but for worst case I have allocated 256 bytes of buffer for the cluster chains.

sd.get_fchain   ldx #4                ; copy first cluster number into fat chain              
@first_entry    lda zp_temp,x 
                sta sd_fat_chain,x 
                dex 
                bpl @first_entry
                                
                ldx #4 
@calc_sector    stx zp_temp+6         ; index to entry in fat-chain table (XR = 4 first time we get here) 
                                
                ldx #2 
                jsr m.shift2w         ; multiply by 4 to get address offset to the right fat (4 bytes per cluster entry)  
                
                lda zp_temp+0         ; index into sector where this cluster numbers entry is located
                sta zp_temp+4                           ; temp+4/temp+5 hi/lo range 0-508   
                lda zp_temp+1
                and #%00000001 
                sta zp_temp+5                    
                
                lda #$00              ; truncate to get offset to address of start of fat sector 
                sta zp_temp+0                    
                lda zp_temp+1
                and #%11111110 
                sta zp_temp+1                    
                        
                ldx #<sd_fat_sector   ; copy fat sector start adress ... 
                ldy #<sd_file_sector                                            
                jsr m.move2w_osd 
                
                ldx #<sd_file_sector  ; ... add offset to the right fat sector for this entry                                                                                 
                ldy #>sd_file_sector 
                jsr m.add2w
                
                ldx #3  
@sector_loaded  lda sd_file_sector,x  ; did we just load this sector? (compare to address in sd command data)
                cmp sd_cmd_dat+1,x 
                bne @load_fat_sctr    ; ... if not then load the new one 
                dex 
                bpl @sector_loaded    ; ... otherwise no need to load it again (next cluster in the chain is often the sequential next one)
                
@set_source     clc                   ; location of next cluster number in FAT sector data 
                lda #<sd_data_buffer
                adc zp_temp+4         ; low byte of index into FAT sector 
                sta zp_source+0
                lda #>sd_data_buffer
                adc zp_temp+5         ; hi byte of index into FAT sector 
                sta zp_source+1
                
                ldx zp_temp+6         ; index to entry in fat-chain table
                ldy #0
                                         
@get_cluster    lda (zp_source),y     ; get next cluster number in chain 
                sta sd_fat_chain,x    ; store in next location in FAT chain 
                sta zp_temp,y 
                inx 
                beq @error            ; overflow in fat chain means file is too long
                iny
                cpy #4
                bne @get_cluster         
                
                jsr .chk_fat_entry    ; check if normal cluster value or EOF  
                bcc @calc_sector      ; normal -> carry set 
                rts     
                
              
@load_fat_sctr  ldx #<sd_file_sector  ; get fat sector that holds the next cluster of this file chain                    
                ldy #>sd_file_sector 
                jsr sd.get_block
                bcs @set_source       ; no error -> continue  
                
                
@error          ldx #3                ; if overflow we have an issue  
                lda #$FF
@eof_loop       sta sd_fat_chain,x    ; store end-of-file marker as fist cluster @ error 
                dex 
                bpl @eof_loop
 
                clc                   ; clear carry to indicate error                 
@end            rts   

The first thing this code does is to copy the start cluster value into the FAT chain we are creating. The start cluster of a file is obviously the first one we will load. Then it calculates where in the FAT table to look for the chain we are after. If e.g. the start cluster is 312, then it is located 312*4 = 1248 = $4E0 bytes into the FAT table. That will be in the 3rd sector of the FAT table (first sector contains bytes 0-511, second 512 – 1023, third 1024 – 1535, etc.) and at offset 224 in that sector (1248-1024=224). So in this example the address of the FAT sector we are after is fat_sector_address + 1024 and the number of the next cluster in the chain will be the 4 byte value located at offset 224.

The code then simply loads that sector and begins building the chain by reading the value at the offset we calculated (224 in this example) and saves it in the next spot in the sd_fat_chain buffer/table. It then uses this value to go find the next cluster in the chain, etc. Since clusters in a chain are often consecutive I check to see if the next cluster is in the sector we already loaded; if it is there is no need to load it again from the SD card.

If we for some reason have an error I store the end-of-file marker in the first slot of the chain, which would indicate that there are no clusters in the chain.

Every time a cluster number is read we need to check if it is an actual cluster number or if we have reached the end of the file. Valid cluster numbers run from $00000000 – $0FFFFFEF, a bad cluster is indicated by $?FFFFFF7, and values higher than that indicating end-of-file. The following code does that check and returns carry clear for a normal cluster number.

.chk_fat_entry  lda zp_temp+3                
                cmp #$0F            ; High byte >= $0F?
                bcc @cfe_end        ; ... if not then normal cluster number 
                
                lda zp_temp+2       ; middle bytes are $FF for special/reserved values
                cmp #$FF
                bne @cfe_end
                lda zp_temp+1                            
                cmp #$FF
                bne @cfe_end
                
                lda zp_temp+0
                cmp #$F0            ; highest valid cluster value is $EF 

.cfe_end        rts   

Zero indicates a free cluster and would not be a valid value for a file chain (note to self: I really should check for this as well and return an error). Technically I should also check if it is the end-of-file marker we encounter if carry is set, but since a bad sector cannot be part of a file chain we should be OK here.

We can now find the cluster chain for the root directory (and any file or directory) which allows us to cycle through its entries and look for the desired file name. Once we find the right entry we can use it to extract the first cluster of the file. The number of the first cluster in a file is stored in bytes in bytes 26, 27, 20 and 21 of the directory entry and the file size is stored at location 31 ($1F). The following code extracts this information once we have found a match – and creates the cluster chain of that file.

@found_file     ldy #$1F                    ; file size 
                ldx #3                                  
@get_file_size  lda (zp_source),y
                sta sd_file_size,x  
                dey 
                dex 
                bpl @get_file_size

                lda sd_file_size+3          ; check if file size is over max limit 
                ora sd_file_size+2          ; high word must be zero 
                bne .stlb_error         
@get_file_pages ldx sd_file_size+1          ; calculate file size in number of pages (1 page = 256 bytes) 
                lda sd_file_size+0
                beq @set_file_pages         ; do we have remaining bytes, beyond number of full pages?  
                inx                         ; ... then increase page number by one 
                cpx #(>SD_MAX_FILESIZE)+1   ; more pages than max size allows? 
                bcs .stlb_error 
@set_file_pages stx sd_file_pages           ; number of pages in file (remaining to load)

        
@first_cluster  ldy #$1A                    ; first cluster of file  
                lda (zp_source),y
                sta zp_temp+0
                iny  
                lda (zp_source),y
                sta zp_temp+1
                ldy #$14                    ; first cluster of file  
                lda (zp_source),y
                sta zp_temp+2               ; start of FAT chain stored in buffer 
                iny  
                lda (zp_source),y
                sta zp_temp+3
                                
@get_fchain     jsr sd.get_fchain           ; get fat chain of clusters for this file 
                bcc .sdlf_end               ; error if carry is clear 

Note that I also do a check to see if the file size is larger than 32K and return an error if it is. Since my system only has 32K there is no reason to load a larger file. In addition a calculate the number of pages (1 page = 256 bytes) to load. The reason for this will be clear once we go through the function I use to load the actual file into the system.

Finally we are ready to load the actual data from the file into memory. But since this blog post is already too long I will do this in the next one. As always feel free to comment, ask questions and point out mistakes, etc.

Reading FAT data Part 1

OK with the SD card initialized in SPI mode it is finally time to read some data. To do this we need to understand the FAT format and I found the following links helpful as I was getting into the specifics. Note in my implementation I only support FAT32.

As mentioned in the previous posts the data is stored in 512 byte blocks on the SD card, which is the smallest increment of data you can read. To get a block of data from the SD card you can send it command 17 with the argument being the address of the block (4 byte address). Once the command is sent we get the usual R1 response and then have to wait until the card is ready to transmit data, which is indicates by sending the OK token ($FE). The following function accomplishes this (and calls the sd.send_cmd function described in my previous post)

.sd_block_cmd   stx zp_source+0       ; low byte in XR
                sty zp_source+1       ; high byte in YR

                ldy #3                ; move block address to command buffer 
@set_cmd_data   lda (zp_source),y
                sta sd_cmd_dat+1,y 
                dey 
                bpl @set_cmd_data
                
.sd_block_cmdx  lda #17               ; entry point if sd_cmd_dat structure set directly 
                jsr sd.send_cmd       ; command 17 = read a single block (arg = block number)
                bne @end  
                
                ldy #0
@wait_data      jsr spi.get_byte      ; wait until card is ready to send block data 
                cmp #$FF 
                bne @return
                dey
                bne @wait_data
                
@return         cmp #$FE              ; did we get 'ok' token ($FE) or error?
@end            rts    

The function needs a pointer to the 4 byte block address passed in XR/YR and will return ‘equal’ (Zero flag set) on success. Assuming that it returns without an error we are now ready to read 512 bytes of data from the specified block on the SD card. The following function reads the 512 bytes of data into a memory buffer that in my current implementation is $0300 – $04FF. To change it simply set sd_data_buffer to the location in memory you want to use.

sd.get_block    jsr .sd_block_cmd     ; send 'read block' command to SD card          
                bne .sdgb_error

.sd_get_bdata   ldx #0 
                
@first_256      lda VIA1_IFR          ; check IRQ flag
                and #%00000100        ; check for SR flag  
                beq @first_256          
                lda VIA1_SR           ; get data 
                sta sd_data_buffer,x 
                inx 
                bne @first_256
                                
                ; ldx #0              ; read last 256 bytes into buffer 
@last_256       lda VIA1_IFR          ; check IRQ flag
                and #%00000100        ; check for SR flag  
                beq @last_256           
                lda VIA1_SR           ; get data 
                sta sd_data_buffer+$100,x 
                inx 
                bne @last_256
                
                jsr spi.get_byte      ; get CRC 
                jsr spi.get_byte                                        

                sec                   ; set carry on success 
                rts 
                

sd.get_blockx   jsr .sd_block_cmdx    ; address already set in in sd_cmd_dat structure
                beq .sd_get_bdata

.sdgb_error     lda #SD_FILE_ERR      ; return file error code 
                sta sd_cmd_dat+0 
                clc                   ; clear carry on failure  
                rts 

I have chosen not to call my generic spi.get_byte function while reading the data, but implement it inline for speed — and I also have two 256 byte loops to read the 512 bytes of data for the same reason. If you want to optimize for code size you could do this, but I have optimized for speed of reading the data. Note the small optimization of not setting XR = 0 in the second loop, since it will have that value exiting the first loop. Once the 512 bytes are received the card sends a two byte CRC code which I simply ignore, but if you want error correction you could implement logic for that.

The loop takes 22 cycles to complete each pass, which is slightly longer than the 16 cycles it takes to read a byte using the system clock to drive the shift register clock on the VIA. This implementation has the advantage of being able to work at different clock speeds, since it waits for the flag to be set, but if you know you are running at a fixed speed you could optimize it to run ~25% faster, e.g. by the following

@first_256      lda VIA1_SR             
                sta sd_data_buffer,x
                nop        
                inx 
                bne @first_256  

I have not currently tested this and at exactly 16 cycles it might be too tight timing.

The first data we want to read from a FAT formatted card is the Master Boot Record (MBR). The MBR is located on block 0. To get the address of any block we have to multiply by the block size (512 or $200) … of course for block 0 this is still $00000000 😉

sd.load_mbr     lda #$00                     ; load block 0 on SD card = MBR 
                sta sd_cmd_dat+4                                        
                sta sd_cmd_dat+3                                        
                sta sd_cmd_dat+2                                        
                sta sd_cmd_dat+1 
                jsr sd.get_blockx

@boot_adr       lda #$00                     ; calculate & store boot sector address (little indean)  
                sta sd_boot_sector+0         ; address = LBA * $200 (sector size)
                lda sd_data_buffer+$1BE+8    ; location of first partition info ($1BE) + LBA (#sectors between MBR and first sector in partition) 
                asl 
                sta sd_boot_sector+1
                lda sd_data_buffer+$1BE+9 
                rol 
                sta sd_boot_sector+2
                lda sd_data_buffer+$1BE+10 
                rol 
                sta sd_boot_sector+3

If you want more info on what is stored on the MBR there is a quick overview here and more info in the Wikipedia article linked above. I am assuming that we only have one partition — or at least my current implementation will only support the first partition on the card — and the only piece of information we are after in the MBR is the location of the boot sector address. The first partition info starts at offset $1BE and the number of blocks/sectors between the MBR and the boot sector is stored as a 4 byte value starting in byte 8 of the partition info. Multiply this value by 512 and you have the boot sector address.

I should really also check that this partition is active and that it is FAT32 formatted, but for now I will leave that up to the reader.

The boot sector contains more information that we will need to read files from the SD card. Most importantly we want to find the location of the File Allocation Table (FAT) and its length. We also need the number of sectors per cluster for the FAT formatting of the card. If you have formatted a USB stick you might have noticed an “allocation unit size” which is the smallest unit size for the file format. It will be a power of 2 multiplier of the SD block size, e.g. an allocation unit size of 4096 bytes is 8 * 512. This means that the smallest size of FAT data you can write to the card in this example is 4096 bytes or 8 blocks of 512 bytes. Every file will be saved as whole number of sectors with any remaining bytes “wasted” — and a 5000 byte file will be 2 sectors long, in this example, with 3,192 bytes unused on the second sector. What a luxury on an 8 bit system, but obviously FAT32 was designed for systems where a couple of thousand bytes can be considered a pittance 🙂

In addition to the location of the FAT we also want to find the location of the root sector where information about file and folder names, etc. is stored. This location is not directly stored in the boot sector, but can be calculated since it starts immediately after the FAT sectors. For error correction etc. there are more than one FAT (typically 2) so you need to multiply the length of a FAT with the number of FAT’s to find the root sector location. Lastly we want the root sector number, which typically is 2 but could be a different value. We need this since the FAT stores the sector numbers of files and we will have to convert those sector numbers to addresses of blocks on the SD Card.

sd.load_bsctr   ldx #<sd_boot_sector       ; get boot sector 
                ldy #>sd_boot_sector 
                jsr sd.get_block
                bcc .stlb_end              ; carry clear = failure         

@sector_size    lda sd_data_buffer+$0B     ; check that sector size is 512 
                bne .init_sd.error                      
                lda sd_data_buffer+$0C           
                cmp #$02
                bne .init_sd.error                      

@reserved       lda sd_data_buffer+$0E     ; sectors to first FAT (number of reserved sectors)
                sta zp_temp+0   
                lda sd_data_buffer+$0F                  
                sta zp_temp+1   
                lda #$00
                sta zp_temp+2   
                jsr m.mult512              ; convert from sector length to address length 
                
                ldx #<sd_boot_sector       ; copy boot sector address to fat sector address... 
                ldy #<sd_fat_sector             
                jsr m.move2w_osd 
                
                ldx #<sd_fat_sector        ; ... and add reserved sector length                            
                ldy #>sd_fat_sector 
                jsr m.add2w

                ldx #3
@fat_length     lda sd_data_buffer+$24,x   ; sector length of FAT (for FAT32)
                sta sd_fat_length,x 
                sta zp_temp,x   
                dex 
                bpl @fat_length
                jsr m.mult512              ; convert from sector length to address length  

                ldx #<sd_fat_sector        ; copy fat sector address to root sector address... 
                ldy #<sd_root_sector            
                jsr m.move2w_osd                

                lda sd_data_buffer+$10     ; number of FAT's 
                sta zp_temp+5
@add_fat_length ldx #<sd_root_sector       ; add FAT length times number of FAT entries                            
                ldy #>sd_root_sector 
                jsr m.add2w
                dec zp_temp+5
                bne @add_fat_length             
                
@cluster_size   lda sd_data_buffer+$0D     ; sectors / cluster 
                sta sd_cluster_size
                               
                ldx #3 
@root_cluster   lda sd_data_buffer+$2C,x   ; root cluster number (typically 2, but is theoretically a 2word)
                sta sd_root_cluster,x 
                dex
                bpl @root_cluster 

I created a couple of helper math routines to help multiplying and adding 4 byte (double word) numbers. Nothing special about these, but just for completion they are listed below.

I will stop this post here. I had intended to cover everything needed to load a file, but it is already getting too long and I am late in publishing. So next blog post will cover the FAT and root sector, and finally we will load a file. However if you are impatient everything is in the complete code listing with a decent number of comments and you can always as questions in the comments.

;--------------------------------------------------------------------------
; copy 2word address to another location, both in os_data range   
; source/dest low byte pointer in XR/YR 
;--------------------------------------------------------------------------

m.move2w_osd    lda #>os_data                           ; both source and destination in the same page range 
m.move2w        sta zp_source+1
                sta zp_dest+1

                stx zp_source+0                         ; low byte of source/dest       
                sty zp_dest+0

                ldy #3
@move           lda (zp_source),y 
                sta (zp_dest),y 
                dey 
                bpl @move
                
                rts 
                
                
;--------------------------------------------------------------------------
; add 2word zp_temp to (zp_source) and store in (zp_source). 
; pointer to other 2word in XR/YR 
;--------------------------------------------------------------------------

m.add2w_512     lda #$00
                sta zp_temp+0                           ; store 512 in zp_temp
                sta zp_temp+2
                sta zp_temp+3
                lda #$02 
                sta zp_temp+1 

                
m.add2w         stx zp_source+0                         ; low byte in XR
                sty zp_source+1                         ; high byte in YR               
                ldy #0                                  ; note: cannot run a loop, since cpy would affect flag 

                clc
                lda (zp_source),y 
                adc zp_temp+0
                sta (zp_source),y 
                iny
                lda (zp_source),y
                adc zp_temp+1
                sta (zp_source),y 
                iny
                lda (zp_source),y
                adc zp_temp+2 
                sta (zp_source),y 
                iny
                lda (zp_source),y
                adc zp_temp+3 
                sta (zp_source),y

                rts 


;--------------------------------------------------------------------------
; left shift 2word in zp_temp. number of shifts in XR 
;--------------------------------------------------------------------------     

m.shift2w       asl zp_temp+0           
                rol zp_temp+1
                rol zp_temp+2
                rol zp_temp+3 
                dex 
                bne m.shift2w 
                rts 

                
;--------------------------------------------------------------------------
; multiply 2word in zp_temp with 512 
;--------------------------------------------------------------------------

m.mult512       asl zp_temp+0           
                rol zp_temp+1
                rol zp_temp+2
                lda zp_temp+2 
                sta zp_temp+3 
                lda zp_temp+1 
                sta zp_temp+2 
                lda zp_temp+0 
                sta zp_temp+1 
                lda #$00                                 
                sta zp_temp+0                                           
                rts