Accessing the SD Card

Now that we can send and receive bytes using the SPI protocol it is time to begin accessing the SD card itself. In this post I will go through what it needed to initialize the SD Card and in the next post go into the FAT format so we can read files from the SD Card. My implementation is not fully general, but should support most newer cards and you can find the entire code here.

I found this series of blog posts from Lucky Resistor helpful in understanding the SD commands, responses, and data format. The pictures were especially helpful in getting an overview of what is being sent back and forth and I would highly encourage you to read Part 2 and Part 3 in combination with this post of mine. There is also a great overview here that would be a good pre-read to this post and the whole initialization process is summarized in this flow chart.

The procedure to initialize the SD Card is as follows

  • Put card into SPI mode, by sending at lest 74 clock pulses
  • Send Command 0 repeatedly until card is idle (resets)
  • Send Command 8 and check for the right response
  • Send AMCD41 command to initialize newer / high capacity cards
  • Send Command 58 to check working voltage range
  • Finally send Command 16 to ensure block size is 512 bytes

The first step is sending at least 74 clock pulses (I send 80 as the shift register triggers 8 clock pulses every time we send a byte). For this first step you have to set CS (slave select for the SD card) high and MOSI should also be 1 during this procedure. For all “normal” commands you should of course select the SD card by pulling CS low, but the first step is different. The follow code achieves this (see SPI Part 2 for spi.send_byte etc. or you can find the full code here)

sd.init         jsr spi.set_output              ; setup for SPI output 
                lda #(SPI_OUTPUT+SPI_NOSS)      ; set SR to output and SD card CS = 1 (special first time init)
                sta VIA1_PORTB                
                                
                ldx #10                         ; send 80 clock signals to initialize SPI mode for SD card      
@init0          lda #$FF                        ; MOSI = 1 and CS = 1 
                jsr spi.send_byte
                dex 
                bne @init0

Note that this initialization should be run at between 100-400 kHz. You can subsequently (on newer cards) increase the speed up to about 25 MHz. Check my previous post for notes on how to vary SPI clock speed and setting it to the max of 1/2 the system clock speed.

Sending a command to the SD card follows a 6 byte format. The first byte is the command number with bit 7 = 0 and bit 6 = 1. E.g. for command 8 the first byte is $48. This is then followed by 4 bytes that are the argument for the command, and finally a CRC (error checking) byte. The CRC byte is optional for SPI mode, with some exceptions that we cover later. Before sending a command you should check that the SD card is idle, indicated by receiving $FF. After sending the command the SD card will send one of several responses, depending on the command. The first byte of the response (at least for the commands used here) will contain several flags and have bit 7 = 0. For many commands this is the only response (denoted R1).

sd.send_cmd     ora #$40            ; set bit 6 = 0 for a command 
                sta sd_cmd_dat+5    ; store at end of command data (we send from end due to dex)
                
@set_crc        ldx #$FF            ; standard CRC for most commands                
                cmp #$48            ; cmd8 has a different CRC
                bne @store_crc                           
                ldx #$87            ; crc for cmd8                 
@store_crc      stx sd_cmd_dat+0    ; store at start of command data (we send from end due to dex)


@check_idle     jsr spi.set_input
                jsr spi.get_byte
                cmp #$FF            ; idle = $FF
                bne .sdcmd_error    ; clear carry and return 
                
@wait           lda VIA1_IFR        ; check IRQ flag
                and #%00000100      ; check for SR flag  
                beq @wait
                

sd.send_cmdx    jsr spi.set_output  ; setup SPI to output 
                                                
                ldx #5
@send_cmd       lda sd_cmd_dat,x 
                jsr spi.send_byte 
                dex 
                bpl @send_cmd

       
sd.get_response jsr spi.set_input

                ldx #10             ; wait for response (should be within 8 bytes)             
@get_byte       jsr spi.get_byte
                bpl .sdcmd_return   ; first bit of first byte of response will always be 0  
                dex                 ; retry 
                bne @get_byte   

.sdcmd_error    ora #$80            ; return negative value on error 
                
.sdcmd_return   sta sd_cmd_dat+0    ; save return value for later retrieval 
                rts 

Couple of comments on the above. Since a valid R1 response from the SD card always has bit 7 = 0 I return a negative value in case of a failure. Also in the above you will notice that the command bytes are stored in the “opposite” sequence (e.g. the first byte to send is stored in the 6th byte of the command data). I do this in general to optimize my indexed loops, so I dont have to do a compare at the end. Doing “dex, bne” saves both time and space over “inx, cpx #val, bne”. Most of the times it is possible to design your code so a loop runs from max value down to zero.

With the code for sending a command to the SD card in place we can now proceed with the actual initialization process. Firstly (and perhaps most importantly) we send command 0 repeatedly until the card is in idle state. I found that if a previous command was not completed this can take quite a number of retries. For instance if resetting the computer while there are still 100’s of bytes left to receive from a read command command 0 needs to be sent dozens of times to get to idle state.

Since the card might not initially be in idle mode we cant wait for the idle byte to be received, as we generally do when sending commands. Therefore I jump to sd.send_cmdx to skip this step in the code below. Also for the initialization process it is recommended to send the correct CRC byte ($95) for command 0.

                ldy #100             ; number of retries to get card in idle state   
                
@cmd0           lda #0               ; command 0 (go to idle) - send without waiting for idle byte! 
                sta sd_cmd_dat+4                                        
                sta sd_cmd_dat+3                                        
                sta sd_cmd_dat+2                                        
                sta sd_cmd_dat+1                                
                ldx #$95             ; crc for cmd0 
                stx sd_cmd_dat+0     ; store at start of command data (we send from end due to dex)
                ora #$40             ; set the command bit 
                sta sd_cmd_dat+5     ; store at end of command data (we send from end due to dex)
                jsr sd.send_cmdx     ; call send command w/o checking idle 
                cmp #$01                                        
                beq @cmd8            ; returns 1 when i idle state
                
                dey                  ; retry cmd0 
                bne @cmd0       
                beq .init_sd.error

Next step is to “send if cond” or command 8 with a special argument of $000001AA and the correct CRC code of $87. The $AA is a pattern that the card should repeat back on success. It could be any pattern, but $AA is the standard one ($AA = %10101010). If we have a type 2+ card the 4th byte of the response should be our pattern (response to command 8 is the R1 byte plus 4 more). If we have an older card then the initialization process is slightly different and I have chosen not to support this (at least for now).

@cmd8           ;lda #$00                
                ;sta sd_cmd_dat+4                                        
                ;sta sd_cmd_dat+3                                        
                lda #$01                
                sta sd_cmd_dat+2                                        
                lda #$AA               ; pattern for response                          
                sta sd_cmd_dat+1                        
                lda #8                 ; command 8 = send if cond  
                jsr sd.send_cmd                                 
                ;cmp #$01              ; skip error checking as response will trigger error if there is one 
                ;bne @error                             
                jsr spi.get_byte       ; response should be $00, $00, $01, $AA 
                jsr spi.get_byte
                jsr spi.get_byte
                jsr spi.get_byte
                cmp #$AA 
                bne .init_sd.error     

Since the command data already contains zeros in the 3rd and 4th location from command 0 I skip setting them. It’s just second nature for me to optimize for size and speed in 6502 assembly, but I leave the code commented for readability and if the proceeding code should change and the assumption is no longer valid. I also skip checking of we got the right R1 response (should be $01) since if there is an error then we wont get the right pattern back either and that will trigger the error. Again just second nature to save bytes and cycles.

With a newer card we then need to send it a special application command to indicate we support SDHC and SDCX cards and enable high speed transmission. Command 55 tells the card that the next command is an application command. The application command we need is AMC41 with bit 30 set in its argument ( $40000000). This sequence might also have to be repeated a number of times until the SD card has executed the command. When the card is ready the AMC41 command will return zero.

                ldy #200             ; number of retries until card is ready 
        
@cmd55          lda #$00             ; command arguments 
                sta sd_cmd_dat+4                                        
                sta sd_cmd_dat+3                                        
                sta sd_cmd_dat+2                                        
                sta sd_cmd_dat+1 
                lda #55              ; command 55 = application command   
                jsr sd.send_cmd
                ;cmp #$01
                ;bne @error                             
                
@amc41          lda #$40             ; command arguments 
                sta sd_cmd_dat+4                                        
                ;lda #$00 
                ;sta sd_cmd_dat+3                                       
                ;sta sd_cmd_dat+2                                       
                ;sta sd_cmd_dat+1 
                lda #41              ; AMC41 when following command 55  
                jsr sd.send_cmd         
                beq @cmd58
                
                dey                  ; retry until card is ready 
                bne @cmd55      

The final steps are to send command 58 (read OCR) to check that the card support the right voltage supply ranges. This is optional and could be skipped entirely if your supply voltage is 2.7V to 3.6V. I still send the command just in case, but ignore the response. Lastly we need to send command 16 to set the block size to 512 bytes to ensure compatibility with FAT. This is done by setting the argument to $00000200 = 512.

@cmd58          lda #$00 
                sta sd_cmd_dat+4                                        
                ;sta sd_cmd_dat+3                                       
                ;sta sd_cmd_dat+2                                       
                ;sta sd_cmd_dat+1 
                lda #58                  ; command 58 = read OCR    
                jsr sd.send_cmd         
                bne .init_sd.error 
                jsr spi.get_byte         ; dont use this response for anything 
                jsr spi.get_byte         ; is used to check that voltage is ok for this card     
                jsr spi.get_byte
                jsr spi.get_byte

                                                        
@cmd16          ;lda #$00 
                ;sta sd_cmd_dat+4                                       
                ;sta sd_cmd_dat+3                                       
                ;sta sd_cmd_dat+1 
                lda #$02                  ; 512 bytes size = $00000200
                sta sd_cmd_dat+2                                        
                lda #16                   ; command 16 = set block size     
                jsr sd.send_cmd         
                bne .init_sd.error   

With this the card should be initialized and ready to read and write to using the FAT file format. I repeat this initialization each time I begin an operation with the SD card, just in case the card has been removed and/or replaced with another one. There might be a way of detecting that a card has been ejected so that you can limit the initialization to once. If someone knows how to do this I would love to hear it in the comments.

The full code incl. reading files in the FAT format can be found here. Feel free to skip ahead if you cant wait for the next post on the FAT file format.

SPI Part 2

In this post we will take a look at the basic SPI code and then subsequently at the more complex SD card and FAT file format implementation. Before we get going some definitions. The way my address decoding logic is set up the VIA registers have the following addresses and names that I will use in the code. Should be easy to modify for your system (see VIA documentation for more details)

VIA1_PORTB      = $F100      ; address of port a
VIA1_PORTA      = $F101      ; address of port b
VIA1_DDRB       = $F102      ; data direction register for port a
VIA1_DDRA       = $F103      ; data direction register for port b
VIA1_T1CL       = $F104      ; timer 1 counter low 
VIA1_T1CH       = $F105      ; timer 1 counter high
VIA1_T1LL       = $F106      ; timer 1 latch low
VIA1_T1LH       = $F107      ; timer 1 latch high
VIA1_T2CL       = $F108      ; timer 2 counter low 
VIA1_T2CH       = $F109      ; timer 2 counter high
VIA1_SR         = $F10A      ; shift register 
VIA1_ACR        = $F10B      ; auxiliary control register 
VIA1_PCR        = $F10C      ; peripheral control register 
VIA1_IFR        = $F10D      ; interrupt flag register 
VIA1_IER        = $F10E      ; interrupt enable register
VIA1_ORA        = $F10F      ; same as port A, except "no handshake"

My initialization code disables the shift register interrupt, sets port B to output and sets the T2 clock frequency (for the fastest speed set SPI_CLKN = 0) .

spi.init        lda #%00000100  ; disable shift-register interrupt
                sta VIA1_IER     
        
                lda #$FF        ; set port B to output
                sta VIA1_DDRB
        
                lda #SPI_CLKN   ; set number of counts for T2/clock
                sta VIA1_T2CL

                rts           

You can set an interrupt to trigger when 8 bits have been sent/received, but I have chosen to have the load/save running in the “main thread”. I have certain kernel update routines in IRQ (e.g. updating screen, cock, etc.) that will then interrupt when needed. Hence the SR interrupt is disabled. You could easily chose to do this differently.

Since we are using the shift-register for both MOSI and MISO communication (see my SPI Part 1 post) we need to set up both the SR register and the buffer chip for either output or input. For the SR register we do this by setting bits 2-4 in the VIA auxiliary control register (ACR) to either %001 (shift in under control of T2) or %101 (shift out under control of T2).

The the input line (connected to PB0) or the output line (connected to PB1) on the buffer is enabled by pulling it low (obviously only pull one low and the other high … I know I really should implement this in HW). This will enable or disable the respective gates on the buffer.

Finally pull the SD cards Slave Select low to indicate it is the device we are communicating to (SS1 is connected to PB2). E.g. writing %xxxxxx10 to Port B will disable output (bit 1 = 1) and enable input (bit 0 = 0). And %xx1110xx will set SS1 = 0 to select the SD card and deselect SS2-4 by setting them to 1.

spi.set_input   lda VIA1_ACR
                and #%11100011    ; mask out SR control bits
                ora #%00000100    ; SR in under control of T2 
                sta VIA1_ACR          
     
                lda #%00111010    ; buffer input & SD card CS = 0 
                sta VIA1_PORTB          
                rts 

                
spi.set_output  lda VIA1_ACR
                and #%11100011    ; mask out SR control bits
                ora #%00010100    ; SR in under control of T2
                sta VIA1_ACR

                lda #%00111001    ; buffer input & SD card CS = 0
                sta VIA1_PORTB
                rts

So far all we have done is to set up things and we need the code to actually send a byte over SPI. Note that any reading or writing to the shift register (SR) clears the respective flag in the IFR. Once 8 bits have been sent or received the flag (bit 2 in IFR) is set to signal that data is ready and if the interrupt is enabled it will also trigger an interrupt.

spi.send_byte   sta VIA1_SR      ; send data to SR (also clears SR flag in IFR)
@wait           lda VIA1_IFR     ; check IFR flags
                and #%00000100   ; isolate SR flag  
                beq @wait        ; wait until done sending byte 
                rts
                

spi.get_byte    lda VIA1_IFR     ; check IFR flag
                and #%00000100   ; isolate SR flag  
                beq spi.get_byte ; wait until SR flag is set (when previous shift operation is completed)
                lda VIA1_SR      ; get data (also clears SR flag in IFR)
                rts 

You have to be a little careful as you can only clear the SR flag in IFR by reading or writing to the shift register. For SD card communication you wilI always send bytes before receiving (more about this in the next blog post where we will dive into the actual SD card and FAT format). You will notice that my send_byte function sends first and then waits (until the byte has been sent), whereas the get_byte function waits first and then sends. This works because of this sequence, but if you are not careful you could be caught in a loop waiting for a flag that is never set or cleared.

Ideally you want to do other things while the SR register is sending/receiving, so for sending the above can certainly be optimized. Later you will see in some of my code where I send or receive large quantities of data that I have made these optimizations.

At the maximal T2 clock speed it will take 32 clock cycles to send/receive a byte, which can conveniently be used to e.g. storing the previous byte in memory. If you count the cycles of the code you execute between reads and writes you wouldn’t even have to check the SR flag before triggering the next byte to be sent/received (but if you are not careful you could trigger the shift register before it is ready for the next byte).

These were the basics of sending and receiving bytes over SPI and in my next blog posts I will go over the specifics of communicating with an SD card in SPI mode.

Note 1: As I was writing this blog post I realized the maximum read/write speed is attained by driving the SPI clock signal from PHI2 rather than Timer 2. The speed is then fixed at half the system clock speed and sending/receiving a byte takes 16 clock cycles, rather than 32 or more with T2. To do this you simply set bits 2-4 of ACR to %010 for receiving and %110 for sending. I tested this with my SD card and it works nicely . However if you need variable speeds for SPI communication with other devices (that might not be as fast) you still want to be able to change the SPI clock speed through T2.

Note 2: The IDE I used for coding supports standard 6502 instructions, but not the extended 65C02 instruction set, so there are places where my code could be optimized. This is a project for me to begin at some point. Suggestions for good 65C02 compilers and/or IDE’s very welcome. From my C64 coding I have been using the excellent C64Studio from Georg Rottensteiner.

SPI part 1

I wanted to have a modern storage solution for my 6502 homebrew computer and that naturally leads to SD cards. The easiest way to interact with an SD card is probably in SPI mode, with the added benefit that you can later add more SPI devices. The web is full of intros to the SPI protocol, incl. https://www.circuitbasics.com/basics-of-the-spi-communication-protocol/

In short we will need a clock line (CLK), serial out (MOSI), serial in (MISO), and a “Slave Select” (SS) line for each device. At least for now I chose to set it up so I can have 3 slave devices connected to the SPI port, controlled by PB2-PB4 on the VIA.

VIA / SPI

The most important design decision was to use the VIA timer to create the SPI clock signal. The T2 timer can create a clock signal that is output on CB1 and the frequency is PHI2 / (2*(N+2), where N is the value you set T2 to count down from (and PHI2 the system clock). Choosing N = 0 the SPI clock becomes 1/4 of the system clock or e.g. 250 kHz if you are using a 1MHz clock for the 6502.

Most other solutions I could find on the web manually created a clock signal by the 6502 writing directly to the VIA. Two writes to set the clock high/low and one read for each bit is at least 12 clock cycles, so the max frequency using this method is 1/12th the system clock frequency. Using T2 (and the VIA’s shift register) we should get at least 3X the speed.

The other advantage is that T2 can be set up to control the VIA’s shift register with bits being sent or received on CB2 and as such the 6502 processor does not have to read every single bit; it can just e.g. read a full byte once the shift register is full. This further increases speed of communications, rather than having the 6502 read and shift every bit.

However there is only one shift register (that can be set to either send or receive) and the SPI protocol calls for separate lines for sending and receiving bits. So I chose to add a buffer to switch between MOSI and MISO being routed to CB2 on the VIA. I use PB1 and PB2 to set which of the lines are enabled on the buffer – and I have pulled MOSI high when the output on the buffer is disabled. This works since – at least for the SD card – there is no simultaneous sending and receiving of bits.

I could of course reduce that to only using one pin on the VIA to control output (MOSI) or input (MISO) by adding a not-gate, but at least from now have not added the extra chip. If I need more glue logic for something else later I certainly will change this (it does annoy me a little to “waste” a pin on the VIA that could be used for something else).

Hopefully this helps explain the HW setup of the SPI interface and in my next blog post I will describe the code for implementing the SPI protocol.