Article 2807 of comp.sys.apple2.programmer: Newsgroups: comp.sys.apple2.programmer Path: winternet.com!uunet!tcsi.tcs.com!agate!ames!waikato!comp.vuw.ac.nz!actrix.gen.nz!dempson From: dempson@atlantis.actrix.gen.nz (David Empson) Subject: Re: Sweet 16? Message-ID: Sender: news@actrix.gen.nz (News Administrator) Organization: Actrix Information Exchange Date: Sat, 1 Apr 1995 04:00:53 GMT References: <3ldt51$i4q@cello.gina.calstate.edu> X-Nntp-Posting-Host: atlantis.actrix.gen.nz Lines: 238 In article <3ldt51$i4q@cello.gina.calstate.edu>, Jason Kennerly wrote: > anyone care to post a list of opcodes and such, so that we may experiment > with it?? I've never seen any official documentation on Sweet-16: the best is probably Steve Wozniak's article in Byte, back in 1977, but I've never seen that (or been able to get hold of it). I cobbled together my own notes many years ago, from looking at the disassembly and experimenting with Merlin (8-bit versions can assemble Sweet-16 code with appropriate directives). Digging out my venerable "red book" (not the Apple one - my own handwritten notes from high school)... Architecture ------------ Sweet-16 provides 16 16-bit registers, called R0, R1, R2, ..., RF. They are implemented using the first 32 bytes of the 6502 zero page, in the usual 6502 low/high order. You can pass data between 6502 and Sweet-16 code by copying values in and out of the appropriate registers. R0 is the accumulator. R1 to R11 are general purpose registers which may be used in any way by the Sweet-16 program. R12 is the Sweet-16 stack pointer (used for Sweet-16 subroutine calls). R13 is the compare result register (used by the CPR instruction). R14 is used by Sweet-16 to point to the register holding the last result. R15 is the Sweet-16 program counter. Instruction Set --------------- The instruction set is mostly orthogonal - nearly all instructions can refer to any register. All opcodes are one byte, with a one or two byte operand for some instructions. The mnemonics I'm listing here are valid for Merlin/Big Mac and match the labels used in the Sweet-16 listing. Other assemblers (e.g. LISA) may use different mnemonics. Opcode Instruction 00 RTN Return to 6502 mode 01 xx BR dest Unconditional relative branch 02 xx BNC dest Branch if no carry 03 xx BC dest Branch if carry 04 xx BP dest Branch if plus 05 xx BM dest Branch if minus 06 xx BZ dest Branch if zero 07 xx BNZ dest Branch if not zero 08 xx BM1 dest Branch if minus 1 (FFFF) 09 xx BNM1 dest Branch if not minus 1 0A BKS Break into system monitor (hits a BRK instruction in the Sweet-16 interpreter) 0B RS Return from subroutine 0C xx BS dest Branch to subroutine Opcodes 0D, 0E and 0F are not implemented (NOP, referred to as NUL in the Sweet-16 listing). 1n xx yy SET Rn,yyxx Set Sweet-16 register n to 16-bit value. 2n LD Rn Load Sweet-16 accumulator (R0) from Rn. 3n ST Rn Store Sweet-16 accumulator (R0) to Rn. 4n LD @Rn Byte load R0 from address pointed to by Rn and increment Rn. R0 high byte is cleared. 5n ST @Rn Byte store R0 (low byte only) to address pointed to by Rn and increment Rn. 6n LDD @Rn Word load R0 from address pointed to by Rn and increment Rn by 2. 7n STD @Rn Word store R0 to address pointed to by Rn and increment Rn by 2. 8n POP @Rn Decrement Rn and byte load R0 from address pointed to by Rn. R0 high byte is cleared. 9n STP @Rn Decrement Rn and byte store R0 to address pointed to by Rn. An ADD Rn Add Rn to R0 (result in R0). Bn SUB Rn Subtract Rn from R0 (result in R0). Cn POPD @Rn Decrement Rn by 2 and word load R0 from address pointed to by Rn. Dn CPR Rn Compare Rn to R0 (subtract result in RD). En INR Rn Increment Rn. Fn DCR Rn Decrement Rn. Some of the registers and instructions may require a little extra explanation: As can be deduced from the POP/POPD instructions, the stack works quite differently to the 6502. Any register may be used as a "stack pointer" for accessing data. R12 is automatically used by Sweet-16 for the BS and RS instructions. Stacks grows upward: ST @Rn, STD @Rn or BS will store the operand value at the stack pointer then increment the stack pointer. POP @Rn, POPD @Rn and RS will decrement the stack pointer then fetch the value. Before using the Sweet-16 BS/RS instructions, the stack pointer must be set to a sensible value, i.e. into any area in the bottom 48k which can be set aside as a buffer for the Sweet-16 stack. It should be initialized to the base address in this area. The STP @Rn instruction is an oddity: it reverses the stack direction, and can be used in conjunction with LD @Rn and LDD@Rn to implement a stack that grows downward. The pointer register should be set one byte past the end of the "stack" - the first STP will decrement the pointer. R13 is used to hold the result of a CPR instruction and is not touched at other times. R14 is special: its low byte is never used, and its high byte is used for two functions: 1. After register instructions (opcodes 1n to Fn), the high byte of R14 is set to the number of the operand register times two (i.e. the zero page offset to the register). This is register 0 for most instructions, except SET, INR, DCR and CPR. 2. After ADD, SUB and CPR, bit 0 is used to hold the carry out of the high byte. R14H is preserved over all branch instructions (opcodes 0x) and over a return to 6502 mode (unless the 6502 code changes it). R14H is used by the various branch instructions. BNC and BC are only valid after an ADD, SUB or CPR instruction ("carry" will always be false after any other register instruction). BP, BM, BZ, BNZ, BM1 and BNM1 directly test the last accessed register, looking at bit 15 in the case of BP/BM, or the whole register value in the other instructions. Branch offsets work like the 6502: an 8-bit signed offset counting from the opcode byte of the next instruction. Here is some example Sweet-16 code, to add two tables of numbers. Before calling this routine, the 6502 code must set R1 ($02-$03), R2 ($04-$05) and R3 ($06-$07) to point to the two source tables and the destination table, and R4 ($08-$09) to the number of entries. add_tables: ; Starting out in 6502 mode. 20 89 F6 JSR $F689 ; Now we're in Sweet-16 mode. add_loop: 61 LDD @R1 ; Get word pointed to by R1 into R0 35 ST R5 ; Put it in R5 62 LDD @R2 ; Get word pointed to by R2 into R0 A5 ADD R5 ; Add R5 to R0 73 STD @R3 ; Store result indirect through R3 F4 DCR R4 ; Decrement the count 07 F8 BNZ add_loop ; Keep going until it is zero 00 RTN ; Exit Sweet-16 mode ; Now we're back in 6502 mode. 60 RTS Note that the LDD @R1, LDD @R2 and STD @R3 instructions will increment the pointer registers by two. As you can see from the above example, 16-bit arithmetic can be done with much less code using Sweet-16 than by doing the same operation in 6502 code (though Sweet-16 will be slower). Here is an equivalent 6502 routine, again doing 16-bit adds. I am not assuming that we are adding less than 256 bytes - if this assumption could be made, then the 6502 code would be simpler. add_tables: A0 00 LDY #$00 A6 08 LDX $08 ; Get low byte of count into X F0 02 BEQ add_loop ; Don't inc high byte if low byte is zero E6 09 INC $09 ; Pre-increment high-byte to save time add_loop: 18 CLC B1 02 LDA ($02),Y ; Get low byte 71 04 ADC ($04),Y ; Add low byte 91 06 STA ($06),Y ; Store low byte of sum C8 INY B1 02 LDA ($02),Y ; Get high byte 71 04 ADC ($04),Y ; Add high byte 91 06 STA ($06),Y ; Store high byte of sum C8 INY D0 06 BNE no_hi_inc E6 03 INC $03 ; Increment high byte of pointers E6 05 INC $05 E6 07 INC $07 no_hi_inc: CA DEX D0 E6 BNE add_loop C6 09 DEC $09 ; All pages done? D0 E2 BNE add_loop 60 RTS 6502 version: 39 bytes. Sweet-16 version: 13 bytes. Obviously, a more complicated example would emphasize the different more than this. By comparison, a 65816 implementation of the same code is as follows. add_tables: A0 00 00 LDY #$0000 ; Set Y to offset A6 08 LDX $08 ; Get count into X add_loop: 18 CLC B1 02 LDA ($02),Y ; Get word 71 04 ADC ($04),Y ; Add word 91 06 STA ($06),Y ; Store sum C8 INY C8 INY CA DEX D0 F4 BNE add_loop 60 RTS This requires 18 bytes, so it isn't much larger than the Sweet-16 code (I'm assuming the 65816 is already in 16-bit native mode), but the 65816 version will be MUCH faster than the Sweet-16 equivalent. In fact, the 65816 version could be shortened and sped up further by adding the table backwards: replace the entry code with LDA $08, DEC A, ASL A, TAY, and replace the end of loop with DEY, DEY, BPL add_loop (assuming less than 32k is being added, which is very likely). There are a few assumptions in the 6502 and 65816 versions, e.g. that subsequent code is not relying on the zero page pointers and count being updated (which will be done by the Sweet-16 version). -- David Empson dempson@actrix.gen.nz Snail mail: P.O. Box 27-103, Wellington, New Zealand