g | x | w | all

Bytes	Lang	Time	Link
nan	SUBLEQ	250819T123153Z	howerj
056	Julia	231104T172224Z	MarcMush
052	ARM64 machine code	231104T154846Z	Nate Eld
055	Ruby	230926T141657Z	Kirill L
083	R*	230926T075324Z	Dominic
022	Z80 machine code	230926T014448Z	v-rob
033	05AB1E	230925T074211Z	Kevin Cr
017	x8664 machine code	230924T143830Z	Peter Co
062	Python 3.11	230923T114654Z	Joao-3
058	Python 3	230923T145929Z	Jonathan
057	JavaScript ES6	230922T225531Z	Arnauld
045	Charcoal	230922T224742Z	Neil
094	Python3	230922T221100Z	Ajax1234

SUBLEQ, 557 bytes (space separated decimals) / 149 cells.

This implements a SUBLEQ interpreter within SUBLEQ (it also handles I/O for interpreters that have it). It can be prepended to any SUBLEQ program, including itself. Note that because it uses 149 cells on SUBLEQ machines that are 8-bits and can only address 256 cells this may cause problems when attempting to simulate programs that are too large, or attempting to recursively use this interpreter.

15 15 3 
145 144 6 
144 15 9 
144 144 12 
114 114 15 
0 144 18 
144 114 21 
144 144 24 
15 15 27 
114 144 30 
144 15 33 
144 144 36 
147 15 42 
148 114 42 
147 145 45 
60 60 48 
145 144 51 
144 60 54 
144 144 57 
115 115 60 
0 144 63 
144 115 66 
144 144 69 
60 60 72 
115 144 75 
144 60 78 
144 144 81 
147 60 87 
148 115 87 
147 145 90 
105 105 93 
145 144 96 
144 105 99 
144 144 102 
146 146 105 
0 144 108 
144 146 111 
144 144 114 
0 0 123 
147 145 120 
144 144 0 
145 145 126 
146 144 129 
144 145 132 
144 144 135 
147 146 -1 
148 145 141 
144 144 0 
0 149 0 
-1 -149

Julia, 56 bytes

<(x,i=0)=(x[1+x[i+2]]-=x[1+x[i+1]])>0 ? x<i+3 : x<x[i+3]

Attempt This Online!

mutates the input and ends by erroring

ARM64 machine code, 52 bytes

The SUBLEQ machine's memory is an array of signed 64-bit cells. This function takes a pointer to the array in x0, and the length of the array in cells in x1. It modifies the array in place. This matches standard ARM64 calling conventions for a C function void subleq(uint64_t *mem, size_t len);.

aa0003e2
a8c11043
f8408445
f8637806
f8647807
eb0600e7
f8247807
8b050c09
9a82d122
cb000043
eb010c7f
54fffec3
d65f03c0

Assembly source:

    .text
    .global subleq1
    .balign 4
    // 64 bit cells
    // here IP is a pointer
subleq1:
    // x0 = base of machine memory array
    // x1 = length of array
    // x2 = program counter
    mov x2, x0                  // pc=0
next_insn:
    ldp x3, x4, [x2], #16       // x3 = a, x4 = b, pc += 2
    ldr x5, [x2], #8            // x5 = c, pc += 1
    // now pc points to tentative next instruction
    ldr x6, [x0, x3, lsl #3]    // x6 = *a
    ldr x7, [x0, x4, lsl #3]    // x7 = *b
    subs x7, x7, x6             // x7 = *b - *a, set flags
    str x7, [x0, x4, lsl #3]    // store back to b
    add x9, x0, x5, lsl #3      // compute new pc corresponding to c
    csel x2, x9, x2, le         // select new pc if we had *b <= *a
    
    // check if pc still in range.
    // This could also be done with a cmp/ccmp pair, but same length.
    // Can save one instruction here if 
    sub x3, x2, x0
    // x3 = offset of pc in bytes.  
    cmp x3, x1, lsl #3
    // Check the unsigned result, so that negative pc corresponds to
    // very large value in x3, definitely higher than x1.
    b.lo next_insn

    ret

I couldn't find much clever golfing to do here. The load-store architecture hurts here as we need three instructions to do a read-modify-write of memory.

(Atomic LDADD is a possibility if we have ARMv8.2, despite being much more inefficient, but we'd need another instruction to negate the value from a as there is no LDSUB, and it wouldn't set the flags so that would cost one more. If the condition to jump were *b - *a < 0 instead of <=, we could branch on the sign bit with tbz. LDADD also lacks the fancy addressing modes that we want to use.)

I took the same interpretation as Peter Cordes that signed overflow in the subtraction is undefined behavior, so that we can branch on *b <= *a instead of *b - *a <= 0.

It was also unclear what should happen if you jump to a negative address. This version will consider it successful termination. If you allow it to be UB, you can shave off one instruction (4 bytes) by changing the second argument (in x1) to be a pointer to the end of the array, instead of a length count, and then just change the loop condition to cmp x2, x1; b.hs next_insn.

I considered keeping an integer instead of a pointer as the program counter. It saves the add instruction to compute the new pc for c, but costs another one to add 3 for the no-jump case, because loads can't post-increment an index register.

I also had some thoughts about using 32-bit cells so that all three elements of the instruction could be loaded with one 128-bit ldp, but unpacking from the high half of a 64-bit register is too much extra work.

Ruby, 55 bytes

->x,i=0{i=(x[x[i+1]]-=x[x[i]])>0?i+3:x[i+2]until !x[i]}

Attempt This Online!

Updates input array in-place.

R*, 84 83 bytes

f=\(p,i=1)`if`(i>sum(p|1),p,f(p,`if`((p[j[2]]=diff(p[j<-p[i+0:2]+1]))<1,j[3],i+3)))

Attempt This Online!* or Try it online using function instead of the shorter \ notation.

*ATO uses a version of R (version 4.3.1) in which the if function errors when the condition has length >1. Versions between 4.1.0 and <4.2.0 just give a warning for this without halting with an error, and the header in the ATO link redefines the if function to mimic this behaviour.
It would cost +3 bytes to adjust the program to run in all versions of R ≥4.1.

Z80 machine code, 22 bytes

44 4E 2C 0A 5F 4E 2C 0A 93 02 3D F2 1C 01 6E 2D 2C 7D BA D0 18 EB

Explanation of disassembled code:

; hl = Pointer to start of program. Must be aligned to 256 byte boundary.
; d = Length of the program in bytes.
subleq:
    ld b, h                     ; We want to use bc as a data pointer, so align
                                ; it to the same boundary as hl.
loop:
    ld c, (hl)                  ; Get a pointer to the first operand in bc and
    inc l                       ; advance the instruction pointer.

    ld a, (bc)                  ; Put the first operand in e.
    ld e, a

    ld c, (hl)                  ; Get the second operand pointer in the same
    inc l                       ; way as the first.

    ld a, (bc)                  ; Put the second operand in a and subtract e
    sub e                       ; from it, storing the result back where it
    ld (bc), a                  ; came from.

    dec a                       ; If the subtracted value was positive, skip
    jp p, no_jump               ; the jump.

    ld l, (hl)                  ; Load our new instruction pointer from the
    dec l                       ; third parameter. Decrement immediately since
                                ; we will increment next no matter what.
no_jump:
    inc l                       ; Advance the instruction pointer.

    ld a, l                     ; If we're past the end of the program, return.
    cp d
    ret nc

    jr loop                     ; Otherwise, continue our interpretation loop.

This is a plain Z80 subroutine called via call with registers as parameters. This subleq processor is 8-bit, both in data and addressing. Jump locations are interpreted as unsigned rather than signed, although comparisons are still signed. Do note that values are not exactly two's complement as $80 is interpreted as positive rather than negative; however, the challenge does not specify this, and it should not affect the Turing-completeness of the implementation.

Example code to call the subroutine with the provided data:

    ld hl, prog_start
    ld d, prog_end - prog_start
    call subleq

    halt

    .align 256
prog_start:
    .db 3, 4, 3
    .db 6, 13, 9
    .db 6, 3, -3
    .db 7, 8, 3
prog_end:

You can paste these chunks of assembly into https://www.asm80.com/ to run them. Or, pull out your favorite Z80 system and run it there.

05AB1E, 34 33 bytes

[¾Ig@#Ðü3¾èRćU¬VèÆ©Yǝ®0›i3ë.µX}F¼

Try it online.

Explanation:

[         # Start an infinite loop:
 ¾        #  Push counter variable `¾` (which starts at 0)
  Ig      #  Push the input-length
    @     #  Pop both, and if `¾` is >= the input-length:
     #    #   Stop the infinite loop
          #   (after which the resulting list is output implicitly as result)
 Ð        #  Triplicate the current list
          #  (which will be the implicit input-list in the first iteration)
  ü3      #  Pop one copy and push its overlapping triplets
    ¾è    #  Pop and get the current Subleq-triplet [L[a],L[b],L[c]] at index `¾` (a)
  R       #  Reverse this triplet [L[a],L[b],L[c]] to [L[c],L[b],L[a]]
   ć      #  Extract its head; pop and push remainder-list and first item separately
    U     #  Pop and store L[c] in variable `X`
    ¬     #  Push the first item of the pair (without popping): L[b]
     V    #  Pop and store L[b] in variable `Y`
    è     #  Index [L[b],L[a]] in another copy of the list: [L[L[b]],L[L[a]]]
     Æ    #  Reduce this pair by subtracting: L[L[b]]-L[L[a]]
      ©   #  Store it in variable `®` (without popping)
       Y  #  Push L[b] from variable `Y`
        ǝ #  Insert L[L[b]]-L[L[a]] at index L[b] into the list
    i     #  If
 ®        #  L[L[b]]-L[L[a]] from variable `®`
  0›      #  is positive (>0):
     3    #   Push a 3
    ë     #  Else (it's <=0):
     .µ   #   Reset the counter variable `¾` to 0
       X  #   And push L[c] from variable `X`
    }     #  Close the if-else statement
     F    #  Loop that many times (either 3 or L[c]):
      ¼   #   Increase the counter variable by 1 every iteration

x86(-64) machine code, 17 bytes

A function that simulates a SUBLEQ machine with 8-bit cells, returning to the caller with the array modified. Callable from C with the x86-64 SysV calling convention as void subleq8(int8_t *end, int8_t *aligned_start); (Like a .begin() / .end() range, the end pointer start+length.)

start must be aligned by 256 so its low 8 bits are 0. This allows indexing the array by replacing the low byte with x86 partial-register stuff, instead of needing to zero-extend the guest byte into a separate register and use an indexed addressing mode. (A similar trick is useful on 8-bit systems like AVR where two 8-bit registers form a 16-bit pointer: if you don't cross a 256-byte boundary, you can skip adc into the high byte; if the array is aligned by 256 you can just replace the low byte.)

start must be in the low 32 bits of virtual address space, e.g. the x32 ABI, mmap(MAP_32BIT), or simply a static array (e.g. in .data) in a non-PIE Linux executable. Or if you run the same machine code in 32-bit mode.

               ;;; array in ESI, aligned by 256 (so the low byte can be replaced to index into it)
               ;;; end-pointer (array+len) in EDI
               
               subleq8:
 89F0              mov  eax, esi               ; copy high bytes to EAX so merging a new low byte generates an address

               .nextinst:
 AC                lodsb                       ; fetch address A into AL, forming an x86 pointer in RAX
 8A 10             mov    dl, [rax]               ; deref A
 AC                lodsb                       ; fetch address B
 28 10             sub    [rax], dl            ; SUB and set FLAGS
 AC                lodsb                       ; fetch branch target address; RSI points at next instruction
 0F 4E F0          cmovle  esi, eax            ; update guest PC according to FLAGS
               
 39 FE             cmp    esi, edi             ; like cmp sil, dil  but range-shifted: both regs have the same high bytes
 72 F2             jb  .nextinst               ; len must be at most 127 to still consider *all* negative byte values as unsigned > 0x7f
 C3                ret

Instead of checking for the subtraction result being negative or zero, we actually check if [b] was less-or-equal than [a] before the subtraction, using FLAGS set by the sub instruction (like cmp). This is the same as checking the result against 0 if signed overflow didn't happen, otherwise it's the opposite. With Python arbitrary-precision integers, signed overflow is impossible. (There is no x86 FLAGS condition for negative-or-zero so it would take an extra 3 bytes, either for cmp byte [rax], 0 or for cmovs + cmove instead of cmovle. Like most ISAs with signed-compare conditions, x86 does have a le less-or-equal condition that checks SF != OF || ZF in the same predicate.)

Apparently even Wikipedia defines a SUBLEQ OISC in terms of comparing against zero, rather than the usual computer-architecture meaning of a less-or-equal condition after a subtraction. :/ If signed wrap-around (overflow) doesn't happen, the two variants of the architecture are equivalent. Unless / until there's a test-case that requires checking the sign of the result instead of the LE condition generated by subtraction, I'm going to leave my answer this way. I'm pretty confident this is also Turing complete and probably not significantly different for implementing most basic arithmetic, if any of that comes near signed overflow.

The max program length is 127 or 128 bytes if you want any negative address to end the program. 126 is the highest multiple of 3, if your program doesn't do misaligned jumps and doesn't want to keep some data past the end of the last instruction. (It effectively treats the program counter as unsigned, looping do {} while(PC < end), so a negative program counter is an unsigned number in the upper half of range, above any signed-positive length.) Otherwise, max program length = 255, exiting only on PC=255, although that's untested and rollover of the high bytes of RSI could happen, but maybe only with misaligned jumps, e.g. to PC=254 so fetching the full instruction would get to PC=257, with lodsb carrying into the higher bytes of RSI.

(With len limited to 128, misaligned jumps are no problem. e.g. a program might use its first instruction to jump to address 4 and then use the first 4 bytes purely as data, eventually falling off the end when PC = 128.)

Tested with the test-case in the question; it's correct for that but it doesn't AFAIK test for off-by-one errors (<= vs. < in the SUBLEQ operation), or for exiting via a taken branch to a negative or past-the-end address. I think my program is correct, e.g. cmovle treats "equal" the same as "less-than", and the test-case does exercise less-than causing the branch to be taken.

x86(-64) machine code, 21 bytes

Simulates a 32-bit SUBLEQ machine. Writing a 32-bit register zero-extends to the full 64-bit register so even if we were willing to align the SUBLEQ program by 2^32 that wouldn't all the same partial-register trickery.

Even starting at absolute address 0 wouldn't work, because SUBLEQ addressing is in terms of cells, like a word-addressable machine. But x86 is a byte-addressable architecture, so SUBLEQ addresses have to be scaled by 4 to get x86-64 byte offsets since we're using 4-byte cells.

In x86-64 SysV, void subleq32(int32_t *end, int32_t *start). I think the start pointer might need to be 4GiB aligned to properly handle all jumps to negative addresses, since we only compare the low 32 bits after doing x86_address = base + target*4. With a larger base, some negative targets could wrap. And due to the scaling by 4, the most negative value we detect is -2^30. We shift out the high bits of some larger numbers, although some larger-magnitude numbers will happen to have a bit that gets shifted to the sign-bit position, too.

So maybe it's only really able to correctly simulate a 30-bit SUBLEQ machine, or 29-bit or something. But still much more than 8-bit, which is the point.

                   ;; word-addressable SUBLEQ machine, where the components of an instruction are still 1 guest address apart
                   ;; despite them being farther apart in x86-64
               ;;; base address in ESI
               ;;; end address in EDI  (base+length in bytes, the one-past-end address.  An x86 address, not guest words)
               subleq32:
 89F1              mov  ecx, esi   ; save the base for later use.
               
               .nextinst:
 AD                lodsd                     ; fetch address A
 8B1481            mov    edx, [rcx+rax*4]   ; and deref; we don't need it after this
 AD                lodsd                     ; fetch address B
 291481            sub    [rcx+rax*4], edx   ; update [B] and set FLAGS
 AD                lodsd                     ; fetch branch target address; RSI points at next instruction for non-taken
 7F03              jnle  .not_taken
 8D3481            lea    esi, [rcx+rax*4]   ; scale the SUBLEQ target address to an x86 address
                   .not_taken:

 39FE              cmp    esi, edi
 72EE              jb  .nextinst       ; unsigned compare catches negative as well as past-the-end
 C3                ret

I also tested this with dd instead of db for the initial array, using GDB print (int[12])prog32 after running, or display the same expression while single-stepping. This was my test caller:

section .data
prog32:
    dd 3, 4, 3,
    dd 6, 13, 9,
    dd 6, 3, -3,
    dd 7, 8, 3
prog32_end:

section .text
global _start
_start:
    mov edi, prog32_end
    mov esi, prog32
    call subleq32

    int3              ; software breakpoint

    xor edi, edi
    mov eax, 231
    syscall           ; x86-64 Linux exit_group(0)

x86(-64) machine code for a byte-addressable SUBLEQ, 17 bytes

Untested, but it's the same logic as the 8-bit version, with the same instructions except 32-bit operand-size. (So writing full registers instead of merging into the low byte).

The SUBLEQ program's cells are 4 SUBLEQ addresses wide, matching the x86 addressing. One way to make the test program run the same is to scale all its starting values by 4, like 12, 16, 12 | 24, 52, 36 | .... If you do misaligned load/store or jumps, it will expose the fact that it's a little-endian machine. (SUBLEQ doesn't have byte loads, only word loads.)

This requires the array to start at absolute address 0. Or in 32-bit mode, DS:0, so you could plausibly imagine some segmented code that sets the DS segment base for passing arrays. (Some "string" instructions use the ES segment, but lodsd uses ds:esi).

                   ;; byte-addressable SUBLEQ machine
               ;;; base address = 0
               ;;; end address = len in EDI
               subleq32b:
 31F6              xor  esi, esi               ; start address = 0, can't justify having the caller pass it since we depend on it being 0
               
               .nextinst:
 AD                lodsd                     ; fetch address A
 8B10              mov    edx, [rax]            ; and deref; we don't need it after this
 AD                lodsd                     ; fetch address B
 2910              sub    [rax], edx         ; update [B] and set FLAGS
 AD                lodsd                     ; fetch branch target address; RSI points at next instruction for non-taken
 0F4EF0            cmovle  esi, eax
               
 39FE              cmp    esi, edi
 72F2              jb  .nextinst       ; unsigned compare catches negative as well as past-the-end
 C3                ret

A 16-bit version of this would be highly plausible in 16-bit mode where segmentation is normal, then it wouldn't be too weird to take an array at ds:0. Same machine code except for the ModRM.rm field that encodes [rax]... except there is no [ax] addressing mode, only [bx], [di], or [si] for single-register 16-bit addressing modes that use the DS segment. So that might cost a couple 1-byte xchg instructions, perhaps with BX.

16-bit addressing modes don't allow a scale factor so they're not very convenient for a word-addressable SUBLEQ.

Python 3.11, 92 86 85 62 bytes

This is my answer, so don't suggest copying from other answers.

def f(p,i=0):
 p[k:=p[i+1]]-=p[p[i]];f(p,(i+3,p[i+2])[p[k]<1])

Attempt This Online!

Takes in a program as a list, mutates the original list with an error (you need a variable and try/except to access the results).

Explanation:

def f(p,i=0):

Define the main function, p is the program to be ran and initialize the instruction pointer as i.

 p[k:=p[i+1]]-=p[p[i]];

Subtract A from B and store in B. Also assigns B as k to shave off bytes.

f(p,(i+3,p[i+2])[p[k]<1])

If B is less than 1, then recursively call with i as C, otherwise add 3.

Python 3, 58 bytes

def f(s,i=0):a,b,*r=s[i:];s[b]-=s[a];f(s,[i+3,*r][s[b]<1])

A recursive function that updates a Subleq program, the list of integers s, in place terminating with a ValueError.

Try it online!

JavaScript (ES6), 57 bytes

Modifies the input array.

f=(m,p=0)=>1/m[p]&&f(m,(m[m[p+1]]-=m[m[p]])>0?p+3:m[p+2])

Try it online!

Commented

f = (             // f is a recursive function taking:
  m,              //   m[] = memory array
  p = 0           //   p = program counter
) =>              //
1 / m[p]          // abort if m[p] is undefined
&&                // otherwise:
  f(              //   do a recursive call:
    m,            //     pass m[]
    (             //
      m[m[p + 1]] //     subtract from m[m[p + 1]] ...
      -= m[m[p]]  //     ... the value stored at m[m[p]]
    ) > 0 ?       //     if the result is positive:
      p + 3       //       continue execution at p + 3
    :             //     else:
      m[p + 2]    //       jump at m[p + 2]
  )               //   end of recursive call

Charcoal, 45 bytes

Ｗ‹ⅈＬθ«≔§θ⊕ⅈι§≔θι⁻§θι§θ§θⅈ¿›§θι⁰Ｍ³→Ｊ§θ⁺²ⅈ⁰»⭆¹θ

Try it online! Link is to verbose version of code. Explanation:

Ｗ‹ⅈＬθ«

Repeat while the instruction pointer is within range.

≔§θ⊕ⅈι

Get the B parameter into a variable.

§≔θι⁻§θι§θ§θⅈ

Subtract the element at the A parameter from the element at the B parameter.

¿›§θι⁰

If the element at the B parameter is still positive, ...

Ｍ³→

... move to the next instruction, otherwise...

Ｊ§θ⁺²ⅈ⁰

.... jump to the C parameter.

»⭆¹θ

Output the final array.

Python3, 94 bytes:

def f(t,i=0):
 if i>=len(t):return t
 t[t[i+1]]-=t[t[i]];return f(t,[t[i+2],i+3][t[t[i+1]]>0])

Try it online!