| Bytes | Lang | Time | Link |
|---|---|---|---|
| nan | SUBLEQ | 250819T123153Z | howerj |
| 056 | Julia | 231104T172224Z | MarcMush |
| 052 | ARM64 machine code | 231104T154846Z | Nate Eld |
| 055 | Ruby | 230926T141657Z | Kirill L |
| 083 | R* | 230926T075324Z | Dominic |
| 022 | Z80 machine code | 230926T014448Z | v-rob |
| 033 | 05AB1E | 230925T074211Z | Kevin Cr |
| 017 | x8664 machine code | 230924T143830Z | Peter Co |
| 062 | Python 3.11 | 230923T114654Z | Joao-3 |
| 058 | Python 3 | 230923T145929Z | Jonathan |
| 057 | JavaScript ES6 | 230922T225531Z | Arnauld |
| 045 | Charcoal | 230922T224742Z | Neil |
| 094 | Python3 | 230922T221100Z | Ajax1234 |
SUBLEQ, 557 bytes (space separated decimals) / 149 cells.
This implements a SUBLEQ interpreter within SUBLEQ (it also handles I/O for interpreters that have it). It can be prepended to any SUBLEQ program, including itself. Note that because it uses 149 cells on SUBLEQ machines that are 8-bits and can only address 256 cells this may cause problems when attempting to simulate programs that are too large, or attempting to recursively use this interpreter.
15 15 3
145 144 6
144 15 9
144 144 12
114 114 15
0 144 18
144 114 21
144 144 24
15 15 27
114 144 30
144 15 33
144 144 36
147 15 42
148 114 42
147 145 45
60 60 48
145 144 51
144 60 54
144 144 57
115 115 60
0 144 63
144 115 66
144 144 69
60 60 72
115 144 75
144 60 78
144 144 81
147 60 87
148 115 87
147 145 90
105 105 93
145 144 96
144 105 99
144 144 102
146 146 105
0 144 108
144 146 111
144 144 114
0 0 123
147 145 120
144 144 0
145 145 126
146 144 129
144 145 132
144 144 135
147 146 -1
148 145 141
144 144 0
0 149 0
-1 -149
Julia, 56 bytes
<(x,i=0)=(x[1+x[i+2]]-=x[1+x[i+1]])>0 ? x<i+3 : x<x[i+3]
mutates the input and ends by erroring
ARM64 machine code, 52 bytes
The SUBLEQ machine's memory is an array of signed 64-bit cells. This function takes a pointer to the array in x0, and the length of the array in cells in x1. It modifies the array in place. This matches standard ARM64 calling conventions for a C function void subleq(uint64_t *mem, size_t len);.
aa0003e2
a8c11043
f8408445
f8637806
f8647807
eb0600e7
f8247807
8b050c09
9a82d122
cb000043
eb010c7f
54fffec3
d65f03c0
Assembly source:
.text
.global subleq1
.balign 4
// 64 bit cells
// here IP is a pointer
subleq1:
// x0 = base of machine memory array
// x1 = length of array
// x2 = program counter
mov x2, x0 // pc=0
next_insn:
ldp x3, x4, [x2], #16 // x3 = a, x4 = b, pc += 2
ldr x5, [x2], #8 // x5 = c, pc += 1
// now pc points to tentative next instruction
ldr x6, [x0, x3, lsl #3] // x6 = *a
ldr x7, [x0, x4, lsl #3] // x7 = *b
subs x7, x7, x6 // x7 = *b - *a, set flags
str x7, [x0, x4, lsl #3] // store back to b
add x9, x0, x5, lsl #3 // compute new pc corresponding to c
csel x2, x9, x2, le // select new pc if we had *b <= *a
// check if pc still in range.
// This could also be done with a cmp/ccmp pair, but same length.
// Can save one instruction here if
sub x3, x2, x0
// x3 = offset of pc in bytes.
cmp x3, x1, lsl #3
// Check the unsigned result, so that negative pc corresponds to
// very large value in x3, definitely higher than x1.
b.lo next_insn
ret
I couldn't find much clever golfing to do here. The load-store architecture hurts here as we need three instructions to do a read-modify-write of memory.
(Atomic LDADD is a possibility if we have ARMv8.2, despite being much more inefficient, but we'd need another instruction to negate the value from a as there is no LDSUB, and it wouldn't set the flags so that would cost one more. If the condition to jump were *b - *a < 0 instead of <=, we could branch on the sign bit with tbz. LDADD also lacks the fancy addressing modes that we want to use.)
I took the same interpretation as Peter Cordes that signed overflow in the subtraction is undefined behavior, so that we can branch on *b <= *a instead of *b - *a <= 0.
It was also unclear what should happen if you jump to a negative address. This version will consider it successful termination. If you allow it to be UB, you can shave off one instruction (4 bytes) by changing the second argument (in x1) to be a pointer to the end of the array, instead of a length count, and then just change the loop condition to cmp x2, x1; b.hs next_insn.
I considered keeping an integer instead of a pointer as the program counter. It saves the add instruction to compute the new pc for c, but costs another one to add 3 for the no-jump case, because loads can't post-increment an index register.
I also had some thoughts about using 32-bit cells so that all three elements of the instruction could be loaded with one 128-bit ldp, but unpacking from the high half of a 64-bit register is too much extra work.
Ruby, 55 bytes
->x,i=0{i=(x[x[i+1]]-=x[x[i]])>0?i+3:x[i+2]until !x[i]}
Updates input array in-place.
R*, 84 83 bytes
f=\(p,i=1)`if`(i>sum(p|1),p,f(p,`if`((p[j[2]]=diff(p[j<-p[i+0:2]+1]))<1,j[3],i+3)))
Attempt This Online!* or Try it online using function instead of the shorter \ notation.
*ATO uses a version of R (version 4.3.1) in which the if function errors when the condition has length >1. Versions between 4.1.0 and <4.2.0 just give a warning for this without halting with an error, and the header in the ATO link redefines the if function to mimic this behaviour.
It would cost +3 bytes to adjust the program to run in all versions of R ≥4.1.
Z80 machine code, 22 bytes
44 4E 2C 0A 5F 4E 2C 0A 93 02 3D F2 1C 01 6E 2D 2C 7D BA D0 18 EB
Explanation of disassembled code:
; hl = Pointer to start of program. Must be aligned to 256 byte boundary.
; d = Length of the program in bytes.
subleq:
ld b, h ; We want to use bc as a data pointer, so align
; it to the same boundary as hl.
loop:
ld c, (hl) ; Get a pointer to the first operand in bc and
inc l ; advance the instruction pointer.
ld a, (bc) ; Put the first operand in e.
ld e, a
ld c, (hl) ; Get the second operand pointer in the same
inc l ; way as the first.
ld a, (bc) ; Put the second operand in a and subtract e
sub e ; from it, storing the result back where it
ld (bc), a ; came from.
dec a ; If the subtracted value was positive, skip
jp p, no_jump ; the jump.
ld l, (hl) ; Load our new instruction pointer from the
dec l ; third parameter. Decrement immediately since
; we will increment next no matter what.
no_jump:
inc l ; Advance the instruction pointer.
ld a, l ; If we're past the end of the program, return.
cp d
ret nc
jr loop ; Otherwise, continue our interpretation loop.
This is a plain Z80 subroutine called via call with registers as parameters. This subleq processor is 8-bit, both in data and addressing. Jump locations are interpreted as unsigned rather than signed, although comparisons are still signed. Do note that values are not exactly two's complement as $80 is interpreted as positive rather than negative; however, the challenge does not specify this, and it should not affect the Turing-completeness of the implementation.
Example code to call the subroutine with the provided data:
ld hl, prog_start
ld d, prog_end - prog_start
call subleq
halt
.align 256
prog_start:
.db 3, 4, 3
.db 6, 13, 9
.db 6, 3, -3
.db 7, 8, 3
prog_end:
You can paste these chunks of assembly into https://www.asm80.com/ to run them. Or, pull out your favorite Z80 system and run it there.
05AB1E, 34 33 bytes
[¾Ig@#Ðü3¾èRćU¬VèÆ©Yǝ®0›i3ë.µX}F¼
Explanation:
[ # Start an infinite loop:
¾ # Push counter variable `¾` (which starts at 0)
Ig # Push the input-length
@ # Pop both, and if `¾` is >= the input-length:
# # Stop the infinite loop
# (after which the resulting list is output implicitly as result)
Ð # Triplicate the current list
# (which will be the implicit input-list in the first iteration)
ü3 # Pop one copy and push its overlapping triplets
¾è # Pop and get the current Subleq-triplet [L[a],L[b],L[c]] at index `¾` (a)
R # Reverse this triplet [L[a],L[b],L[c]] to [L[c],L[b],L[a]]
ć # Extract its head; pop and push remainder-list and first item separately
U # Pop and store L[c] in variable `X`
¬ # Push the first item of the pair (without popping): L[b]
V # Pop and store L[b] in variable `Y`
è # Index [L[b],L[a]] in another copy of the list: [L[L[b]],L[L[a]]]
Æ # Reduce this pair by subtracting: L[L[b]]-L[L[a]]
© # Store it in variable `®` (without popping)
Y # Push L[b] from variable `Y`
ǝ # Insert L[L[b]]-L[L[a]] at index L[b] into the list
i # If
® # L[L[b]]-L[L[a]] from variable `®`
0› # is positive (>0):
3 # Push a 3
ë # Else (it's <=0):
.µ # Reset the counter variable `¾` to 0
X # And push L[c] from variable `X`
} # Close the if-else statement
F # Loop that many times (either 3 or L[c]):
¼ # Increase the counter variable by 1 every iteration
x86(-64) machine code, 17 bytes
A function that simulates a SUBLEQ machine with 8-bit cells, returning to the caller with the array modified. Callable from C with the x86-64 SysV calling convention as void subleq8(int8_t *end, int8_t *aligned_start); (Like a .begin() / .end() range, the end pointer start+length.)
start must be aligned by 256 so its low 8 bits are 0. This allows indexing the array by replacing the low byte with x86 partial-register stuff, instead of needing to zero-extend the guest byte into a separate register and use an indexed addressing mode. (A similar trick is useful on 8-bit systems like AVR where two 8-bit registers form a 16-bit pointer: if you don't cross a 256-byte boundary, you can skip adc into the high byte; if the array is aligned by 256 you can just replace the low byte.)
start must be in the low 32 bits of virtual address space, e.g. the x32 ABI, mmap(MAP_32BIT), or simply a static array (e.g. in .data) in a non-PIE Linux executable. Or if you run the same machine code in 32-bit mode.
;;; array in ESI, aligned by 256 (so the low byte can be replaced to index into it)
;;; end-pointer (array+len) in EDI
subleq8:
89F0 mov eax, esi ; copy high bytes to EAX so merging a new low byte generates an address
.nextinst:
AC lodsb ; fetch address A into AL, forming an x86 pointer in RAX
8A 10 mov dl, [rax] ; deref A
AC lodsb ; fetch address B
28 10 sub [rax], dl ; SUB and set FLAGS
AC lodsb ; fetch branch target address; RSI points at next instruction
0F 4E F0 cmovle esi, eax ; update guest PC according to FLAGS
39 FE cmp esi, edi ; like cmp sil, dil but range-shifted: both regs have the same high bytes
72 F2 jb .nextinst ; len must be at most 127 to still consider *all* negative byte values as unsigned > 0x7f
C3 ret
Instead of checking for the subtraction result being negative or zero, we actually check if [b] was less-or-equal than [a] before the subtraction, using FLAGS set by the sub instruction (like cmp). This is the same as checking the result against 0 if signed overflow didn't happen, otherwise it's the opposite. With Python arbitrary-precision integers, signed overflow is impossible. (There is no x86 FLAGS condition for negative-or-zero so it would take an extra 3 bytes, either for cmp byte [rax], 0 or for cmovs + cmove instead of cmovle. Like most ISAs with signed-compare conditions, x86 does have a le less-or-equal condition that checks SF != OF || ZF in the same predicate.)
Apparently even Wikipedia defines a SUBLEQ OISC in terms of comparing against zero, rather than the usual computer-architecture meaning of a less-or-equal condition after a subtraction. :/ If signed wrap-around (overflow) doesn't happen, the two variants of the architecture are equivalent. Unless / until there's a test-case that requires checking the sign of the result instead of the LE condition generated by subtraction, I'm going to leave my answer this way. I'm pretty confident this is also Turing complete and probably not significantly different for implementing most basic arithmetic, if any of that comes near signed overflow.
The max program length is 127 or 128 bytes if you want any negative address to end the program. 126 is the highest multiple of 3, if your program doesn't do misaligned jumps and doesn't want to keep some data past the end of the last instruction. (It effectively treats the program counter as unsigned, looping do {} while(PC < end), so a negative program counter is an unsigned number in the upper half of range, above any signed-positive length.) Otherwise, max program length = 255, exiting only on PC=255, although that's untested and rollover of the high bytes of RSI could happen, but maybe only with misaligned jumps, e.g. to PC=254 so fetching the full instruction would get to PC=257, with lodsb carrying into the higher bytes of RSI.
(With len limited to 128, misaligned jumps are no problem. e.g. a program might use its first instruction to jump to address 4 and then use the first 4 bytes purely as data, eventually falling off the end when PC = 128.)
Tested with the test-case in the question; it's correct for that but it doesn't AFAIK test for off-by-one errors (<= vs. < in the SUBLEQ operation), or for exiting via a taken branch to a negative or past-the-end address. I think my program is correct, e.g. cmovle treats "equal" the same as "less-than", and the test-case does exercise less-than causing the branch to be taken.
x86(-64) machine code, 21 bytes
Simulates a 32-bit SUBLEQ machine. Writing a 32-bit register zero-extends to the full 64-bit register so even if we were willing to align the SUBLEQ program by 2^32 that wouldn't all the same partial-register trickery.
Even starting at absolute address 0 wouldn't work, because SUBLEQ addressing is in terms of cells, like a word-addressable machine. But x86 is a byte-addressable architecture, so SUBLEQ addresses have to be scaled by 4 to get x86-64 byte offsets since we're using 4-byte cells.
In x86-64 SysV, void subleq32(int32_t *end, int32_t *start). I think the start pointer might need to be 4GiB aligned to properly handle all jumps to negative addresses, since we only compare the low 32 bits after doing x86_address = base + target*4. With a larger base, some negative targets could wrap. And due to the scaling by 4, the most negative value we detect is -2^30. We shift out the high bits of some larger numbers, although some larger-magnitude numbers will happen to have a bit that gets shifted to the sign-bit position, too.
So maybe it's only really able to correctly simulate a 30-bit SUBLEQ machine, or 29-bit or something. But still much more than 8-bit, which is the point.
;; word-addressable SUBLEQ machine, where the components of an instruction are still 1 guest address apart
;; despite them being farther apart in x86-64
;;; base address in ESI
;;; end address in EDI (base+length in bytes, the one-past-end address. An x86 address, not guest words)
subleq32:
89F1 mov ecx, esi ; save the base for later use.
.nextinst:
AD lodsd ; fetch address A
8B1481 mov edx, [rcx+rax*4] ; and deref; we don't need it after this
AD lodsd ; fetch address B
291481 sub [rcx+rax*4], edx ; update [B] and set FLAGS
AD lodsd ; fetch branch target address; RSI points at next instruction for non-taken
7F03 jnle .not_taken
8D3481 lea esi, [rcx+rax*4] ; scale the SUBLEQ target address to an x86 address
.not_taken:
39FE cmp esi, edi
72EE jb .nextinst ; unsigned compare catches negative as well as past-the-end
C3 ret
I also tested this with dd instead of db for the initial array, using GDB print (int[12])prog32 after running, or display the same expression while single-stepping. This was my test caller:
section .data
prog32:
dd 3, 4, 3,
dd 6, 13, 9,
dd 6, 3, -3,
dd 7, 8, 3
prog32_end:
section .text
global _start
_start:
mov edi, prog32_end
mov esi, prog32
call subleq32
int3 ; software breakpoint
xor edi, edi
mov eax, 231
syscall ; x86-64 Linux exit_group(0)
x86(-64) machine code for a byte-addressable SUBLEQ, 17 bytes
Untested, but it's the same logic as the 8-bit version, with the same instructions except 32-bit operand-size. (So writing full registers instead of merging into the low byte).
The SUBLEQ program's cells are 4 SUBLEQ addresses wide, matching the x86 addressing. One way to make the test program run the same is to scale all its starting values by 4, like 12, 16, 12 | 24, 52, 36 | .... If you do misaligned load/store or jumps, it will expose the fact that it's a little-endian machine. (SUBLEQ doesn't have byte loads, only word loads.)
This requires the array to start at absolute address 0. Or in 32-bit mode, DS:0, so you could plausibly imagine some segmented code that sets the DS segment base for passing arrays. (Some "string" instructions use the ES segment, but lodsd uses ds:esi).
;; byte-addressable SUBLEQ machine
;;; base address = 0
;;; end address = len in EDI
subleq32b:
31F6 xor esi, esi ; start address = 0, can't justify having the caller pass it since we depend on it being 0
.nextinst:
AD lodsd ; fetch address A
8B10 mov edx, [rax] ; and deref; we don't need it after this
AD lodsd ; fetch address B
2910 sub [rax], edx ; update [B] and set FLAGS
AD lodsd ; fetch branch target address; RSI points at next instruction for non-taken
0F4EF0 cmovle esi, eax
39FE cmp esi, edi
72F2 jb .nextinst ; unsigned compare catches negative as well as past-the-end
C3 ret
A 16-bit version of this would be highly plausible in 16-bit mode where segmentation is normal, then it wouldn't be too weird to take an array at ds:0. Same machine code except for the ModRM.rm field that encodes [rax]... except there is no [ax] addressing mode, only [bx], [di], or [si] for single-register 16-bit addressing modes that use the DS segment. So that might cost a couple 1-byte xchg instructions, perhaps with BX.
16-bit addressing modes don't allow a scale factor so they're not very convenient for a word-addressable SUBLEQ.
Python 3.11, 92 86 85 62 bytes
This is my answer, so don't suggest copying from other answers.
def f(p,i=0):
p[k:=p[i+1]]-=p[p[i]];f(p,(i+3,p[i+2])[p[k]<1])
Takes in a program as a list, mutates the original list with an error (you need a variable and try/except to access the results).
Explanation:
def f(p,i=0):
Define the main function, p is the program to be ran and initialize the instruction pointer as i.
p[k:=p[i+1]]-=p[p[i]];
Subtract A from B and store in B. Also assigns B as k to shave off bytes.
f(p,(i+3,p[i+2])[p[k]<1])
If B is less than 1, then recursively call with i as C, otherwise add 3.
Python 3, 58 bytes
def f(s,i=0):a,b,*r=s[i:];s[b]-=s[a];f(s,[i+3,*r][s[b]<1])
A recursive function that updates a Subleq program, the list of integers s, in place terminating with a ValueError.
JavaScript (ES6), 57 bytes
Modifies the input array.
f=(m,p=0)=>1/m[p]&&f(m,(m[m[p+1]]-=m[m[p]])>0?p+3:m[p+2])
Commented
f = ( // f is a recursive function taking:
m, // m[] = memory array
p = 0 // p = program counter
) => //
1 / m[p] // abort if m[p] is undefined
&& // otherwise:
f( // do a recursive call:
m, // pass m[]
( //
m[m[p + 1]] // subtract from m[m[p + 1]] ...
-= m[m[p]] // ... the value stored at m[m[p]]
) > 0 ? // if the result is positive:
p + 3 // continue execution at p + 3
: // else:
m[p + 2] // jump at m[p + 2]
) // end of recursive call
Charcoal, 45 bytes
W‹ⅈLθ«≔§θ⊕ⅈι§≔θι⁻§θι§θ§θⅈ¿›§θι⁰M³→J§θ⁺²ⅈ⁰»⭆¹θ
Try it online! Link is to verbose version of code. Explanation:
W‹ⅈLθ«
Repeat while the instruction pointer is within range.
≔§θ⊕ⅈι
Get the B parameter into a variable.
§≔θι⁻§θι§θ§θⅈ
Subtract the element at the A parameter from the element at the B parameter.
¿›§θι⁰
If the element at the B parameter is still positive, ...
M³→
... move to the next instruction, otherwise...
J§θ⁺²ⅈ⁰
.... jump to the C parameter.
»⭆¹θ
Output the final array.
Python3, 94 bytes:
def f(t,i=0):
if i>=len(t):return t
t[t[i+1]]-=t[t[i]];return f(t,[t[i+2],i+3][t[t[i+1]]>0])