| Bytes | Lang | Time | Link |
|---|---|---|---|
| 010 | Python | 250130T074640Z | Lucenapo |
| 003 | Grass | 230623T011053Z | bluswimm |
| 1225 | Perl | 230109T155340Z | Andy A. |
| 002 | Pyt | 230108T201507Z | Kip the |
| 006 | C# | 220823T095654Z | Acer |
| 007 | Knight | 220821T045327Z | Sampersa |
| 002 | Chicken | 211125T013134Z | Fmbalbue |
| 004 | Jelly | 200719T204343Z | fireflam |
| 019 | PowerPC for XBox 360 | 220819T215024Z | NoLonger |
| 008 | Lean Mean Bean Machine | 211105T151145Z | Mayube |
| 1164 | x86 32bit machine code | 170720T201742Z | Peter Co |
| 003 | Add++ | 211116T180048Z | Fmbalbue |
| 003 | KonamiCode | 211112T171946Z | Ginger |
| nan | Rust | 210704T175354Z | Aiden4 |
| 003 | Swift | 210923T164429Z | Bbrk24 |
| 010 | Pip Classic | 210922T201738Z | DLosc |
| 003 | Google Sheets | 210809T123726Z | General |
| 002 | Dis | 210807T230746Z | user1004 |
| 003 | Desmos | 210318T150151Z | Ethan Ch |
| 002 | NDBall | 201102T135617Z | Aspwil e |
| 002 | Zig 0.6.0 | 200721T194828Z | pfg |
| 026 | Scratch 1.x | 200720T194637Z | qarz |
| 011 | JavaScript | 180214T043312Z | iovoid |
| 005 | Turing Machine Code | 191115T145634Z | ouflak |
| 007 | JavaScript | 191211T183622Z | null |
| 002 | Unreadable | 191126T154427Z | Robin Ry |
| 002 | Milky Way | 191126T121620Z | user8505 |
| 034 | Keg | 191126T070343Z | user8505 |
| 010 | Python | 170720T050651Z | KSab |
| 002 | Shakespeare Programming Language | 191113T204009Z | Hello Go |
| 027 | Ruby | 190124T032942Z | CG One H |
| 004 | AWK | 170720T152907Z | Robert B |
| 006 | Go | 190810T043440Z | Purple P |
| nan | TIBasic 83+/84+/SE | 180302T182143Z | bb94 |
| 012 | INTERCAL | 190404T012347Z | Unrelate |
| 003 | Runic Enchantments | 190228T194751Z | Draco18s |
| 005 | Ink | 190228T165351Z | Sara J |
| 007 | PCRE Regex | 190126T052214Z | Deadcode |
| 004 | ECMAScript Regex | 190126T053607Z | Deadcode |
| 016 | Powershell | 181119T215744Z | Veskah |
| 005 | Rockstar | 181119T182910Z | Mayube |
| 003 | Husk1 | 181122T180501Z | ბიმო |
| 003 | Scratch scratchblocks2 | 181121T211842Z | W. K. |
| 008 | JavaScript Node.js | 181101T022344Z | Shieru A |
| 015 | Literate Haskell | 181101T044011Z | Ørj |
| 002 | FRACTRAN | 181101T024458Z | Conor O& |
| 005 | AutoHotkey | 180301T235540Z | nelsontr |
| 002 | SmileBASIC | 180301T200340Z | 12Me21 |
| 006 | Ly | 170810T000009Z | LyricLy |
| 003 | Gaia | 170808T151309Z | Business |
| 002 | BrainHack a variation of BrainFlak | 170720T132647Z | Riley |
| 007 | JavaScript | 170720T051932Z | user4180 |
| 002 | Ada | 170721T165152Z | xaambru |
| 008 | COBOL GNU | 170721T132427Z | PhilDenf |
| 003 | APL and MATL and Fortran | 170720T071127Z | Adá |
| 002 | Commodore 64 Basic | 170720T215733Z | Mark |
| 002 | VBA | 170720T185208Z | Taylor R |
| 014 | Fortran | 170720T173326Z | Steadybo |
| 004 | S.I.L.O.S | 170720T130718Z | Rohan Jh |
| 016 | C# | 170720T084205Z | TheLetha |
| 007 | CJam | 170720T080637Z | Erik the |
| 016 | C clang | 170720T062127Z | Anders K |
| 004 | Java | 170720T064848Z | user4180 |
| 018 | Free Pascal | 170720T053953Z | tsh |
| 006 | Pyth | 170720T051559Z | isaacg |
| 002 | Changeling | 170720T050103Z | Dennis |
Python (10 bytes)
'''"""\xx
The \xx cannot be in a comment due to the newline. If it is in a string, the error is'unicodeescape' codec can't decode bytes in position 2-3: truncated \xXX escape.
If it is not in a string, the error is unexpected character after line continuation character.
Grass, 3 bytes
wWv
Grass programs are written using only w, W, and v; all other characters are ignored.
w is necessary because any characters placed before the program's first w are ignored.
W is expected to be followed by a sequence of w characters to form a function application; however, it is instead followed by a v, which causes a parse error.
Perl, 12 bytes 25 bytes
Tried to cut all strings. Not really short.
.
"'/%+)}]>#
=pod
=cut
.
- First newline to exit comments (there are no multiline comments)
- then
.in first column to exit format sections - end strings (there are strings like
"...",'...'and likeqw/.../but instead of '/' other chars can be used - start a POD section (
=podat the beginning of the line) - end the POD section (
=cutat the beginning of the line) .is not in any comment and is invalid.
After __DATA__ or __END__ there is no code. There starts a data section.
Hope this is correct (my first step on code golf ...)
Pyt, 2 bytes
ɬĹ
Just one amongst many.
It's pretty simple, actually: Pyt has no comment markers, so every valid character when using Pyt's proprietary encoding (not UTF-8) does something.
ɬ pushes the string "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" Ĺ returns the least common multiple of the top two elements on the stack You can replace ɬ with either ɫ or ʊ (only the lowercase and uppercase letters, respectively), and you can replace Ĺ with the vast majority of the functions in Pyt, as most of them only work on numbers.
There is another 2-byte family of solutions:
do
where d is any digit (0-9), and o is either ҏ or Ƒ. ҏ palindromizes a string or array and does not work on numbers, and Ƒ flattens the array at the top of the stack, but errors if the top of the stack isn't an array.
And there is a singleton (not in a family of solutions):
ɔŕ
ɔ clears the stack, and then ŕ pops the top of the stack (and errors out if it is empty).
C#, 6 bytes
"*/#
# is only used in C# to specify a preprocessor directive. Preprocessor directives are only valid if the # is the first non-whitespace character on its line. As a result, it doesn't even matter if the specified directive exists or not (so the shortest solution is to not specify any directive name).
The only way to prevent # being interpreted as a directive is to put it in a comment or string.
It is not possible to comment out # because // is prevented by preceeding newline and /* is prevented by */. It is not possible to enclose it in a string because multiline strings must start with @", but there is no way to put @ in front of a string that would capture # (@\n" is not valid).
Knight, 7 bytes
Technically, knight programs aren't allowed to contain anything outside of \r, \n ,\t, or <space>-~, so you could just submit <NUL> for a one byte solution.
Additionally, each knight program is exactly a single expression; handling things after that experssion is up to the implementation. So, you could prefix any program with TRUE—a valid expression—after which anythign afterwards would be undefined behaviour. However, that kind of defeats the spirit of the question as well IMO, so let's assume that we're actually in some arbitrary expression:
#"'#'
$
The idea is since $ is not a legal function, it cannot appear outside of strings.
- If there's no leading code, then the comment removes the rest of the first line.
- If we're embedded in a comment, our entire first line will be commented out, leaivng a lone
$, which is illegal. - If we're inside a
"string, then the first"will close it, which is then followed by the legal'#'string - If we're inside a
'string, then the first'will close it, with the remaining'being commented out by the#, leaving$by itself.
Chicken, 2 bytes
ih
Thanks to jimmy23013
Jelly, 3 4 bytes
»»
€
Alternative 4 bytes:
»
€
+1 byte because emanresu A found an issue with the old approach
Jelly has no comments, and every line in the code is parsed whether or not it is reachable. The only way to prevent some code from being executed is by putting it in a string literal. There are different types of string literals based on the ending character, but almost all start with “. This is countered by », which terminates a string (and interprets it as a dictionary-compressed string).
The only two string literal types that do not terminate at the first » are ⁾ and ⁽, which each process precisely the next two characters. However this still leaves the newline unescaped.
Thus € (each/map) will always be executed on a link (line) of its own. It tries to get a link from its left; since none exists, the interpreter errors.
PowerPC for XBox 360, 19 bytes
~
J,e~
J,e~
J,e~
J,
The operative sequence is ~␊J,, in which the linefeed may be replaced with any byte less than 32 and the J with any byte congruent to 2 mod 8.
If this sequence appears on a four-byte boundary (which it will, in the above string) on an executable page, it codes an xdcbt (prefetch to L1 cache, skipping L2); any (even speculative) jump to it will then break cache coherency, probably leading to a crash soon after. Since the branch predictor even predicts computed jump destinations, this will happen sooner or later, see here.
I wish I had been able to force an all-printable version, but LF is pretty close.
Lean Mean Bean Machine, 8 bytes
OO
0/
&
Begins with a leading newline.
Unfortunately this is basically an entire invalid program, but it can be inserted into larger programs, and there's no way to my knowledge to prevent the runtime error.
LMBM creates a 'marble' (aka instruction pointer) at every O in the code, and there is no way to prevent this (which now that I think about it causes some issues if you need an O in a string). & is the division operator in LMBM.
In short, this snippet causes a ZeroDivisionError: division by zero runtime error.
x86 32-bit machine code, 11 bytes (and future-proof 64-bit)
90 90 90 90 90 90 90 90 90 0f 0b
This is times 9 nop / ud2. It's basically a NOP sled, so it still runs as 0 or more nops and then ud2 to raise an exception, regardless of how many of the 0x90 bytes were consumed as operands to a preceding opcode. Other single-byte instructions (like times 9 xchg eax, ecx) would work, too.
x86 64-bit machine code, 10 bytes (current CPUs)
There are some 1-byte illegal instructions in 64-bit mode, until some future ISA extension repurposes them as prefixes or parts of multi-byte opcodes in 64-bit mode only, separate from their meaning in 32-bit mode. 0x0e is illegal in 64-bit mode on current CPUs (tested on Intel Skylake). It's push cs in 32-bit mode; AMD64 freed up a bunch of opcodes for future use but as of yet neither Intel nor AMD have added any 64-bit-only extension to use them. But they still could, which is why this isn't future-proof; some future CPU could decode 0x0e as an opcode or prefix.
0e 0e 0e 0e 0e 0e 0e 0e 0e 0e
Rules interpretation for executable machine code:
The bytes can't be jumped over (like the "not parsed" restriction), because CPUs don't raise exceptions until they actually try to decode/execute (non-speculatively).
Illegal means always raises an exception, for example an illegal-instruction exception. (Real programs can catch that with an exception handler on bare metal, or install an OS signal handler, but I think this captures the spirit of the challenge.)
It works because a shorter byte-string ending in ud2 could appear as an imm32 and/or part of the addressing mode for another instruction, or split across a pair of instructions. It's easiest to think about this in terms of what you could put before the string to "consume" the bytes as part of an instruction, and leave something that won't fault.
I think an instruction can consume at most 9 bytes of arbitrary stuff: a SIB byte, a disp32, and an imm32. i.e. the first 2 bytes of this instruction can consume 8 NOPs and a ud2, but not 9.
c7 84 4b 00 04 00 00 78 56 34 12 mov dword [rbx+rcx*2+0x400],0x12345678
Can't beat 9 nops:
db 0xc7, 0x84 ; opcode + mod/rm byte: consumes 9 bytes (SIB + disp32 + imm32)
times 9 nop ; 1-byte xchg eax, ecx or whatever works, too
ud2
----
b: c7 84 90 90 90 90 90 90 90 90 90 mov DWORD PTR [rax+rdx*4-0x6f6f6f70],0x90909090
16: 0f 0b ud2
64-bit mode:
c7 84 0e 0e 0e 0e 0e 0e 0e 0e 0e mov DWORD PTR [rsi+rcx*1+0xe0e0e0e],0xe0e0e0e
0e (bad)
But the bytes for 8 NOPs + ud2 (or times 9 db 0x0e) can appear as part of other insns:
db 0xc7, 0x84 ; defender's opcode + mod/rm that consumes 9 bytes
times 8 nop ; attacker code
ud2
times 10 nop ;; defenders's padding to be consumed by the 0b opcode (2nd half of ud2)
----
18: c7 84 90 90 90 90 90 90 90 90 0f mov DWORD PTR [rax+rdx*4-0x6f6f6f70],0xf909090
23: 0b 90 90 90 90 90 or edx,DWORD PTR [rax-0x6f6f6f70]
29: 90 nop
2a: 90 nop
...
KonamiCode, 3 bytes
]()
Explanation:
First, the ] closes any comments this string might be placed in. Then, the () attempts to pass an empty number to the comment closing sign. Using the pynami interpreter, this will always result in an "This command doesn't have a parameter!" error, even if you put it inside a comment.
Rust, 65540 bytes
`"###... a lot more #s ###
That is a backtick, quote, 65535 #s, and an invisible U+202A Per this advisory there is an issue with Unicode bidirectional overrides that can cause Unicode-aware editors to misrender the code, allowing bad actors to hide malicious code in plain sight. Starting in rust 1.56.1, rustc contains mitigation for that vulnerability, preventing the use of the relevant codepoints by default anywhere in the source code. However, the text_direction_codepoint_in_literal lint can be set to allow or warn to let use it in string literals. String literals are tricky to break out of, mainly because raw string literals are designed to allow any valid utf8 within them. However, take a look at rustc's lexer, and you will find something interesting- no more than 65535 hashtags are permitted in delimiters. Simply adding the quote followed by the long string of hashtags is enough to break any attempts at surrounding it in strings, and the U+202A is always invalid anywhere but a string. However there is another problem- someone could use the starting quote to begin a string, and hide the nastiness that way. This is where the backtick comes into play- it is an error anywhere but a literal or a comment, thus making it truly impossible for rust to contain this code. If you're wondering if there is an easier solution that fails after the lexer you'd be disappointed- anything that passes the lexer can be thrown away by a macro, and tacking on a #[cfg(nope)] will make anything that parses disappear.
edit: an actually valid answer this time
Swift, 4 3 bytes
There's control characters in here, so I'll just give their ASCII values: 0A 22 00.
Most of the ways that other C-family languages do this don't work in Swift, because Swift allows nested block comments.
The null character always generates at least a warning. However, if it's in the middle of whitespace or a line comment, it's only a warning, not an error. Additionally, Xcode lets you put most control characters inside of a block comment and just rolls with it.
Emphasis on most. For some reason, if you have a null byte in the middle of a block comment or string literal, the Swift compiler gets lost and can't find the end, no matter where it is. This causes an error (and also breaks Xcode's syntax highlighting). So, this is three bytes:
0A-- a newline, to make sure we're not in a line comment or single-line string literal.22--", the start of a single-line string literal, unless it's inside a multi-line string literal or block comment. This does not end a multi-line string literal if it started inside of one.00-- a null byte, to prevent the block comment or string literal from terminating.
The original answer used /* (open block comment) rather than ".
Pip Classic, 10 bytes
;\"
;`
*;
I think this should be solid, but I welcome anyone to prove me wrong.
How?
A semicolon at the start of a line begins a single-line comment, so our code is effectively
*;
and Pip complains because * is a binary operator and needs a left operand.
If we add any expression before the snippet, we still get an error because * needs a right operand. No operand can be supplied because * is followed by the expression terminator ;.
We can't wrap the snippet in a string, because every possible string delimiter (", \", and `) is matched by a delimiter in the snippet. (In a regular "-delimited string, the backslash is not an escape character, so it doesn't mess anything up.)
Finally, we can't comment out the last line of the snippet because Pip Classic doesn't have block comments.
Google Sheets, Excel; 3 bytes
For a single-cell Formula (starts with + or =). Otherwise, you can just put whatever arbitrary text you want into a cell.
Here it is: 1"1
- A string has to be enclosed in
"s. A double quote must be escaped by"". - The
"has to be a string token boundary. - If it's the end, following a string with a 1 is illegal.
- If it's the beginning, preceding a string with a 1 is also illegal.
Dis, 2 bytes.
))
How it works
(to the nearest)is a comment.- But nothing can match the second
)above in this syntax. - Thus it is syntax error.
Desmos, 3 bytes
\
(linefeed, backslash, linefeed)
I recently said "Pasting in invalid-formatted text [into Desmos] simply does nothing." As I recently discovered, this isn't quite true. When pasting in multiple lines, it's happy to throw errors if one of them is invalid. Now, this works even if some of those lines other are blank, causing them to be ignored. Therefore, we make a \ on its own line, which is guaranteed to throw an error.
Logically, we should be able to knock this down to two bytes (\, LF), as no legal line can end with a \. However, Desmos strangely interprets this as an empty list (normally represented as []), for reasons I don't understand. This makes that two-byte string an unusual almost-illegal string, where the only valid program containing that string is the string itself.
NDBall 2 bytes
specificly in NDBallSim V1.0.1
##
a hashtag is a memory cell instructor and requires a direction, two of them anywhere will cause the error
NDBall Parse ERROR @ LINE (whatever line): Memory cell requires a direction ex: #>12
this double character trick can actually be done with many more chars, to be exact all of these )(,}{|><+-pP$%Y][E and newline
This is actually all (discounting digits) of the useable chars in the lang itself because it never has a case where you use the same char twice
Zig 0.6.0, 2 bytes
`
(\n`)
A newline will reset the tokenizer state, escaping any comments or multiline strings and erroring for single line strings. Backticks are not a valid character except in strings.
Scratch (1.x, except 1.2 beta), scratchblocks syntax, 26 bytes
when gf clicked
say(()/()
Leading new line ensures that "when gf clicked" will not be in a comment, so that what's below it will run.
This errors when run in the Stage, because the Stage cannot use the say block.
This errors when run in a sprite by itself, because a divide by zero is attempted. (When a number argument is blank, it is read as 0.)
This errors when run in a sprite when ) or a new line is added after it, because it doesn't change what the existing code does.
This errors when run in a sprite when something else is added after it, because it creates an undefined block which causes an error when run.
1.2 beta is excluded because it featured comment blocks (different from modern comments), which supported multiline, which this could be put into.
Versions past 1.4, including the Experimental Viewer, are excluded because dividing by zero does not cause an error.
JavaScript, 11 characters
`
`*/}'"`\u)
The backticks make sure to kill template strings, the quotes get rid of strings, the newline avoid commented lines, the end of comment avoids block comments, and the last backtick and escape (with a ) to avoid appending numbers or /) try to start a invalid string.
Turing Machine Code 5 bytes
Assuming block editing isn't allowed:
0
Or with the symbols showing:
< cr >< lf >
0
< cr >< lf >
Without block editing, it is impossible to stick this behind the comment symbol (';'), as the '0' will end up on the next line anyway. There is no block commenting in Turing Machine Code, a fact that is taken advantage of in other answers here as well. This patch of code would not only not run, it would kill the whole program before it can begin to execute, no matter where it is placed.
JavaScript, 7 bytes
*/
#`#
Adding a
//at the beginning will still not work because of the leading newline, leaving the second line uncommented.Adding a
/*will not uncomment the string completely because of the closing*/that completes it, leaving the#exposed.Adding
`will not quote the string completely, because of`that completes a string, leaving the#exposed.Regular expressions won't work because of
#following/character./following*cannot be parsed as a regular expression, as regular expressions cannot have newlines
Try it!
clicky.onclick=a=>{console.clear();console.log(eval(before.value+" */\n#`#"+after.value));}
textarea {
font-family: monospace;
}
<button onclick="console.clear();">Clear console</button>
<br>
<textarea id=before placeholder="before string"></textarea>
<pre><code> */
#`#</code></pre>
<textarea id=after placeholder="after string"></textarea>
<br>
<button id=clicky>Evaluate</button>
Unreadable, 2 bytes
''
All Unreadable commands must be of the form '""…", with one ' followed by 1 to 10 "s. Having two successive 's anywhere in the program leads to error: parser failed: invalid command (0): '.
Milky Way, 2 bytes
Tries to execute an undefined opcode. Milky Way does not have comments. The newline is for ending strings.
)
Keg, 3 4 bytes
Fixed a for loop bug noted by @Jono2906
)ø.
Explanation
\n Terminate a line-comment
) End a for loop
ø Clear the stack
. Try to print the TOS item, which will create an error to the program.
Python, 10 bytes (not cpython)
?"""?'''?
Edit:
Due to @feersum's diligence in finding obscure ways to break the Python interpreter, this answer is completely invalidated for any typical cpython environment as far as I can tell! (Python 2 and 3 for both Windows and Linux) I do still believe that these cracks will not work for Pypy on any platform (the only other Python implementation I have tested).
Edit 2:
In the comments @EdgyNerd has found this crack taking advantage of a non-ascii encoding declaration! This seems to decode to print(""). I don't know exactly how this was found but I imagine the way to fix this sort of exploit would maybe be to try different combinations of any invalid characters where the ?s are, and find one that doesn't behave well with any encoding.
Note the leading newline. Cannot be commented out due to the newline, and no combination of triple quoted strings should work if I thought about this correctly.
@feersum in the comments seems to have completely broken any cpython program on Windows as far as I can tell by adding the 0x1A character to the beginning of a file. It seems that maybe (?) this is due to the way this character is handled by the operating system, apparently being a translated to an EOF as it passes through stdin because of some legacy DOS standard.
In a very real sense this isn't an issue with python but with the operating system. If you create a python script that reads the file and uses the builtin compile on it, it gives the more expected behavior of throwing a syntax error. Pypy (which probably does just this internally) also throws an error.
Shakespeare Programming Language, 2 bytes
.:
Explanation: If this string is in the title of the play, the . ends it and the : is not a valid character name. Similar problems occur in an act and scene name. No character can speak a line beginning with :, and the . will end a Recall statement, which can otherwise create a comment.
Ruby, 58 23 27 bytes AND proof of impossibility at bottom
This snippet is valid in any Ruby version prior to Ruby 2.3 (when heredocs were added):
=end
)}end/;[}'"\//;[}#{]}
Old version (invalid):
=end
)}end/;kill(Process.pid,-9)'"\//;kill Process.pid,-9
This cannot be used anywhere in a Ruby program except after __END__.
Proof of impossibility
This solution is impossible in any Ruby version after and including Ruby 2.3.
Any Ruby solution can be inserted into this snippet and function as a valid program:
<<'string'
# Insert code here
string
With this particular snippet, you can add (note leading newline)
string
to your solution in order to invalidate the program. However, changing the "name" of the heredoc will again invalidate your solution. Heredocs can have infinite placeholders, meaning a solution accounting for all of them would be infinitely long. Thus an answer in Ruby 2.3+ is impossible.
Thanks to histocrat for pointing this out.
AWK, 4 bytes
/
Since AWK doesn't have a method to do multi-line comments, need 2 newlines before and 1 after / to prevent commenting out or turning this into a regex, e.g. add 1/. The most common message being `unexpected newline or end of string.
Go, 6 bytes
*/```
The grave accent (`) marks a raw string literal, inside which all characters except `, including newlines and backslashes, are interpreted literally as part of the string. Three `'s in a row are the core: adjacent string literals are invalid and ` always closes a ` string, so there's no way to make sense of them. I had to use 3 more bytes for anti-circumvention, a newline so we can't be inside a single-line comment or a normal quoted string, and a */ so we can't be inside a multi-line comment.
TI-Basic (83+/84+/SE, 24500 bytes)
A
(24500 times)
TI(-83+/84+/SE)-Basic does syntax checking only on statements that it reaches, so even 5000 End statements in a row can be skipped with a Return. This, in contrast, cannot fit into the RAM of a TI-83+/84+/SE, so no program can contain this string. Being a bit conservative with the character count here.
The original TI-83 has 27000 bytes of RAM, so you'll need 27500 As in that case.
TI-Basic (89/Ti/92+/V200, 3 bytes)
"
Newline, quote, newline. The newline closes any comments (and disallows embedding the illegal character in a string, since AFAIK multiline string constants are not allowed), the other newline disallows closing the string, and the quote gives a syntax error.
You can get to 2 bytes with
±
without the newline, but I'm not sure whether this counts because ± is valid only in string constants.
INTERCAL, 12 bytes
DOTRYAGAINDO
INTERCAL's approach to syntax errors is a bit special. Essentially, an invalid statement won't actually error unless the program tries to execute it. In fact, the idiomatic syntax for comments is to start them with PLEASE NOTE, which really just starts a statement, declares that it isn't to be executed, and then begins it with the letter E. If your code has DODO in the middle of it, you could prepend DOABSTAINFROM(1)(1) and tack any valid statement onto the end and you'll be fine, if it's DODODO you can just bend execution around it as (1)DON'TDODODOCOMEFROM(1). Even though INTERCAL lacks string literal syntax for escaping them, there's no way to use syntax errors to create an illegal string, even exhausting every possible line number with (1)DO(2)DO...(65535)DODODO, since it seems that it's plenty possible to have duplicate line numbers with COME FROM working with any of them.
To make an illegal string, we actually need to use a perfectly valid statement: TRY AGAIN. Even if it doesn't get executed, it strictly must be the last statement in a program if it's in the program at all. 12 bytes is, to my knowledge, the shortest an illegal string can get using TRY AGAIN, because it needs to guarantee that there is a statement after it (executed or not) so DOTRYAGAIN is just normal code, and it needs to make sure that the entire statement is indeed TRY AGAIN, so TRYAGAINDO doesn't work because it can easily be turned into an ignored, normal syntax error: DON'TRYAGAINDOGIVEUP, or PLEASE DO NOT TRY TO USE TRYAGAINDO NOT THAT IT WOULD WORK. No matter what you put on either side of DOTRYAGAINDO, you'll error, with either ICL993I I GAVE UP LONG AGO, ICL079I PROGRAMMER IS INSUFFICIENTLY POLITE, or ICL099I PROGRAMMER IS OVERLY POLITE.
Runic Enchantments, 3 bytes
Ṷ
One of many possible variations.
Runic utilizes unicode combining characters in a "M modifies the behavior of C" (where C is a command). As such, no two modifiers are allowed to modify the same command and the parser will throw an error if such an occurrence is found.
Similarly, certain commands that redirect the IP cannot be modified in any way, due to the existence of direction modifying modifier characters (and both in the same cell makes no sense).
There is no way to escape or literal-ize the string to make it valid. Tio link contains a ; in order to bypass the higher-priority "no terminator" error.
Ink, 5 bytes
*/{}
Leading newline ends single-line comments.
*/ ends multi-line comments. Thanks to the leading newline, you can't put a / in front of it to make it the start of a comment rather than the end of one.
{ and } enclose things meant to be parsed, rather than simply printed. If there's nothing between them, the compiler gets sad because it Expected some kind of logic, conditional or sequence within braces: { ... } but saw '}'. This happens even inside string literals, so there's no need to check if we're inside one of those.
PCRE Regex, 6 7 bytes
\E
)](+
Any string not containing \E would be legal inside a \Q...\E literal sequence. By starting this one with \E, we break out of such a sequence if we were in one. And if we weren't in one, but are preceded by a \, then it will be treated as a literal \E, and we'll still be guaranteed not to be inside a \Q...\E.
(+ is not part of any legal group structure, and will generate a compile-time error ("Incomplete group structure" and/or "The preceding token is not quantifiable" / "quantifier does not follow a repeatable item") anywhere other than:
- Within a
\Q...\E. We've handled this. We can't be inside one thanks to the\E. - Immediately after a
\. But since we have it immediately after a\E, it can't be immediately after a\. Note that\Ealone is valid, even if there was no\Qbefore it. - Inside a
#...style comment. This can only happen in free-spacing mode, but this can be turned on by(?x)anywhere in a regex, so we need to handle it. - Inside a
(?#...)style comment. - Inside a character class, e.g.
[(+]or[\Q\E(+]– or even[\Q\E](+], which will be treated as[](+], which, since an empty character class is not part of PCRE syntax (except in PCRE2 with the PCRE2_ALLOW_EMPTY_CLASS option enabled), is treated as a character class consisting of](+(the beginning of a character class is the only place where]does not need to be escaped).
Because of #3, we need a newline, to break out of any #... comment we may be in.
Because of #4, we need a ), to break out of any (?#...) comment we may be in.
And it is because of #5 that we need to put ] in front of the (+. This closes any character class we might have been in. If we hadn't already put a newline and/or a ) to close a potential comment, we'd need this to be ]], because a character class can't be empty and ] is allowed to be the first character in a class without being escaped. In any case, thanks to having a character before our ], it even works if a range was started, e.g. [!-\E)](+.
Edit: Silly me, didn't protect it from being inside comments. Fixed.
ECMAScript Regex, 4 bytes
]](+
This is a quite a bit easier than PCRE. There's no \Q...\E, no free-spacing mode, and no comments. But if we used just ](+ we could still be inside a character class and have our ] escaped, as [\](+] which would be treated as a character class of ](+. So we still need the double ] to make sure we exit any character class we may have been in, which works even if a range was started, e.g. [!-]](+.
(+ is illegal in any context other than a character class, and will give an error message such as "Nothing to repeat" or "Incomplete group structure" / "The preceding token is not quantifiable".
Powershell, 10 8 12 14 13 14 16 bytes
-2 byte thanks to Mazzy finding a better way to break it
+4 -1 bytes thanks to IsItGreyOrGray
$#>
'@';
"@";
@=
I hope this works. ' and " to guard against quotes, #> to break the block-comment, new lines to stop the single-line comment, both '@ and "@ to catch another style of strings, and then starts an improper array to throw a syntax error.
The logic being they can't use either set of quotes to get in, they can't block-comment it out, If @" is used, it'll create a here-string which can't have a token afterwards, and if they leave it alone, it'll try to make a broken array. This statement wants to live so hard, I keep finding even more holes in the armor.
Rockstar, 4 5 bytes
Crossed out 4 is still 4 :(
)
"""
Rockstar is a very... wordy language.
While " can be used to define a string, such as Put "Hello" into myVar, to my knowledge there is no way for 3 quotes to appear outside of a comment, and the close paren ensures that won't happen either (Comments in Rockstar are enclosed in parentheses, like this).
Rockstar also has a poetic literal syntax, in which punctuation is ignored, so the newline makes sure that the 3 quotes are the start of a line of code, which should always be invalid
Husk1, 3 bytes
◊
Explanation
The newlines force ◊ to be parsed as a supposed built-in, however since it's not (yet) implemented the parsing fails with unexpected "\9674" or an error because of empty lines.
Note: Initially I tried to force an inference failure, but the type-checking is done lazily and one can easily "un-break" programs with adding a valid main function.
1: The code might work at some point in the future. So to be precise any version of Husk as of before the date of this post (ie. at least up to commit 0806b9d).
Scratch (scratchblocks2), 3 bytes
There's no such thing as an error in scratchblocks2 - just red-colored blocks - but this can't be expressed in actual Scratch, so I think it's OK.
<?
Leading newline to avoid this just being commented or ::ed out.
Then a predicate block with it's label starting with ? - there's no such block.
JavaScript (Node.js), 9 8 bytes
`*/
\u`~
I think this should be illegal enough.
Previous JS attempts in other answers
;*/\u)By @Cows quack
As an ES5 answer this should be valid, but in ES6 wrapping the code with a pair of backticks wrecks this. As a result valid ES6 answers must involve backticks.
` `*/}'"`\u!By @iovoid
This is an improved version involving backticks. However a single
/after the code breaks this (It becomes a template literal being multiplied by a regex, useless but syntactically valid.) @Neil made a suggestion that changing!to). This should theoretically work because adding/at the end no longer works (due to malformed regex.)
Explanation
`*/
\u`~
//`*/
\u`~
and
/*`*/
\u`~
Blocks comments by introducing illegal escape sequences
``*/
\u`~
Blocks initial backtick by introducing non-terminated RegExp literal
console.log`*/
\u`~
Blocks tagged template literals by introducing an expected operator between two backticks
Literate Haskell, 15 bytes
Repairing a deleted attempt by nimi.
\end{code}
5
>
nimi's original attempt is the last two lines, based on Literate Haskell not allowing > style literate code to be on a neighboring line to a literate comment line (5 here). It failed because it can be embedded in a comment in the alternate ("LaTeX") literate coding style:
\begin{code}
{-
5
>
-}
\end{code}
However, the \begin{code} style of Literate Haskell does not nest, neither in itself nor in {- -} multiline comments, so by putting a line with \end{code} just before the line with the 5, that workaround fails, and I don't see a different one.
FRACTRAN, 2 bytes
()
Since FRACTRAN doesn't have any way of including comments or literals (AFAICT), this will always error any valid program, since all valid programs must be a valid fraction, and this string can never be part of a valid fraction.
AutoHotkey, 5 bytes
` is the escape character. You can only escape a " when assigning it to a variable.
\n*/ prevents it from being commented out or assigned to a variable.
*/`"
SmileBASIC, 2 bytes
!
Nothing continues past the end of a line, so all you need is a line break followed by something which can't be the start of a statement. ! is the logical not operator, but you aren't allowed to ignore the result of an expression, so even something like !10 would be invalid (while X=!10 works, of course)
Similar things will work in any language where everything ends at the end of a line, as long as it parses the code before executing it.
There are a lot of alternative characters that could be used here, so I think it would be more interesting to list the ones that COULD be valid.
@ is the start of a label, for example, @DATA; ( could be part of an expression like (X)=1 which is allowed for some reason; any letter or _ could be a variable name X=1, function call LOCATE 10,2, or keyword WHILE 1; ' is a comment; and ? is short for PRINT.
Ly, 6 bytes
\""{)
(note the leading newline)
The newline prevents line comments, Ly doesn't have block comments, the \"" ensures that all open string literals will close, and the unmatched brackets raise the error.
Gaia, 3 bytes
#“
Each line in Gaia is a separate function, so the newline ensures that the code starts at the beginning of a function. Even putting a newline in a string literal will start a new function, since Gaia allows omitting closing quotes. In addition, all functions are parsed before execution, so adding additional functions below won't help.
The # is a meta, which has to directly follow an operator. At the start of the function, there is no operator, so it's a syntax error.
The “ is an opening quote for string literals. It's there because Gaia also allows omitting the opening quote of strings at the start of a function. If this opening quote wasn't here, you could write #” which is entirely legal.
Brain-Hack (a variation of Brain-Flak), 3 2 bytes
Thanks to Wheat Wizard for pointing out that Brain-Hack doesn't support comments, saving me a byte.
(}
JavaScript, 7 bytes
;*/\u)
Note the leading newline.
\u)is an invalid Unicode escape sequence and this is why this string is invalid- Adding a
//at the beginning will still not work because of the leading newline, leaving the second line uncommented - Adding a
/*will not uncomment the string completely because of the closing*/that completes it, leaving the\u)exposed - As stated by @tsh, the bottom line can be turned into a regex by having a
/after the string, so by having the)in front of the\u, we can ensure that the regex literal will always be invalid - As stated by @asgallant, one could do
1||1(string)/to avoid having to evaluate the regex. The semi-colon at the beginning of the second line stops that from happening by terminating the expression1||1before it hits the second line, thus forcing a SyntaxError with the;*.
Try it!
clicky.onclick=a=>{console.clear();console.log(eval(before.value+"\n;*/\\u)"+after.value));}
textarea {
font-family: monospace;
}
<button onclick="console.clear();">Clear console</button>
<br>
<textarea id=before placeholder="before string"></textarea>
<pre><code>
*/\u)</code></pre>
<textarea id=after placeholder="after string"></textarea>
<br>
<button id=clicky>Evaluate</button>
Ada - 2 bytes
I think this should work:
_
That's newline-underscore. Newline terminates comments and isn't allowed in a string. An underscore cannot follow whitespace; it used to be allowed only after letters and numbers, but the introduction of Unicode made things complicated.
COBOL (GNU), 8 bytes
THEGAME
First, a linefeed to prevent you from putting my word in a commented line.
Then, historically, COBOL programs were printed on coding sheets, the compiler relies heavily on 80-character limited lines, there are no multiline comments and the first 6 characters are comments (often used as editable line numbers), you can put almost anything there, AFAIK. I chose THEGAM at the beginning of the next line.
Then, the 7th symbol in any line only accepts a very restricted list of characters : Space (no effect), Asterisk (comments the rest of the line), Hyphen, Slash, there may be others, but certainly not E.
The error given by GnuCobol, for instance, is :
error: invalid indicator 'E' at column 7
Also, you just lost the game.
APL and MATL and Fortran, 3 bytes
'
Newline, Quote, Newline always throws an error since block comments do not exist:
- APL:
unbalanced quotes - MATL:
string literal not closed - Fortran:
Invalid character in name
Commodore 64 Basic, 2 bytes
B
(that's a newline followed by the letter "B").
Any line in a Commodore 64 program must begin with either a line number or a BASIC keyword, and stored programs only permit line numbers. There are no keywords beginning with "B" (or "H", "J", "K", "Q", "X", "Y", or "Z").
VBA, 2 Bytes
A linefeed followed by an underscore - the _ functions as the line continuation character in VBA, and as there is nothing in the line directly to the left or above the line continuation, coupled with VBA's lack of multiline comments means that this will always throw the compile time error Compile Error: Invalid character
_
Fortran, 14 bytes
end program
e
No multiline comments or preprocessor directives in Fortran.
S.I.L.O.S, 4 bytes
Silos is competitive \o/
x+
S.I.L.O.S runs on a two pass interpreter / compiler. Before execution a "compiler" attempts to simplify the source into an array describing the sourc Each line is treated separately. x+a is an assignment operator that will add ea to the value of x and store it into x. However the "compiler" will break. Therefore, we take this string and add a new line before and after ensuring it's on its own line and breaks the compiler.
C#, 16 bytes
*/"
#endif<#@#>
Works because:
//comment won't work because of the new line/*comment won't work because of the*/- You can't have constants in the code alone
- Adding
#if falseto the start won't work because of the#endif - The
"closes any string literal - The
<#@#>is a nameless directive so fails for T4 templates. - The new line tricks it so having
/at the start won't trick the*/
Each variation fails with a compilation error.
C (clang), 16 bytes
*/
#else
#else
*/ closes any /* comment, and the leading space makes sure we didn’t just start one. The newline closes any // comment and breaks any string literal. Then we cause an #else without #if or #else after #else error (regardless of how many #if 0s we might be inside).
Java, 4 bytes
;\u;
This is an invalid Unicode escape sequence and will cause an error in the compiler.
error: illegal unicode escape
Free Pascal, 18 bytes
*)}{$else}{$else}
First close all possible comments, then handle conditional compile.
Please comment here if I forgot something.
Pyth, 6 bytes
¡¡$¡"¡
¡ is an unimplemented character, meaning that if the Pyth parser ever evaluates it, it will error out with a PythParseError. The code ensures this will happen on one of the ¡s.
There are three ways a byte can be present in a Pyth program, and not be parsed: In a string literal (" or .", which are parsed equivalently), in a Python literal ($) and immediately after a \.
This code prevents \ from making it evaluate without error, because that only affects the immediately following byte, and the second ¡ errors.
$ embeds the code within the $s into the compiled Python code directly. I make no assumptions about what might happen there.
If the program reaches this code in a $ context, it will end at the $, and the ¡ just after it will make the parser error. Pyth's Python literals always end at the next $, regardless of what the Python code might be doing.
If the program starts in a " context, the " will make the string end, and the final ¡ will make the parser error.
Changeling, 2 bytes
That's two linefeeds. Valid Changeling must always form a perfect square of printable ASCII characters, so it cannot contain two linefeeds in a row.
The error is always a parser error and always the same:
This shape is unpleasant.
accompanied by exit code 1.