g | x | w | all

Bytes	Lang	Time	Link
nan		210907T153358Z	user4180
nan		240409T123151Z	guest430
nan	\L	240528T104808Z	user4180
nan		240420T231633Z	guest430
nan	Okay	210914T140407Z	user4180
nan		231122T185825Z	Philippo
nan	i	210915T104630Z	user4180
nan		210914T135617Z	user4180
nan		210123T104813Z	user4180
nan		210321T025903Z	user1004
nan		210320T092455Z	user4180
nan		150915T190254Z	F. Hauri
nan		180622T105342Z	user4180
nan	Append a newline in one byte	171209T035809Z	Jordan
nan		171109T000109Z	Jordan
nan	I know this is an old thread	170726T064230Z	Philippo
nan	Expanding upon this tip answer	170406T201232Z	seshouma
000	There's no builtin arithmetic	150605T121220Z	Toby Spe
086	If not explicitly banned by the question	150611T160700Z	Digital
nan	In sed	160913T103958Z	seshouma
nan	Let's talk about the t and T commands	160829T172101Z	seshouma
nan	Instead of clearing the pattern space with s/.*//	160829T152424Z	seshouma
nan		150605T112010Z	Toby Spe
nan	As mentioned in man sed GNU	150722T174517Z	Dennis
nan	If you need to use labels then for sure you'll want your label names to be as short as possible. In fact taken to the extreme	150605T174237Z	Digital
nan	The GNU sed documentation describes the s command as "sed's Swiss Army Knife". But if all you want to do is replace all instances of one character with another	150605T173901Z	Digital
nan	Consider using extended regex syntax in GNU sed. The r option costs one byte in scoring	150605T112301Z	Toby Spe

Unary-UCD-Decimal conversions

The programs below are adapted from solutions in anarchy golf. The solutions below work on GNU sed 4.2.2, where empty labels are allowed. It is possible I've missed better solutions on anarchy golf.

Unary to Decimal, `sed -r` at 47 bytes

From solutions by %20 and tails. Works only for integers other than 0. The chosen unary digit (a in the below program) occurs 3 times in the program.

:
s/a/<<123456789a01>/
s/(.)<.*\1(a?.).*>/\2/
t

Try it online! and (bash wrapper for easier testing)

If you have a prefix (< here, occurs 4 times) before your number, you can make 0 show up with just 2 more bytes:

:
s/<|a/<<123456789&01>/
s/(.)<.*\1(a?.).*>/\2/
t

Try it online! (test cases show off how it functions)

Decimal to UCD, `sed` at 39 bytes

From a solution by %20. Works on all integers, including negative ones.

:
s/[1-9]/&;/g
y/123456789/012345678/
t

Try it online!

Decimal to Unary (via UCD), `sed` 55 bytes

From a solution by tails. Works on all integers, including negative ones.

:
s/[1-9]/&;/g
y/123456789/012345678/
s/;0/9;/
t
s/0//g

Try it online! and (bash wrapper)

If you don't care about 0, you can golf a byte off of this:

:
s/\w/&#/g
y/0123456789/!!12345678/
s/#!/9#/
t
s/!//g

Try it online!

use the e flag on s for simplifying things

your output may depend on which shell sed is using. I'll only include things that work on TIO at time of writing.

I know there are lots of decimal to unary and back solutions on here, but they are all unreasonably long. here's some that are much shorter:

decimal to unary:

s/.*/echo {0..&}/e
s/\w//g

this sends echo {0..#} to the shell which usually expands it into a list from 0 to #, with spaces inbetween each number. remove the numbers and you have the right number of spaces. works for any positive integer

unary to decimal:

s/.*/wc -L<<<&/e
s/.*/wc -L<<<'&'/e         #the above if you're using spaces as your unary char
s/.*/echo &|wc -L/e        #shells that don't support <<< redirects
s/.*/echo '&'|wc -L/e      # + using spaces

these send all the chars to wc which returns the longest line with -L (the <<< redirect and echo add a newline to the end). the unary must consist of the entire line; as it runs the text of the full line after the swap, so if you leave something out you end up trying to run loremipsumwc -L<<<' '

basic math (-E)

s/(.*) (.*)/echo $[\1+\2]/e   #two numbers seperated by a space
s/(.*) (.*)/echo $[\1*\2]/e   #there's at least + - * / %

Try it online!

idk, go have fun

s/.*/for i in {1..&};{ echo $[&%$i];}/e

basically if a shell solution is a lot shorter than a sed solution it doesn't need to be that way

`\L`, `\U`, `\E` for fractals in GNU sed

The case switching special sequences help in "toggling" a line. An example is the following challenge to produce a fractal X on anagol, whose shortest sed solution by mitchs et al is reproduced here.

s/^/X\n/
:
s/^.\{,27\}\n/&\L&\U&/mg
//s/[ X]/& &/g
s/x/ X /g
t

Try it online!

If the fractal uses another character to fill it, then at the end you can add a transliterate.

Store more in your hold space

To preform a loop without losing the number you used as your counter, you can use h;H to add your number to the hold space twice, and then only decrement after the newline x;s/\n./\n/;x. This essentially gives you two holdspaces instead of just one. Now when the loop finishes you still have your number and can even copy it again to use as many times as you need like x;s/.*/&&/;x (note this will copy the newline as well)

Okay, here's a weird functioning of (GNU) sed's regex. For example, if the pattern space is only qqq, then the extended regex s/^|q|$/<&>/g gives <q><q><q>. Try it online!

I'm not sure, but I think it is because if a character matches, then do the empty strings surrounding it. So because the terminal q matches, the match also includes the end of the pattern space, so $ by itself doesn't get matched because otherwise it would overlap with the previous match (and likewise for ^ and even I think for word boundaries and the like).

An example where this is useful is in the following (rather contrived) task:

an input of a should give ab,
and inputs matching /a(c+)/ should give ab\1b, for example, an input of accc should give abcccb

Without the trick I can get 11 bytes (with the -E flag)

s/a|c+/&b/g

but using it gives 10 bytes

s/a|$/&b/g

Try it online!

(Do let me know if there is a better example that uses this behaviour).

Use back references

Most people know that you can mark some parts of your regex with  and later refer to it as \1 (or \2 for the second and so on) in your substitution.

But you hardly ever see back references, using \1 in the regex itself. For extended regular expressions (option -E to sed), this has been removed from the standard, but GNU sed supports it anyhow, so it works at tio.run, for example.

For example, this ERE (.)\1 matches a double character or this one (.).*\1 appearing twice in the pattern space. See here for an example, how this makes things easier (and much shorter!).

Complicated task like this one looking for matches in a file or this one or this one would almost be impossible without this feature.

Math with back references

And it's great for teaching sed to count. I know you can solve some of those problems using y, but as soon as you have more than one thing to increment or decrement, this is the way to go. See examples here or here or try the 151-byte decimal add online to examine how it works.

`i`, `a`, and `c`

(The answer focuses on GNU sed because GNU sed's invocation of these commands is slightly shorter than that of POSIX, but otherwise I think the functionality should be the same).

These insert to, append to, and change the pattern space respectively. The lines they add are printed and so are not edited into the pattern space, meaning that the program will not be able to manipulate these lines. But they are shorter than s, if you want to insert some unchanging lines. Compare the following two lines:

s/^/text\n/ # 11 bytes
itext       #  5 bytes

Now I want to focus on c. c text replaces the pattern space with text. This text will be printed immediately, since the c makes sed move on to the next line. This means that the commands follows a call to c are effectively ignored. This behaviour of c can be useful in challenges where there are only few possible outputs, particularly when combined with the conditional /.../.

An example is the 'Hello, World!' challenge, where c is shorter than s:

s/^/Hello, World!/ # 18 bytes
cHello, World!     # 14 bytes

Another example is this challenge (sed answer) to swap the strings Good and Bad. The outputs are restricted to being Good (when the input is Bad) or Bad (when the input is Good).

s gives 21 bytes:

s/Goo/Ba/;t;s/Ba/Goo/

Using c (in combination with /.../) gives 13 instead:

/B/cGood
cBad

If the input matches /B/, i.e. it the input is Bad, Good is printed and the program skips processing this input line. So the program only ever reaches the second line if the input doesn't match B, i.e. if the input is Good. Then in this case the output is set to Bad.

Combine `s` substitutions

The s command takes many bytes (4 + 1 for the statement separator), so combining them can save bytes.

An example: the following is at 17 bytes

s/\S+ //
s/\S+$//

while combining the two substitutions gives 15 bytes

s/^\S+ |\S+$//g

Make use of sed's line-handling ability

With flexible challenge I/O, it can pay to have input/output separated by newlines instead of any other character by taking advantage of sed's commands for handling lines (like D, N, n, G, H, P, s's m flag) instead of only being limited to s substitutions.

This can also open the opportunity for using D for looping instead of labels and goto, especially in sed versions that don't permit empty labels.

`#n` at first to imply `-n`

The "#" and the remainder of the line are ignored (treated as a comment), with the single exception that if the first two characters in the file are #n, the default output is suppressed; this is the equivalent of specifying -n on the command line.

Source: sed (from SUSv2)

This is useful if you prefer NOT to output something by default.

...But is it really useful? -n adds either 1 or 2 bytes but #n and LF adds 3.

The `L` command in old GNU sed versions

Used, for example, in the second solution in https://codegolf.stackexchange.com/a/220633/. Older versions of GNU sed like GNU sed 4.2.2 have the L command, which was later removed in newer versions. From the archived docs,

L n

This GNU sed extension fills and joins lines in pattern space to produce output lines of (at most) n characters, like fmt does; if n is omitted, the default as specified on the command line is used. This command is considered a failed experiment and unless there is enough request (which seems unlikely) will be removed in future versions.

Mostly useless step:

y|A-y|B-z|

This will only translate A to B and y to z (... and - to - ;), but nothing else, so

sed -e 'y|A-y|B-z|' <<<'Hello world!'

will just return:

Hello world!

You could ensure this will be useless, for sample by using this on lower-case hexadecimal values (containing only 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e or f.)

A little worst:

sed '; ;/s/b;y|A-y|B-z|;s ;s/ //; ; ;' <<<'Hello world'
Hello world

Why did this not suppress the space?

Empty regexes are equivalent to the previously encountered regex

^{(thanks to Riley for discovering this from an anagol submission)}

Here is an example where we are tasked with creating 100 @s in an empty buffer.

s/$/@@@@@@@@@@/;s/.*/&&&&&&&&&&/ # 31 bytes
s/.*/@@@@@@@@@@/;s//&&&&&&&&&&/  # 30 bytes

The second solution is 1 byte shorter and uses the fact that empty regexes are filled in with the last encountered regex. Here, for the second substitution, the last regex was .*, so the empty regex here will be filled with .*. This also works with regexes in /conditionals/.

Note that it is the previously encountered regex, so the following would also work.

s/.*/@@@@@@@@@@/;/@*/!s/$/@/;s//&&&&&&&&&&/

The empty regex gets filled with @* instead of $ because s/$/@/ is never reached.

Append a newline in one byte

The G command appends a newline and the contents of the hold space to the pattern space, so if your hold space is empty, instead of this:

s/$/\n/

You can do this:

Prepend a newline in three bytes

The H command appends a newline and the contents of the pattern space to the hold space, and x swaps the two, so if your hold space is empty, instead of this:

s/^/\n/

You can do this:

H;x

This will pollute your hold space, so it only works once. For two more bytes, though, you could clear your pattern space before swapping, which is still a savings of two bytes:

H;z;x

Read the whole input at once with `-z`

Often you need to operate on the whole input at once instead of one line at a time. The N command is useful for that:

:
$!{N;b}

...but usually you can skip it and use the -z flag instead.

The -z flag makes sed use NUL (\0) as its input line separator instead of \n, so if you know your input won’t contain \0, it will read all of the input at once as a single “line”:

$ echo 'foo
> bar
> baz' | sed -z '1y/ao/eu/'
fuu
ber
bez

Try it online!

I know this is an old thread, but I just found those clumsy decimal to UCD converters, with almost a hundred bytes, some even messing the hold space or requiring special faulty sed versions.

For decimal to UCD I use (68 bytes; former best posted here 87 bytes)

s/$/\n9876543210/
:a
s/\([1-9]\)\(.*\n.*\)\1\(.\)/\3x\2\1\3/
ta
P;d

UCD to decimal is (also 66 bytes; former best posted here 96)

s/$/\n0123456789/
:a      
s/\([0-8]\)x\(.*\n.*\)\1\(.\)/\3\2\1\3/
ta      
P;d

\n in the replacement is not portable. You can use a different character instead and save two bytes, but you'll need more bytes to remove the appendix instead of P;d; see next remark. Or, if your hold space is empty, do G;s/$/9876543210/ without byte penalty.
If you need further processing, you'll need some more bytes for s/\n.*// instead of P;d.
You could save two bytes each for those buggy old GNU sed versions
No, you can't save those six backslashes as extended regular expressions don't do backreferences

Expanding upon this tip answer, regarding the conversions between decimal and plain unary number formats, I present the following alternative methods, with their advantages and disadvantages.

Decimal to plain unary: 102 + 1(r flag) = 103 bytes. I counted \t as a literal tab, as 1 byte.

h
:
s:\w::2g
y:9876543210:87654321\t :
/ /!s:$:@:
/\s/!t
x;s:-?.::;x
G;s:\s::g
/\w/{s:@:&&&&&&&&&&:g;t}

Try it online!

Advantage: it is 22 bytes shorter and as extra, it works with negative integers as input

Disadvantage: it overwrites the hold space. However, since it's more likely that you'd need to convert the input integer right at the start of the program, this limitation is rarely felt.

Plain unary to decimal: 102 + 1(r flag) = 103 bytes

s:-?:&0:
/@/{:
s:\b9+:0&:
s:.9*@:/&:
h;s:.*/::
y:0123456789:1234567890:
x;s:/.*::
G;s:\n::
s:@::
/@/t}

Try it online!

Advantage: it is 14 bytes shorter. This time both tip versions work for negative integers as input.

Disadvantage: it overwrites the hold space

For a complicated challenge, you'll have to adapt these snippets to work with other information that may exist in the pattern space or hold space, besides the number to convert. The code can be golfed more, if you know you only work with positive numbers or that zero alone is not going to be a valid input / output.

An example of such challenge answer, where I created and used these snippets, is the Reciprocal of a number (1/x).

There's no built-in arithmetic, but calculations can be done in unary or in unary-coded decimal. The following code converts decimal to UCD, with x as the unit and 0 as the digits separator:

s/[1-9]/0&/g
s/[5-9]/4&/g
y/8/4/
s/9/4&/g
s/4/22/g
s/[37]/2x/g
s/[26]/xx/g
s/[1-9]/x/g

and here's the conversion back to decimal:

s/0x/-x/g
s/xx/2/g
y/x/1/
s/22/4/g
s/44/8/g
s/81/9/g
s/42/6/g
s/21/3/g
s/61/7/g
s/41/5/g
s/-//g

These are both taken from an answer to "Multiply two numbers without using any numbers".

Plain old unary can be converted using this pair of loops from this answer to "{Curly Numbers};", where the unit is ;. I've used v and x to match Roman for 5 and 10; b comes from "bis".

# unary to decimal
:d
/;/{
s/;;;;;/v/g
s/vv/x/g
/[;v]/!s/x\+/&0/
s/;;/b/g
s/bb/4/
s/b;/3/
s/v;/6/
s/vb/7/
s/v3/8/
s/v4/9/
y/;bvx/125;/
td
}

# Decimal to unary
:u
s/\b9/;8/
s/\b8/;7/
s/\b7/;6/
s/\b6/;5/
s/\b5/;4/
s/\b4/;3/
s/\b3/;2/
s/\b2/;1/
s/\b1/;0/
s/\b0//
/[^;]/s/;/&&&&&&&&&&/g
tu

If not explicitly banned by the question, the consensus for this meta question is that numerical input may be in unary. This saves you the 86 bytes of decimal to unary as per this answer.

In sed, the closest thing to a function that you can have is a label. A function is useful because you can execute its code multiple times, thus saving a lot of bytes. In sed however you would need to specify the return label and as such you can't simply call this "function" multiple times throughout your code the way you would do it in other languages.

The workaround I use is to add in one of the two memories a flag, which is used to select the return label. This works best when the function code only needs a single memory space (the other one).

Example showing what I mean: taken from a project of mine to write a small game in sed

# after applying the player's move, I overwrite the pattern space with the flag "P"
s/.*/P/
b check_game_status
:continue_turn_from_player
#code

b calculate_bot_move
:return_bot_move
# here I call the same function 'check_game_status', but with a different flag: "B"
s/.*/B/
b check_game_status
:continue_turn_from_bot
#code (like say 'b update_screen')

:check_game_status   # this needs just the hold space to run
#code
/^P$/b continue_turn_from_player
/^B$/b continue_turn_from_bot

The labels should be golfed of course to just one letter, I used full names for a better explanation.

Let's talk about the t and T commands, that although they are explained in the man page, it's easy to forget about it and introduce bugs accidently, especially when the code gets complicated.

Man page statement for t:

If a s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label.

Example showing what I mean: Let's say you have a list of numbers and you want to count how many negatives there are. Partial code below:

1{x;s/.*/0/;x}                   # initialize the counter to 0 in hold space
s/-/&/                           # check if number is negative
t increment_counter              # if so, jump to 'increment_counter' code block
b                                # else, do nothing (start a next cycle)

:increment_counter
#function code here

Looks ok, but it's not. If the first number is positive, that code will still think it was negative, because the jump done via t for the first line of input is performed regardless, since there was a successful s substitution when we initialized the counter! Correct is: /-/b increment_counter.

If this seemed easy, you could still be fooled when doing multiple jumps back and forth to simulate functions. In our example the increment_counter block of code for sure would use a lot of s commands. Returning back with b main might cause another check in "main" to fall in the same trap. That is why I usually return from code blocks with s/.*/&/;t label. It's ugly, but useful.

Instead of clearing the pattern space with s/.*//, use the z command (lowercase) if you go with GNU sed. Besides the lower bytes count, it has the advantage that it won't start the next cycle as the command d does, which can be useful in certain situations.

When repeatedly replacing in a loop:

loop:
s/foo/bar/g
tloop

it's usually unnecessary to replace globally, as the loop will eventually replace all occurrences:

# GNU sed
:
s/foo/bar/
t

Note also the GNU extension above: a label can have an empty name, saving more precious bytes. In other implementations, a label cannot be empty, and jumping without a label transfers flow to the end of script (i.e. same as n).

As mentioned in man sed (GNU), you can use any character as a delimiter for regular expressions by using the syntax

\%regexp%

where % is a placeholder for any character.

This is useful for commands like

/^http:\/\//

which are shorter as

\%^http://%

What is mentioned in the GNU sed manual but not in man sed is that you can change the delimiters of s/// and y/// as well.

For example, the command

ss/ssg

removes all slashes from the pattern space.

If you need to use labels then for sure you'll want your label names to be as short as possible. In fact taken to the extreme, you may even use the empty string as a label name:

:    # define label ""
p    # print pattern space
b    # infinite loop! - branch to label ""

The GNU sed documentation describes the s command as "sed's Swiss Army Knife". But if all you want to do is replace all instances of one character with another, then the y command is what you need:

y/a/b/

is one char shorter than:

s/a/b/g

Consider using extended regex syntax (in GNU sed). The -r option costs one byte in scoring, but using it just once to eliminate the backslashes from a pair of $...$ has already paid for itself.

Unary-UCD-Decimal conversions

Unary to Decimal, sed -r at 47 bytes

Decimal to UCD, sed at 39 bytes

Decimal to Unary (via UCD), sed 55 bytes

use the e flag on s for simplifying things

decimal to unary:

unary to decimal:

basic math (-E)

idk, go have fun

\L, \U, \E for fractals in GNU sed

Store more in your hold space

Use back references

Math with back references

i, a, and c

Combine s substitutions

Make use of sed's line-handling ability

#n at first to imply -n

The L command in old GNU sed versions

Empty regexes are equivalent to the previously encountered regex

Append a newline in one byte

Prepend a newline in three bytes

Read the whole input at once with -z

Unary to Decimal, `sed -r` at 47 bytes

Decimal to UCD, `sed` at 39 bytes

Decimal to Unary (via UCD), `sed` 55 bytes

`\L`, `\U`, `\E` for fractals in GNU sed

`i`, `a`, and `c`

Combine `s` substitutions

`#n` at first to imply `-n`

The `L` command in old GNU sed versions

Read the whole input at once with `-z`