g | x | w | all

Bytes	Lang	Time	Link
026	Haskell + hgl	240724T135452Z	Wheat Wi
016	sed 4.2.2	240725T052036Z	Digital
043	JavaScript ES6	240724T081140Z	Arnauld
082	Python 3.8 prerelease	240724T095658Z	Jitse
030	Charcoal	240725T083936Z	Neil
105	Setanta	240725T071331Z	bb94
029	Wolfram Language Mathematica	240724T225458Z	att
008	Nekomata	240725T021308Z	alephalp
009	QuadR ≡	240724T085533Z	Adá
014	Brachylog	240724T223135Z	Unrelate
018	Perl 5 + pl	240724T205349Z	Dom Hast
015	Brachylog	240724T203657Z	DLosc
011	Retina 0.8.2	240724T173832Z	Neil
048	R	240724T103921Z	pajonk
010	Japt	240724T093257Z	Shaggy
008	05AB1E legacy	240724T083148Z	Kevin Cr

Haskell + hgl, 26 bytes

yyc$sk$h_>~l2 aT(mY<xys)ʃ

Attempt This Online!

This uses the adapted rule from Nitrodon, which has not been proven correct, but at least has not been proven incorrect.

Explanation

yyc: Repeat a function until reaching a fixed point.
sk: Replace all matches of a parser with their results.
h_: Parse some string.
>~: Feed the result of that parse into a function to create a second parse.
mY<xys: Parse some number of characters from the input
aT: Ignore the result of that and give the result of ...
ʃ: Parse the input string and return it.

This finds occurrences of xyx where y is made up of a subset of the characters of x, and replaces them with x until a fixed point is reached. y can potentially be empty here.

Reflection

This is ok. I'm not too happy with it but I initially expected much worse. This really has more of a glue problem than anything else.

I could add l2 aT and l2 aK, they will probably be used again sometime.
I have (?*>) and (<*?) for aT<py and aK^.py and (+*>) and (<*+) for aT<so and aK^.so. I could use (**>) = aT<my and (<**) = aK^.py.

The above are minor fixes, but I think the thing that would really fix the glue problem is having a version of (>~) which just returns the result from the left-hand side instead of the right hand-side. So:

(<~) =
  f' $ fb < ap pM

In which case this could look like:

yyc$sk$h_<~(mY<xys)>~ʃ

Although this wouldn't make this answer shorter than implementing either of the above, this does seem like the solution lightest on the glue and an operation which has a large reuse potential

Old versions

Here are some old versions which use the incorrect algorithm given in the question and currently used by all the other answers:

Parser, 21 13 bytes

yyc$sk$h_>~ʃ

Attempt This Online!

Explanation

yyc: Repeat until reaching a fixed point.
sk: Apply a parser as many times as possible to substrings.
h_>~ʃ: Parse some string twice. This returns the parsed string only once.

No parser, 21 bytes

yyc$mBl<(he~<<gr)<<pt

Attempt This Online!

Explanation

yyc: Repeat until reaching a fixed point.
pt: Get all partitions.
gr: Group each partition into groups of contiguous equal elements.
he: Get the first element of each group.
mBl: Get the smallest result.

Reflection

Although I only have two versions of this answer I am splitting the reflection into three based on three potential solutions to this challenge.

Parser version

I am happy with the parser version. I was annoyed by the operator precedence between (~<) and (#|)/(++) in the original version of the answer, but I'm not sure it's actually wrong in general, and now it's no longer a problem.

In the interest in coming up with something to improve here:

Maybe yyc<sk could have a builtin? I think this idiom of "apply a parser repeatedly to substrings until reaching a fixed point" could come up again. I just really kind of don't like making more parser consumers.
Weird as it is, I think fb ʃ could have a builtin. This isn't the first time I've used it, you can also use it for things like palindrome checking. Although I don't think it's useful enough to justify a 2 byte name and with a 3 byte name it wouldn't even save anything here.

Regex version

I don't think I could reasonably do this with regex right now, since I haven't implemented back-referencing yet. This is just yet another push to do that.

Non-parser version

I am rather frustrated with trying to build a non-parser version here. There is a lot to improve on this front.

A contiguous nub function would be useful.
I would like an "unpairs" function that basically undoes what pA does.
sps finds all the splits into two pieces. But here I'd like all partitions into 3 pieces. It'd be good to have builtin which takes a number, \$n\$, and finds all the partitions into \$n\$ pieces, with presets for small numbers like 3. pST eL3 almost does what I want, but it returns a list which is a bit annoying. It's also inefficient and 7 bytes long.

sed 4.2.2, 16

:
s/(.+)\1/\1/
t

-1 byte thanks to @Neil .

Try it online!

JavaScript (ES6), 43 bytes

-2 thanks to @l4m2

f=s=>s==(s=s.replace(/(.+)\1/,"$1"))?s:f(s)

Try it online!

Python 3.8 (pre-release), 82 bytes

lambda s:(n:=len(s))==[s:=s.replace((x:=s[i%n:i%~n])+x,x)for i in range(n**3)]or s

Try it online!

-14 bytes thanks to xnor

Charcoal, 30 bytes

Ｗ⌈ΦＥθ⌈ΦＥλ✂θμλ¹№θ×²μκ≔⪫⪪θ×²ιιθθ

Try it online! Link is to verbose version of code. Explanation:

Ｗ⌈ΦＥθ⌈ΦＥλ✂θμλ¹№θ×²μκ

While there are substrings of the input that appear doubled in the input, take the maximum, and...

≔⪫⪪θ×²ιιθ

... deduplicate that substring from the input.

θ

Output the finally reduced string.

Setanta, 105 bytes

gniomh(s){le t idir(0,fad@s)le j idir(1,fad@s+1)le i idir(0,j){t=cuid@s(i,j)s=athchuir@s(t+t,t)}toradh s}

Try on try-setanta.ie

Wolfram Language (Mathematica), 31 29 bytes

f[a___,b__,b__,c___]=f[a,b,c]

Try it online!

Input [characters...]. Return the characters of the reduced string, wrapped in f.

NonCommutativeMultiply (**) is a conveniently Flat undefined operator. Unfortunately, it needs to be Unprotected before we can add a definition. Try it online!

Nekomata, 8 bytes

ʷ{JYᵗƶ¿j

Attempt This Online!

ʷ{JYᵗƶ¿j
ʷ{          Loop until failure:
  J             Nondeterministically split the input into parts
                e.g. "QUAQUAUAL" may become ["QUA", "QUA", "UA", "L"]
   Y            Run length encode
                e.g. ["QUA", "QUA", "UA", "L"] -> ["QUA", "UA", "L"], [2, 1, 1]
    ᵗƶ          Check that at least one count is greater than 1
                e.g. [2, 1, 1] passes
      ¿         If the check passes, drop the counts; otherwise fail
                e.g. ["QUA", "UA", "L"], [2, 1, 1] -> ["QUA", "UA", "L"]
       j        Join
                e.g. ["QUA", "UA", "L"] -> "QUAUAL"

This may output the same result multiple times. You can add the -1 flag to only output the first result.

QuadR `≡`, 9 bytes

Blatant rip-off of l4m2's golf of Arnauld's JavaScript solution — go upvote!

(.+)\1
\1

Try it online!

Brachylog, 15 14 bytes

{c~c₂ᶠo∋~jᵗc}ˡ

Try it online!

I wouldn't be surprised if there's a better way to do this.

...But, I mostly just took Jitse's comment about requiring recursion as a challenge.

{           }ˡ    Left reduce by:
 c                Concatenate the new element to the accumulator,
  ~c₂             then partition that result into two possibly-empty slices,
     ᶠo∋          trying the partitions in lexicographic order.
        ~jᵗ       Un-double the last slice,
           c      and concatenate the slices back together.

Perl 5 + `-pl`, 18 bytes

1while s;(.+)\K\1;

Try it online!

Brachylog, 16 15 bytes

~c₃↺{h&~j}ʰ↻c↰|

Try it online!

Explanation

I wouldn't be surprised if there's a better way to do this.

~c₃↺{h&~j}ʰ↻c↰|
~c₃               "Unconcatenate" the input string into three substrings
   ↺              Rotate the first element to the end
    {    }ʰ       Apply this predicate to the new first element:
     h&             Assert that it is nonempty
       ~j           Assert that it consists of a substring repeated twice,
                    and return that substring
           ↻      Rotate the final element back to the beginning
             c    Concatenate the list back together
              ↰   Call the main predicate again on the result
               |  If there was no way to partition the input string that
                  satisfied the assertions, return the input unchanged

Retina 0.8.2, 11 bytes

+`(.+)\1
$1

Try it online! Link includes test cases. Explanation: Blatant rip-off of @Adám's rip-off of @l4m2's golf of @Arnauld's answer.

R, 48 bytes

\(s){while(s!=(s=sub("(.+)\\1","\\1",s,,T)))0;s}

Attempt This Online!

Japt, 10 bytes

I didn't understand the spec at all so this is based on the worked example & test cases.

e"(.+)%1"Ï

Try it (includes all test cases)

05AB1E (legacy), 8 bytes

Δ.œ€ÔJéн

Try it online or verify all test cases.

Explanation:

Δ         # Loop until the result no longer changes:
 .œ       #  Get all partitions of the current string
          #  (which will use the implicit input-string in the first iteration)
   €      #  Map over each partition:
    Ô     #   Connected uniquify all parts in this partition
     J    #  Join each connected uniquified partition back together
      é   #  Sort these strings by length (shortest to longest)
       н  #  Pop and keep the first/shortest one
          # (after the loop, the result is output implicitly)

Uses the legacy version of 05AB1E instead of the latest version, since the connected uniquify builtin Ô will interpret stringified numbers as numbers instead of strings. E.g. with input 11.0+1+1e1, the legacy program correctly connected uniqifies ["1","1",".0","+1","+1","e1"] to ["1",".0","+1","e1"] → "1.0+1e1", whereas the new version of 05AB1E will connected uniquify ["1","1.0","+1","+1e1"] to ["1"] → "1", since it interprets them all as the same number (1): try it online.

Haskell + hgl, 26 bytes

Explanation

Reflection

Old versions

Parser, 21 13 bytes

Explanation

No parser, 21 bytes

Explanation

Reflection

Parser version

Regex version

Non-parser version

sed 4.2.2, 16

JavaScript (ES6), 43 bytes

Python 3.8 (pre-release), 82 bytes

Charcoal, 30 bytes

Setanta, 105 bytes

Wolfram Language (Mathematica), 31 29 bytes

Nekomata, 8 bytes

QuadR ≡, 9 bytes

Brachylog, 15 14 bytes

Perl 5 + -pl, 18 bytes

Brachylog, 16 15 bytes

Explanation

Retina 0.8.2, 11 bytes

R, 48 bytes

Japt, 10 bytes

05AB1E (legacy), 8 bytes

QuadR `≡`, 9 bytes

Perl 5 + `-pl`, 18 bytes