| Bytes | Lang | Time | Link |
|---|---|---|---|
| 389 | C gcc | 250906T163953Z | ErikF |
| 307 | Python 3.8 prerelease | 250903T122703Z | V_R |
| 467 | sed recursive | 250905T111625Z | Jan Blum |
| 110 | Vyxal | 250903T095615Z | emanresu |
| 198 | JavaScript Node.js | 250903T080307Z | Arnauld |
| 114 | 05AB1E | 250903T072838Z | Kevin Cr |
| nan | 250904T040117Z | Anonymou | |
| 113 | Jelly | 250903T225631Z | Jonathan |
| 208 | Ruby pl | 250903T155627Z | Value In |
| 126 | Charcoal | 250903T210855Z | Neil |
| 380 | Python 3.8 prerelease | 250903T043938Z | Ted |
C (gcc), 389 bytes
*a[]={"0%#%G3$ST'","AB%G4$BQg12&E$F3FT$gAT\"T4g!%DgAC",0,0,"AQ#6A3D%2gBWD5$GQDQ\"B'","0DQgA1ECE\"D\"#C&DQBG#C$DDQ%g\"DQEg$D#&DQ6","EC\"%DQ#5E2#S!F#33A%","\"3g#gT\"$#g'E5U6C\"S'gT'#g'E5U6C\"S\"",0,"0CGg$Q4EDE65ED$S!E3\"F#$BQ",0,0,0,"5DQ&D3$ED","0$g$4EE3&D3&2$E#6gDE$g$E3","AAFQ4\"%&EgBQ\"&&F"};p;int*f(char*s){char*t;for(t=a[*(int*)s/9%16];p+=*t&7,*t;t++)if(*t<96)s[p]="aiou"[*t/16-2];p=s;}
Ungolfed:
*a[]={
"0%#%G3$ST'",
"AB%G4$BQg12&E$F3FT$gAT\"T4g!%DgAC",
0,0,
"AQ#6A3D%2gBWD5$GQDQ\"B'",
"0DQgA1ECE\"D\"#C&DQBG#C$DDQ%g\"DQEg$D#&DQ6",
"EC\"%DQ#5E2#S!F#33A%",
"\"3g#gT\"$#g'E5U6C\"S'gT'#g'E5U6C\"S\"",
0,
"0CGg$Q4EDE65ED$S!E3\"F#$BQ",
0,0,0,
"5DQ&D3$ED",
"0$g$4EE3&D3&2$E#6gDE$g$E3",
"AAFQ4\"%&EgBQ\"&&F"
};
p;
int *f(char *s) {
char *t;
for(t = a[*(int*)s/9%16]; /* Get correction template */
p+=*t&7,*t; /* Advance the correction index to the next position; done on NUL */
t++)
if(*t<96) /* Skip (for deltas > 7) */
s[p]="aiou"[*t/16-2]; /* Replace character with correct vowel */
p=s;
}
I find the correction template from the hash s=int(x/9)%16 of the first four characters of the passed-in string (ASCII, LE32), then for each character in the encoded string, apply the delta from the current position (bottom 3 bits up to 7) and apply ['a', 'i', 'o', 'u', (skip)] (the skip is required because some of the deltas are greater than 7). The pointer to the mutated string is returned.
The encoder that I used to make each string: Try it online!
Python 3.8 (pre-release), 307 332 340 354 bytes
-14 thanks to collaboration between Value Ink and myself
Another -8 thanks to Value Ink
Another -25 thanks to stealing Arnauld's len(t)%38%17 trick
lambda t,j="".join:j([j(x)for x in zip(t.split("e"),j(["aieo ux"[int(y)]for y in oct(int('1EBAQH8S8PVSUWQ9357KMUXVS1OPMAGVT2NXF2FOBQFL3AGKYP8QUVBDS2XXPLSC4A44QH5WHPLJBPKZDC31OOHR1PR1J7110PCZZE9VJ78JYUKHMDZT834V15MY95WVH73TI5SLA27Y7HJV3IJ6ZG1WCV7NQROLCSK318KTT3PZQMZ',36))[2:]]).split()[len(t)%38%17]+"\n")])
[Try it online!][TIO-mf7dlplx]
Explanation
The basic algorithm is to store a list of the vowels for each quote, e.g. eiouaoiaoeo for the first one, and zip() that with the input having been split() at "e". I append \n at the end of each entry in the vowel list to make sure the length of the split input is the same as the length of the vowel string, which is required for zip() to produce the correct outputs. This means all of my outputs have a trailing newline, which is allowed by the rules.
The list of vowels is stored as a space-separated string (eiouaoiaoeo aieeaeeuaaaeeaeoeiueioaueaeeueaaeeaeoeiueioaua ...), recovered using split(). An extra character x is introduced vowel strings we never need to use (see modular arithmetic logic below). The string is converted to an octal integer using the mapping {"a":"0","i":"1","e":"2","o":"3"," ":"4","u":"5", "x": "6"}, and encoded as a base 36 number. This mapping was chosen to lead to the smallest possible number without producing leading 0s.
The length of each quote mod 38 mod 17 is [1, 2, 11, 8, 6, 9, 0, 12, 4, 13] (Arnauld's finding). We use these numbers to index into the list of vowel strings, so the third text actually has its vowels stored at index 11 since the length of that text mod 38 mod 17 is 11. The indices go up to 13, but there are only 10 quotes, so there are some filler x entries instead of vowel strings in the list.
sed (recursive, so actually Bash), 467 bytes
sed "y/e/_/;s/^_/i/;$(sed 's:[A-Z]:\L&e:
h;s:[aeiou]:_:g;G;y:\n:/:;s:.*:s/&/g:'<<@
qua
afF
paC
toda
aK
Ky
Ny
ay
mov
ev
B
C
G
M
R
S
obl
ab
ad
al
am
ar
it
in
int
hot
hat
had
Man
anot
oN
and
urn
ran
Bin
Hn
thin
thos
tH
ing
cha
Td
Nd
with
easy
ears
rS
irst
eak
ion
tual
car
cau
Ta
good
ou
work
Wr
wa
vas
V
off
til
mil
all
huma
faT
erC
ontra
nsTr
sta
turk
fac
ory
Tr
ibL
limi
kic
duc
dus
suc
Bc
Pc
can
uny
Ls
iga
Gt
M,
up
t if
is
i'
\. i
y i
a
o
@
)"
(Note significant leading and trailing blanks.)
Strategy
- Find recurring patterns in text (possibly making use of properties of english language - there should be room for optimization here), and provide a list of these patterns to be applied in order
- Use first
sedpass to transform each of these patterns into asedsubstitution command like
s/q__/qua/g
s/_ff_/affe/g
s/p_c_/pace/g
s/t_d_/toda/g
s/_k_/ake/g
s/k_y/key/g
...
- Compress pattern list by encoding consonant+e (probably the most frequent two-letter groups in English) as uppercase-consonant, e.g. write
Minstead ofme, and use firstsedpass to expand back
Vyxal, 115 110 bytes
β«9LbṄ_₀A§¼K¤j‛P⌈ZaRUbİɖɽ?Ǒ°{›ɖuUoṖ₆‛›∇↳Ṗh|İ(ǓzĊ∷„≈ǐ⁋æ꘍∷←Π#†ǐĊ₌(Un+ɾḋṪD¾I*ṙ²₍£ż«⌈inβ3τṘ0€vB?⌈ƛf\ekvVΠµøDL}¨£iṄ
This is a dictionary compression-based approach. It's somewhat silly but also has some neat aspects, so I've decided to put it here.
Vyxal uses a dictionary-based string compression system: common words or sequences are replaced with a pair of non-ASCII characters - for example, `ƈṡ, ƛ€!` decompresses to Hello, World!. However, unusually, Vyxal's string compressor is built into the language, and can be used to compress strings at runtime. And one interesting side effect of using a compressor based on an English dictionary, is that it can be used to identify English.
If we take a word and consider every combination of vowels that could replace its es, then compare the results by how long they are when compressed, the shortest ones are going to be the ones that correspond most closely to English - so we expect the index of each actual word in its compression-sorted list of alternatives to be quite low.
For example, with the word wh*n, where the * could be any vowel, our options are whan, when, whin, whon, and whun. However, only when is a word in Vyxal's dictionary, and so it compresses to a 2-byte dictionary string, while the others compress to at least 3 bytes, so when is chosen as the most likely candidate and placed first in the list of options. Ties are broken by lexicographic order.
Extending this to the full input when in doubt, say nothing and move on., all the words aside from in and on. show up first (index 0) in their lists, with in and on. showing up at indices 2 and 3 due to all options being the same length. We can then assign the list 0, 2, 0, 0, 0, 0, 0, 3 to this string, with each number representing the index of a word in its compression-sorted list of possibilities.
We can construct a similar list for each given sentence. The indices are usually very small - over half the words are guessed first in their lists, which is impressive considering there are 5/25 options for most words; over 90% are within the top 4 options, and almost all remaining words show up below 20 - with the sole exception of obligated, which shows up at index 396 due to almost all possibilities compressing equally poorly.
Now, we can compress these lists. The approach I've chosen here is to convert each index to bijective binary (using 1s and 2s as digits instead of 0s and 1s), and then delimit by 0s - for example, 0, 2, 0, 0, 0, 0, 0, 3 becomes _, 2, _, _, _, _, _, 11 (with _ representing the empty string) and then this is joined together to yield 0200000011.
I originally used binary here, but bijective binary is a fair bit more efficient - since leading zeros aren't ignored, we can fit more possible numbers into the same space. We can't quite convert this from ternary due to the leading zero, so instead I reverse the ternary string, giving 1100000020, and then convert from ternary, yielding 26250. This does remove any trailing zeroes from the string, but that's not a problem: these would correspond to trailing zeroes in the list of indices, which will automatically get turned into zeroes when the lists are zipped.
Doing this for each sentence yields an extremely compact representation of the correct word to assign to each index - however, we need to choose which one to decompress.
The approach I ended up taking here was using Vyxal's other form of compressed data - alphabetical strings. Each number from above can be converted to base 26 - for example, the above 144337 becomes ifnl - and we can use spaces to concatenate these into one string of letters, which Vyxal can compress.
We do, however, need to hash each input to a distinct index in the list of decompression cues. The approach I found for this, completely by accident, was to use the custom base decompression builtin β, which takes a string and decompresses it using another string as a key - and plugging the sentences into this twice yields a set of numbers that are distinct modulo 17. We can then just adjust the list of decompression cues so that they're at the appropriate index for their strings, with empty strings or junk in between - and that gets us a list of indices for a corresponding input sentence.
From there, all that's necessary is to convert those compressed forms into a list of indices, generate the list of possibilities for each input word, sort them, and choose the correct words!
β # Hash the input (see above)
i # and use that to select the correct decompression cue
«...«⌈ # from a compressed string containing all of them
nβ # parse from a base-26 string to an integer
3τ # and convert to ternary,
Ṙ # reverse to reintroduce any trailing zeroes
0€ # split on 0s
vB # and parse each digit sequence from binary, yielding a list of indices
?⌈ # take each word of the input
ƛ } # and over each one
f V # replace
\e # "e"
kv # with every vowel
Π # and take every possibility
µ } # and sort the possibilities by
L # length
øD # when compressed
¨£i # finally, index each number into its list of possibilities
Ṅ # and join by spaces
In terms of data compression, this is a pretty decent method! The only hardcoded data is the compressed string, which contains both all the cues and the ordering necessary to relate them to the sentences in 85 79 bytes, compared to e.g. Kevin Cruijssen's 05AB1E answer using 97 95 bytes for similar data. It's difficult to make a fair comparison due to the decompression and extraction code required, but I think it's reasonable to say that this method is a bit shorter.
JavaScript (Node.js), 198 bytes
The source code contains unprintable characters.
s=>s.replace(/e/g,_=>"aeoiu"[(q=Buffer(`L9*(
)9U~B65~(RTf.RTf~~?eB=[6n~~B&(kR7r#~~1~>9~~q8?42n9
p2'3/~nAg7Z~Vn[*o_$`.split`~`[s.length%38%17])[k++/3]|q/5)%5],q=k=0)
Method
The quote is identified by its length \$n\$, which is turned into a lookup index with the modulo chain \$(n\bmod 38)\bmod 17\$.
The vowels of each quote are stored by groups of 3, with aeoiu mapped to \$0\$, \$1\$, \$2\$, \$3\$, \$4\$. Each group is encoded with an ASCII character in the range \$[0\dots5^3-1]=[0\dots124]\$.
The permutation of the vowels was chosen to minimize the final lookup string length, accounting for required character escaping and trailing 0s removal.
We use ~ (ASCII code 126) as a delimiter in the lookup string.
Commented
s => // s = input string
s.replace( // replace in s ...
/e/g, // ... each vowel 'e'
_ => // with ...
"aeoiu"[ // ... another vowel, using an index in [0..4]
( q = // update q:
Buffer( // turn into a Buffer:
`...` // take the lookup string
.split`~`[ // split it at '~'
s.length // get the right part by applying the modulo
% 38 % 17 // chain to the length of the input string
] //
) // end of Buffer()
[k++ / 3] // attempt to read at k / 3 (then increment k)
// this is undefined if k is not a multiple of 3
| // bitwise OR with ...
q / 5 // ... the previous value divided by 5
) // (which also forces coercion to an integer)
% 5 // reduce modulo 5
], //
q = k = 0 // starting with q = 0 and k = 0
) // end of replace()
05AB1E, 119 115 114 bytes
'e¡•œjÆ₅·¦ÎKмÌèζýMΛćáÕtImĆÎ&í»Ï£§çΘ¹ΣšF.θ,ÏTù¡YChí—?”'ι¿¡Ëˆ∞Ï1ǝ¬ªNâ3₆ד¿K·r~α0XÓ∍›õ®α:Í••3FÃŽn嵕₆в£žMRÅвI'e¢ù`.ιJ
-4 bytes thanks to a tip of @ValueInk †
Try it online or verify all test cases.
Explanation:
'e¡ '# Split the (implicit) input-string on "e"
•œjÆ₅·¦ÎKмÌèζýMΛćáÕtImĆÎ&í»Ï£§çΘ¹ΣšF.θ,ÏTù¡YChí—?”'ι¿¡Ëˆ∞Ï1ǝ¬ªNâ3₆ד¿K·r~α0XÓ∍›õ®α:Í•
'# Push compressed integer 33661166341730947323742594214991132931290889044884883702434500366942293627433085270923068711731105465016280482100881923417883481729608331506876022437314686732263433892129363251493019570801390120479604
•3FÃŽn嵕 # Push compressed integer 841821687692745
₆в # Convert the second to a base-36 list:
# [8,10,14,15,19,21,23,27,30,33]
£ # Split the larger integer into parts of that size:
# [33661166,3417309473,23742594214991,132931290889044,8848837024345003669,422936274330852709230,68711731105465016280482,100881923417883481729608331,506876022437314686732263433892,129363251493019570801390120479604]
žM # Push a constant string with the vowels (-y): "aeiou"
R # Reverse it to "uoiea" †
Åв # Convert each inner integer to a base-"uoiea" list,
# (aka, convert it to base-stringLength and index into the string):
# [["e","i","o","u","a","o","i","a","o","e","o"],["i","e","a","a","a","e","o","i","a","u","u","e","a","e"],["o","o","o","u","i","a","a","a","e","o","e","o","u","a","e","e","a","a","e","o"],["o","o","a","a","o","u","a","i","o","i","a","u","a","e","o","a","i","i","o","e","a"],["o","u","a","e","i","o","i","o","e","a","i","e","o","e","u","o","e","i","a","e","o","u","o","u","a","o","e","a"],["i","o","o","e","a","u","e","i","o","e","o","o","e","i","i","o","o","a","u","a","o","i","a","e","o","a","e","a","o","u"],["i","a","e","e","a","i","o","e","o","i","e","a","o","i","e","a","i","e","a","o","a","e","i","e","o","e","o","a","e","e","a","o","i"],["o","o","a","e","o","i","a","o","u","e","i","i","e","a","e","o","a","e","o","i","e","o","u","a","e","o","u","a","u","i","e","a","a","e","o","e","o","o"],["i","o","u","e","e","o","i","o","o","o","a","o","a","a","o","a","o","u","o","e","o","a","o","a","o","o","u","a","a","o","u","o","e","e","a","o","a","e","a","o","u","e","i"],["a","i","e","e","a","e","e","u","a","a","a","e","e","a","e","o","e","i","u","e","i","o","a","u","e","a","e","e","u","e","a","a","e","e","a","e","o","e","i","u","e","i","o","a","u","a"]]
I # Push the input-string again
'e¢ '# Pop and count the amount of "e" in it
ù` # Keep the list of vowels of that size
.ι # Interleave the split input and these vowels
J # Join everything together to a single string
# (which is output implicitly as result)
See this 05AB1E tip of mine (sections How to compress large integers? and How to compress integer lists?) to understand why •œjÆ₅·¦ÎKмÌèζýMΛćáÕtImĆÎ&í»Ï£§çΘ¹ΣšF.θ,ÏTù¡YChí—?”'ι¿¡Ëˆ∞Ï1ǝ¬ªNâ3₆ד¿K·r~α0XÓ∍›õ®α:Í• is 33661166341730947323742594214991132931290889044884883702434500366942293627433085270923068711731105465016280482100881923417883481729608331506876022437314686732263433892129363251493019570801390120479604; •3FÃŽn嵕 is 841821687692745; and •3FÃŽn嵕₆в is [8,10,14,15,19,21,23,27,30,33].
† Because 05AB1E can't compress integer lists with a leading 0 and one of the sentences starts with an a, I reverse the vowels string (and have adjusted the compressed integers) since there isn't a sentence starting with a u for its vowels (thanks @ValueInk for the tip).
Could I do this in Excel?
- Truncate the input to first 4 characters (with LEFT command). fourchar=LEFT(input, four)
- Then a set of nesting IF commands, using the expected four character strings: "when", "whet", "ef y", and so on.
- With output = the particular quote, that corresponds to the particular four character string in the nesting IFs.
Please...don't yell at me too hard. I am a total civilian office worker, not a coder. I just wondered if this approach would be allowed.
Jelly, 113 bytes
⁽¦Ụ,“¡ṆṚq“¡ṣ×fĠSn“¡ṡọo)tḄḣn“¡x€pƲḍµ¢mÞƁİ“Æ⁾Ṣ[ḥ81⁶ṡƽƲ⁹“£:ŻɓḊ5ŀtȦỴẆÐ`ı“Œ ƙĠ#wçñ⁾Ḅ“/⁺4ṣṄḷŀs,“_ṙ'Vḳ&“¡Ø⁾J8’ḥṃØẹɓṣ”eż
A full program that accepts a string and prints to stdout
How?
⁽¦Ụ,“...“...’ḥṃØẹɓṣ”eż - Main Link: list of characters, S
, ḥ - hash {S} using:
⁽¦Ụ - salt = 2436
“...“...’ - domain = [26983364, 462962559208611, 30266711103401497111, 353864496146005868907598699, 868734713992749156230688226138, 48270613457564664982186844774276, 76807421178203501785923, 740918920714374341545, 94703755492039, 4212081307]
(picking the appropriate one of these ten numbers)
ṃ - base-decompress with digits:
Øẹ - "aeiou"
(making the vowels to insert)
ɓ - start a new dyadic chain - f(S, VowelsToInsert)
ṣ”e - split {S} at 'e' characters
ż - zip with {VowelsToInsert}
- implicit, smashing print
Ruby -pl, 208 bytes
Uses Arnauld's \$(length\bmod38)\bmod17\$ trick to determine which vowel string to substitute in. Each is represented by a base 36 string that is decoded and converted to its base 5 digits (conveniently, least significant first, meaning they can be substituted in by poping from the back instead of shifting from the front). The order of vowels uieoa was determined by starting with u at position 0 to prevent leading zero issues, and finding the permutation of the remaining vowels that resulted in the shortest combination of strings.
Theoretically can be shortened by using Ruby's pack functionality to compress into unprintables but it's annoying to use those so I'm not going to figure out if it actually saves bytes.
q=%w"77qjdokf9zmy0pa dj62i 95djlwoghwti299mr4e89 . 1whyxgw2yhdj8r . 4zxjnskmov5oq . waar9t p67ik2q81 . zpu3mijjcc0b2kk6chs 3kcyhql9mu ykj952di82qt7q02i"[~/$/%38%17].to_i(36).digits 5
gsub(/e/){'uieoa'[q.pop]}
Charcoal, 126 bytes
F⪪”{➙∧7⁼ec⊞_PJ‴⁴ζ↑r⬤Rα\`∨₂⪪V″◨N"«X↑⊗~χC﹪∕∕:≧⧴ΦΠ-ÞLηPχζ‖ïFbX⟧OβcZHFÞ⊟¤⪫⟲@⌊/↙J⍘▶&²(J≧⎚n⁼⬤⁵c↷⭆¶6~vt^″⟧η³νεJ⁻Q\” ¿⁼Lι№θe⭆⪪θe⁺κ§⁺ι¶λ
Try it online! Link is to verbose version of code. Explanation:
F⪪”...”
Loop over all the vowel strings extracted by splitting a compressed string.
¿⁼Lι№θe
If this string's length matches the number of es in the input...
⭆⪪θe⁺κ§⁺ι¶λ
... substitute the es with the letters of the string in turn. Uses @V_R's trick of appending a trailing newline and then zipping with the split input.
Python 3.8 (pre-release), 380 bytes
import zlib,base64
def f(s,r=""):
for v in zlib.decompress(base64.b85decode("c${62u@S^T3<6uZ)(8lQ6siAT<m|_DJ|Qq*XI2x9=f#o?*tPs-W^)^(u{*JpK&c9>WUN=76=P8EIB;8M$PDj=Hv{_7CGxScdZ~3u;(L1G*3>eSX13Mf0qH|miL>$7kopK<sI8VnUke^&2hR9x9w|6^`+K!?T2OmS|1I?cBTsc*")).decode().split():
if 2>len(x:=s.split("e"))-len(v):
while x or v:
if x:r+=x.pop(0)
if v:r+=v[0];v=v[1:]
return r
- iterates through strings of vowels, each string corresponding to a quote.
- when it finds one which is the correct length, it combines it with the consonants from the input string
- the vowels are stored as a compressed string.