| Bytes | Lang | Time | Link |
|---|---|---|---|
| 251 | Setanta | 240719T203134Z | bb94 |
| 195 | Lexurgy | 220119T070008Z | bigyihsu |
| 438 | C gcc | 220118T080206Z | Alexandr |
| 086 | Python 3 | 220114T153050Z | Jakque |
| 051 | Jelly | 220114T132149Z | Jonathan |
| 055 | Charcoal | 220114T123027Z | Neil |
| 064 | Perl 5 p | 220114T211748Z | Xcali |
| 313 | TypeScript type system | 220114T192202Z | Merlin04 |
| 047 | Pip | 220114T152555Z | DLosc |
| 049 | 05AB1E | 220114T151652Z | Kevin Cr |
| 047 | Retina 0.8.2 | 220114T121400Z | Neil |
Setanta, 260 251 bytes
Sure, this could be done a lot shorter in Raku, but what’s the fun in that?
gniomh(f){s:=0m:=""le i idir(0,fad@f){l:=f[i]c:=aimsigh@(go_liosta@"nmptkswlj"())(l)+1v:=aimsigh@(go_liosta@"aeiou"())(l)+1b:=0ma c{b=s==1s=(s&c<2&3)|1}no ma v{b=s==2s=2}no b=1ma b|aimsigh@["wu","wo","ji","ti","nm","nn"](m+l)+1{s=0bris}m=l}toradh s>1}
−9 bytes because whoops Setanta isn’t Raku
Lexurgy, 195 bytes
Lexurgy is a tool made for conlangers for applying sound changes, so this is perfect for this challenge! (and here I am bashing it into code golf)
Outputs the original word if it's valid Toki Pona, and an empty string otherwise.
Extremely slow version:
Class c {m,n,p,t,k,s,w,l,j}
Class v {a,e,i,o,u}
a:
{({j,t} i),(w {o,u}),({m,n} {m,n}),!@c&!@v}=>`
{(!n&@c @c),(@v @v)}=>` *
!@v&!n=>`/_ $
n=>`/$ _ $
c propagate:
[]=>`/{` _,_ `}
d:
`=>*
Much faster version, 199 bytes:
Class c {m,n,p,t,k,s,w,l,j}
Class v {a,e,i,o,u}
a:
{j,t} i=>`
w {o,u}=>`
{m,n} {m,n}=>`
!n&@c @c=>` *
@v @v=>` *
!@v&!n=>`/_ $
n=>`/$ _ $
!@c&!@v=>`
c propagate:
[]=>`/{` _,_ `}
d:
`=>*
Ungolfed:
Class cons {m,n,p,t,k,s,w,l,j}
Class vow {a,e,i,o,u}
remove-forbidden:
{j,t} i => ` # ji, ti
w {o,u} => ` # wo, wu
{m,n} {m,n} => ` # mn, mm, etc
!n&@cons @cons => ` * # no consecutive consonants
@vow @vow => ` * # no consecutive vowels
!@vow&!n => ` / _ $ # ending with a vowel or n
n => ` / $ _ $ # nothing of length 1
Then:
!@cons&!@vow => ` # convert any invalid character
Then propagate:
[] => ` / {` _, _ `} # spread the invalid
Then:
` => * # delete the invalid
C (gcc), 438 bytes
#define R return
int c(l){char a[]={'n','m','p','t','k','s','w','l','j'};for(int i=0;i<9;i++)if(l==a[i])R 1;R 0;}
int v(l){R l==97||l==101||l==105||l==111||l==117?1:0;}
int f(char* s){int i,a,b;for(i=0;*s!=0;s++,i++){a =*s;b=*(s+1);if(!(c(a)||v(a))||((a=='j'||a=='t')&&b=='i'||a=='w'&&(b=='u'||b=='o')||a=='n'&&(b=='n'||b=='m'))||(c(a)&&c(b)&&a!='n')||(v(a)&&v(b))) R 0;}if(i==1&&c(*(s-1))) R 0;if(*s==0&&v(*(s-2))&&*(s-1)!='n') R 0;R 1;}
Explanations :
#define R return
// function to detect a consonant
int c(l){char a[]={'n','m','p','t','k','s','w','l','j'};for(int i=0;i<9;i++)if(l==a[i])R 1;R 0;}
// function to detect a vowel
int v(l){R l==97||l==101||l==105||l==111||l==117?1:0;}
int f(char* s){int i,a,b;for(i=0;*s!=0;s++,i++)
{
a =*s;b=*(s+1);
if(!(c(a)||v(a))|| // detect if characters are allowed
((a=='j'||a=='t')&&b=='i'||a=='w'&&(b=='u'||b=='o')||a=='n'&&(b=='n'||b=='m'))|| // detect if sequences ji, wu, wo & ti are not used
(c(a)&&c(b)&&a!='n')|| // detect if there are not 2 consecutives consonants
(v(a)&&v(b))) // detect if there are not 2 consecutives vowels
R 0;
if(i==1&&c(*(s-1))) R 0; // detect if it a single letter word & a vowel
if(*s==0&&v(*(s-2))&&*(s-1)!='n') R 0; // test if the last character is not a consonant except 'n'
R 1;
}
```
Python 3, 97 88 86 bytes
lambda x:re.sub("((?!ji|wu|wo|ti|.*n[nm])(^|[j-npstw])[aeiou]n?)*$","",x)>""
import re
return False for valid word, True for invalid
Thanks to @14m2 for -2 bytes
How it works:
- at each syllable, we chek for
ji|wu|wo|tiand prevent any capture if it is present. We also chek for the presence of eithernnornmfurther in the word. - if it was absent, we capture the syllable (consonant + voyel (+ n))
- All the syllables captured are replaced by the empty string
- We then check if the result is greater than the empty string (falsey) or equal to the empty string (thruthy)
Jelly, 56 51 bytes
+1 to cater for strict IO (two distinct outputs rather than truthy vs falsey being allowed)
“jtklmnpsw”,ØẹŒpṖṖ¬3,8¦p”n;ƊṗⱮLẎF€⁾mnyw⁾nnƲÐḟḊ€;$e@
A (very inefficiant) monadic Link that yields 0 when the input string is not a Toki Pona word and 1 when it is.
(Don't) Try it online! (it's so inefficient it'll only complete for words of length three or less!)
...but here is a test-suite that has all tests except the four syllable pankulato that (a) limits to three base-syllables, rather than that of the number of characters in the input string and (b) only calls the word-generating code once for all (hence the e@ has been moved out to the footer).
How?
We construct a list containing ALL valid Toki Pona words constructed from at most length(input) syllables and check if the input is in there.
Yep that's soooo nasty, but without easy regex access I imagine it's the golfiest way.
“jtklmnpsw”,ØẹŒpṖṖ¬3,8¦p”n;Ɗṗ - (partial) Link: integer (from below!)
“jtklmnpsw” - "jtklmnpsw"
Øẹ - "aeiou"
, - pair
Œp - Catesian product
ṖṖ - pop off "wu" and "wo"
3,8¦ - apply to indices 3 & 8 ("ji" & "ti"):
¬ - logical NOT (replace these with [0,0] (integers)
Ɗ - last three links as a monad:
”n - 'n'
p - Cartesian product (appends 'n' to each)
; - concatenate
ṗ - Catiasian power (the integer)
...ⱮLẎF€⁾mnyw⁾nnƲÐḟḊ€;$e@ - (continued) Link: string, S
... L - length of S
...Ɱ - map across [1..length(S)] with:
... - code above -> base-syllable combos of each length
Ẏ - tighten
F€ - flatten each
Ðḟ - filter discard those for which:
Ʋ - last four links as a monad:
⁾mn - "mn"
y - translate (convert ms to ns)
⁾nn - "nn"
w - index of first occurrence (or zero)
$ - last two links as a monad:
Ḋ€ - dequeue each
; - concatenate
@ - with swapped arguments:
e - S exists in there?
Charcoal, 59 58 55 bytes
∧θ¬⊙⪪”&↧q1o⁺VPα”²№θι≔aeiouηF⮌θ¿№ηι≔⁻”&↧ï⁸t∕p№t⟦”ηη¿⁻ιn⎚
Try it online! Link is to verbose version of code. Explanation:
∧θ¬⊙⪪”&↧q1o⁺VPα”²№θι
Check that the word doesn't contain any of the illegal letter pairs contained in the compressed string.
≔aeiouη
Start by expecting the last character to be a vowel.
F⮌θ
Loop over the word in reverse.
¿№ηι
If we see an expected letter, ...
≔⁻”&↧ï⁸t∕p№t⟦”ηη
... then flip the set of expected letters by subtracting it from the string all the legal Toki Pona letters grouped into vowels and consonants.
¿⁻ιn
Otherwise, if the current letter is not an n, ...
⎚
... then erase any previous validity there might have been.
TypeScript type system, 313 bytes
type v="a"|"e"|"i"|"o"|"u";type i<T>=T extends""?1:T extends`${Exclude<`${"m"|"n"|"p"|"t"|"k"|"s"|"w"|"l"|"j"}${v}`,"ji"|"wu"|"wo"|"ti">}${infer r}`?i<r>extends 1?1:r extends`n${infer e}`?e extends`${"n"|"m"}${any}`?0:i<e>:0:0;type o<T>=T extends`${v}${infer p}`?i<p>extends 1?1:p extends`n${infer r}`?i<r>:0:i<T>
This is written entirely with TypeScript types - the o type outputs 1 if the input parameter is a valid word and 0 if it is not. There's probably some room for further golfing.
Pip, 56 53 47 bytes
-3 bytes by porting Neil's Retina answer
X<>"jiwuwotinnnm"NIa&a~=+:`^|[j-nptsw]`+XV.`n?`
Returns 1 for a valid word, 0 for an invalid word. Attempt This Online!
Explanation
At its core, this solution works similarly to Neil's Retina answer:
- The input does not contain any of the illegal sequences
ji,wu,wo,ti,nn, ornm; AND - The input fully matches the regex
((^|[j-nptsw])[aeiou]n?)+
First half:
X<>"jiwuwotinnnm"NIa
"jiwuwotinnnm" That string
<> Grouped into pairs of characters
X Converted to a regex that matches any of those pairs
NI Does not match in
a The command-line argument
Second half:
a~=+:`^|[j-nptsw]`+XV.`n?`
`^|[j-nptsw]` That regex
+ Wrapped in a non-capturing group and followed by
XV Built-in regex `[aeiou]`
. Followed by
`n?` That regex
+: Apply the + quantifier to the above wrapped in n.c. group
a~= Command-line argument fully matches that regex
05AB1E, 49 bytes
„nn„nm‚åà≠×ε.•2Ñ|qγù•žM⨨D27SèKD'n««N>ãJ}˜D€¦«Iå
Port of @JonathanAllan's Jelly answer, but even slower.. :/
Outputs 1/0 for accept/reject respectively.
Try it online.
As is it's too slow for a test suite, but by adding 2äн between the × and ε (map over halve the input-length instead), we can verify all but the longest few truthy test cases and falsey test cases respectively, in separated test suites.
Explanation:
„nn„nm‚ # Push pair ["nn","nm"]
åà≠ # Check that NEITHER is present in the (implicit) input
× # 'Multiply' it by the (implicit) input-string
# (the input if truthy; "" if falsey)
ε # Map over the characters:
.•2Ñ|qγù• # Push compressed string "jtklmnpsw"
žM # Push builtin vowels "aeiou"
â # Pop both, and create a list of all possible char-pairs
¨¨ # Remove the last two ("wu" and "wo")
D # Duplicate the list
27S # Push pair [2,7]
è # Index those into the copy: ["ji","ti"]
K # Remove those as well
D # Duplicate the list again
'n« '# Append an "n" to each string
« # Merge the two lists together
N # Push the 0-based map-index
> # Increase it by 1 to make it 1-based
ã # Cartesian product this index on the list of syllables
J # Join each inner list together to a string
}˜ # After the map: flatten the list of lists
D # Duplicate the list
€¦ # Remove the first consonant from each
« # Merge the two lists together
Iå # Check if the input-string is in this list
# (after which the result is output implicitly)
See this 05AB1E tip of mine (section How to compress strings not part of the dictionary?) to understand why .•2Ñ|qγù• is "jtklmnpsw".
Retina 0.8.2, 48 47 bytes
A`ji|nm|nn|ti|wu|wo
^((^|[j-npstw])[aeiou]n?)+$
Try it online! Link includes test cases. Edit: Saved 1 obvious byte thanks to @ovs. Explanation:
A`ji|nm|nn|ti|wu|wo
Delete invalid inputs.
^((^|[j-npstw])[aeiou]n?)+$
Match valid inputs that weren't invalidated above.