| Bytes | Lang | Time | Link |
|---|---|---|---|
| 026 | Uiua | 250304T122009Z | noodle p |
| 016 | 05AB1E | 250304T084247Z | Kevin Cr |
| 072 | Zsh | 250304T080815Z | roblogic |
| 133 | Tcl | 180427T130842Z | sergiol |
| 023 | Pip s | 250227T224532Z | DLosc |
| 016 | Vyxal S | 210805T150552Z | Aaroneou |
| 093 | Python | 210805T115252Z | ryno |
| 016 | Japt v2.0a0 S | 201113T151954Z | Shaggy |
| 037 | Perl 5 n | 201113T155049Z | Xcali |
| 055 | K ngn/k | 210714T181337Z | coltim |
| 040 | Ruby | 210714T075051Z | ovs |
| 022 | Husk | 201114T031017Z | Razetime |
| 131 | Python 3 | 201113T153944Z | Jitse |
| 056 | grep and awk | 160410T030558Z | joeytwid |
| 103 | Python | 160408T214113Z | orlp |
| nan | 160408T144728Z | Xanderha | |
| 040 | 05AB1E | 160407T055007Z | Adnan |
| 107 | Javascript ES6 | 160408T101651Z | Qwertiy |
| 023 | MATL | 160407T083433Z | Luis Men |
| 076 | Ruby 76 Bytes | 160407T192052Z | knut |
| 023 | Pyth | 160407T071436Z | Jakube |
| 071 | JavaScript ES6 | 160407T045419Z | user8165 |
| 128 | C# LINQPAD | 160407T072030Z | mnsr |
| 043 | Seriously | 160408T021813Z | user4594 |
| 122 | C | 160407T090800Z | mIllIbyt |
| 120 | PHP 120bytes | 160407T112140Z | user5286 |
| 138 | Python 3.5 | 160407T093105Z | R. Kap |
| 039 | Perl 6 | 160407T084922Z | Ven |
| 172 | Lua | 160407T071511Z | Katenkyo |
| 028 | Retina | 160407T062000Z | Kobi |
| 031 | Jelly | 160407T061231Z | Dennis |
| 045 | Retina | 160407T035707Z | Sp3000 |
| 102 | Julia | 160407T050005Z | Alex A. |
| nan | 160407T041352Z | DJMcMayh |
Uiua, 26 bytes
/$"_ _"▽◰⌵⊸≡◇⊢⊢⍉regex$ \w+
Try it: Uiua pad
Without regex:
Uiua, 28 bytes
/$"_ _"▽◰⌵⊜⊃⊢□↥⌵±⤙⊸∊⊂@_+@0⇡9
Try it: Uiua pad
05AB1E, 16 bytes
žjмS¡õK.¡нl}€нðý
Try it online or verify all test cases.
Explanation:
žj # Push constant "abc...xyzABC...XYZ012...789_"
м # Remove all those characters from the (implicit) input-string
S # Convert what remains into a list of characters
¡ # Split the (implicit) input-string by those characters
õK # Remove all empty strings ""
.¡ # Group all words/numbers/underscores by:
н # Their first character
l # Converted to lowercase
}€н # After the group by: map over each group, and leave its first word
ðý # Join the list with space delimiter
# (after which the result is output implicitly)
Zsh, 72 bytes
for i (${@//[^0-9A-Za-z_]}){n=$i[1]:l;((${#n:|P}))&&printf $i\ ;P+=($n)}
Similar to my split string solution, but here we need more code to clean up the input.
Tcl, 133 bytes
proc F {s D\ {}} {lmap w $s {regsub -all \[^\\w] $w "" f
if {[set k [string tol [string in $f 0]]]ni$D} {dict se D $k $f}}
dict v $D}
Pip -s, 23 bytes
{YLC@ayNIl&lPBy}FIa@+XW
Explanation
{YLC@ayNIl&lPBy}FIa@+XW
a ; Command-line argument
@ ; Find each regex match of
+ ; one or more consecutive
XW ; word characters (alphanumeric + underscore)
{ }FI ; Filter the matches by this function:
a ; The match
@ ; First character
LC ; Lowercased
Y ; Store that in the y variable
l ; List of unique first letters (initially empty)
yNI ; Truthy if y is not in l, falsey otherwise
& ; If truthy, then
lPBy ; push y onto the list and return its new value
; (which is truthy because it's a nonempty list)
; Output, space-separated (-s flag)
Vyxal S, 16 bytes
kr‛ _+↔⇩⌈:vhÞU*'
Explanation:
kr‛ _+↔ # Remove any non A-Z,a-z,0-9,_, or space chars
⇩ # Lowercase
⌈ # Split on spaces
: # Duplicate
vh # Get the first letter of each
ÞU # Nub Sieve (Unique mask)
* # Multiply mask with list of words
' # Remove all empty strings
# 'S' flag - join top of stack with spaces and print
Python, 93 bytes
import re
s=""
for w in re.findall("\w+",input()):
if(a:=w[0].lower())not in s:s+=a;print(w)
Japt v2.0a0 -S, 19 16 bytes
f/\w+/
üÈÎvÃmÎnU
f/\w+/\nüÈÎvÃmÎnU :Implicit input of string U > "Ferulas flourish in gorgeous gardens."
f/\w+/ :Match /\w+/g > ["Ferulas","flourish","in","gorgeous","gardens"]
\n :Reassign to U
ü :Group and sort by
È :Passing each through the following function
Î : First character > ["F","f","i","g","g"]
v : Lowercase > ["f","f","i","g","g"]
à :End function > [["Ferulas","flourish"],["gorgeous","gardens"],["in"]]
m :Map
Î : First element > ["Ferulas","gorgeous","in"]
nU :Sort by index in U > ["Ferulas","in","gorgeous"]
:Implicit output joined with spaces > "Ferulas in gorgeous"
K (ngn/k), 55 bytes
{" "/w@.*'" "_=_*'w:" "\c(c:`c$&2!&48,27\3988544219)?x}
(c:`c$&2!&48,27\3988544219)a compressed version of"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz", stored incc(...)?xreplace all non-alphanumeric/underscore characters with spacesw:" "\split the converted input on spaces, storing inw=_*'wbuild a dictionary mapping the distinct (lowercased) leading characters to their words' position(s) within the input sentence" "_remove/ignore spacesw@.*'retrieve the first word beginning with each distinct character..." "/...joining them together with spaces (to be implicitly returned)
Ruby, 40 bytes
->s{s.scan(/\w+/).uniq{|x|x.ord|32}*' '}
String#ord returns the codepoint of the first character in a string and bitwise or with 32 maps uppercase codepoints to the lowercase ones and 95 (_) to 127, while keeping everything else the same.
Husk, 22 bytes
wÖ€₁¹m←ko_←₁
mf§|='_□w
The '_' requirement is a bit annoying, otherwise a pretty interesting challenge.
Explanation
function ₁: filter nonalphanumerics and split
mf§|='_□w
w split on spaces
mf map and filter each letter in the words by:
□ is it alphanumeric?
§| or:
='_ is it an underscore?
main function:
wÖ€₁¹m←ko_←₁
₁ format the input
ko key the words on:
_← first letter, lowercased
m← map each to first word
Ö order by:
€₁¹ their index in the formatted input
w join back with spaces
Python 3, 131 bytes
lambda s:' '.join(sorted({w[0].lower():w for w in''.join(c*(c.isalnum()or c in'_ ')for c in s).split()[::-1]}.values(),key=s.find))
Solution without regex.
grep and awk, 68 56 bytes
The script:
echo `grep -o '\w*'|awk '!x[tolower(substr($0,1,1))]++'`
Explanation:
grep -omatches the legal words, printing each on its own line.awktakes the first letter of each line withsubstr, makes it lowercase, and then increments a hashtable entry with that key. If the value was unset before the increment, the line is printed.echo ...turns the lines back into words
I previously tried to create a solution without awk, using uniq, sort, grep and bash but fell just short. History in the edits.
Thanks to Dennis for some improvements I missed.
Python, 103 bytes
import re
lambda s,d=[]:[w for w in re.findall("\w+",s)if(d.append(w.lower()[0])or d[-1])not in d[:-1]]
PHP
Inspired by the use of regex in most of the answers, I originally tried to do this without using regex at all just to show off a neat variation, but the sticking point of not having clean strings as input ruined that idea. Sad.
With function wrapper, 89 bytes
function f($s){foreach(preg_split('/\W/',$s)as$w)$c[lcfirst($w)[0]]++?:$v.=" $w";echo$v;}
Without function wrapper (needing $s pre-declared), 73 bytes
foreach(preg_split('/\W/',$s)as$w)$c[lcfirst($w)[0]]++?:$v.=" $w";echo$v;
Explanation:
foreach(preg_split('/\W/',$s)as$w)$c[lcfirst($w)[0]]++?:$v.=" $w";echo$v;
preg_split('/\w/',$s) Break input on all non-word characters
foreach( as$w) Loop through each 'word'
lcfirst($w)[0] Take the first letter of the lowercase version of the word
$c[ ]++?: Increment an array element with a key of that letter after checking if it's false-y (0)
$v.=" $w"; Add the word if the letter wasn't found (if the previous condition evaluated to false)
echo$v; Print the new string to screen.
My only regret is that I couldn't find a faster way of checking/converting letter case.
05AB1E, 40 bytes
Code:
94L32+çJžj-DU-ð¡""Kvy¬Xsl©åï>iX®«Uy}\}ðý
Explanation:
We first generate all characters which should be deleted from the input string using 94L32+ç (Try here). We join this string using J and remove [a-zA-Z0-9_] which is stored in žj (Try here). We remove all the characters that are in the second string from the first string, which will leave us:
!"#$%&'()*+,-./:;<=>?@[\]^`{|}~
That can also be tested here. We Duplicate this and store in to X with the U-command. We then remove all the characters that are in this string from the input. We then split on whitespaces using ð¡ and remove all empty strings (using ""K). We now have this.
This is the clean version of the input, which we will work with. We map over each element using v. This uses y as the string variable. We take the first character of the string using ¬ and push X, which contains a string with all forbidden characters (!"#$%&'()*+,-./:;<=>?@[\]^`{|}~). We check if the lowercase version of the first character, (which will also be ©opied to the register), is in this string using å. Covered by this part: ï>i, if the first letter doesn't exist in the string of forbidden characters (X), we append this letter to the list of forbidden characters (done with X®«U) and we push y on top of the stack.
Finally, when the strings are filtered, we join the stack by spaces with ðý.
Javascript ES6, 108 107 chars
107 chars, result string is trimmed
r=s=>s.split``.reverse().join``
f=s=>r(r(s).replace(/\b\w*(\w)\b(?=.*\1\b)/gi,'')).replace(/\W+/g,' ').trim()
Test:
["Take all first words for each letter... this is a test",
"Look ^_^ .... There are 3 little dogs :)",
"...maybe some day 1 plus 2 plus 20 could result in 3"
].map(f) + '' == [
"Take all first words each letter is",
"Look _ There are 3 dogs",
"maybe some day 1 plus 2 could result in 3"
]
MATL, 23 bytes
'\w+'XXtck1Z)t!=XRa~)Zc
This borrows Jakube's idea of using a regexp for removing unwanted characters and splitting at the same time.
Input is a string enclosed by single quotes.
Explanation
'\w+'XX % find words that match this regexp. Gives a cell array
t % duplicate
c % convert into 2D char array, right-padded with spaces
k % make lowercase
1Z) % get first column (starting letter of each word)
t!= % duplicate, transpose, test for equality: all combinations
XR % set diagonal and below to 0
a~ % true for columns that contain all zeros
) % use as a logical index (filter) of words to keep from the original cell array
Zc % join those words by spaces
Ruby 76 Bytes
s;f={};s.scan(/(([\w])[\w]*)/).map{|h,i|f[j=i.upcase]?nil:(f[j]=!p; h)}.compact.*' '
Or with method definition 88 bytes
def m s;f={};(s.scan(/((\w)\w*)/).map{|h,i|f[j=i.upcase]?nil:(f[j]=1; h)}-[p]).*' ';end
Ungolfed and with unit test:
def m_long(s)
#found - Hash with already found initials
f={}
#h=hit, i=initial, j=i[0].downcase
s.scan(/(([\w\d])[\w\d]*)/).map{|h,i|
f[j=i.upcase] ? nil : (f[j] = true; h)
}.compact.join(' ')
end
#true == !p
#~ def m(s)
#~ f={};s.scan(/(([\w\d])[\w\d]*)/).map{|h,i|f[j=i.upcase]?nil:(f[j]=!p; h)}.compact.join' '
#~ end
def m s;f={};s.scan(/(([\w\d])[\w\d]*)/).map{|h,i|f[j=i.upcase]?nil:(f[j]=!p; h)}.compact.join' ';end
#~ s = "Ferulas flourish in gorgeous gardens."
#~ p s.split
require 'minitest/autorun'
class FirstLetterTest < Minitest::Test
def test_1
assert_equal("Ferulas in gorgeous",m("Ferulas flourish in gorgeous gardens."))
assert_equal("Ferulas in gorgeous",m_long("Ferulas flourish in gorgeous gardens."))
end
def test_2
assert_equal("Take all first words each letter is",m("Take all first words for each letter... this is a test"))
assert_equal("Take all first words each letter is",m_long("Take all first words for each letter... this is a test"))
end
def test_3
assert_equal("Look _ There are 3 dogs",m("Look ^_^ .... There are 3 little dogs :)"))
assert_equal("Look _ There are 3 dogs",m_long("Look ^_^ .... There are 3 little dogs :)"))
end
def test_4
assert_equal("maybe some day 1 plus 2 could result in 3",m("...maybe some day 1 plus 2 plus 20 could result in 3"))
assert_equal("maybe some day 1 plus 2 could result in 3",m_long("...maybe some day 1 plus 2 plus 20 could result in 3"))
end
end
Pyth, 23 bytes
J:z"\w+"1jdxDJhM.grhk0J
Try it online: Demonstration or Test Suite
J:z"\w+"1 finds all the words in the input using the regex \w+ and stores them in J.
.grhk0J groups the words by their lowercase first letter, hM takes the first from each group, xDJ sorts these words by their index in the input string, and jd puts spaces between them.
JavaScript (ES6), 73 71 bytes
s=>s.match(u=/\w+/g).filter(w=>u[n=parseInt(w[0],36)]?0:u[n]=1).join` `
Saved 2 bytes thanks to @edc65!
Test
var solution = s=>s.match(u=/\w+/g).filter(w=>u[n=parseInt(w[0],36)]?0:u[n]=1).join` `;
var testCases = [
"Ferulas flourish in gorgeous gardens.",
"Take all first words for each letter... this is a test",
"Look ^_^ .... There are 3 little dogs :)",
"...maybe some day 1 plus 2 plus 20 could result in 3"
];
document.write("<pre>"+testCases.map(t=>t+"\n"+solution(t)).join("\n\n")+"</pre>");
C# (LINQPAD) - 136 128 bytes
var w=Util.ReadLine().Split(' ');string.Join(" ",w.Select(s=>w.First(f=>Regex.IsMatch(""+f[0],"(?i)"+s[0]))).Distinct()).Dump();
Seriously, 43 bytes
6╙¬▀'_+,;)-@s`;0@Eùk`M┬i;╗;lrZ`i@╜í=`M@░' j
The lack of regex capabilities made this much more difficult than it needed to be.
Explanation:
6╙¬▀'_+,;)-@s`;0@Eùk`M┬i;╗;lrZ`i@╜í=`M@░' j
6╙¬▀ push digits in base 62 (uppercase and lowercase letters and numbers)
'_+ prepend underscore
,;) push two copies of input, move one to bottom of stack
- get all characters in input that are not letters, numbers, or underscores
@s split input on all occurrences of non-word characters
`;0@Eùk`M for each word: push the first letter (lowercased)
┬i transpose and flatten (TOS is list of first letters, then list of words)
;╗ push a copy of the first letters list to register 0
;lrZ zip the list of first letters with their positions in the list
`i@╜í=`M for each first letter: push 1 if that is the first time the letter has been encountered (first index of the letter matches its own index) else 0
@░ filter words (take words where corresponding element in the previous list is truthy)
' j join on spaces
C, 142 132 122 bytes
10 bytes lighter thanks to @tucuxi!
b[200],k;main(c){for(;~c;isalnum(c)|c==95?k&2?:(k|=!b[c|32]++?k&1?putchar(32):0,7:2),k&4?putchar(c):0:(k&=1))c=getchar();}
Prints a trailing space after the last output word.
PHP 120bytes
function a($s){foreach(preg_split('/\W/',$s)as$w)if(!$o[ucfirst($w[0])]){$o[ucfirst($w[0])]=$w;}return implode(" ",$o);}
This generates a bunch of warnings but that's fine.
Python 3.5, 138 bytes:
import re;lambda o,t=[]:''.join([y[0]for y in[(u+' ',t.append(u[0].lower()))for u in re.sub('\W+',' ',o).split()if u[0].lower()not in t]])
Basically, what's happening is..
- Using a simple regular expression, the program replaces all the characters, except lowercase or uppercase letters, digits, or underscores in the given string with spaces, and then splits the string at those spaces.
- Then, using list comprehension, create a list that iterates through all the words in the split string, and add the first letters of each word to list "t".
- In the process, if the current word's first letter is NOT already in the list "t", then that word and a trailing space are added to the current list being created. Otherwise, the list continues on appending the first letters of each word to list "t".
- Finally, when all words in the split have been iterated through, the words in the new list are joined into a string and returned.
Perl 6, 39 bytes
{.words.grep({!%.{.substr(0,1).lc}++})}
Lua, 172 Bytes
It ended up way longer that I wanted...
t={}(...):gsub("[%w_]+",function(w)b=nil for i=1,#t
do b=t[i]:sub(1,1):lower()==w:sub(1,1):lower()and 1 or b
end t[#t+1]=not b and w or nil end)print(table.concat(t," "))
Ungolfed
t={} -- initialise the accepted words list
(...):gsub("[%w_]+",function(w)-- iterate over each group of alphanumericals and underscores
b=nil -- initialise b (boolean->do we have this letter or not)
for i=1,#t -- iterate over t
do
b=t[i]:sub(1,1):lower() -- compare the first char of t's i word
==w:sub(1,1):lower() -- and the first char of the current word
and 1 -- if they are equals, set b to 1
or b -- else, don't change it
end
t[#t+1]=not b and w or nil -- insert w into t if b isn't set
end)
print(table.concat(t," ")) -- print the content of t separated by spaces
Retina, 28 bytes:
M!i`\b(\w)(?<!\b\1.+)\w* ¶
M!- Match each work and print all words separated by newlines.i- Ignore case.\b(\w)- Capture first letter of each word(?<!\b\1.+)- After matching the letter, check if there wasn't a previous word starting with the same letter.\1.+ensures at least two characters, so we are skipping the current word.\w*- match the rest of the word.
The above matches only words - all other characters are removed.¶\n- Replace newlines with spaces.
Retina, 45 bytes
i`\b((\w)\w*)\b(?<=\b\2\w*\b.+) \W+ ^ | $
Simply uses a single regex to remove later words starting with the same \w character (case insensitive with the i option), converts runs of \W to a single space, then removes any leading/trailing space from the result.
Edit: See @Kobi's answer for a shorter version using M!`
Julia, 165 155 151 129 102 bytes
g(s,d=[])=join(filter(i->i!=0,[(c=lcfirst(w)[1])∈d?0:(d=[d;c];w)for w=split(s,r"\W",keep=1<0)])," ")
This is a function that accepts a string and returns a string.
Ungolfed:
function g(s, d=[])
# Split the string into an array on unwanted characters, then for
# each word, if the first letter has been encountered, populate
# this element of the array with 0, otherwise note the first letter
# and use the word. This results in an array of words and zeros.
x = [(c = lcfirst(w)[1]) ∈ d ? 0 : (d = [d; c]; w) for w = split(s, r"\W", keep=1<0)]
# Remove the zeros, keeping only the words. Note that this works
# even if the word is the string "0" since 0 != "0".
z = filter(i -> i != 0, x)
# Join into a string and return
return join(z, " ")
end
Saved 53 bytes with help from Sp3000!
Vim 57 keystrokes
:s/[^a-zA-Z_ ]//g<cr>A <cr>ylwv$:s/\%V\c<c-v><c-r>"\h* //eg<c-v><cr>@q<esc>0"qDk@q
Explanation:
:s/[^a-zA-Z_ ]//g #Remove all invalid chars.
A <cr> #Enter insert mode, and enter
#a space and a newline at the end
ylwv$:s/\\c%V<c-v><c-r>"\h* //eg<c-v><cr>@q<esc> #Enter all of this text on the
#next line
0 #Go to the beginning of the line
"qD #Delete this line into register
#"q"
k@q #Run "q" as a macro
#Macro
ylw #Yank a single letter
v$ #Visual selection to end of line
:s/ #Substitute regex
\%V\c #Only apply to the selection and
#ignore case
<c-v><c-r>" #Enter the yanked letter
\h* #All "Head of word" chars
#And a space
// #Replace with an empty string
eg #Continue the macro if not found
#Apply to all matches
<c-v><cr> #Enter a <CR> literal
@q<esc> #Recursively call the macro
I'm really dissapointed by how long this one is. The "Invalid" chars (everything but a-z, A-Z, _ and space) really threw me off. I'm sure there's a better way to do this:
:s/[^a-zA-Z_ ]//g
Since \h matches all of that expect for the space, but I can't figure out how to put the metachar in a range. If anyone has tips, I'd love to hear em.