| Bytes | Lang | Time | Link |
|---|---|---|---|
| nan | Tcl | 230410T114159Z | 138 Aspe |
| 176 | C GCC | 230208T230318Z | AnrimO |
| 015 | 05AB1E | 230210T075212Z | Kevin Cr |
| 018 | Jelly | 230209T023434Z | Unrelate |
| 049 | Retina 0.8.2 | 230208T221212Z | Neil |
| 027 | APLDyalog Unicode | 230208T222219Z | att |
| 072 | JavaScript Node.js | 230208T185204Z | Conor O& |
| 039 | Raku | 230208T220123Z | Sean |
| 041 | Perl 5 | 230208T211154Z | Kjetil S |
| 095 | Python | 230208T182935Z | mousetai |
Tcl, 506 264 bytes
Modified from @Kjetil S's answer
Saved 242 bytes thanks to @sergiol
Golfed version, try it online!
proc I {s i\ 0} {lmap m [regexp -all -inline -indices {\d+} $s] {append r [string ra $s $i [set b [lindex $m 0]]-1][expr [set n [string ra $s $b [set e [lindex $m 1]]]]+1e9].$n
set i $e+1}
append r [string ra $s $i e]
list $r}
proc f x\ y {string co [I $x] [I $y]}
Ungolfed version
proc increment_numbers {str} {
set result ""
set idx 0
foreach match [regexp -all -inline -indices -- {\d+} $str] {
set start [lindex $match 0]
set end [lindex $match 1]
append result [string range $str $idx [expr {$start - 1}]]
set num [string range $str $start $end]
append result [expr {$num + 1e9}].$num
set idx [expr {$end + 1}]
}
append result [string range $str $idx end]
return $result
}
proc f {x y} {
set x [increment_numbers $x]
set y [increment_numbers $y]
return [string compare $x $y]
}
foreach test {
{abc abx LT}
{abx abc GT}
{abx abx EQ}
{ab abc LT}
{ab ab10 LT}
{ab10c ab9x GT}
{ab9x ab10c LT}
{15x 16b LT}
{16b 15x GT}
{852 9 GT}
{1,000 9 LT}
{1.000 9 LT}
{20.15.12 20.19.12 LT}
{20.15.12 6.99.99 GT}
{15k19 15w12 LT}
} {
lassign $test x y exp
set got [f $x $y]
switch $got {
-1 {set got LT}
0 {set got EQ}
1 {set got GT}
}
set status [expr {$exp eq $got ? "ok" : "NOT OK"}]
puts [format "%-6s x: %-8s y: %-8s expected: %s got: %s" $status $x $y $exp $got]
}
C (GCC), 230 178 176 bytes
edit: - bytes thank to c--, -6 thanks to pan
*p="1234567890";i,j;f(char*x,char*y){x[i=strcspn(x,p)]+y[j=strcspn(y,p)]?strncmp(x,y,i>j?j:i)?:i-j?:(i=strspn(x+=i,p))-strspn(y+=j,p)?:atoi(x)-atoi(y)?:f(x+i,y+i):strcmp(x,y);}
Returns positive for GT, negative for LT and 0 for EQ (same output as strcmp)
* p = "1234567890";
i, j;
f(char * x, char * y) {
x[i = strcspn(x, p)] + y[j = strcspn(y, p)] ? // find first digit in each string
strncmp(x, y, i > j ? j : i) ? : // return strcmp if strings are different before first digit
i - j? : // return longer string before the first digit
(i = strspn(x += i, p)) - strspn(y += j, p) ? : // return longer number
atoi(x) - atoi(y) ? : // return larger number if both have the same length
f(x + i, y + i) : // otherwise compare the string after the numbers
strcmp(x, y); // return strcmp if there are no digits
}
Previous attempt:
int f(char*x,char*y){int c,i,j;for(;;){i=strcspn(x,"1234567890");j=strcspn(y,"1234567890");if(!*(x+i)&&!*(y+j)) return strcmp(x,y);if(c=strncmp(x,y,i>j?j:i))return c;if(c=atoi(x+=i)-atoi(y+=j))return c;for(;isdigit(*x);x++,y++);}}
05AB1E, 16 15 bytes
ÙΣ.γd}εÐdigs»]k
Input as a pair of strings. Outputs [0,1] for \$LT\$; [1,0] for \$GT\$; and [0] for \$EQ\$.
Try it online or verify all test cases.
Explanation:
Ù # Uniquify the (implicit) input-pair (for the EQ test cases)
Σ # Sort the pair by:
.γ # Adjacent group the substrings by:
d # Is it a (non-negative) number
}ε # After the adjacent-group-by: map over each part:
Ð # Triplicate the current part
di # Pop one, and if it's a (non-negative) number:
g # Pop another, and push its length
s # Swap so the number is before the length on the stack
» # Join the stack (the length & number) with newline delimiter
] # Close the if-statement, map, and sort-by
k # Get the indices of this sorted pair into the (implicit) input-pair
# (which is output implicitly as result)
Jelly, 23 20 18 bytes
œ-œlLɗÐƤ€ØDoṚ$ż"¹Ġ
Takes a list of two inputs, and returns [[1], [2]] for LT, [[1, 2]] for EQ, and [[2], [1]] for GT. (The test footer converts these because my brain couldn't handle checking the test cases otherwise.)
A band-aid fix to a solution that otherwise always compares digits as greater than non-digits.
ÐƤ For every prefix (largest first) of
€ each of the inputs,
L get the length of
œ- the multiset difference of the suffix and
œl ɗ ØD the suffix with leading digits removed.
o Replace zeroes in either result with
Ṛ$ corresponding elements of the other result,
ż"¹ then zip each result with the corresponding input.
Ġ Group indices, sorted by value.
Retina 0.8.2, 49 bytes
\d+
$.&$*10$&
^
$%'¶
O`¶.*
^(.*)(¶\1)*(¶.*)*$
$#2
Try it online! Takes newline-separated input and outputs 0, 1 or 2 for GT, LT and EQ but link is to test suite that splits on tabs and translates the output to >, < or = for convenience. Explanation:
\d+
$.&$*10$&
Prefixes each run of digits with a run of 1s of the same length and a 0. This maintains lexicographical sort order when comparing numbers with non-numbers while making numbers sort by length and then lexicographically.
See below for the explanation of the rest of the program. Previous 44 byte version compared digit strings numerically rather than by length:
\d+
$*10
^
$%'¶
O`¶.*
^(.*)(¶\1)*(¶.*)*$
$#2
Try it online! Takes newline-separated input and outputs 0, 1 or 2 for GT, LT and EQ but link is to test suite that splits on tabs and translates the output to >, < or = for convenience. Explanation:
\d+
$*10
Replace all embedded runs of digits with a run of that number of 1s followed by a 0. This maintains lexicographical sort order when comparing numbers with non-numbers while making numbers sort numerically.
^
$%'¶
Make a copy of the first string.
O`¶.*
Sort the modified strings lexicographically.
^(.*)(¶\1)*(¶.*)*$
$#2
Count how many strings match the first string. If both strings were equal, then they will both match. If the first string was less than the second, then it will match and the other string will not. If the first string was greater than the second, then there will not be any additional matches.
I include the previous newline in the sort string as non-empty strings are easier to process in Retina, but note that the above rule works whether or not the copy is included in the sort; with the copy, the possible results are EEE (all equal), FFS (first sorts before second), SFF (first sorts after second), while without the copy, the third result is FSF, where the first two strings are still different.
APL(Dyalog Unicode), 27 bytes SBCS
(⍋>⍒)-⍤⌈⍥≢/↑¨¨'\d+|.'⎕S'&'¨
Tacit function that returns 0 1, 0 0, 1 0 for LT, EQ, GT, respectively. Input f x y.
'\d+|.'⎕S'&'¨ split into numbers/others
-⍤ ↑¨¨ left-pad splits to
⌈⍥≢/ max length of inputs
(⍋>⍒) compare
JavaScript (Node.js), 74 72 bytes
Saved 2 bytes thanks to Arnauld! Bugfix thanks to [tsh]
x=>y=>((a=(g=s=>b=s.replace(/\d+/g,n=>0+n.padStart(16)))(x))>g(y))-(a<b)
Try it online! Gives -1 for LT, 0 for EQ, and 1 for GT. Call as f(a)(b).
Assumes the strings' integers are safe integers (in this case, positive integers ≤ \$2^{53}-1\$, i.e., 16 decimal digits or less. This transforms both input strings by left padding such integers with spaces, prepending a zero, and using the native string comparison sort, which gives us the correct results. (This method, as opposed to padding with 0s, distinguishes 0 and 000, as in such cases, the length of the string should be taken into consideration.)
Raku, 39 bytes
&[cmp]o*».&{[m:g/\d+|./».&{+$_//$_}]}
*».&{ ... }is an anonymous function that maps its list-of-strings argument over the braced expression.m:g/\d+|./breaks a string up into a list of matches, each either a group of digits, or a single other character.».&{ ... }maps each of those lists over the braced expression.+$_ // $_tries to convert each match into a number with the+operator. If that fails, the defined-or operator//replaces the error with the original value.[ ... ]wraps each list in anArray.&[cmp]is a reference to the built-incmpoperator, which operates on arrays of heterogeneous data types just as specified in the problem statement. It returns one of the enumerated valuesLess,Same, orMore.ocomposes those two functions together.
Perl 5, 41 bytes
sub{s/\d+/$&+1e9.$&/ge for@_;pop cmp pop}
Returns -1 for GT, 0 for EQ and 1 for LT.
Python, 95 bytes
-X bytes thanks to WheatWizzard and Conor O'Brian
def f(x):a,b=[[(len(i),i)for i in re.findall('\d+|.',y)]for y in x];return(a>b)-(a<b)
import re