| Bytes | Lang | Time | Link |
|---|---|---|---|
| nan | Rust | 250421T021147Z | 138 Aspe |
| 066 | Wolfram Language Mathematica | 220217T025326Z | att |
| 031 | 05AB1E | 210402T164804Z | Kevin Cr |
| 141 | JavaScript ES6 | 210314T231648Z | Arnauld |
| 205 | Python 3 | 210315T215754Z | Noodle9 |
| 030 | Brachylog | 210315T153252Z | xash |
| nan | J | 210315T031226Z | Jonah |
| 077 | Retina | 210315T114116Z | Neil |
Rust, 740 631 619 608 516 500 489 483 bytes
Saved 109+12+11+92+16+11+6=257 bytes thanks to @ceilingcat
Golfed version. Attempt This Online!
use std::collections::HashSet;fn f(I:&str)->Vec<usize>{let(n,mut a,mut r,mut u,mut c)=(I.len(),vec![],vec![],HashSet::new(),0);for i in 0..n{for j in i..n{a.push(I[i..=j].into());c+=1}}let b=a.clone();for z in a{u.insert(z);}while c>0{r.push(c);let mut s=HashSet::new();for x in u{for y in&b{let p=format!("{x}{y}{x}");if I.contains(&p){s.insert(p);}}}c=0;for p in&s{if""<p&&p.len()<=n{let mut h=HashSet::new();for P in 0..n{h.insert(I[P..].find(p).map(|i|P+i));}c+=h.len()-1}}u=s}r}
Ungolfed version. Attempt This Online!
use std::collections::HashSet;
use std::iter::FromIterator; // Required for collect::<HashSet<_>>()
/// Counts occurrences of a substring in text using the Python find-based logic.
/// This method correctly counts overlapping occurrences based on the start index found.
///
/// Equivalent to the Python `count_occurrences_original_method`.
fn count_occurrences(text: &str, substring: &str) -> usize {
if text.is_empty() || substring.is_empty() {
return 0;
}
let text_length = text.len();
let sub_length = substring.len();
// If substring is longer than text, it cannot be found
if sub_length > text_length {
return 0;
}
let mut found_indices_set: HashSet<Option<usize>> = HashSet::new();
// Iterate through possible start positions in the text
for start_pos in 0..text_length {
// text.find() searches the *entire* string starting from byte 0.
// We need the equivalent of Python's text.find(substring, start_pos),
// which finds the first occurrence *at or after* start_pos.
// We can achieve this by searching within the slice starting from start_pos.
let search_slice = &text[start_pos..];
let found_in_slice = search_slice.find(substring);
// If found, map the index relative to the slice back to the original text's index.
let found_index_option = found_in_slice.map(|index_in_slice| start_pos + index_in_slice);
// Add the result (Some(index) or None) to the set.
// The Python code implicitly adds -1 if find fails after the last match;
// here, `None` represents that "not found from this start_pos onwards".
found_indices_set.insert(found_index_option);
}
// The number of actual occurrences is len(set) - 1.
// If substring is not found at all, the set will contain only `None`, len is 1. 1 - 1 = 0.
// If found k > 0 times, the set contains k `Some(index)` values + possibly `None`.
// The length will be k or k+1. Subtracting 1 gives k.
// Use saturating_sub to prevent underflow if len is 0 (shouldn't happen here).
let count = found_indices_set.len().saturating_sub(1);
count
}
/// Calculates a sequence of counts related to repeating substring patterns.
///
/// Equivalent to the Python `f_explicit`.
fn f_explicit_rust(input_string: &str) -> Vec<usize> {
let w = input_string;
let n_w = w.len();
// Handle empty string case early
if n_w == 0 {
return Vec::new();
}
// 1. Generate all possible substrings (owned Strings)
let mut all_substrings_list: Vec<String> = Vec::new();
for i in 0..n_w {
for j in i..n_w {
// Use byte slicing [i..=j] for direct equivalent of Python's w[i:j+1]
// This assumes simple ASCII/UTF-8 where byte indices align with characters okay.
// For complex Unicode, a char-based approach would be needed.
let substring = w[i..=j].to_string(); // Create an owned String
all_substrings_list.push(substring);
}
}
// 2. Keep a copy of the initial list of substrings ('v' in original)
// Need to clone as the original list might be consumed or modified implicitly later
let base_substrings_for_combination: Vec<String> = all_substrings_list.clone();
// 3. Initialize the result list
let mut results: Vec<usize> = Vec::new();
// Initial 'n' is the total count of all generated substrings (including duplicates)
let mut current_iteration_count = all_substrings_list.len();
// 4. Initialize the collection of *unique* substrings for pattern generation ('s' in original)
// We start with the unique set derived from the initial list.
// Convert the Vec<String> into a HashSet<String>. This consumes the Vec.
let mut current_unique_substrings: HashSet<String> = HashSet::from_iter(all_substrings_list);
// 5. Main loop: continues as long as the count is positive
while current_iteration_count > 0 {
// 6. Record the count for this iteration
results.push(current_iteration_count);
// 7. Generate the next set of pattern substrings ('s' update)
let mut next_pattern_substrings_set: HashSet<String> = HashSet::new();
// Iterate through unique substrings from the current generation ('sub1')
// Borrow elements from the set to avoid moving them.
for sub1 in current_unique_substrings{
// Iterate through all original substrings ('sub2' from 'base_substrings_for_combination')
// Borrow elements from the base list.
for sub2 in &base_substrings_for_combination {
// Form the combined pattern: sub1 + sub2 + sub1
// Using format! creates a new owned String.
let combined_pattern = format!("{}{}{}", sub1, sub2, sub1);
// Check if this pattern exists within the original input string 'w'
// `w.contains()` takes a &str pattern.
if w.contains(&combined_pattern) {
// Insert the owned String into the set.
next_pattern_substrings_set.insert(combined_pattern);
}
}
}
// 8. Recalculate the count 'n' for the *next* iteration based on the new set
let mut next_iteration_total_occurrences = 0;
// Iterate through the newly found unique patterns ('pattern_sub')
// Borrow elements from the set.
for pattern_sub in &next_pattern_substrings_set {
// Count occurrences of this pattern in the original string 'w'
// Pass string slices (&str) to the helper function.
let occurrences = count_occurrences(w, pattern_sub);
next_iteration_total_occurrences += occurrences;
}
// 9. Update state for the next iteration
current_iteration_count = next_iteration_total_occurrences;
// Move ownership of the newly created set to become the current set for the next loop.
current_unique_substrings = next_pattern_substrings_set;
}
// 10. Return the collected counts
results
}
fn main() {
let tests = vec![
"", "P", "MOM", "alfalfa", "aha aha!", "xxOOOxxxxxOOOxxx", "3123213332331112232",
"2122222111111121221", "13121311213121311212", "344112222321431121114",
"141324331112324224341", "344342224124222122433", "331123321122132312321",
"11221112211212122222122", "2222222112211121112212121", "AAAaaAAAAAAAaAaAAaAaAAAAA",
"1100000010010011011011111100", "1111221211122111111221211121",
];
let expected: Vec<Vec<usize>> = vec![
vec![], vec![1], vec![6, 1], vec![28, 8], vec![36, 9, 1], vec![136, 64, 9],
vec![190, 55, 1], vec![190, 80, 3], vec![210, 114, 25, 2], vec![231, 48],
vec![231, 49], vec![231, 61, 1], vec![231, 74], vec![276, 149, 4],
vec![325, 166, 6], vec![325, 198, 30], vec![406, 204], vec![406, 261, 56, 2],
];
let checker = ['\u{274c}', '\u{2713}']; // [Cross mark, Check mark]
println!("Testing Rust explicit function:");
for (t, e_vec) in tests.iter().zip(expected.iter()) {
let s_list = f_explicit_rust(t);
let success = s_list == *e_vec; // Compare the Vec<usize> results
println!("\"{}\" -> {:?} {}", t, s_list, checker[success as usize]);
// assert!(success); // Optionally panic on failure
}
}
Wolfram Language (Mathematica), 66 bytes
Counts@Level[z@@@Subsequences@#,-3]&
z[Longest@a___,__,a___]:=z@a!
Returns an Association with corresponding values. Prepend List@@ for list output.
05AB1E, 31 bytes
āεo<VŒε.œYùεĆ4ô€¨yªεÂQ}P}1å]O0Ü
Try it online or verify some more test cases (pretty slow, so the larger test cases are omitted in the test suite).
Explanation:
Step 1: Loop over the possible sizes \$n\$ of the \$Z_n\$ based on the input-length.†
We can calculate the valid \$n\$ to check with the following formula: \$m = \left\lceil\log_2(L)\right\rceil\$, where \$L\$ is the input-length, and the result \$m\$ indicates the range \$[1,m]\$ to check for Zimin 'words': \$Z_{n=1}^m\$.
In 05AB1E code, this would have resulted in g.²îLεo<VŒε.œYùεĆ4ô€¨yªεÂQ}P}1å]O0Ü (35 bytes): try it online.
† Since we have to strip trailing 0s at the end in case no \$Z\$ were found for them anyway, we simply don't calculate this maximum. Instead, we'll just check in the range \$[1,L]\$ for Zimin words. Less efficient in terms of performance, but 4 bytes shorter, which is all we care about with code-golf. :)
We can also calculate the amount of parts in the resulting Zimin 'word' (\$Z_n\$), which is the oeis sequence A000225: \$2^n-1\$.
ā # Push a list in the range [1, (implicit) input-length]
ε # Map over each of these integers:
o # Take 2 to the power this integer
< # Decrease it by 1
V # Pop and store this in variable `Y`
Step 2: For each of these amount of parts \$Y\$, we check how many substrings of the input are valid Zimin 'words' (\$Z_Y\$).
Step 2a: We do this by first generating all substrings of the input, and mapping over them:
Œ # Get all substrings of the (implicit) input-list
ε # Inner map over each substring:
Step 2b: Then we'll get all partitions of the current substring of exactly \$Y\$ amount of parts:
.œ # Get all partitions of this subtring
Yù # Only leave the partitions of `Y` amount of parts
Step 2c: We then check for each of these (correctly sized) partitions of this substring if it's a valid Zimin 'word'.
We do this by checking two things:
- Is the current list of parts in this partition a palindrome?
- Is every \$ABA\$ part within the current partition a palindrome?
As for step 2c2, we do so by first adding a dummy trailing item to the partition; then splitting the partition-list into sublists of size 4; then we remove the trailing item from each of these sublists; after which we can check for each \$[A,B,A]\$ sublist whether it's a palindrome.
ε # Inner map over each remaining partition:
Ć # Enclose the partition; append its own first part at the end
4ô # Split the partition into parts of size 4
۬ # Remove the final part of each quartet
yª # Append the full partition to this list of triplets as well
ε # Inner map over this list of lists:
# Check if the current list is a palindrome by:
 # Bifurcating: short for Duplicate & Reverse copy
Q # Check if the list and reversed copied list are the same
} # After this inner-most map:
P # Check if all were truthy by taking the product
Alternatively for step 2c2, Ć4ô€¨ could have been 4ô3δ∍ instead: Split the partition-list into sublists of size 4 (with 3 in the trailing sublist); shorten each sublist to size 3 by removing trailing item(s); same as above. Try it online.
Step 2d: And we then check if there is at least one valid Zimin 'word' (\$Z_Y\$) among the mapped partitions for this substring:
} # After the map over the partitions of the current substring:
1å # Check if any were truthy by checking if it contains a 1
# (NOTE: we can't use the max builtin `à` here like we usually do when
# we want to mimic an `any` builtin, because lists could be empty here,
# resulting in an empty string)
Step 3: Now we sum the amount of truthy results per size \$Y\$:
] # Close all remaining nested maps
O # Sum the inner-most lists together
Step 4: And finally we clean-up all trailing 0s (both the \$\geq m\$ sizes we've checked in our golfed approach, as well as any trailing \$Z\$ within the \$[1,m]\$ range that simply resulted in \$0\$), before outputting the result:
0Ü # Remove any trailing 0s from this list
# (after which the resulting list is output implicitly)
JavaScript (ES6), 148 142 141 bytes
Saved 6 bytes thanks to @user81655
Expects an array of characters.
f=(a,o=[],s='')=>a.map(c=>{for(s+=c,i=p='';s.match(`^${p+='(.+)'+p.replace(/\(.../g,_=>"\\"+n++)}$`);n=1)o[+i]=-~o[i++]})+a?f(a.slice(1),o):o
How?
For each substring in the input string, we apply the following patterns while they match:
P₀ ^(.+)$
P₁ ^(.+)(.+)\1$
P₂ ^(.+)(.+)\1(.+)\1\2\1$
P₃ ^(.+)(.+)\1(.+)\1\2\1(.+)\1\2\1\3\1\2\1$
...
Apart from the delimiters ^ and $ which are ignored below, these patterns are built recursively as follows:
\$P_0\$ is
(.+)\$P_{k+1}\$ is \$P_{k}\$, followed by
(.+), followed by \$P_k\$ with each \$n\$-th group(.+)replaced with the back reference\{n}(1-indexed)e.g. for \$P_2 \rightarrow P_3\$:
(.+)(.+)\1(.+)\1\2\1 -> (.+)(.+)\1(.+)\1\2\1 (.+)(.+)(.+)\1(.+)\1\2\1 \1 \2 \3
It's worth noting that:
- if a substring matches some \$P_i\$, it also matches all \$P_j,\:j<i\$
- if a substring doesn't match some \$P_i\$, it also won't match any \$P_j,\:j>i\$
We keep track of the results in the array o[], which is eventually returned. Whenever \$P_i\$ matches a substring, we increment o[i].
Python 3, 205 bytes
def f(w):
R=range(len(w));s=[w[i:j+1]for i in R for j in R[i:]];v=[*s];n=len(s);r=[]
while n:r+=[n];s={i+j+i for i in s for j in v if i+j+i in w};n=sum(len({w.find(a,b)for b in R})-1for a in s)
return r
Quite verbose.
Inputs a string and returns a list of the number of substrings that are instances \$Z_1, Z_2, \dots, Z_n\$.
Brachylog, 30 bytes
{s{h∧1|~cṪ↺₁hʰb=h↰ᶠ⌉+₁}ᵘ}ᶠcọtᵐ
{s … }ᶠfor every substring:{h∧1| …}on every non-empty string, return 1. Also try …~cinput split into groups, so that …Ṫ↺₁…b=it unifies with[A,B,A](this works, too, but is sadly longer).hʰB is non-emptyh↰ᶠtakeAand call the predicate recursively, finding all results⌉+₁get the highest number and add 1ᵘFind all unique values. So foraha ahawe get[1,2,3]from[aha aha, aha, a].cọtᵐjoin the results from all the substrings, count the occurrences of each number, and return them
J, 63 58 53 52 51 47 bytes
_1}.1#.a:~:[:([#~[e.&,(],,)&.>/)^:a:~@;<@(<\)\.
-2 bytes thanks to Bubbler
-3 bytes thanks to FrownyFrog for golfier way to create all substrings of a string.
how
We'll use MOM as our example:
[:...;<@(<\)\.All substrings of the input, boxed.┌─┬─┬─┬──┬──┬───┐ │M│O│M│MO│OM│MOM│ └─┴─┴─┴──┴──┴───┘(...)^:a:~Using the substrings as both the left and right args~, apply the verb in parens until a fixed point, keeping track of each iteration's results^:a:.(...(],,)&.>/)To create the next "level" of candidates, create all possible combinations that look like<level_n><orig substring><level_n>wherelevel_nis any element from the current level, and<orig substring>is one of the original input substrings (level 0). Then the 1st level looks like:┌─────┬─────┬─────┬───────┬───────┬─────────┐ │MMM │OMO │MMM │MOMMO │OMMOM │MOMMMOM │ ├─────┼─────┼─────┼───────┼───────┼─────────┤ │MOM │OOO │MOM │MOOMO │OMOOM │MOMOMOM │ ├─────┼─────┼─────┼───────┼───────┼─────────┤ │MMM │OMO │MMM │MOMMO │OMMOM │MOMMMOM │ ├─────┼─────┼─────┼───────┼───────┼─────────┤ │MMOM │OMOO │MMOM │MOMOMO │OMMOOM │MOMMOMOM │ ├─────┼─────┼─────┼───────┼───────┼─────────┤ │MOMM │OOMO │MOMM │MOOMMO │OMOMOM │MOMOMMOM │ ├─────┼─────┼─────┼───────┼───────┼─────────┤ │MMOMM│OMOMO│MMOMM│MOMOMMO│OMMOMOM│MOMMOMMOM│ └─────┴─────┴─────┴───────┴───────┴─────────┘[#~[e.&,...Check if each original substring is an element of this candidate list[e.&,, and use that to filter the original substrings[#~. In the above example, onlyMOMpasses.After the above process iterates to its fixed point (no more matches), the result is:
┌───┬─┬─┬──┬──┬───┐ │M │O│M│MO│OM│MOM│ ├───┼─┼─┼──┼──┼───┤ │MOM│ │ │ │ │ │ ├───┼─┼─┼──┼──┼───┤ │ │ │ │ │ │ │ └───┴─┴─┴──┴──┴───┘a:~:Since we don't want to count empty boxes, we perform an elementwise test for "not equal to the empty boxa:", giving us a 0-1 matrix:1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 01#.Sum rows:6 1 0_1}.And kill the last element:6 1
Retina, 77 bytes
K`(.+)
/\(/{G`
%"$+"~`^
w`
"¶0"(Am`^0
K`
$+2
(.*?\))+(.*)$
$&¶$&(.+)$2\$#1$2
Try it online! No test suite because of the advanced use of result history. Explanation:
K`(.+)
Start by considering matches of Z₁.
/\(/{`
Repeat while the working area contains a (.
G`
Do nothing. This is here solely to record the value at the start of each pass of the loop in the stage history, so that it can be referred to later on.
%"$+"~`^
w`
Processing each line of regex separately, evaluate it as an overlapped count command on the original input, thereby returning a list of counts.
"¶0"(`
If the list contains a 0 (strictly speaking this can only happen on the second pass, as I use a newline to check that it's a leading 0)...
Am`^0
... then delete all 0 values (including the first if the input string was empty), otherwise:
K`
$+2
Retrieve the value saved in step 2. (Because Retina numbers stages in post-order traversal, this is the G command.) Unfortunately $ doesn't work directly in a K command, so the easiest way to do this is to delete the result and then substitute the empty string with the saved value.
(.*?\))+(.*)$
$&¶$&(.+)$2\$#1$2
Append an additional regex that matches the next Zimin word.