g | x | w | all

Bytes	Lang	Time	Link
188	Python	250529T203613Z	Willow W
425	Python3	250511T160158Z	Ajax1234
182	JavaScript ES7	230220T021908Z	Arnauld
021	05AB1E	230220T171017Z	The Thon
195	JavaScript Node.js	230220T155132Z	l4m2
017	Vyxal	230220T131451Z	AndrovT
044	Charcoal	230220T100330Z	Neil
051	K ngn/k	230220T094540Z	ovs

Python, 188 bytes

P=lambda A,n:n and[[A[:i]]+B for i in range(1,len(A)-n+1)for B in P(A[i:],n-1)]or[[A]]
f=lambda L,c:min(P(sorted(L),c-1),key=lambda X:sum((y-sum(Y)/len(Y))**2/len(Y)for Y in X for y in Y))

Attempt This Online!

The function P(A,n) recursively computes all ordered partitions of a list A into n+1 clusters, meaning P([0,1,2,3,4],1) will contain [[0],[1,2,3,4]] and [[0,1,2],[3,4]] for example, but not [[0,1,3],[2,4]]. Since we're trying to minimize the variance, unordered partitions can always be improved by an ordered partition (as long as we sort the input list L), so we only need the check those.

f(L,c) is the function that find the best partition. It takes the list L and the number of clusters c as inputs, computes all the ordered partitions with P(sorted(L),c-1), and returns the partition X that minimizes the sum of variances sum((y-sum(Y)/len(Y))**2/len(Y)for Y in X for y in Y). Of course, since all partitions have the same number of clusters, minimizing this sum is the same as minimizing the mean variance.

Python3, 425 bytes

Long, but designed to be efficient on larger values of \$l\$ and \$c\$.

v=lambda x:sum(sum((i-sum(j)/len(j))**2 for i in j)/len(j)for j in x)
def f(s,c):
 L,V=0,-1
 q=[(dict(enumerate(s)),[],[])]
 for a,b,s in q:
  if len(s)+1==c and not a:
   W=v(s+[b])
   if V==-1 or W<V:L=s+[b];V=W
   continue
  if len(s)+1>c:continue
  for i in[*a]:
   A={**a}
   if[]==b or A[i]>=b[-1]:
    O=[(b+[K:=A.pop(i)],[]),([K],[b]*bool(b))]
    for o,B in O:
     if V==-1 or v(s+B+[o])<=V:q+=[(A,o,s+B)]
 return L

Try it online!

JavaScript (ES7), 182 bytes

Expects (list)(c).

a=>m=c=>eval("for(q=c**a.length;p=a.slice(-c).map((_,j)=>a.filter((_,i)=>!(q/c**i%c^j))),q--;)p.some(A=>A.map(v=>s+=(v-eval(A.join`+`)/w)**2/w,w=A.length)=='',s=0)|s>m||(m=s,o=p);o")

Try it online!

Commented

This is a version without eval() for readability.

a =>                     // a[] = input list
m =                      // initialize m to a non-numeric value
c => {                   // c = number of clusters
  for(                   // main loop:
    q = c ** a.length;   //   start with q = c ** a.length
    p = a.slice(-c)      //   p[] = partition of length c
    .map((_, j) =>       //   for each value in p[] at index j:
      a.filter((_, i) => //     for each value in a[] at index i:
        !(               //       keep the value if
          q / c ** i % c //       floor(q / c ** i) mod c
          ^ j            //       is equal to j
        )                //
      )                  //     end of filter()
    ),                   //   end of map()
    q--;                 //   stop once q = 0 has been processed
  )                      //
  p.some(A =>            //   for each array A[] in p[]:
    A.map(v =>           //     for each value v in A[]:
      s += (             //       add to s:
        v -              //         v minus
        eval(A.join`+`)  //         the sum of all values in A[]
        / w              //         divided by w (the length of A[])
      ) ** 2 / w,        //       squared and divided by w again
      w = A.length       //       initialize w
    )                    //     end of map()
    == '',               //     trigger the some() if A[] was empty
    s = 0                //     start with s = 0
  )                      //   end of some(); if the result is falsy
  | s > m ||             //   and we don't have s > m:
    (m = s, o = p);      //     update the minimum and the output array
                         // (implicit end of for)
  return o               // return the output array
}                        //

05AB1E, 21 bytes

{.œʒg²Q}ΣεDÅA-nÅA}O}н

Try it online!

Port of AndrovT's Vyxal answer.

Explanation

{.œʒg²Q}ΣεDÅA-nÅA}O}н  # Implicit input
{                      # Sort the first input
 .œʒ   }               # Filter its partitions by:
    g                  #  Length of list
     ²Q                #  Equals the second nput
        Σ          }н  # Minimum by:
         ε       }     #  Map:
           ÅA          #   Get mean
          D  -         #   Subtract
              n        #   Square
               ÅA      #   Get mean
                  O    #  Sum
                       # Implicit output

JavaScript (Node.js), 195 bytes

f=([c,...x],n,L=P=x.slice(-n).map(_=>[]))=>1/c?L.map((_,i)=>f(x,0,L.map(x=>i--?x:[...x,c])))&&P:P=B(P)<B(L)?P:L;A=x=>x.map(t=>X+=++Y&&t,X=Y=0)+x?X/Y:1/0;B=L=>A(L.map(x=>A(x.map(v=>(v-A(x))**2))))

Try it online!

Vyxal, 17 bytes

søṖ'L⁰=;‡ƛṁ-²ṁ;∑∵

Try it Online!

s                 # sort
 øṖ               # all partitions
   '   ;          # filter by:
    L             #   length
      =           #   is equal to
     ⁰            #   last input
        ‡       ∵ # minimum by:
         ƛ    ;   #   map:
          ṁ-      #     subtract mean
            ²     #     square
             ṁ    #     mean
               ∑  #   sum

Charcoal, 45 44 bytes

Ｉ⊟⌊ＥΦＥＸηＬθＥηΦθ⁼λ﹪÷ιＸηξη⌊ι⟦ΣＥι∕ΣＥλΣＸ⁻νλ²ＸＬλ²ι

Try it online! Link is to verbose version of code. Explanation:

ＥＸηＬθＥηΦθ⁼λ﹪÷ιＸηξη

All permutations of the input list elements into c clusters...

Φ...⌊ι

... where no cluster is empty...

Ｅ...⟦ΣＥι∕ΣＥλΣＸ⁻νλ²ＸＬλ²ι

... calculate the doubled variance of each cluster...

Ｉ⊟⌊...

... output the clusters with the minimum.

Edit: Saved 1 byte by calculating the doubled variance using the last alternative variance formula for finite populations:

$$ \textrm{Variance(X)} = \frac{1}{2N^2} \cdot \sum_{i=1}^{N} \sum_{j=1}^{N} (x_i - x_j)^2 $$

K (ngn/k), 51 bytes

{.y@*<+/'x''a*a:y-x''y@:='+z\&z=#'?'+!z|^y}{+/x%#x}

Try it online!