g | x | w | all
Bytes Lang Time Link
363C++170628T010853ZJerry Je
335Python 3220321T174217Zdes54321
338Python 2151120T205035ZTFeld

C++ 363

Thanks to @ceilingcat for some very nice pieces of golfing - now even shorter

#import<bits/stdc++.h>
#define b t.push_back(l.substr(0
using namespace std;main(int p,char**a){int i=0,c=atoi(a[1]),w=atoi(a[2]);deque<string>t;for(string l;getline(cin,l);)for(;p=l.find_last_of(" \n",w),l=~p&&p>w-4?b,p)),&l[p+1]:w/l.size()?b,999)),"":(b,w-1)+"-"),&l[w-1]),l[0];);for(p=~-t.size()/c+1;i<p*c;)cout<<left<<setw(w+4)<<t[i/c+i++%c*p]<<"\n"+(i%c>0);}

Try it online!

Python 3, 336 339 335 bytes

@Jerry Jeremiah's recent activity on this question inspired me to try it myself, and see if Python 3 couldn't manage to beat the old Python 2 answer, and, after some painful golfing, I managed to get this slightly shorter:

def F(c,w,s):
 r,R,i,e=[""],[],0," "
 for j in s.split(e):
  while 1:
   if(a:=len(r[i]))+len(j)<=w:r[i]+=j.replace("\n",e*(w-a))+e;break
   elif a+min(3,w)>w:r[i]+=e*(w-a+1)
   else:r[i]+=j[:w-a-1]+"- ";j=j[w-a-1:]
   i+=1;r+=[""]
 h=-(-len(r)//c);r+=[e]*c;exec(c*"R.append(r[:h]);del r[:h];");return"\n".join(map("   ".join,zip(*R)))

Try it online!

Takes three arguments, as columns, width, text. It does require that all newlines in the input have a leading and trailing space, despite my best efforts to fix that.

Edit +3 bytes: Thanks @jezza_99 for pointing out it was erroring out, turns out my golf of assigning w-a to b didn't work, because the assignment wasn't in code that ran every loop.

Edit -4 bytes: -3 from @Sylvester Kruin spotting that I could save bytes by defining a variable for " ", and another -1 by cleaning up the line that pads r so the zip() doesn't truncate the text if some of the columns in the final row aren't filled up.

Ungolfed code and explanation

def F(c,w,s):
    r,R,i=[""],[],0
    for j in s.split(" "): ## Loop through each word in the input
        while 1: ## While we havent finished adding this word
            if (a:=len(r[i]))+len(j)<=w: ## If theres room for the full word
                r[i]+=j.replace("\n"," "*(w-a))+" " ## Add it + a trailing space, and if the string is a newline, replace it with enough spaces to finish the line
                break ## Break to next word
            elif a+min(3,w)>w: # If this line can be padded with spaces
                r[i]+=" "*(w-a+1) ## Pad with spaces
            else: ## Need to hypenate
                r[i]+=j[:w-a-1]+"- " ## Add as much word as can fit plus the hypen
                j=j[w-a-1:] ## Trim the part of this word we added
            i+=1 ## Jump to next line
            r+=[""] ## Initialize that line to the empty string
    h=-(-len(r)//c) ## Determine column height
    r+=[""]*c ## Pad the last row with empty strings so the zip() truncate it if some of the columns are incomplete
    for j in range(c): ## Convert the flat list into a list containing each column
        R.append(r[:h])
        del r[:h]
    return "\n".join( ## Join each line with a newline
                    map("   ".join, ## Use map() to .join the columns of each line with 3 spaces
                                    zip(*R))) ## Zip the lists to get a list of each line

Python 2, 346 338 bytes

i,C,W=input()
r=[]
for l in [x or' 'for x in i.split('\n')]:
 while l:
  if' '==l[0]:l=l[1:]
  w=l[:W];x=W;s=w.rfind(' ')+1
  if max(W-3,0)<s:w=w[:s];x-=W-s
  elif x<len(l)and' 'not in l[x:x+1]:w=w[:-1]+'-';x-=1
  r+=[w];l=l[x:]
r=[s.ljust(W)for s in r+['']*(C-1)]
print'\n'.join('    '.join(s)for s in zip(*zip(*[iter(r)]*((len(r))/C))))

Input as 'string',C,W