g | x | w | all
Bytes Lang Time Link
195Swift 5.9240403T004035ZmacOSist
273Python 3240403T052819ZMinko_Mi
103Perl 5 pF\"240403T144512ZXcali
078QuadR i≡240403T043942ZAdá
111Retina 0.8.2240403T083840ZNeil

Swift 5.9, 246 243 224 195 bytes

let f={($0+"").replacing(/\b(?i:um|uh|like|you know)([., ])/){" "+$0.1}.replacing(/\ +(,| )/){$0.1}.trimmingPrefix(" ").replacing(/(^|\. )(\w)([\w-]+)/){$0.1+$0.2.uppercased()+$0.3.lowercased()}}

If only Swift regexes supported lookbehinds...

Python 3, 273 bytes

import re
s = input().split(". ")
g = re.sub
def f(x):
    x=g(r'\s*,',',',g(r'\s+',' ',g(r'(?<!\")\b(um|uh|like|you know)\b(?!\")', '', x.lstrip(), flags=re.IGNORECASE)))
    return x[0].upper()+x[1:]+"."
s = [f(x) for x in s]
s[-1] = s[-1][:-1]
print(" ".join(s))

Not sure if I'm printing properly, or if it's necessary

Perl 5 -pF\", 103 bytes

map++$i%2&&s/\b(um|uh|like|you know)\b//gi,@F;$_=join'"',@F;s/^ *| +?(?=,| )//g;s/(^|\. )\K\S+/\u\L$&/g

Try it online!

QuadR i≡, 80 78 bytes

−2 thanks to inspiration from Neil's Retina answer.

"[^"]*"
^ +| (?= |,)|\b(um|uh|like|you know)\b
(^|\. )([a-z])(\w*)
&

\1\u2\l3

Try it online!

Will be explained once the spec is fully settled.

This is equivalent to the Dyalog APL function '"[^"]*"' '^ +| (?= |,)|\b(um|uh|like|you know)\b' '(^|\. )([a-z])(\w*)'⎕R'&' '' '\1\u2\l3'⍠1⍣≡

Retina 0.8.2, 111 bytes

i`\b(um|uh|like|you know)\b(?=(([^"]*"){2})*[^"]*$)

^ +| +( |,)
$1
T`L`l`(^|\. )[^ .]+
T`l`L`(^|\. )[^\w .]*\w

Try it online! Link includes test cases. Explanation:

i`\b(um|uh|like|you know)\b(?=(([^"]*"){2})*[^"]*$)

Delete filler words that aren't quoted.

^ +| +( |,)
$1

Delete leading spaces and spaces before spaces or commas.

T`L`l`(^|\. )[^ .]+
T`l`L`(^|\. )[^\w .]*\w

Title case the first word of each sentence.