Differentiability of Lipschitz functions#

The regularity of a Lipschitz function \(f\colon \mathbb R^n\to \mathbb R\) is very interesting. Of course, Lipschitz functions are continuous, but they may not be differentiable everywhere. However, it is quite easy to convince yourself that they cannot be non-differentiable on quite a large set. The question to quantify how large the non-differentiability set of a Lipschitz function can be was one of the motivating questions of Lebesgue’s development of measure theory.

Definition 18 (Total variation of a measure)

Let \(f\colon [a,b]\to \mathbb R\). The total variation of \(f\), \(Vf\colon [a,b]\to [0,\infty]\) is defined by

\[Vf(x)= \sup \sum_{i=1}^n |f(t_i)-f(t_{i-1})|\]

where the supremum ranges over all \(a=t_0 < t_1 <\ldots <t_n=b\).

If \(Vf(b)<\infty\), \(f\) is said to have bounded variation (BV).

Definition 19 (Absolutely continuous function)

A function \(f\colon [a,b]\to \mathbb R\) is absolutely continuous (AC) if for any \(\epsilon>0\) there exists a \(\delta>0\) such that, for any intervals \((a_1,b_1),(a_2,b_2)\ldots\subset [a,b]\) with \(\sum_i |b_i-a_i|<\delta\), we have \(\sum_i |f(b_i)-f(a_i)|<\epsilon\).

Note that AC functions are BV, and Lipschitz functions are AC (see Example 33). Also, if \(f\) is BV then \(Vf\) and \(Vf-f\) are non-decreasing. If \(f\) is AC then so are \(Vf\) and \(Vf-f\), see Example 34.

Theorem 12 (Lebesgue)

Let \(f\colon [a,b]\to \mathbb R\) be absolutely continuous. Then \(f\) is differentiable \(\mathcal L^1\) almost everywhere. Moreover, for any \(x>y \in [a,b]\),

\[f(x)-f(y)= \int_y^x f' \, \mathrm{d}x.\]

Proof. By Example 34 it suffices to assume that \(f\) is non-decreasing. In this case define a measure \(\mu\) on \([a,b]\) using the Carathéodory construction with \(F\) the set of compact intervals and \(\zeta([c,d])= f(d)-f(c)\). This defines a finite Borel measure such that \(\mu([c,d])=f(d)-f(c)\) for all intervals \([c,d]\subset [a,b]\). Indeed, for any \(\delta>0\), we may cover \([c,d]\) by finitely many intervals \([c,c_1],[c_1,c_2],\ldots,[c_k,d]\) of width \(\delta'\leq\delta\), showing

\[\mu([c,d]) \leq \sum f(c_{i+1})-f(c_i) = f(d)-f(c).\]

The reverse inequality holds because \(f\) is non-decreasing.

Note that \(\mu\ll \mathcal L^1\). Indeed, given \(\epsilon>0\), let \(\delta>0\) be given by the definition of \(f\) being absolutely continuous. If \(\mathcal L^1(N)=0\), we may cover \(N\) by countably many closed intervals \(I_i\) such that \(\sum_i\mathcal L^1(I_i)<\delta\). In particular \(\sum_i f(I_i)<\epsilon\) and hence \(\mu(N)<\epsilon\). Therefore,

\[\mu = \int \frac{\, \mathrm{d}\mu}{\, \mathrm{d}\mathcal L^1}\, \mathrm{d}\mathcal L^1,\]

with \(\, \mathrm{d}\mu/\, \mathrm{d}\mathcal L^1 \in L^1(\mathcal L^1)\).

By the Lebesgue differentiation theorem, for any Lebesgue point \(x\) of \(\, \mathrm{d}\mu/\, \mathrm{d}\mathcal L^1\),

\[\begin{split}\lim_{t\to 0}\frac{f(x+t)-f(x)}{t} &=\lim_{t\to 0} \frac{\mu([x+t,x])}{t}\\ &= \lim_{t\to 0} \frac{1}{t} \int_x^{x+t} \frac{\, \mathrm{d}\mu}{\, \mathrm{d}\mathcal L^1}\, \mathrm{d}\mathcal L^1\\ &= \frac{\, \mathrm{d}\mu}{\, \mathrm{d}\mathcal L^1}(x).\end{split}\]

Theorem 13 (Rademacher)

Any Lipschitz \(f\colon \mathbb R^n\to \mathbb R\) is differentiable \(\mathcal L^n\) almost everywhere.

Proof. For notational simplicity, we prove the case \(n=2\).

For each \(y\in \mathbb R\), \(x\mapsto f(x,y)\) is a Lipschitz function \(\mathbb R\to\mathbb R\) and so is differentiable \(\mathcal L^1\)-a.e. That is, for every \(y\), \(\partial_1 f(x,y)\) exists for \(\mathcal L^1\)-a.e. \(x\). By Fubini’s theorem, \(\partial_1 f\) exists \(\mathcal L^2\)-a.e. Similarly, \(\partial_2 f\) exists almost everywhere too.

Fix \(\epsilon>0\). For \(D\in \mathbb Q^2\) and \(j\in \mathbb N\) let

\[X_{D,j}=\{x: |f(x+h e_i)-f(x)-D_i h|<\epsilon |h|,\ \forall 0<|h|<1/j,\ i=1,2\}.\]

These are Borel sets. Further, for \(D\in\mathbb Q^2\), if

\[|\partial_1f(x)-D_1|<\epsilon/2 \quad \text{and} \quad |\partial_2f(x)-D_2|<\epsilon/2,\]

then \(x\in X_{D,j}\) for sufficiently large \(j\). That is,

\[X^\epsilon=\bigcup_{D\in \mathbb Q}\bigcup_{j\in\mathbb N}X_{D,j}\]

is a set of full measure.

Fix \(D\in \mathbb Q^2\) and \(j\in\mathbb N\). Let \(x\) be a density point of \(X_{D,j}\). Let \(R>0\) such that

\[\mathcal L^n(B(x,r)\cap X_{D,j}) \geq (1-\epsilon^n)\mathcal L^n(B(x,r))\]

for all \(0<r<R\). In particular, for every \(y\in B(x,r)\) there exists \(y'\in X_{D,j}\) with

\[\|y-y'\|< \epsilon \|y-x\|.\]

Now let \(r<\min\{R,1/j\}\) and \(\|x-y\|<r\). Set \(h=y-x\), \(\tilde y = x + \pi_1 y\) and \(\tilde{\tilde y} \in X_{D,j}\) with

\[\|\tilde y- \tilde{\tilde y}\| < \epsilon\|x-\tilde y\| \leq \epsilon \|x-y\|.\]

Also let \(y',y''\) lie on the same vertical line as \(\tilde{\tilde y}\) such that \(y',\tilde y\) have the same vertical component as do \(y'',\tilde{\tilde y}\). Then, since \(x\in X_{D,j}\),

(17)#\[|f(\tilde y)-f(x)-D_1 h_1| < \epsilon |h_1|=\epsilon \|x-\tilde y\|\leq \epsilon \|x-y\|;\]

Since \(f\) is Lipschitz,

(18)#\[|f(\tilde y)-f(y')| \leq L\|\tilde y-y'\|\leq L\epsilon \|x-y\|;\]

Since \(\tilde{\tilde y}\in X_{D,j}\),

(19)#\[|f(y'')-f(y')- D_2 h_2| \leq \epsilon \|y'-y''\|\leq \epsilon \|x-y\|;\]

Since \(f\) is Lipschitz,

(20)#\[|f(y'')-f(y)| \leq L\|y''-y\| = L\|y'-\tilde y\| \leq L \|\tilde{\tilde y}-\tilde y\| \leq \epsilon L\|x-y\|.\]

By combining (17), (18) (19) and (20),

\[|f(y)-f(x) - D\cdot h| \leq 2(1+L) \epsilon \|x-y\|.\]

This is true for all \(y\) with \(\|x-y\|<r\) and for any density point \(x\) of the full measure set \(X^\epsilon\). That is, for \(\mathcal L^n\)-a.e \(x\). Taking a countable intersection over \(\epsilon \to 0\) concludes the proof.


Example 33

Exercise 48. Prove that Lipschitz functions are AC and that AC functions are BV.

Example 34

Exercise 49. Let \(f\colon [a,b]\to \mathbb R\) be BV. Show that \(Vf\) and \(Vf-f\) are non-decreasing. If \(f\) is AC then show that \(Vf\) and \(Vf-f\) are AC.

Example 35

Exercise 50. Show that any monotonic \(f\colon \mathbb R\to \mathbb R\) is continuous except at countably many points.

Example 36

Exercise 51. In this exercise we will show that monotonic functions are differentiable almost everywhere.

Let \(f\colon [a,b]\to \mathbb R\) be non-decreasing. For each \(x\in (a,b)\) let

\[\underline Df(x) = \liminf_{h\to 0} \frac{f(x+h)-f(x)}{h} \quad \overline Df(x) = \limsup_{h\to 0} \frac{f(x+h)-f(x)}{h}.\]

Observe that the set of \(x\in(a,b)\) where \(f\) is not differentiable at \(x\) is the countable union, over \(p<q\in\mathbb Q\), of the sets

\[B_{p,q} := \{x\in (a,b): \underline Df(x) <p < q < \overline Df(x)\}.\]

We now fix \(p<q\in\mathbb Q\).

  1. Let

    \[\mathcal B =\{[x,x+h]:f(x+h)-f(x) <ph\}.\]

    Note that \(\mathcal B\) satisfies the hypotheses of the Vitali covering theorem (recall Theorem 43). Let \(\mathcal B'\) be a disjoint sub-cover obtained from the Vitali covering theorem with respect to Lebesgue measure and let \(S=\cup \mathcal B'\). Prove that

    \[\mathcal L^1(f(B_{p,q}\cap S)) \leq p \mathcal L^1(B_{p,q}\cap S).\]

    Note: this is the step where we require \(f\) to be monotonic.

  2. Similarly, prove that \(\mathcal L^1(f(B_{p,q}\cap S)) \geq q \mathcal L^1(B_{p,q} \cap S)\).

  3. Deduce that \(f\) is differentiable almost everywhere.

  4. Deduce that a BV function is differentiable almost everywhere.

However, BV functions do not satisfy the fundamental theorem of calculus:

Example 37

Exercise 52. Recall the definition of the Cantor set from Exercise 6. Define the Cantor function \(f\colon [0,1] \to [0,1]\) as follows. For each \(n\in\mathbb N\), define \(f_n \colon [0,1] \to [0,1]\) by

\[f(x) = \left(\frac{3}{2}\right)^n\mathcal L^1([0,x]\cap C_n).\]

Show that the \(f_n\) converge uniformly on \([0,1]\) to a monotonic, continuous function \(f\). For each \(x\in [0,1]\setminus C\), show that \(f'(x)=0\).

Thus, \(f\) is monotonic and hence BV, has derivative \(0\) almost everywhere, but does not satisfy the fundamental theorem of calculus.

Example 38

Exercise 53. In lectures we proved that the derivative of any AC function is an absolutely continuous measure. Prove the converse: for any finite, absolutely continuous measure \(\mu\) on \([0,\infty)\), show that

\[f(x):= \int_0^x \frac{\, \mathrm{d}\mu}{\, \mathrm{d}\mathcal L^1}\, \mathrm{d}\mathcal L^1 = \mu([0,x])\]

defines an absolutely continuous function.

Up to now, we have considered points where functions are differentiable. We now consider points of non-differentiability (which are much more interesting).

Example 39

Show that the Cantor function is not differentiable at any point of the Cantor set.

Example 40

Let \(N\subset [0,1]\) satisfy \(\mathcal L^1(N)=0\).

  1. For each \(n\in\mathbb N\), iteratively construct a countable collection of open intervals \(\mathcal O_n\) such that, for each \(n\in\mathbb N\),

    • \(N\) is contained in the union of \(\mathcal O_n\);

    • for every \(I\in \mathcal O_n\) there exists \(J\in\mathcal O_{n-1}\) with \(I\subset J\);

    • for each \(I\in \mathcal O_{n-1}\),

      \[\mathcal L^1(I\cap \cup\{J:J\in \mathcal O_n\}) < 2^{-n} |I|.\]
  2. Let

    \[S= \bigcap_{n\in\mathbb N} \bigcup_{m>n} \cup\{J:J\in\mathcal O_m\},\]

    the “limsup” of the \(\mathcal O_n\) (\(S\) is the set of points that are contained in infinitely many intervals from the \(\mathcal O_n\)). In particular, \(S\supset N\).

    For each \(x\in [0,1]\setminus S\), let \(N(x)\) be the largest \(n\) for which there exists \(I\in \mathcal O_n\) with \(x\in I\). Define \(P(x)=1\) if \(N(x)\) is even, \(P(x)=0\) otherwise. Finally, for each \(x\in[0,1]\) define

    \[f(x) = \mathcal L^1(\{t\in [0,x] : P(t)=1\}).\]

    Show that \(f\) is Lipschitz, monotonic, and not differentiable at any point of \(N\). Hint: show that \(\underline D f(x)=0\) and \(\overline Df(x)=1\) for each \(x\in N\).