- Theorem
A linear transformation
amounts to multiplication by a uniquely defined matrix; that is, there exists a unique matrix
such that
![{\displaystyle \forall {\vec {v}}\in \mathbb {R} ^{n}:L({\vec {v}})=A{\vec {v}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/88183bb485a0d626048c8b311d56b7485b336f11)
- Proof
We set the column vectors
![{\displaystyle {\begin{pmatrix}a_{1,j}\\a_{2,j}\\\vdots \\a_{n,j}\end{pmatrix}}:=L({\vec {e}}_{j})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a245149f44b389a3dff6e940132dd7e548622dbe)
where
is the standard basis of
. Then we define from this
![{\displaystyle A:={\begin{pmatrix}a_{1,1}&\cdots &a_{1,n}\\\vdots &\ddots &\vdots \\a_{n,1}&\cdots &a_{n,n}\end{pmatrix}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/61151edbcd1ab13f947825fbfd72a70b5171028c)
and note that for any vector
of
we obtain
![{\displaystyle A{\vec {v}}=A\left(\sum _{j=1}^{n}v_{j}{\vec {e}}_{j}\right)=\sum _{j=1}^{n}Av_{j}{\vec {e}}_{j}=\sum _{j=1}^{n}v_{j}L({\vec {e}}_{j})=L\left(\sum _{j=1}^{n}v_{j}{\vec {e}}_{j}\right)=L({\vec {v}})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0b332eee321cce17f4dfb60d61d264b990e4e463)
Thus, we have shown existence. To prove uniqueness, suppose there were any other matrix
with the property that
. Then in particular,
![{\displaystyle B{\vec {e}}_{j}=L({\vec {e}}_{j})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/cd7042776fe5f02f9292dac2f6541896135e198b)
which already implies that
(since all the columns of both matrices are identical).
It is not immediately straightforward how one would generalize the derivative to higher dimensions. For, if we take the definition of the derivative at a point
![{\displaystyle \lim _{h\to 0}{\frac {f(x_{0}+h)-f(x_{0})}{h}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/203d39352152e44adff02431b0ead853f085e10d)
and insert vectors for
and
, we would divide the whole thing by a vector. But this is not defined.
Hence, we shall rephrase the definition of the derivative a bit and cast it into a form where it can be generalized to higher dimensions.
- Theorem
Let
be a one-dimensional function and let
. Then
is differentiable at
if and only if there exists a linear function
such that
![{\displaystyle \lim _{h\to 0}{\frac {{\Big |}f(x_{0}+h)-{\big (}f(x_{0})+l(h){\big )}{\Big |}}{|h|}}=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f44ad5c3c3b580cd63784b1b201e5d20548fc6b2)
We note that according to the above, linear functions
are given by multiplication by a
-matrix, that is, a scalar.
- Proof
First assume that
is differentiable at
. We set
and obtain
![{\displaystyle {\frac {{\Big |}f(x_{0}+h)-{\big (}f(x_{0})+l(h){\big )}{\Big |}}{|h|}}=\left|{\frac {f(x_{0}+h)-f(x_{0})}{h}}-f'(x_{0})\right|}](https://wikimedia.org/api/rest_v1/media/math/render/svg/95e0ca993e6b47cbf52de3ee79631b4546953557)
which converges to 0 due to the definition of
.
Assume now that we are given an
such that
![{\displaystyle \lim _{h\to 0}{\frac {{\Big |}f(x_{0}+h)-{\big (}f(x_{0})+l(h){\big )}{\Big |}}{|h|}}=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f44ad5c3c3b580cd63784b1b201e5d20548fc6b2)
Let
be the scalar associated to
. Then by an analogous computation
.
With the latter formulation of differentiability from the above theorem, we may readily generalize to higher dimensions, since division by the Euclidean norm of a vector is defined, and linear mappings are also defined in higher dimensions.
- Definition
A function
is called differentiable or totally differentiable at a point
if and only if there exists a linear function
such that
![{\displaystyle \lim _{{\vec {h}}\to 0}{\frac {{\Big \|}f(x_{0}+{\vec {h}})-{\big (}f(x_{0})+L({\vec {h}}){\big )}{\Big \|}}{\|{\vec {h}}\|}}=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ad918530e33b154cecb961be6e324f13825b397a)
We have already proven that this definition coincides with the usual one in the one-dim. case (that is
).
We have the following theorem:
- Theorem
Let
be a set, let
be an interior point of
, and let
be a function differentiable at
. Then the linear map
such that
![{\displaystyle \lim _{{\vec {h}}\to 0}{\frac {{\Big \|}f(x_{0}+{\vec {h}})-{\big (}f(x_{0})+L({\vec {h}}){\big )}{\Big \|}}{\|{\vec {h}}\|}}=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ad918530e33b154cecb961be6e324f13825b397a)
is unique; that is, there exists only one such map
.
- Proof
Since
is an interior point of
, we find
such that
. Let now
be any other linear mapping with the property that
![{\displaystyle \lim _{{\vec {h}}\to 0}{\frac {{\Big \|}f(x_{0}+{\vec {h}})-{\big (}f(x_{0})+K({\vec {h}}){\big )}{\Big \|}}{\|{\vec {h}}\|}}=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/b448258dc3147525603b339aab38a4c8542e290f)
We note that for all vectors of the standard basis
, the numbers
for
are contained within
. Hence, we obtain by the triangle inequality
![{\displaystyle {\Big \|}L({\vec {e}}_{j})-K({\vec {e}}_{j}){\Big \|}={\frac {{\bigl \|}L(\lambda {\vec {e}}_{j})-K(\lambda {\vec {e}}_{j}){\bigr \|}}{\|\lambda {\vec {e}}_{j}\|}}\leq {\frac {{\Big \|}f(x_{0}+\lambda {\vec {e}}_{j})-{\big (}f(x_{0})+L(\lambda {\vec {e}}_{j}){\big )}{\Big \|}}{\|\lambda {\vec {e}}_{j}\|}}+{\frac {{\Big \|}f(x_{0}+\lambda {\vec {e}}_{j})-{\big (}f(x_{0})+K(\lambda {\vec {e}}_{j}){\big )}{\Big \|}}{\|\lambda {\vec {e}}_{j}\|}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0f15fbe1a2e9136750eb597539c301f5a7269544)
Taking
, we see that
. Thus,
and
coincide on all basis vectors, and since every other vector can be expressed as a linear combination of those, by linearity of
and
we obtain
.
Thus, the following definition is justified:
- Definition
Let
be a function (where
is a subset of
), and let
be an interior point of
such that
is differentiable at
. Then the unique linear function
such that
![{\displaystyle \lim _{{\vec {h}}\to 0}{\frac {{\Big \|}f(x_{0}+{\vec {h}})-{\big (}f(x_{0})+L({\vec {h}}){\big )}{\Big \|}}{\|{\vec {h}}\|}}=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ad918530e33b154cecb961be6e324f13825b397a)
is called the differential of
at
and is denoted
.
We shall first define directional derivatives.
- Definition
Let
be a function, and let
be a vector. If the limit
![{\displaystyle \lim _{h\to 0}{\frac {f(x_{0}+h{\vec {v}})-f(x_{0})}{h}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/11dec346236fb280dd3c220f42126af018a0499d)
exists, it is called directional derivative of
in direction
. We denote it by
.
The following theorem relates directional derivatives and the differential of a totally differentiable function:
- Theorem
Let
be a function that is totally differentiable at
, and let
be a nonzero vector. Then
exists and is equal to
.
- Proof
According to the very definition of total differentiability,
![{\displaystyle \lim _{h\to 0}\left\|{\frac {f(x_{0}+h{\vec {v}})-f(x_{0})}{|h|\cdot \|{\vec {v}}\|}}-{\frac {f'(x_{0}){\vec {v}}}{|h|\cdot \|{\vec {v}}\|}}\right\|=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/70e25845e7b61f36bfe5f534cc7f4f986ba0fe04)
Hence,
![{\displaystyle \lim _{h\to 0}\left\|{\frac {f(x_{0}+h{\vec {v}})-f(x_{0})}{|h|}}-{\frac {f'(x_{0}){\vec {v}}}{|h|}}\right\|=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/9f92361db7a86666652105f2cbd5484d884bd4a3)
by multiplying the above equation by
. Noting that
![{\displaystyle \left\|{\frac {f(x_{0}+h{\vec {v}})-f(x_{0})}{|h|}}-{\frac {f'(x_{0}){\vec {v}}}{|h|}}\right\|=\left\|{\frac {f(x_{0}+h{\vec {v}})-f(x_{0})}{h}}-{\frac {f'(x_{0}){\vec {v}}}{h}}\right\|}](https://wikimedia.org/api/rest_v1/media/math/render/svg/63472bd100cb389a624d237087408eaac1a1cfb8)
the theorem follows.
A special case of directional derivatives are partial derivatives:
- Definition
Let
be the standard basis of
, let
and let
be a function such that the directional derivatives
all exist. Then we set
![{\displaystyle {\frac {\partial f}{\partial x_{j}}}:=D_{{\vec {e}}_{j}}f(x_{0})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/942448e82aa75292ca3bb8b90d68ba600a08f46a)
and call it the partial derivative in the direction of
.
In fact, by writing down the definition of
, we see that the partial derivative in the direction of
is nothing else than the derivative of the function
in the variable
at the place
. That is, for instance, if
![{\displaystyle f(x,y,z)=x^{2}+4z^{3}+3xy}](https://wikimedia.org/api/rest_v1/media/math/render/svg/9e2116bf968565a8c3a1b6e76ee38b99fb1df522)
then
![{\displaystyle {\frac {\partial f}{\partial x}}=2x+3y\ ,\ {\frac {\partial f}{\partial y}}=3x\ ,\ {\frac {\partial f}{\partial z}}=12z^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d909af0cd9db229269d65ec40d512dd451288280)
that is, when forming a partial derivative, we regard the other variables as constant and derive only with respect to the variable we are considering.
From the above, we know that the differential of a function
has an associated matrix representing the linear map thus defined. Under a condition, we can determine this matrix from the partial derivatives of the component functions.
- Theorem
Let
be a function such that all partial derivatives exist at
and are continuous in each component on
for a possibly very small, but positive
. Then
is totally differentiable at
and the differential of
is given by left multiplication by the matrix
![{\displaystyle J_{f}(x_{0}):={\begin{pmatrix}{\dfrac {\partial f_{1}}{\partial x_{1}}}&\cdots &{\dfrac {\partial f_{1}}{\partial x_{m}}}\\\vdots &\ddots &\vdots \\{\dfrac {\partial f_{n}}{\partial x_{1}}}&\cdots &{\dfrac {\partial f_{n}}{\partial x_{m}}}\end{pmatrix}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5ca157419e3d3ffa3ea8e3ad083bd3f3624ce92f)
where
.
The matrix
is called the Jacobian matrix.
- Proof
|
|
|
|
We shall now prove that all summands of the last sum go to 0.
Indeed, let
. Writing again
, we obtain by the one-dimensional mean value theorem, first applied in the first variable, then in the second and so on, the succession of equations
![{\displaystyle f_{j}(x_{0}+h_{1}{\vec {e}}_{1})-f_{j}(x_{0})=\overbrace {(x_{0,1}+h_{1}-x_{0,1})} ^{=h_{1}}{\frac {\partial f_{j}}{\partial x_{1}}}(x_{0}+t_{1}{\vec {e}}_{1})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f83489cd0322382a6464ffda12648fa924792848)
![{\displaystyle f_{j}(x_{0}+h_{1}{\vec {e}}_{1}+h_{2}{\vec {e}}_{2})-f_{j}(x_{0}+h_{1}{\vec {e}}_{1})=\overbrace {(x_{0,2}+h_{2}-x_{0,2})} ^{=h_{2}}{\frac {\partial f_{j}}{\partial x_{2}}}(x_{0}+h_{1}{\vec {e}}_{1}+t_{2}{\vec {e}}_{2})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ffd6160e6b829a0072b5d1666dd42e161aed3b0a)
![{\displaystyle \vdots }](https://wikimedia.org/api/rest_v1/media/math/render/svg/f8039d9feb6596ae092e5305108722975060c083)
![{\displaystyle f_{j}(x_{0}+h_{1}{\vec {e}}_{1}+\cdots +h_{m}{\vec {e}}_{m})-f_{j}(x_{0}+h_{1}{\vec {e}}_{1}+\cdots +h_{m-1}{\vec {e}}_{m-1})=\overbrace {(x_{0,m}+h_{m}-x_{0,m})} ^{=h_{m}}{\frac {\partial f_{j}}{\partial x_{m}}}(x_{0}+h_{1}{\vec {e}}_{1}+\cdots +h_{m-1}{\vec {e}}_{m-1}+t_{n}{\vec {e}}_{m})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f7f0af6d22f8b78918c766029e59fc7b3c4e3544)
for suitably chosen
. We can now sum all these equations together to obtain
![{\displaystyle f_{j}(x_{0}+{\vec {h}})-f(x_{0})=\sum _{k=1}^{m}h_{k}{\frac {\partial f_{j}}{\partial x_{k}}}\left(x_{0}+\sum _{l=1}^{k-1}h_{l}{\vec {e}}_{l}+t_{k}{\vec {e}}_{k}\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/9dfac62eec69f4bda1d8d812fbc3188ab24a064d)
Let now
. Using the continuity of the
on
, we may choose
such that
![{\displaystyle \left|{\frac {\partial f_{j}}{\partial x_{k}}}\left(x_{0}+\sum _{l=1}^{k-1}h_{l}{\vec {e}}_{l}+t_{k}e_{k}\right)-{\frac {\partial f_{j}}{\partial x_{m}}}(x_{0})\right|<{\frac {\epsilon }{m}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/aac98dcc6e4f13caaef8b029c5f3c7a21eaf72f2)
for
, given that
(which we may assume as
). Hence, we obtain
![{\displaystyle {\frac {\left\|f_{j}(x_{0}+h)-\left(f_{j}(x_{0})+\displaystyle \sum _{k=1}^{m}h_{k}{\frac {\partial f_{j}}{\partial x_{m}}}(x_{0})\right)\right\|}{\|{\vec {h}}\|}}\leq {\frac {\|{\vec {h}}\|\cdot m\cdot {\frac {\epsilon }{m}}}{\|{\vec {h}}\|}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7bc967828e1cbee5fb99e692d9e29288f658a0b8)
and thus the theorem.
- Corollary
If
is continuously differentiable at
and
, then
![{\displaystyle D_{\vec {v}}f(x_{0})=\sum _{j=1}^{m}v_{j}{\frac {\partial f}{\partial x_{j}}}(x_{0})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d65a4567576fb5cf9e33694b40c025d7f7a30428)
- Proof
![{\displaystyle D_{\vec {v}}f(x_{0})=f'(x_{0})({\vec {v}})=J_{f}(x_{0}){\vec {v}}=\sum _{j=1}^{m}v_{j}{\frac {\partial f}{\partial x_{j}}}(x_{0})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a408afe8ae941363d51b7fe02293ffede382de31)
![{\displaystyle \Box }](https://wikimedia.org/api/rest_v1/media/math/render/svg/029b77f09ebeaf7528fc831fe57848be51f2240b)