Calculus/Directional derivatives and the gradient vector

Directional derivatives

Normally, a partial derivative of a function with respect to one of its variables, say, x_j, takes the derivative of that "slice" of that function parallel to the x_j'th axis.

More precisely, we can think of cutting a function f(x₁,...,x_n) in space along the x_j'th axis, with keeping everything but the x_j variable constant.

From the definition, we have the partial derivative at a point p of the function along this slice as

{\partial \mathbf {f}  \over \partial x_{j}}=\lim _{t\rightarrow 0}{\mathbf {f} (\mathbf {p} +t\mathbf {e} _{j})-\mathbf {f} (\mathbf {p} ) \over t}

provided this limit exists.

Instead of the basis vector, which corresponds to taking the derivative along that axis, we can pick a vector in any direction (which we usually take as being a unit vector), and we take the directional derivative of a function as

{\partial \mathbf {f}  \over \partial \mathbf {d} }=\lim _{t\rightarrow 0}{\mathbf {f} (\mathbf {p} +t\mathbf {d} )-\mathbf {f} (\mathbf {p} ) \over t}

where d is the direction vector.

If we want to calculate directional derivatives, calculating them from the limit definition is rather painful, but, we have the following: if f : Rⁿ → R is differentiable at a point p, |p|=1,

{\partial \mathbf {f}  \over \partial \mathbf {d} }=D_{\mathbf {p} }\mathbf {f} (\mathbf {d} )

There is a closely related formulation which we'll look at in the next section.

Gradient vectors

The partial derivatives of a scalar tell us how much it changes if we move along one of the axes. What if we move in a different direction?

We'll call the scalar f, and consider what happens if we move an infintesimal direction dr=(dx,dy,dz), using the chain rule.

\mathbf {df} =dx{\frac {\partial f}{\partial x}}+dy{\frac {\partial f}{\partial y}}+dz{\frac {\partial f}{\partial z}}

This is the dot product of dr with a vector whose components are the partial derivatives of f, called the gradient of f

$\operatorname {grad} \mathbf {f} =\nabla \mathbf {f} =\left({\frac {\partial \mathbf {f} (\mathbf {p} )}{\partial x_{1}}},\cdots ,{\frac {\partial \mathbf {f} (\mathbf {p} )}{\partial x_{n}}}\right)$

We can form directional derivatives at a point p, in the direction d then by taking the dot product of the gradient with d

{\partial \mathbf {f} (\mathbf {p} ) \over \partial \mathbf {d} }=\mathbf {d} \cdot \nabla \mathbf {f} (\mathbf {p} )

.

Notice that grad f looks like a vector multiplied by a scalar. This particular combination of partial derivatives is commonplace, so we abbreviate it to

\nabla =\left({\frac {\partial }{\partial x}},{\frac {\partial }{\partial y}},{\frac {\partial }{\partial z}}\right)

We can write the action of taking the gradient vector by writing this as an operator. Recall that in the one-variable case we can write d/dx for the action of taking the derivative with respect to x. This case is similar, but ∇ acts like a vector.

We can also write the action of taking the gradient vector as:

\nabla =\left({\frac {\partial }{\partial x_{1}}},{\frac {\partial }{\partial x_{2}}},\cdots {\frac {\partial }{\partial x_{n}}}\right)

Properties of the gradient vector

Geometry

Grad f(p) is a vector pointing in the direction of steepest slope of f. |grad f(p)| is the rate of change of that slope at that point.

For example, if we consider h(x, y)=x²+y². The level sets of h are concentric circles, centred on the origin, and

\nabla h=(h_{x},h_{y})=2(x,y)=2\mathbf {r}

grad h points directly away from the origin, at right angles to the contours.

Along a level set, (∇f)(p) is perpendicular to the level set {x|f(x)=f(p) at x=p}.

If dr points along the contours of f, where the function is constant, then df will be zero. Since df is a dot product, that means that the two vectors, df and grad f, must be at right angles, i.e. the gradient is at right angles to the contours.

Algebraic properties

Like d/dx, ∇ is linear. For any pair of constants, a and b, and any pair of scalar functions, f and g

{\frac {d}{dx}}(af+bg)=a{\frac {d}{dx}}f+b{\frac {d}{dx}}g\quad \nabla (af+bg)=a\nabla f+b\nabla g

Since it's a vector, we can try taking its dot and cross product with other vectors, and with itself.