Multivariable Derivatives: Complete Guide
Multivariable calculus extends differentiation to functions of two or more variables. Instead of a single derivative, a function f(x, y) has partial derivatives — one for each variable — measuring how f changes as each variable changes independently.
The gradient vector ∇f collects all partial derivatives and points in the direction of steepest ascent. Directional derivatives generalize this to measure the rate of change in any direction.
Formula
∂f/∂x (hold y const) | ∂f/∂y (hold x const) | ∇f = (∂f/∂x, ∂f/∂y)
Partial Derivatives: Computing and Interpreting
Definition: The partial derivative ∂f/∂x at a point (a, b) is the ordinary derivative of f(x, b) with respect to x — you treat all other variables as constants.
Notation: ∂f/∂x, fₓ, ∂z/∂x, D₁f — all mean the same thing. Mixed partials: fₓᵧ = ∂²f/∂y∂x means differentiate first with respect to x, then y.
Computing example: f(x, y) = 3x²y + y³ − 5x. ∂f/∂x = 6xy − 5 (differentiate treating y as constant). ∂f/∂y = 3x² + 3y² (differentiate treating x as constant).
Geometric meaning: ∂f/∂x at (a, b) is the slope of the curve formed by slicing the surface z = f(x, y) with the plane y = b. Similarly ∂f/∂y is the slope in the y-direction at fixed x = a.
| Function f(x,y) | ∂f/∂x | ∂f/∂y |
|---|---|---|
| x²y³ | 2xy³ | 3x²y² |
| eˣ sin(y) | eˣ sin(y) | eˣ cos(y) |
| ln(x² + y²) | 2x/(x²+y²) | 2y/(x²+y²) |
| x³ + 3xy + y² | 3x² + 3y | 3x + 2y |
| sin(xy) | y·cos(xy) | x·cos(xy) |
| √(x² + y²) | x/√(x²+y²) | y/√(x²+y²) |
Higher-Order and Mixed Partial Derivatives
Second partial derivatives: fₓₓ = ∂²f/∂x² (differentiate twice w.r.t. x), fᵧᵧ = ∂²f/∂y², fₓᵧ = ∂²f/∂y∂x (mixed — first x then y).
Clairaut's Theorem: If f and its second partials are continuous, then the mixed partials are equal: fₓᵧ = fᵧₓ. This means the order of differentiation doesn't matter for most functions encountered in practice.
Example: f(x, y) = x²y³. fₓ = 2xy³. fₓₓ = 2y³. fₓᵧ = 6xy². fᵧ = 3x²y². fᵧₓ = 6xy². Confirmed: fₓᵧ = fᵧₓ = 6xy².
Laplacian: The operator ∇²f = fₓₓ + fᵧᵧ (sum of unmixed second partials) appears in physics — harmonic functions satisfy ∇²f = 0 (heat, fluid flow, electrostatics).
The Gradient Vector
Definition: For f(x, y), the gradient is ∇f = (∂f/∂x, ∂f/∂y). For f(x, y, z): ∇f = (∂f/∂x, ∂f/∂y, ∂f/∂z). It is a vector-valued function.
Key property: ∇f at a point (a, b) points in the direction of steepest ascent of the surface z = f(x, y). Its magnitude |∇f| gives the rate of change in that steepest direction.
Level curves: The gradient ∇f is always perpendicular (normal) to the level curves f(x, y) = c. This is why gradient descent works — you always move perpendicular to contours.
Example: f(x, y) = x² + 4y². ∇f = (2x, 8y). At point (1, 1): ∇f = (2, 8). Direction: (2, 8)/|(2,8)| = (2, 8)/√68. The steepest slope is |∇f| = √68 ≈ 8.25.
Directional Derivatives
Definition: The directional derivative D_u f at (a, b) in direction of unit vector u = (u₁, u₂) is: D_u f = ∇f · u = fₓ u₁ + fᵧ u₂.
Interpretation: D_u f measures the instantaneous rate of change of f as you move from (a, b) in direction u. It generalizes ∂f/∂x (which is D_u f for u = (1, 0)) and ∂f/∂y (u = (0, 1)).
Maximum rate of change: The maximum value of D_u f is |∇f|, achieved when u points in the gradient direction. The minimum is −|∇f| (steepest descent). Zero rate of change when u is perpendicular to ∇f (moving along a level curve).
Example: f(x, y) = sin(x) + 2y at (π/2, 0). Direction: 30° from x-axis → u = (cos30°, sin30°) = (√3/2, 1/2). ∇f = (cos x, 2) = (0, 2) at (π/2, 0). D_u f = (0)(√3/2) + (2)(1/2) = 1.
Multivariable Chain Rule
Case 1 — one intermediate variable: If z = f(x, y), x = x(t), y = y(t), then dz/dt = (∂f/∂x)(dx/dt) + (∂f/∂y)(dy/dt).
Case 2 — two intermediate variables: If z = f(x, y), x = x(s, t), y = y(s, t), then ∂z/∂s = (∂f/∂x)(∂x/∂s) + (∂f/∂y)(∂y/∂s) and similarly for ∂z/∂t.
Example: z = x² + y², x = cos(t), y = sin(t). dz/dt = 2x(−sin t) + 2y(cos t) = 2cos(t)(−sin t) + 2sin(t)(cos t) = 0. This makes sense: x² + y² = 1 is constant on the unit circle, so dz/dt = 0.
Implicit differentiation: For F(x, y) = 0, dy/dx = −(∂F/∂x)/(∂F/∂y) — the implicit function theorem in one variable.
Applications of Multivariable Derivatives
Optimization: Local extrema of f(x, y) require ∂f/∂x = 0 and ∂f/∂y = 0 simultaneously. Use the second derivative test (discriminant D = fₓₓfᵧᵧ − (fₓᵧ)²): D > 0 and fₓₓ > 0 → local min; D > 0 and fₓₓ < 0 → local max; D < 0 → saddle point.
Tangent planes: The tangent plane to z = f(x, y) at (a, b, f(a,b)) is: z = f(a,b) + fₓ(a,b)(x−a) + fᵧ(a,b)(y−b). This generalizes the tangent line from single-variable calculus.
Linear approximation: L(x, y) = f(a,b) + fₓ(a,b)(x−a) + fᵧ(a,b)(y−b). For small changes: Δf ≈ fₓ Δx + fᵧ Δy — used in error propagation.
Physics applications: Temperature gradient ∇T tells heat flow direction. Pressure gradient ∇P drives fluid flow (Navier-Stokes). In neural networks, backpropagation computes ∇Loss with respect to all weights via the chain rule.