Guide12 min read

Multivariable Derivatives: Complete Guide

Multivariable calculus extends differentiation to functions of two or more variables. Instead of a single derivative, a function f(x, y) has partial derivatives — one for each variable — measuring how f changes as each variable changes independently.

The gradient vector ∇f collects all partial derivatives and points in the direction of steepest ascent. Directional derivatives generalize this to measure the rate of change in any direction.

Formula

∂f/∂x (hold y const) | ∂f/∂y (hold x const) | ∇f = (∂f/∂x, ∂f/∂y)

∂f/∂x = Partial derivative with respect to x — treat y as constant∂f/∂y = Partial derivative with respect to y — treat x as constant∇f = Gradient vector — direction of steepest increaseD_u f = Directional derivative in direction u = ∇f · u

Partial Derivatives: Computing and Interpreting

Definition: The partial derivative ∂f/∂x at a point (a, b) is the ordinary derivative of f(x, b) with respect to x — you treat all other variables as constants.

Notation: ∂f/∂x, fₓ, ∂z/∂x, D₁f — all mean the same thing. Mixed partials: fₓᵧ = ∂²f/∂y∂x means differentiate first with respect to x, then y.

Computing example: f(x, y) = 3x²y + y³ − 5x. ∂f/∂x = 6xy − 5 (differentiate treating y as constant). ∂f/∂y = 3x² + 3y² (differentiate treating x as constant).

Geometric meaning: ∂f/∂x at (a, b) is the slope of the curve formed by slicing the surface z = f(x, y) with the plane y = b. Similarly ∂f/∂y is the slope in the y-direction at fixed x = a.

Function f(x,y)∂f/∂x∂f/∂y
x²y³2xy³3x²y²
eˣ sin(y)eˣ sin(y)eˣ cos(y)
ln(x² + y²)2x/(x²+y²)2y/(x²+y²)
x³ + 3xy + y²3x² + 3y3x + 2y
sin(xy)y·cos(xy)x·cos(xy)
√(x² + y²)x/√(x²+y²)y/√(x²+y²)

Higher-Order and Mixed Partial Derivatives

Second partial derivatives: fₓₓ = ∂²f/∂x² (differentiate twice w.r.t. x), fᵧᵧ = ∂²f/∂y², fₓᵧ = ∂²f/∂y∂x (mixed — first x then y).

Clairaut's Theorem: If f and its second partials are continuous, then the mixed partials are equal: fₓᵧ = fᵧₓ. This means the order of differentiation doesn't matter for most functions encountered in practice.

Example: f(x, y) = x²y³. fₓ = 2xy³. fₓₓ = 2y³. fₓᵧ = 6xy². fᵧ = 3x²y². fᵧₓ = 6xy². Confirmed: fₓᵧ = fᵧₓ = 6xy².

Laplacian: The operator ∇²f = fₓₓ + fᵧᵧ (sum of unmixed second partials) appears in physics — harmonic functions satisfy ∇²f = 0 (heat, fluid flow, electrostatics).

The Gradient Vector

Definition: For f(x, y), the gradient is ∇f = (∂f/∂x, ∂f/∂y). For f(x, y, z): ∇f = (∂f/∂x, ∂f/∂y, ∂f/∂z). It is a vector-valued function.

Key property: ∇f at a point (a, b) points in the direction of steepest ascent of the surface z = f(x, y). Its magnitude |∇f| gives the rate of change in that steepest direction.

Level curves: The gradient ∇f is always perpendicular (normal) to the level curves f(x, y) = c. This is why gradient descent works — you always move perpendicular to contours.

Example: f(x, y) = x² + 4y². ∇f = (2x, 8y). At point (1, 1): ∇f = (2, 8). Direction: (2, 8)/|(2,8)| = (2, 8)/√68. The steepest slope is |∇f| = √68 ≈ 8.25.

Directional Derivatives

Definition: The directional derivative D_u f at (a, b) in direction of unit vector u = (u₁, u₂) is: D_u f = ∇f · u = fₓ u₁ + fᵧ u₂.

Interpretation: D_u f measures the instantaneous rate of change of f as you move from (a, b) in direction u. It generalizes ∂f/∂x (which is D_u f for u = (1, 0)) and ∂f/∂y (u = (0, 1)).

Maximum rate of change: The maximum value of D_u f is |∇f|, achieved when u points in the gradient direction. The minimum is −|∇f| (steepest descent). Zero rate of change when u is perpendicular to ∇f (moving along a level curve).

Example: f(x, y) = sin(x) + 2y at (π/2, 0). Direction: 30° from x-axis → u = (cos30°, sin30°) = (√3/2, 1/2). ∇f = (cos x, 2) = (0, 2) at (π/2, 0). D_u f = (0)(√3/2) + (2)(1/2) = 1.

Multivariable Chain Rule

Case 1 — one intermediate variable: If z = f(x, y), x = x(t), y = y(t), then dz/dt = (∂f/∂x)(dx/dt) + (∂f/∂y)(dy/dt).

Case 2 — two intermediate variables: If z = f(x, y), x = x(s, t), y = y(s, t), then ∂z/∂s = (∂f/∂x)(∂x/∂s) + (∂f/∂y)(∂y/∂s) and similarly for ∂z/∂t.

Example: z = x² + y², x = cos(t), y = sin(t). dz/dt = 2x(−sin t) + 2y(cos t) = 2cos(t)(−sin t) + 2sin(t)(cos t) = 0. This makes sense: x² + y² = 1 is constant on the unit circle, so dz/dt = 0.

Implicit differentiation: For F(x, y) = 0, dy/dx = −(∂F/∂x)/(∂F/∂y) — the implicit function theorem in one variable.

Applications of Multivariable Derivatives

Optimization: Local extrema of f(x, y) require ∂f/∂x = 0 and ∂f/∂y = 0 simultaneously. Use the second derivative test (discriminant D = fₓₓfᵧᵧ − (fₓᵧ)²): D > 0 and fₓₓ > 0 → local min; D > 0 and fₓₓ < 0 → local max; D < 0 → saddle point.

Tangent planes: The tangent plane to z = f(x, y) at (a, b, f(a,b)) is: z = f(a,b) + fₓ(a,b)(x−a) + fᵧ(a,b)(y−b). This generalizes the tangent line from single-variable calculus.

Linear approximation: L(x, y) = f(a,b) + fₓ(a,b)(x−a) + fᵧ(a,b)(y−b). For small changes: Δf ≈ fₓ Δx + fᵧ Δy — used in error propagation.

Physics applications: Temperature gradient ∇T tells heat flow direction. Pressure gradient ∇P drives fluid flow (Navier-Stokes). In neural networks, backpropagation computes ∇Loss with respect to all weights via the chain rule.

Frequently Asked Questions