MODULE 0.1
| Difficulty | Beginner |
| Estimated Time | 3–4 hours |
| Prerequisites | High school algebra, basic trigonometry |
Differential equations are the language of change. Every model in physics, biology, engineering, and economics that describes how a quantity evolves over time or space ultimately rests on two pillars: calculus (the machinery of derivatives and integrals) and linear algebra (the framework for systems, transformations, and stability analysis).
Without fluency in differentiation rules you cannot even write down the equations; without integration techniques you cannot solve them analytically; and without matrix theory you cannot handle systems of equations or analyse stability of equilibria. This module rebuilds those foundations concisely so that every subsequent module can proceed at full speed.
If you can already differentiate products, integrate by parts, and compute eigenvalues confidently, skim the definitions and jump to the exercises. If not, work through every example — the time invested here will repay itself tenfold in Levels 1 through 4.
After completing this module you will be able to:
The derivative of a function \(f\) at a point \(x=a\) is defined as
$$ f'(a) = \lim_{h \to 0} \frac{f(a+h) - f(a)}{h}, $$provided the limit exists. When the limit exists for every point in an open interval, we say \(f\) is differentiable on that interval.
Let \(f\) be bounded on \([a,b]\). Partition the interval into \(n\) subintervals of width \(\Delta x_k\) with sample points \(x_k^*\). The definite integral is
$$ \int_a^b f(x)\,dx = \lim_{n\to\infty} \sum_{k=1}^{n} f(x_k^*)\,\Delta x_k, $$provided the limit exists and is independent of the choice of partition and sample points.
If \(f\) is infinitely differentiable at \(x=a\), its Taylor series about \(a\) is
$$ f(x) = \sum_{n=0}^{\infty} \frac{f^{(n)}(a)}{n!}(x-a)^n. $$When \(a=0\) this is called the Maclaurin series. The series converges to \(f(x)\) inside the radius of convergence \(R\).
An \(m \times n\) matrix \(A = [a_{ij}]\) is a rectangular array of real (or complex) numbers with \(m\) rows and \(n\) columns. If \(A\) is \(m\times p\) and \(B\) is \(p\times n\), their product \(C = AB\) is the \(m\times n\) matrix with entries
$$ c_{ij} = \sum_{k=1}^{p} a_{ik}\,b_{kj}. $$Matrix multiplication is associative and distributive but not commutative in general.
Let \(A\) be an \(n\times n\) matrix. A scalar \(\lambda\) is an eigenvalue of \(A\) if there exists a nonzero vector \(\mathbf{v}\) such that
$$ A\mathbf{v} = \lambda\mathbf{v}. $$The vector \(\mathbf{v}\) is called an eigenvector associated with \(\lambda\). Eigenvalues are found by solving the characteristic equation \(\det(A - \lambda I) = 0\).
A vector space over \(\mathbb{R}\) is a set \(V\) equipped with two operations — vector addition and scalar multiplication — satisfying the following axioms for all \(\mathbf{u}, \mathbf{v}, \mathbf{w} \in V\) and all \(\alpha, \beta \in \mathbb{R}\):
For a \(2\times 2\) matrix \(A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}\), the determinant is \(\det(A) = ad - bc\). For an \(n\times n\) matrix the determinant is defined recursively by cofactor expansion along any row or column:
$$ \det(A) = \sum_{j=1}^{n} (-1)^{i+j}\, a_{ij}\, M_{ij}, $$where \(M_{ij}\) is the determinant of the \((n-1)\times(n-1)\) submatrix obtained by deleting row \(i\) and column \(j\). A matrix is invertible if and only if \(\det(A)\neq 0\).
Part I. If \(f\) is continuous on \([a,b]\) and \(F(x) = \int_a^x f(t)\,dt\), then \(F\) is differentiable on \((a,b)\) and \(F'(x) = f(x)\).
Part II. If \(F\) is any antiderivative of \(f\) on \([a,b]\) (i.e., \(F' = f\)), then
$$ \int_a^b f(x)\,dx = F(b) - F(a). $$If \(f\) has \(n+1\) continuous derivatives on an interval containing \(a\) and \(x\), then
$$ f(x) = \sum_{k=0}^{n} \frac{f^{(k)}(a)}{k!}(x-a)^k + R_n(x), $$where the Lagrange remainder is
$$ R_n(x) = \frac{f^{(n+1)}(c)}{(n+1)!}(x-a)^{n+1} $$for some \(c\) between \(a\) and \(x\). This provides an explicit error bound for polynomial approximations.
If \(A\) is a real symmetric \(n\times n\) matrix (\(A = A^T\)), then:
Problem. Evaluate \(\displaystyle\int \frac{x}{(x-1)(x+2)}\,dx\).
Write
$$ \frac{x}{(x-1)(x+2)} = \frac{A}{x-1} + \frac{B}{x+2}. $$Multiply both sides by \((x-1)(x+2)\):
$$ x = A(x+2) + B(x-1). $$Set \(x=1\): \(1 = A(3) \Rightarrow A = \frac{1}{3}\).
Set \(x=-2\): \(-2 = B(-3) \Rightarrow B = \frac{2}{3}\).
Partial fractions reduce a complicated rational integrand into a sum of simple logarithmic integrals. This technique will reappear when solving linear ODEs via the Laplace transform (Level 1, Module 5) and when performing inverse Laplace transforms.
Problem. Find all eigenvalues and eigenvectors of \(A = \begin{pmatrix} 4 & 1 \\ 2 & 3 \end{pmatrix}\).
Expanding:
$$ \lambda^2 - 7\lambda + 10 = 0 \quad\Rightarrow\quad (\lambda - 5)(\lambda - 2) = 0. $$So \(\lambda_1 = 5\) and \(\lambda_2 = 2\).
Solve \((A - 5I)\mathbf{v} = \mathbf{0}\):
$$ \begin{pmatrix} -1 & 1 \\ 2 & -2 \end{pmatrix}\begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix}. $$Both rows give \(-v_1 + v_2 = 0\), so \(v_2 = v_1\). Choosing \(v_1 = 1\):
$$ \mathbf{v}_1 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}. $$Solve \((A - 2I)\mathbf{v} = \mathbf{0}\):
$$ \begin{pmatrix} 2 & 1 \\ 2 & 1 \end{pmatrix}\begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix}. $$This gives \(2v_1 + v_2 = 0\), so \(v_2 = -2v_1\). Choosing \(v_1 = 1\):
$$ \mathbf{v}_2 = \begin{pmatrix} 1 \\ -2 \end{pmatrix}. $$The matrix \(A\) stretches vectors along the direction \((1,1)^T\) by a factor of 5 and along \((1,-2)^T\) by a factor of 2. In Level 1, eigenvalues of the coefficient matrix will determine the qualitative behaviour (growth, decay, oscillation) of solutions to systems of linear ODEs.
Problem. Find the Maclaurin series for \(e^x\) up to order 4, and bound the error when approximating \(e^{0.5}\).
Since all derivatives of \(e^x\) are \(e^x\), evaluated at \(a=0\) each gives \(f^{(n)}(0) = 1\). Thus:
$$ e^x \approx 1 + x + \frac{x^2}{2} + \frac{x^3}{6} + \frac{x^4}{24}. $$The actual value is \(e^{0.5} \approx 1.64872\).
By Taylor's theorem, \(|R_4(0.5)| \le \frac{e^c}{5!}(0.5)^5\) for some \(c\in[0,0.5]\). Since \(e^c \le e^{0.5} < 2\):
$$ |R_4(0.5)| \le \frac{2}{120}\cdot\frac{1}{32} = \frac{2}{3840} \approx 0.00052. $$The actual error \(|1.64872 - 1.64844| = 0.00028\) is indeed within this bound.
Taylor polynomial approximations are the foundation of series solution methods for differential equations (Level 1, Module 6). Knowing how to bound the remainder tells you how many terms you need for a desired accuracy.
The following code blocks are designed to run in a Pyodide (browser-based Python) environment or in any standard Python 3 installation with NumPy. Copy each block and experiment with different inputs.
This lab demonstrates matrix multiplication, determinant computation, matrix inversion, and eigenvalue/eigenvector calculation.
import numpy as np
# Define two matrices
A = np.array([[4, 1],
[2, 3]])
B = np.array([[1, 0],
[0, 2]])
# Matrix multiplication
print("A @ B =")
print(A @ B)
print("\nB @ A =")
print(B @ A)
print("\nNote: A @ B != B @ A (non-commutativity)")
# Determinant
det_A = np.linalg.det(A)
print(f"\ndet(A) = {det_A:.4f}")
# Inverse
A_inv = np.linalg.inv(A)
print(f"\nA^(-1) =\n{A_inv}")
print(f"\nVerification A @ A^(-1) =\n{A @ A_inv}")
# Eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print(f"\nEigenvalues: {eigenvalues}")
print(f"Eigenvectors (as columns):\n{eigenvectors}")
# Verify: A @ v = lambda * v for first eigenpair
lam1 = eigenvalues[0]
v1 = eigenvectors[:, 0]
print(f"\nA @ v1 = {A @ v1}")
print(f"lambda1 * v1 = {lam1 * v1}")
print(f"Match: {np.allclose(A @ v1, lam1 * v1)}")
Expected output highlights: Eigenvalues are 5.0 and 2.0, matching our hand calculation in Example 2. The product \(A \cdot B\) differs from \(B \cdot A\), confirming non-commutativity.
We approximate the derivative of \(f(x) = \sin(x)\) using central differences and the integral of \(f(x) = e^{-x^2}\) using the trapezoidal rule.
import numpy as np
# --- Numerical Differentiation ---
def central_diff(f, x, h=1e-5):
"""Central difference approximation of f'(x)."""
return (f(x + h) - f(x - h)) / (2 * h)
f = np.sin
x0 = np.pi / 4
approx_deriv = central_diff(f, x0)
exact_deriv = np.cos(x0)
print("=== Numerical Differentiation ===")
print(f"f(x) = sin(x) at x = pi/4")
print(f"Approximate f'(x) = {approx_deriv:.10f}")
print(f"Exact f'(x) = cos(pi/4) = {exact_deriv:.10f}")
print(f"Absolute error = {abs(approx_deriv - exact_deriv):.2e}")
# --- Numerical Integration (Trapezoidal Rule) ---
def trapezoidal(f, a, b, n):
"""Composite trapezoidal rule with n subintervals."""
x = np.linspace(a, b, n + 1)
y = f(x)
h = (b - a) / n
return h * (0.5 * y[0] + np.sum(y[1:-1]) + 0.5 * y[-1])
g = lambda x: np.exp(-x**2)
print("\n=== Numerical Integration ===")
print("Integral of exp(-x^2) from 0 to 1")
for n in [10, 100, 1000]:
result = trapezoidal(g, 0, 1, n)
print(f" n = {n:>5d}: I = {result:.10f}")
# Compare with known value: integral = sqrt(pi)/2 * erf(1)
from math import erf, sqrt, pi
exact_integral = sqrt(pi) / 2 * erf(1)
print(f" Exact: I = {exact_integral:.10f}")
Expected output highlights: The central difference approximation of \(\cos(\pi/4)\) is accurate to about \(10^{-11}\). The trapezoidal rule converges to the exact value \(\frac{\sqrt{\pi}}{2}\,\text{erf}(1) \approx 0.7468241328\) as \(n\) increases.
Imagine a calculus student as a reinforcement-learning agent whose task is to evaluate integrals. We can formalise this as follows:
| RL Concept | Calculus Analogue |
|---|---|
| State | The current form of the integrand, e.g., \(\frac{x}{(x-1)(x+2)}\). The state encodes the algebraic structure: is it a product, a composition, a rational function, a trigonometric expression? |
| Action | The integration technique selected: substitution (\(u\)-sub), integration by parts, partial fractions, trigonometric identities, or "simplify first." The agent must choose one action per step. |
| Reward / Utility | A positive reward for reaching a closed-form antiderivative. A small negative cost for each step taken (penalising inefficient strategies). A large negative penalty for an incorrect manipulation (sign error, wrong decomposition form). |
| Learning / Policy Update | After many practice problems the student internalises a policy: "If I see a rational function with distinct linear factors, apply partial fractions. If I see a product of a polynomial and an exponential, use integration by parts." This policy maps states to actions and improves with experience — exactly like Q-learning or policy gradient. |
This framing reveals why practice is effective: each solved integral updates the student's internal policy, strengthening the association between structural features of the integrand (state) and the technique that leads most efficiently to a solution (optimal action). It also explains why students who memorise formulas without practising (no exploration) develop brittle policies that fail on novel integrands.
Exercise A1. Differentiate \(f(x) = x^2\,e^{3x}\,\sin(x)\) using the product rule (treat as a product of three functions).
Using the product rule for three functions \((uvw)' = u'vw + uv'w + uvw'\):
Let \(u = x^2\), \(v = e^{3x}\), \(w = \sin(x)\).
$$ f'(x) = 2x\,e^{3x}\sin(x) + 3x^2\,e^{3x}\sin(x) + x^2\,e^{3x}\cos(x). $$Factor out \(x\,e^{3x}\):
$$ f'(x) = x\,e^{3x}\bigl[2\sin(x) + 3x\sin(x) + x\cos(x)\bigr]. $$Exercise A2. Evaluate \(\displaystyle\int x^2\,e^{-x}\,dx\) using integration by parts (you will need to apply it twice).
First application: let \(u = x^2\), \(dv = e^{-x}\,dx\), so \(du = 2x\,dx\), \(v = -e^{-x}\).
$$ \int x^2 e^{-x}\,dx = -x^2 e^{-x} + 2\int x\,e^{-x}\,dx. $$Second application: let \(u = x\), \(dv = e^{-x}\,dx\), so \(du = dx\), \(v = -e^{-x}\).
$$ \int x\,e^{-x}\,dx = -x\,e^{-x} + \int e^{-x}\,dx = -x\,e^{-x} - e^{-x}. $$Combining:
$$ \int x^2 e^{-x}\,dx = -x^2 e^{-x} + 2\bigl(-x\,e^{-x} - e^{-x}\bigr) + C = -e^{-x}(x^2 + 2x + 2) + C. $$Exercise A3. Compute the determinant and inverse of \(A = \begin{pmatrix} 2 & 1 & 0 \\ 1 & 3 & 1 \\ 0 & 1 & 2 \end{pmatrix}\).
Cofactor expansion along the first row:
$$ \det(A) = 2(3\cdot2 - 1\cdot1) - 1(1\cdot2 - 1\cdot0) + 0 = 2(5) - 1(2) = 8. $$Since \(\det(A) = 8 \neq 0\), \(A\) is invertible. The matrix of cofactors is:
$$ C = \begin{pmatrix} 5 & -2 & 1 \\ -2 & 4 & -2 \\ 1 & -2 & 5 \end{pmatrix}. $$The adjugate is \(\text{adj}(A) = C^T = \begin{pmatrix} 5 & -2 & 1 \\ -2 & 4 & -2 \\ 1 & -2 & 5 \end{pmatrix}\) (symmetric in this case). Therefore:
$$ A^{-1} = \frac{1}{8}\begin{pmatrix} 5 & -2 & 1 \\ -2 & 4 & -2 \\ 1 & -2 & 5 \end{pmatrix}. $$Exercise A4. Find the eigenvalues and eigenvectors of \(B = \begin{pmatrix} 3 & -1 \\ -1 & 3 \end{pmatrix}\). Verify that the eigenvectors are orthogonal.
Characteristic equation: \((3-\lambda)^2 - 1 = 0 \Rightarrow \lambda^2 - 6\lambda + 8 = 0 \Rightarrow (\lambda-4)(\lambda-2) = 0\).
So \(\lambda_1 = 4\), \(\lambda_2 = 2\).
For \(\lambda_1 = 4\): \((B - 4I)\mathbf{v} = \mathbf{0}\) gives \(-v_1 - v_2 = 0\), so \(\mathbf{v}_1 = \begin{pmatrix}1\\-1\end{pmatrix}\).
For \(\lambda_2 = 2\): \((B - 2I)\mathbf{v} = \mathbf{0}\) gives \(v_1 - v_2 = 0\), so \(\mathbf{v}_2 = \begin{pmatrix}1\\1\end{pmatrix}\).
Orthogonality check: \(\mathbf{v}_1 \cdot \mathbf{v}_2 = 1\cdot1 + (-1)\cdot1 = 0\). Confirmed (as expected by the Spectral Theorem, since \(B\) is symmetric).
Exercise C1. Using NumPy, verify that the inverse of \(A = \begin{pmatrix} 2 & 1 & 0 \\ 1 & 3 & 1 \\ 0 & 1 & 2 \end{pmatrix}\) is \(\frac{1}{8}\begin{pmatrix} 5 & -2 & 1 \\ -2 & 4 & -2 \\ 1 & -2 & 5 \end{pmatrix}\) by computing np.linalg.inv(A) and checking that A @ A_inv is close to the identity.
Exercise C2. Write a Python function that uses the central difference formula to approximate the second derivative \(f''(x)\) via
$$ f''(x) \approx \frac{f(x+h) - 2f(x) + f(x-h)}{h^2}. $$Test it on \(f(x)=\sin(x)\) at \(x=\pi/3\) and compare with the exact value \(f''(\pi/3) = -\sin(\pi/3) = -\frac{\sqrt{3}}{2}\).
import numpy as np
def second_derivative(f, x, h=1e-4):
return (f(x + h) - 2*f(x) + f(x - h)) / h**2
x0 = np.pi / 3
approx = second_derivative(np.sin, x0)
exact = -np.sin(x0)
print(f"Approximate: {approx:.10f}")
print(f"Exact: {exact:.10f}")
print(f"Error: {abs(approx - exact):.2e}")
# Error should be on the order of h^2 = 1e-8
Exercise C3. Implement Simpson's rule in Python:
$$ \int_a^b f(x)\,dx \approx \frac{h}{3}\bigl[f(x_0) + 4f(x_1) + 2f(x_2) + 4f(x_3) + \cdots + 4f(x_{n-1}) + f(x_n)\bigr], $$where \(n\) is even and \(h = (b-a)/n\). Use it to compute \(\int_0^1 \frac{4}{1+x^2}\,dx\) (which equals \(\pi\)). How many subintervals are needed to get 8 correct decimal digits?
Exercise C4. Use np.linalg.eig to find the eigenvalues of the \(4\times4\) matrix
This tridiagonal matrix arises in finite-difference discretisations of second-order ODEs. The exact eigenvalues are \(\lambda_k = 2 - 2\cos\!\bigl(\frac{k\pi}{5}\bigr)\) for \(k=1,2,3,4\). Compare the numerical output with these exact values.
import numpy as np
M = np.array([[2, -1, 0, 0],
[-1, 2, -1, 0],
[0, -1, 2, -1],
[0, 0, -1, 2]], dtype=float)
evals_num = np.sort(np.linalg.eig(M)[0].real)
evals_exact = np.sort([2 - 2*np.cos(k*np.pi/5) for k in range(1, 5)])
print("Numerical eigenvalues: ", evals_num)
print("Exact eigenvalues: ", evals_exact)
print("Max difference: ", np.max(np.abs(evals_num - evals_exact)))
Exercise G1. (Decision-making under uncertainty.) You are given the integral \(\int_0^{\infty} x^3\,e^{-2x}\,dx\). An agent must decide between two strategies: (a) integration by parts (applied repeatedly), or (b) recognising that this is a Gamma function \(\Gamma(4)/2^4\). Compute the integral both ways. Which strategy is more efficient? Design a heuristic rule that an integration-technique agent could use to detect when a Gamma function shortcut is available.
Method (b): Recognise \(\int_0^\infty x^{n-1} e^{-\alpha x}\,dx = \frac{\Gamma(n)}{\alpha^n}\). Here \(n=4\), \(\alpha=2\):
$$ \int_0^\infty x^3 e^{-2x}\,dx = \frac{\Gamma(4)}{2^4} = \frac{3!}{16} = \frac{6}{16} = \frac{3}{8}. $$Method (a): Three rounds of integration by parts yield the same answer \(\frac{3}{8}\) but take considerably longer.
Heuristic rule: "If the integrand has the form \(x^n e^{-\alpha x}\) on \([0,\infty)\) with \(n\) a positive integer and \(\alpha > 0\), use the Gamma function shortcut."
Exercise G2. (Exploration vs. exploitation.) When computing eigenvalues, an agent could (a) use the characteristic polynomial approach (exact but expensive for large \(n\)) or (b) use iterative numerical methods (approximate but scalable). For the matrix from Exercise C4, time both approaches in Python (use numpy.linalg.eig for the numerical method and numpy.polynomial for the characteristic polynomial). Plot computation time vs. matrix size for tridiagonal matrices of dimension \(n = 4, 8, 16, 32, 64, 128\). At what size does the numerical approach become clearly preferable? Frame this as an explore/exploit trade-off.
1: (b) 2: (b) — by parts, \([xe^x]_0^1 - \int_0^1 e^x\,dx = e - (e-1) = 1\) 3: (c) 4: (a) 5: (b) — \(5\cdot4-3\cdot2=14\) 6: (b) 7: (b) 8: (b) 9: (b) 10: (c)
Build a Python script (or Jupyter notebook) that takes a real symmetric \(n\times n\) matrix as input and produces:
Test your script on:
M = np.random.randn(5,5); A = (M + M.T)/2.| Criterion | Points | Expectations |
|---|---|---|
| Correctness of eigendecomposition | 8 | Eigenvalues and eigenvectors match NumPy output; \(PDP^T\) reconstruction error < \(10^{-12}\). |
| Code quality and documentation | 6 | Functions have docstrings; variable names are descriptive; no unnecessary global state. |
| Spectral decomposition verification | 5 | Explicit numerical check of \(A = PDP^T\) printed for each test matrix. |
| Visualisation: eigenvalue bar chart | 4 | Clear labels, appropriate axis scaling, title. |
| Visualisation: ellipse plot (2D case) | 5 | Unit circle and its image shown; eigenvector directions annotated. |
| Testing on all three required matrices | 2 | Output shown for each test case. |
You now have the calculus and linear algebra toolkit required for the rest of the course. In the next module you will learn what a differential equation actually is, how to classify them, and what it means for a function to be a solution.