\part{Numerical differentiation and integration}
\chapter{Numerical differentiation}
\section{Forward difference}
\section{Backwards difference}
\section{Balanced difference}
\section{Higher order methods}


\chapter{Numerical integration}


\section{Newton-Cotes integration}

\subsection{Riemann sum}

The definition of a Riemann integrable function offers a natural 
\[\int^{b}_{a}f(x)dx \approx \sum^{n}_{k=1} f(x_k)h\]
\[h= \frac{b-a}{n}\]
\[x_k = a+ nh\]
\section{Trapezoidal rule}
To reduce the error, one can consider the use of trapezoids along the mesh; hence the integral is estimated as a linear function over each partiton rather than as a constant function.
\[\int^{b}_{a}f(x)dx \approx \sum^{n-1}_{k=1} f(x_k)h\]
\[h= \frac{b-a}{n}\]
\[x_k = a+ nh\]
\[f_k = \frac{f(x_k)+f(x_{k+1})}{2}\]


\subsection{Romberg integration}
One can perform a series acceleration technique known as \emph{Richardson extrapolation} to derive higher order schemes for numerical integration. The use of this technique in numerical integration is formalized as \emph{Romberg integration}.

\[h_n = \frac{b-a}{2^{n+1}\]

\[R(0,0)=h_0(f(a)+f(b))\]
\[R(n,0)=\frac{1}{2}R(n-1,0)+ 2h_n \sum^{2^n -1}_{k=1} f(a+(2k-1)h_{n-1}) \]
\[R(n,m)=R(n,m-1)+\frac{1}{4^m -1}(R(n,m-1)-R(n-1,m-1)) \]


\subsubsection{Refined trapezoidal rule}


One can remodel the trapezoidal rule as an algorithm where tolerance is specified rather than the mesh refinement. The crux of the idea is to double the refinement of the mesh until the absolute error from the previous iteration is within tolerance, however there are two standard methods for iterating the algorithm:

\begin{itemize}
\item Weighted sum of previous value and Riemann sum with 'doubly refined mesh'
\item Calculating trapezoidal rule for 'doubly refined mesh'
\end{itemize}


\[h_n= \frac{b-a}{2^{n}}\]
\[x_{n,k} = a+ kh_n\]
\[I_{0}=\frac{f(a)+f(b)}{2}(a-b)\]
\[I_{n+1}= \frac{I_n}{2} + \frac{h_{n+1}}{2} \sum_{i=1}^{2^{n+1}} f(x_{n+1,i})\]


\subsection{Simpson's rule}

The theory of Romberg integration allows us to develop higher order schemes than the trapezoidal rule (0-order). \emph{Simpson's rule} is the 1-order algorithm in the Romberg integration scheme, and is otherwise observed by fitting quadratics to each partition rather than lines.

\section{Gaussian integration}
\subsection{Gauss-Legendre integration}
An idea with strong link to functional analysis and the idea of orthogonal polynomials allows the realization of an extremely powerful method of numerical integration. We develop our theory within the Hilbert space $L^{2}([-1,1])$ (this just means we're going to consider integration problems on $[-1,1]$; one can do the change of variables later).

Consider a function $f$ that can be represented well by a polynomial interpolation $h$. Let $(P_n)_{n \in \mathbb{N}}$ be the Legendre polynomials. Recall that $\int^{1}_{-1}P_n (x)x^k dx = 0$

By Euclidean polynomial division, we may have $h(x)=q(x) P_n (x) +r(x)$, where

\[\int^{1}_{-1} f(x)dx \approx \int^{1}_{-1} h(x)dx \]
\[ = \int^{1}_{-1} q(x) P_n (x) +r(x)dx \]
\[ = \int^{1}_{-1} q(x) P_n (x) +r(x)dx \]
\[ = \int^{1}_{-1} r(x)dx \]

Since $r$ is a polynomial of degree less than $n$, it can be interpolated \emph{exactly} by some Lagrange interpolation.
\[ r(x) = \sum^{n}_{i=1} \ell_i (x)r(x_i) \]

\[ = \int^{1}_{-1} \sum^{n}_{i=1} \ell_i (x)r(x_i)dx \]
\[ =  \sum^{n}_{i=1} r(x_i) \int^{1}_{-1}\ell_i (x)dx \]


\[w_i = \int^{1}_{-1} \ell_i (x) dx \]

\[ =  \sum^{n}_{i=1} r(x_i) w_i dx \]

This is close, however we want to deal with the function $h$ instead of $r$; how can we pick the $x_i$ so that $r$ coincides with $h$? 
Returning to the equation $h(x_i)=q(x_i) P_n (x_i) +r(x_i)$, we can see that one way to do this is by letting $x_i$ be the zeroes of $P_n$!


\[ \int^{1}_{-1} f(x)dx \approx \sum^{n}_{i=1} f(x_i) w_i dx \]

With the change of variables, we get the following.
\[\int^{b}_{a} f(x)dx =  \approx \frac{b-a}{2} \sum^{n}_{i=0} w_i f(\frac{b-a}{2}u_i + \frac{a+b}{2})\]


\section{Monte Carlo integration}

To integrate for higher dimensions, one approach is to 'nest' singular dimension integrals by setting $n-1$ variables as constants, integrating over this function, changing these $n-1$ variables slightly, integrating again etc. Needless to say, this becomes quite computationally expensive.

Probability theory offers a unique approach to integration that converges relatively slowly, however scales well for higher dimensions. It is an interesting consequence of the law of large numbers that states that the volume ('size' of the domain of integration) multiplied by mapings of random points within that volume equals the desired integral. This is known as \emph{Monte Carlo (MC) integration}.

\[ \mathrm{plim}_{n\to \infty} \frac{\sum^{n}_{i=1} f(\mathbf{u}_i)}{n} = \int_{\Omega} f(\mathbf{x}) d\mathbf{x} \]

\begin{itemize}
\item $\Omega$ is the domain of integration (and simultaenously the sample space we will sample on)
\item $U \sim \mathrm{U}(\Omega)$ is a random variable with a continuous uniform distribution on $\Omega$
\item $(\mathbf{u}_i)$ is a sequence of random samples of $U$ 
\item $\lambda(\Omega)$ represents the 'volume' of $\Omega$ (technically the Lebesgue measure of $\Omega$)

Note that some programming packages only have functions that generate continuous uniformly distributed samples on $[0,1]$; in this case one would need to either find a way to transform uniform samples on $[0,1]$ to the desired $\Omega$, or do a change of variables to the function.


Assuming that we are taking purely random samples (realistically a pseudo-random sequence would be used); there are methods that seek to reduce the error introduced by the variance of the continuous uniform random variable, the various class of methods used for this purpose form the study of \emph{MC variance reduction}.

One can improve the performance of MC by using a random sampler that is designed to sample in a way that values do not 'clump' together; these are called quasi-random sequences. Quasi-random sequences aim to preserve some properties of randomness, however may add some conditional bias. For example one may want that samples are more 'evenly distributed' given a set of previous samples; the Sobol and Halton sequences have this purpose. 

\subsection{MC variance reduction}
\subsubsection{Antithetic variates}
\subsubsection{Control variates}
\subsubsection{Control variates}
\subsubsection{Stratified sampling}
Division of domain into subdomains to sample on
There is also a recursive version, which recurses the stratified sampling technique
\subsubsection{Importance sampling}
Conducting more sampling where the value is known to have more 'extreme values'. Uniform random variable scaled by the function to introduce bias.