\part{Fundamentals}


\chapter{Probability}







Considerj
\begin{definition}
A \emph{trial} is an experiment that produces a result, formally defined by a probability space (this comes later).
\end{definition}

\begin{definition}
An \emph{outcome} $\omega$ is a possible result of a trial. 
\end{definition}

\begin{definition}
	An \emph{event} is a set of outcomes of a trial.
\end{definition}

\begin{definition}
The \emph{sample space} $\Omega$ is the set of all outcomes of a trial.
\end{definition}

\begin{definition}
The \emph{event space} $\mathcal{F}$ is the set of all considered events of a trial.
\end{definition}








\section{Frequentist interpretation of probability}



Being a field of mathematics inspired by idea of randomness, The definitions of probability theory will come from the way we choose to think of and interpret probability as. The main question we need to answer is this; how do will we actually measure the probability of some event occuring?

The 
Frequentist probability, which interprets probability as \emph{relative frequency}
Bayesian probability, which interprets probability as \emph{credence}


One interpretation of probability is that it describes the frequency of an event occuring in comparison to other events, given that the trial is conducted infinite amount of times; this is the \emph{frequentist philosophy} of probability theory.

The \emph{frequentist philosophy} interprets probability as a \emph{relative frequency}; the limiting ratio of how often an event occurs in comparison to other events as the amount of trials conducted approaches infinity. This interpretation is rather objective, and is inspired by the law of large numbers (which will be covered later).






The frequentist approach is easier and more concrete to introduce, so we will begin our study of probability theory by creating definitions motivated by the idea of relative frequency. 

Imagine you run $n$ trials of an experiment and you record in how many experiments some event $A$ occured. The ratio of trials where the event occured over the amount of trials is known as the \emph{relative frequency} of the experiment.

$f(A,n) = \frac{\text{Amount of times } A \text{ occured in those } n \text{ events }}{n}$

We note that the relative frequency can never be below 0 or greater than 1.
$f(A,n)$ is not truly a function of domain $\mathcal{F} \times \mathbb{N}$ since the value of $f$ would depend on the specific experiment being run; if you run $n$ trials on one day and $n$ trials on another, you may observe different results.

That said, we are hoping that if we could theoretically run experiments with infinite trials, all experiments converge to the same relative frequency. This is known as the \emph{probability of an event}

$\text{Probability of }A = \lim_{n\to \infty} f(A,n) = \lim_{n \to \infty} \frac{\text{Amount of times } A \text{ occured in those } n \text{ events }}{n}$


We desire to construct a mathematical framework that can handle probabilities and prove theorems on them; this is the art of probability theory. The probabilities 'inserted' into our mathematical model for some application are based on empirical sampling like our Cairns example, however one must take great care to ensure that these values accurately represent the system being modelled; this is the art of statistics.

Probability theory is quite a powerful tool, in fact, we will eventually prove a theorem called the law of large numbers which can in turn be used to prove that our intuition of all limits of relative frequencies of the same experiment converging to the same value to be consistent with the theory.


\subsection{Cairns}

Let's give a concrete example by trying to calculate the probability of it raining over 10mm of rain on any given summer day in Cairns.

During the 2025-2026 summer in beautiful Cairns, it rained over 10mm on 27 days of the season. We can describe this as a relative frequency like this.

\[f(R,90) = \frac{27}{90}\]

This describes 
This is a ratio that describes the reputation by its frequency 

To obtain the true probability, we would need data for an infinite amount of summers past, present, and future. That's empirically impossible for me to do, so in reality I would approximate the probability as the relative frequency I managed to calculate.

\[ \text{Probability that on a given summer day it rains over 10mm in Cairns} \approx \frac{27}{90}\]






%The \emph{Bayesian philosophy} interprets probability as \emph{credence}; a level of condifence in comparsion to other events that the event will occur. This interpretation is rather subjective, and is inspired by Bayesian inference.



\section{Probability function}



Not all events are equal; events may differ in their 'probabilities'.

\begin{definition}[Probability function]
The \emph{probability function} $\mathrm{Pr} : \mathcal{F} \to [0,1]$ maps each event to its probability of occurence. It 'measures' the probability for events and obeys the following 3 laws.
\begin{itemize}
	\item $\mathrm{Pr}(E)$ is a positive real number
	\item $\mathrm{Pr}(\Omega)=1$
	\item For disjoint $E_i$, one has $\mathrm{Pr}(\bigcup^{\infty}_{i=1}E_i) = \sum^{\infty}_{i=1} \mathrm{Pr}(E_i)$
\end{itemize}
These 3 conditions are sometimes called \emph{Kolmogorov's axioms}, since they define the nature of probability.
\end{definition}


\begin{definition}[Independent pair of events]
A pair of events are \emph{independent} iff the occurence of one event doesn't effect the probability of another.
\[ A \perp B \iff \mathrm{Pr}(A\cap B)=\mathrm{Pr}(A) \mathrm{Pr}(B)\]
\end{definition}






\section{Conditional probability}


From a frequentist point of view, this can be seen as the frequency of an e

Say we run $n$ trials, $a$ of which $A$ occurs and $b$ of which $B$ occurs. It is possible that both $A$ and $B$ occur in the same trial (i.e $A \cap B$ occurs), denote this as $c$ amount of trials.

The frequentist interpretation of conditional probability would be the trials where both $A$ and $B$, out of the trials where $B$ is known to occur, so $\frac{c}{b}$. Because this is equal to $\frac{c / n}{b / n}$ we can use the ratio of probabilities instead of ratio of trials without changing the final value.


\begin{definition}[Conditional probability]
\[\mathrm{Pr}(A|B) = \frac{\mathrm{Pr}(A \cap B)}{\mathrm{Pr}(B)}\]
\end{definition}




\begin{proposition}
Let $A,B$ be independent events, then $\mathrm{Pr}(A|B)=\mathrm{Pr}(A)$
\end{proposition}



\begin{theorem}\emph{Bayes' theorem}
Let $A,B$ be events, then the following relation holds.
\[\mathrm{Pr}(A|B)=\frac{\mathrm{Pr}(B|A)\mathrm{Pr}(A)}{\mathrm{Pr}(A)}\]
\[A,B \subseteq \Omega \implies \mathrm{Pr}(A|B)=\frac{\mathrm{Pr}(B|A)\mathrm{Pr}(A)}{\mathrm{Pr}(A)}\]
\end{theorem}

