I was reading "Data Analysis" by D. S. Sivia and found the following fairly early in.

Suppose you have a posterior probability density function . One way to approximate it is with a Normal distribution or by writing it with two parameters in the form , where is the best estimate for and is the standard deviation.

It is clear that the maximum of the posterior is given by (and .

A measure of the reliability of this best estimate can be obtained by computing the Taylor expansion of the log-likelihood, :

where the second term is missing because since is a monotonic function of .

Now, the term dominates the Taylor series, and after rearranging we get:

.

We have obtained the Normal, or Gaussian, distribution. Note that