Another possibility for choosing the prior stems from automatic mechanisms that do not insert subjectivity. These are also known as objective, conventional or default priors.
In this article we will analyze
Laplace's prior is based on the Uniform distribution, in fact:
If \(\Theta = \{ \theta_1, ..., \theta_k \}\) is "bounded and discrete" I can hypothesize a prior in the following way:
\(\rightarrow \pi_L(\theta_i) = \frac{1}{k} \space \space \space \space \space \forall i = 1,2, ... , k\) that is a constant prior
But what happens if the support of \(\theta\) is not discrete or is unbounded?
In these cases since we have hypothesized the constant prior we can hypothesize any constant and for convenience it is assumed to be 1
\[ \pi_L(\theta)= c \propto 1\]
So ultimately according to this method I can hypothesize a constant prior equal to 1.
This prior is obviously improper since:
But actually it doesn't matter that the prior is improper because the posterior must be proper and if using this prior the posterior is proper then Laplace's prior can be used.
Let's assume: \[f(x|\theta) \propto \theta^4 x^3 e^{-\theta x }\] such that \(x>0; \theta >0\)
We want to use Laplace's prior to construct the posterior:
\[ \pi_L(\theta) \propto 1\]
\[ \pi(\theta | \underline x )\propto f(\underline x|\theta) \space \space \pi_L(\theta) \propto \prod_{i=1}^n \theta^4 x_i^3 e^{-\theta x_i } = \\= \theta^{4 \space n} \bigg(\prod_{i=1}^n x_i \bigg)^3 e^{-\theta \sum x_i } \propto \\ \propto \theta^{4 \space n} e^{-\theta \sum x_i }\]
We recognize the kernel of a \(Gamma(4n+1, \space \space \sum x_i)\)
Laplace's prior besides the problem of being improper can present a second inconvenience: it may be invariant.
If in the situation explained in the example just above instead of estimating \(\theta\) I had to estimate \(\lambda = g(\theta)\), can we use the same prior and reparameterize it?
YES, only if:
| Model 1 | Model 2 |
|---|---|
| Step1a: create induced model \[ (S_x , f(\underline x ; \theta), \Theta) \] | Step1b: reparameterize \[\lambda = g(\theta)\] |
| Step2a: elicitation \[ Find \space \space \pi(\theta)\] | Step 2b: create induced model \[ (S_x , f_1(\underline x ; \lambda), \Lambda) \] |
| Step3a: reparameterize \[\lambda = g(\theta) \\ \theta = g^{-1}(\lambda) \\ \pi^*(\lambda) = \pi(g^{-1}(\lambda)) \space \space \mid \space \frac{\partial g^{-1}(\lambda)}{ \partial \lambda} \space\mid \] | Step3b: elicitation \[ Find \space \space \pi^{**}(\lambda)\] |
If the posteriors derived from these methods \((\space \pi^{*}(\lambda) \space and \space \pi^{**}(\lambda)\space)\) are equal then Laplace's prior is endowed with invariance.
This second methodology associates a non-subjective elicitation rule using Fisher's expected information:
\[ \pi_j(\theta) \propto \sqrt{I_A(\theta)} \]
Fisher information can be interpreted as the amount of information contained by an observable random variable \(X\), concerning an unobservable parameter \(\theta\) on which the probability distribution of a chi-square depends. Fisher information can therefore be read as a measure of the curvature of the likelihood at the maximum likelihood estimate for \(\theta\). A flat likelihood with a modest second derivative will result in less information, where a greater curve will bring a greater amount of information. Expected Fisher information:
\[ I_A(\theta) = E \bigg[\space \bigg( \frac{\partial log(f(\underline x ; \theta))}{ \partial \theta} \bigg)^2\space \bigg] = - E \bigg[\space \bigg( \frac{\partial^2 log(f(\underline x ; \theta))}{ \partial \theta^2} \bigg)\space \bigg] \]
Observed Fisher information:
\[\mathscr{I}_A(\theta) = - \bigg( \frac{\partial^2 log(f(\underline x ; \theta))}{ \partial \theta^2} \bigg) \]
At an interpretative level it can be said that the expected information, which depends on the parameter but not on the sample, is a measure of the information brought by a generic sample for the given experiment, while the observed information, which depends only on the sample, measures the information brought by the observed sample, the observed information can be a good estimate for Fisher's expected information.
This prior is always invariant but not always proper.
Let's assume:
\[ X \sim Pois(\theta)\]
\(hp. \space n=1 \rightarrow f(\underline x;\theta)= \frac{e^{-\theta}\space \theta^x}{x!}\)
EXPECTED INFORMATION
\[ I_A(\theta) = - E \bigg[\space \bigg( \frac{\partial^2 log(f(\underline x ; \theta))}{ \partial \theta^2} \bigg)\space \bigg] = \\ = - E \bigg[\space \bigg( \frac{\partial^2 (-\theta+x \space \log \theta - \log{x!})}{ \partial \theta^2} \bigg)\space \bigg]= \\ = - E \bigg[\space - \frac{x}{\theta^2}\space \bigg] = \frac{E[x]}{\theta^2} = \\ = \frac{\theta}{\theta^2} = \frac{1}{\theta}\]
Therefore JEFFREYS' PRIOR:
\[ \pi_j(\theta) \propto \sqrt{I_A(\theta)}=\sqrt{\frac{1}{\theta}}=\bigg(\frac{1}{\theta}\bigg)^{-\frac{1}{2}}\]
This last technique (in my opinion a bit less refined than the others) uses any known functional form defined in \(\Theta\) imposing a very high variance.
For example, you can choose a Normal centered on the prior mean (but actually it is not necessary) and a very high variance: \[ Normal(\theta_{\space 0}, \space \space 10^5) \]