Skip to article frontmatterSkip to article content

Appendix Two

Authors
Affiliations
KU Leuven
KU Leuven

Expectation of Poisson data contributing to a measurement

Assume the configuration of Figure 12: two radioactive sources contribute to a single measurement NN. We know the a-priori expected values aˉ\bar{a} and bˉ\bar{b} for each of the sources, and we know the measured count N=a+bN = a + b. The question is to compute the expected values of aa and bb given NN.

By definition, the expected value equals:

E(aa+b=N)=a=0p(aa+b=N)aa=0p(aa+b=N)E(a | a + b = N) = \frac{\sum_{a=0}^\infty p(a | a + b = N) a} {\sum_{a=0}^\infty p(a | a + b = N)}

The denominator should be equal to 1, but if we keep it, we can apply the equation also if pp is known except for a constant factor.

The prior expectations for aa and bb are:

paˉ(a)=eaˉaˉaa!pbˉ(b)=ebˉbˉbb!p_{\bar{a}}(a) = e^{-\bar{a}} \frac{\bar{a}^a}{a!} \hspace{2cm} p_{\bar{b}}(b) = e^{-\bar{b}} \frac{\bar{b}^b}{b!}

After the measurement, we know that the first detector can only have produced aa counts if the other one happened to contribute NaN-a counts. Dropping some constants this yields:

p(aa+b=N)eaˉaˉaa!    ebˉbˉNa(Na)!aˉaa!bˉNa(Na)!\begin{align} p(a | a + b = N) \sim e^{-\bar{a}} \frac{\bar{a}^a}{a!} \;\; e^{-\bar{b}} \frac{\bar{b}^{N-a}}{(N-a)!}\\ \sim \frac{\bar{a}^a}{a!} \frac{\bar{b}^{N-a}}{(N-a)!} \end{align}

Applying (1) yields:

E(aa+b=N)=a=0Naˉaa!bˉNa(Na)!aa=0Naˉaa!bˉNa(Na)!E(a | a+b=N) = \frac{\sum_{a=0}^N \frac{\bar{a}^a}{a!} \frac{\bar{b}^{N-a}}{(N-a)!} a} {\sum_{a=0}^N \frac{\bar{a}^a}{a!} \frac{\bar{b}^{N-a}}{(N-a)!}}

Let us first look at the denominator:

a=0Naˉaa!bˉNa(Na)!=(aˉ+bˉ)NN!since (Na)=N!a!(Na)!\sum_{a=0}^N \frac{\bar{a}^a}{a!} \frac{\bar{b}^{N-a}}{(N-a)!} = \frac{(\bar{a} + \bar{b})^N}{N!} \hspace{1cm} \mbox{since } \left( \begin{array}{cc} N\\a \end{array}\right) = \frac{N!}{a! (N-a)!}

A bit more work needs to be done for the numerator:

a=0Naˉaa!bˉNa(Na)!a=a=1Naˉaa!bˉNa(Na)!asummation can start from 1=a=1Naˉa(a1)!bˉNa(Na)!=aˉa=1Naˉa1(a1)!bˉNa(Na)!=aˉa=0N1aˉaa!bˉN1a(N1a)!=aˉ(aˉ+bˉ)N1(N1)!\begin{align} \sum_{a=0}^N \frac{\bar{a}^a}{a!} \frac{\bar{b}^{N-a}}{(N-a)!} a & = \sum_{a=1}^N \frac{\bar{a}^a}{a!} \frac{\bar{b}^{N-a}}{(N-a)!} a \hspace{1cm} \mbox{summation can start from 1}\\ & = \sum_{a=1}^N \frac{\bar{a}^a}{(a-1)!} \frac{\bar{b}^{N-a}}{(N-a)!}\\ & = \bar{a} \sum_{a=1}^N \frac{\bar{a}^{a-1}}{(a-1)!} \frac{\bar{b}^{N-a}}{(N-a)!}\\ & = \bar{a} \sum_{a=0}^{N-1} \frac{\bar{a}^a}{a!} \frac{\bar{b}^{N-1-a}}{(N-1-a)!}\\ & = \bar{a} \frac{(\bar{a} + \bar{b})^{N-1}}{(N-1)!} \end{align}

Combining numerator and denominator results in:

E(aa+b=N)=aˉ(aˉ+bˉ)N1(N1)!N!(aˉ+bˉ)N=aˉNaˉ+bˉ\begin{align} E(a | a+b=N) &= \bar{a} \frac{(\bar{a} + \bar{b})^{N-1}}{(N-1)!} \frac{N!}{(\bar{a} + \bar{b})^N}\\ &= \bar{a} \frac{N}{\bar{a} + \bar{b}} \end{align}

The convergence of the EM algorithm

This section explains why maximizing the likelihood of the complete variables LxL_x is equivalent to maximizing the likelihood LL of the observed variables QQ.

First some notations and definitions must be introduced. As in the text, Λ denotes the reconstruction, QQ the measured sinogram and XX the complete variables. We want to maximize the logarithm of the likelihood given the reconstruction:

L(Λ)=lng(QΛ)L(\Lambda) = \ln g(Q | \Lambda)

The conditional likelihood of the complete variables f(XΛ)f(X | \Lambda) can be written as:

f(XΛ)=k(XQ,Λ)  g(QΛ),f(X | \Lambda) = k(X | Q, \Lambda) \; g(Q | \Lambda),

where k(XQ,Λ)k(X | Q, \Lambda) is the conditional likelihood of the complete variables, given the reconstruction and the measurement. These definitions immediately imply that

lnf(XΛ)=L(Λ)+lnk(XQ,Λ).\ln f(X | \Lambda) = L(\Lambda) + \ln k(X | Q, \Lambda).

The objective function we construct during the E-step is defined as

h(ΛΛ)=E[lnf(XΛ)Q,Λ)],h(\Lambda' | \Lambda) = E\left[\ln f(X | \Lambda') | Q, \Lambda)\right],

which means that we write the log-likelihood of the complete variables as a function of Λ\Lambda', and that we eliminate the unknown variables XX by computing the expectation based on the current reconstruction Λ. Combining (10) and (11) results in

h(ΛΛ)=L(Λ)+E[lnk(XQ,Λ)Q,Λ]h(\Lambda' | \Lambda) = L(\Lambda') + E\left[\ln k(X | Q, \Lambda') | Q,\Lambda\right]

Finally, we define a generalized EM (GEM) algorithm. This is a procedure which computes a new reconstruction from the current one. The procedure will be denoted as MM. MM is a GEM-algorithm if

h(M(Λ)Λ)h(ΛΛ).h(M(\Lambda) | \Lambda) \geq h(\Lambda | \Lambda).

This means that we want MM to increase hh. We can be more demanding and require that MM maximizes hh; then we have a regular EM algorithm, such as the MLEM algorithm of section Iterative Reconstruction.

Now, from equation (12) we can compute what happens with LL if we apply a GEM-step to increase the value of hh:

L(M(Λ))L(Λ)=h(M(Λ)Λ)h(Λ,Λ)+E[lnk(XQ,Λ)Q,Λ]E[lnk(XQ,M(Λ))Q,Λ]L(M(\Lambda)) - L(\Lambda) = h(M(\Lambda) | \Lambda) - h(\Lambda, \Lambda) + E\left[\ln k(X | Q, \Lambda) | Q,\Lambda\right] - E\left[\ln k(X | Q, M(\Lambda)) | Q,\Lambda\right]

Because MM is a GEM-algorithm, we already know that h(M(Λ)Λ)h(Λ,Λ)h(M(\Lambda) | \Lambda) - h(\Lambda, \Lambda) is positive. If we can also show that

E[lnk(XQ,Λ)Q,Λ]E[lnk(XQ,M(Λ))Q,Λ]0E\left[\ln k(X | Q, \Lambda) | Q,\Lambda\right] - E\left[\ln k(X | Q, M(\Lambda)) | Q,\Lambda\right] \geq 0

then we have proven that every GEM-step increases the likelihood LL.\

By definition, we have

E[lnk(XQ,Λ)Q,Λ]=k(XQ,Λ)lnk(XQ,Λ)dX.E\left[\ln k(X | Q, \Lambda') | Q,\Lambda\right] = \int k(X | Q, \Lambda) \ln k(X | Q, \Lambda') dX.

Therefore, the left hand side of equation (15) can be rewritten as

k(XQ,Λ)lnk(XQ,Λ)dXk(XQ,Λ)lnk(XQ,M(Λ))dX,\int k(X | Q, \Lambda) \ln k(X | Q, \Lambda) dX - \int k(X | Q, \Lambda) \ln k(X | Q, M(\Lambda)) dX,

which can be written as

k(XQ,Λ)lnk(XQ,Λ)k(XQ,M(Λ))dX,\int k(X | Q, \Lambda) \ln \frac{k(X | Q, \Lambda)}{k(X | Q, M(\Lambda))} dX,

with the additional requirement that k(XQ,Λ)dX=k(XQ,M(Λ))dX=1\int k(X | Q, \Lambda) dX = \int k(X |Q, M(\Lambda)) dX = 1. It turns out that (18) is always positive, due to the convexity of tlntt \ln t. We will have a look at that now.\

Consider the function ψ(t)=tlnt\psi(t) = t \ln t. It is only defined for t>0t > 0. Here are its derivatives:

ψ(t)=dψ(t)dt=1+lntψ(t)=d2ψ(t)dt2=1t>0\begin{align} \psi'(t) &= \frac{d \psi(t)}{dt} = 1 + \ln t\\ \psi''(t) &= \frac{d^2 \psi(t)}{dt^2} = \frac{1}{t} > 0 \end{align}

The second derivative is continuous and strictly positive (tlntt \ln t is a convex function). Consequently, we can always find a value uu such that

ψ(t)=ψ(1)+(t1)ψ(1)+(t1)22ψ(u)    with    0<ut.\psi(t) = \psi(1) + (t - 1) \psi'(1) + \frac{(t-1)^2}{2} \psi''(u) \;\; \mbox{with} \;\; 0 < u \leq t.

Because ψ(1)=0\psi(1) = 0 and ψ(1)=1\psi'(1) = 1, this becomes:

ψ(t)=(t1)+(t1)22ψ(u)    with    0<ut.\psi(t) = (t - 1) + \frac{(t-1)^2}{2} \psi''(u) \;\; \mbox{with} \;\; 0 < u \leq t.

Consider now an integral of the same form as in (18):

f1(x)lnf1(x)f2(x)dx    with    f1(x)dx=f2(x)dx=1.\int f_1(x) \ln \frac{f_1(x)}{f_2(x)} dx \;\; \mbox{with} \;\; \int f_1(x) dx = \int f_2(x) dx = 1.

We can rework it such that we can exploit the convexity of tlntt \ln t:

f1(x)lnf1(x)f2(x)dx=f1(x)f2(x)ln(f1(x)f2(x))f2(x)dx=[f1(x)f2(x)1+12(f1(x)f2(x)1)2ψ(u(x))]f2(x)dx  with  0<u(x)<=(f1(x)f2(x))dx+12(f1(x)f2(x)12)2ψ(u(x))f2(x)dx        0\begin{align} & \int f_1(x) \ln \frac{f_1(x)}{f_2(x)} dx \nonumber \\ &= \int \frac{f_1(x)}{f_2(x)} \ln \left( \frac{f_1(x)}{f_2(x)} \right) f_2(x) dx \nonumber \\ &= \int \left[ \frac{f_1(x)}{f_2(x)} -1 + \frac{1}{2}\left( \frac{f_1(x)}{f_2(x)} - 1\right)^2 \psi''(u(x)) \right] f_2(x) dx \; \mbox{with} \; 0 < u(x) < \infty \nonumber \\ &= \int \left( f_1(x) - f_2(x) \right) dx + \frac{1}{2} \int \left( \frac{f_1(x)}{f_2(x)} - 1 2\right)^2 \psi''(u(x)) f_2(x) dx \;\; \geq \;\; 0 \end{align}

Now we have proved that the GEM-step increases LL. It still remains to be shown that the algorithm indeed converges towards the maximum likelihood solution. If you want to know really everything about it, you should read the paper by Salvo and Defrise, “A Convergence Proof of MLEM and MLEM-3 With Fixed Background”, IEEE TMI, 38 (2019).

Backprojection of a projection of a point source

2D parallel projection

Consider a point source located in the center of the field of view. The projection is a sinogram q(s,θ)q(s, \theta), given by

q(s,θ)=δ(s),q(s, \theta) = \delta(s),

where δ(x)\delta(x) is the delta function: δ(x)=1\delta(x) = 1 if x=0x = 0, and δ(x)=0\delta(x) = 0 if x0x \neq 0. The backprojection b(x,y)b(x,y) is then given by

b(x,y)=0πq(xcosθ+ysinθ,θ)dθb(x,y) = \int_0^\pi q(x \cos\theta + y\sin\theta, \theta) d\theta \nonumber
b(x,y)=0πδ(xcosθ+ysinθ)dθb(x,y) = \int_0^\pi \delta(x\cos\theta + y\sin\theta) d\theta

To proceed, the argument of the delta function must be changed. This can be done with the following expression:

δ(f(x))=1Nδ(xn)f(xn),\delta(f(x)) = \sum_1^N \frac{\delta(x_n)}{|f'(x_n)|},

where ff' is the derivative of ff, and xn,n=1Nx_n, n=1 \ldots N are the zeros of f(x)f(x), i.e. f(xn)=0f(x_n) = 0. A simple example is δ(ax)=δ(x)/a\delta(ax) = \delta(x) / |a|. Applying this to (26) yields:

b(x,y)=0πδ(θθ0)xsinθ0+ycosθ0dθ,b(x,y) = \int_0^\pi \frac{\delta(\theta - \theta_0)} {|-x\sin\theta_0 + y\cos\theta_0|} d\theta,

where θ0\theta_0 is such that xcosθ0+ysinθ0=0x\cos\theta_0 + y\sin\theta_0 = 0. This is satisfied if

cosθ0=±yx2+y2      and      sinθ0=xx2+y2,\cos\theta_0 = \pm \frac{y}{\sqrt{x^2+y^2}} \;\;\;\mbox{and}\;\;\; \sin\theta_0 = \mp \frac{x}{\sqrt{x^2+y^2}},

which results in

b(x,y)=1x2+y2.b(x,y) = \frac{1}{\sqrt{x^2 + y^2}}.

2D parallel projection with TOF

Consider again the projection of a point source in the center of the field of view. With time-of-flight, the parallel projection along angle θ can be considered as a Gaussian blurring along lines with angle θ. That gives for the TOF-sinogram q(x,y,θ)q(x,y,\theta):

q(x,y,θ)=12πσexp(x2+y22σ2)  δ(xcosθ+ysinθ),q(x,y,\theta) = \frac{1}{\sqrt{2\pi}\sigma} \exp(-\frac{x^2+y^2}{2 \sigma^2}) \; \delta(x\cos\theta + y\sin\theta),

where σ represents the uncertainty of the TOF-measurement. The corresponding backprojection operator is obtained by applying the same blurring to the projections, followed by integration over θ. The convolution of a Gaussian with itself is equivalent to multiplying its standard deviation with 2\sqrt{2} (see appendix Combining resolution effects: convolution of two Gaussians). One finds

b(x,y)=0π12πσexp(x2+y24σ2)  δ(xcosθ+ysinθ)dθ=12πσx2+y2exp(x2+y24σ2),\begin{align} b(x,y) &= \int_0^\pi \frac{1}{2\sqrt{\pi}\sigma} \exp(-\frac{x^2+y^2}{4 \sigma^2}) \; \delta(x\cos\theta + y\sin\theta) d\theta\\ &= \frac{1}{2\sqrt{\pi}\sigma \sqrt{x^2 + y^2}} \exp(-\frac{x^2+y^2}{4 \sigma^2}), \end{align}

which is equivalent to equation (41).