8 votos

El Número de Exponencial de los Sumandos en un Intervalo Fijo es de Poisson

¿Qué es lo más inteligente manera de probar que el número de independientes exponencial de los sumandos en un intervalo fijo se distribuye como una variable aleatoria de Poisson? Que puedo hacer de una manera, pero me gustaría saber si hay otra forma en la que obtiene más puntos de estilo.

Deje $S_1, S_2, \ldots \overset{iid}{\sim} \text{Exponential}(\mu)$. Para ser claros, las densidades de cada una de las $f_S(s) = \mu e^{-\mu s}$. Ahora, para $t > 0$, definir $K_t = \{j : S_1 + \cdots S_j < t < S_1 + \cdots + S_j + S_{j+1}\}$.

La forma en que puedo hacerlo es directamente la integración de la articulación de la densidad de las exponenciales. Por ejemplo, $$P(K_t = 1) = P(S_1 < t < S_1 + S_2) = \int_0^t \int_{t-s_1}^{\infty} \mu e^{-\mu s_1} \mu e^{-\mu s_2}ds_2 ds_1 = (\mu t) e^{\mu t}.$$ Pero hay otra manera?

Edit: Esto es un conocido resultado de básicos de procesos de Poisson. Este hecho fue la motivación por la pregunta. Así que, idealmente, las respuestas no dependen de los resultados de Poisson o de cualquier otra procesos estocásticos.

11voto

jldugger Puntos 7490

Propongo tomar, como punto de partida, el concepto de un homogénea proceso de Poisson. Este es un punto del proceso en la línea (a menudo pensaba, y que se refiere, como un "momento" de la línea). Las realizaciones son conjuntos de puntos. Casi seguramente, cualquier conjunto acotado de números reales contiene sólo un número finito de puntos.

Las propiedades fundamentales disfrutado de este proceso, los voy a utilizar repetidamente en el análisis, se

  1. (La independencia) Los resultados en cualquiera de los dos conjuntos disjuntos son independientes.

  2. (Homogeneidad) El número esperado de puntos en cualquier conjunto medible $\mathcal{A}$ con un límite de medida $|\mathcal A|$ es directamente proporcional a $|\mathcal{A}|$. La constante de proporcionalidad, $\lambda$, es distinto de cero.

Todo fluye a partir de estas propiedades, como veremos.

Los tiempos de espera

Vamos a estudiar los "tiempos de espera" de este proceso. Dado un tiempo de inicio de la $s$ y la duración transcurrida $t \ge 0$, vamos a $S(s, t)$ ser la posibilidad de que los puntos no se producen entre los $s$$s+t$: es decir, dentro del intervalo de $(s, s+t]$. Considere dos intervalos adyacentes, uno de $r$ $r+s$y otro de$r+s$$r+s+t$. Por la independencia, que la probabilidad de que ningún punto de unión es el producto de la probabilidad de que no hay un punto en la primera y la posibilidad de que no hay un punto en el segundo:

$$S(r, s+t) = S(r,s) S(r+s,t).\tag{1}$$

By homogeneity, these chances remain the same when we slide the intervals around. That is, for any $s$, $S(r, t) = S(r+s, t)$. In particular, we may always take $r=-s$ to obtain

$$S(r, t) = S(0,t) = S(t)$$

for all $r$, allowing us to drop the explicit dependence on $r$ in the notation. Plugging this into $(1)$ gives

$$S(s+t) = S(s)S(t).\tag{2}$$

Homogeneity makes it obvious $S$ must be continuous (actually, differentiable). It is well known that the only solutions to $(2)$ are exponential. One simple way to see this is to consider that the logarithm of $S$ is linear and, since $S(0)=1$, there consequently must be some number $\kappa$ for which

$$\log(S(t)) = \kappa t.$$

Since $S$ must decrease as time goes on, $\kappa \lt 0$. Ergo, all solutions are of the form

$$S(t) = e^{-\kappa t}.$$

It is the chance that no points occur within any specified interval of length $t$.

Exponential summands

Fix an interval; by virtue of homogeneity, we may assume it starts at $0$ and ends (say) at $b$. Almost surely there are only finitely many points of the process in the interval $(0,b]$, allowing us to order them $0 \lt t_1 \lt t_2 \lt \cdots \lt t_n$. $t_1$ is a realization of a random variable $T_1$ governed by $S$: that is,

$$\Pr(T_1 \le t) = 1 - \Pr(T_1 \gt t) = 1 - S(t) = 1 - e^{-\kappa t}.$$

$T_2$ similarly is an independent random variable, where $T_2 - T_1$ also is governed by $S$, whence

$$\Pr(T_2 \le T_1 + t) = 1 - S(t) = 1 - e^{-\kappa t},$$

and likewise for the remaining $T_i$. This demonstrates that $n$ is exactly as described in the question: it is the largest number of "exponential summands" that fit within the interval $(0, b]$. It is a realization of the random variable

$$N(0,b) = \max\{i\,|\,T_1+T_2+\cdots+T_i \le b\}.$$

Poisson distribution

Let $k\ge 0$ be an integer. What is the chance, $p_k$, that there are exactly $k$ points in the interval $[0,1]$? I am going to deduce the answer from the ergodic property of the Poisson process, which I take to be intuitive: because the process within the unit interval $[0,1]$ is the same as (and independent of) the process within any unit interval $[t-1, t]$, we may deduce properties of the process by varying $t$ for a single realization and studying the point patterns that show up. In particular, $p_k$ must equal the limiting proportion of time that the number of points $N(t)=N(t-1,t)$ in the interval $[t-1,t]$ equals $k$. We may express this formally using the indicator function $I$ which is equal to $1$ when its argument is true and is $0$ otherwise:

$$p_k = \lim_{x\to\infty}\frac{1}{x-1}\int_1^x I(N(t)=k) dt.$$

The integral is the total duration between $1$ and $x$ when interval $[t-1,t]$ contains exactly $k$ points, while the denominator of the fraction of course is the total elapsed time from $1$ to $x$.

Figure showing a graph of $N(t)$

This is a plot of $N(t)$ for one realization of a Poisson process with rate $\lambda=2.5$. White horizontal lines mark the values $k=1,2,3$ on the vertical axis, which extends from $0$ to $4$. The gray vertical lines show the times at which points occur in this realization. The dotted blue vertical lines show the same points shifted one unit to the right. The solid red curve plots $N(t)$ beginning at $t=1$. (It extends infinitely far to the right but not all of it could be drawn!)

$N(1)=4$ because four points (gray lines) appear in the first unit of time, $[0,1]$. The plot of $N(t)$ then rises by one unit every time $t$ another gray line is encountered moving left to right, because this is when the interval $[t-1,t]$ picks up that point. It falls by one unit every time $t$ a blue line is encountered, because this is the same thing as losing the value $t-1$ from the interval $[t-1,t]$.

The proportion of time spent at each height $k$ estimates $p_k$, the chance that any unit interval contains exactly $k$ points.

Suppose the interval $[t-1, t]$ contains $k$ points. As we slide it to the right, let's keep track of the new points it picks up and the old points that drop off. Two simple relations determine everything:

  1. Between $t$ and $t+dt$ (for $dt \ge 0$), the expected number of new points is $\lambda\, dt$. (Esa es la homogeneidad.)

  2. Sin embargo, se espera que el número de puntos que se pierden es $k\, dt$ porque no se $k$ puntos situados de forma aleatoria en el intervalo. (Esto también es debido a la homogeneidad.)

Casi seguramente a la mayoría, se añade un punto o perderse en un instante. (Si no, no sería positivo límite inferior para la probabilidad de dos o más puntos que aparecen en forma arbitraria pequeños intervalos de $[t, t+dt]$, pero ya que el número de puntos en los intervalos que sólo es $\lambda dt$ - que crece infinitamente pequeña con $dt$--esto es imposible.) En consecuencia, sólo hay dos transiciones que tienen distinto de cero probabilidad de que ocurran: de $k$ $k+1$ puntos y de $k$ $k-1$ puntos. Sus instantáneas son las tasas de

$$\tau(k\to k+1) =\lim_{dt\to 0^{+}} \frac{\lambda\, dt}{dt} = \lambda$$

and, if $k \ge 1$,

$$\tau(k\to k-1) = \lim_{dt\to 0^{+}} \frac{k\, dt}{dt} = k.$$

This might look complicated, since it establishes a system of differential equations for the infinitely many probabilities $p_k$. However, because the process is homogeneous these probabilities are unchanging.

Look first at the case $k=0$. The expected rate at which $p_0$ changes (namely, zero) is the expected rate of transitions $1\0$ from states with $k=1$ minus the expected rate of transitions $0\a 1$ to states with $k=1$. Thus, by virtue of the simple relations (1) and (2),

$$(1)p_1 - (\lambda)p_0 = 0.$$

This enables us to find $p_1$ in terms of $p_0$:

$$p_1 = \lambda p_0.\tag{3}$$

Now consider the general situation $k\gt 0$. There are four transitions potentially changing $p_k$, all of which must balance out in expectation: $k\k+1$ and $k\k-1$ make it decrease, while $k+1\a k$ and $k-1\a k$ increase it. Thus, again using the simple relations (1) and (2) to compute the instantaneous rates,

$$[(\lambda)p_{k-1} - (k) p_k] + [(k+1)p_{k+1} - (\lambda)p_k] = 0.$$

We may assume inductively that the first term in brackets balances out (something we just showed for the case $k=0$), thereby automatically balancing out the second term in brackets and easily giving the solution

$$p_{k+1} = \frac{\lambda}{k+1}p_k.\tag{4}$$

Formulas $(3)$ and $(4)$ determine all the $p_k$ in terms of $p_0$: the solution is

$$p_k = p_0\frac{\lambda^k}{k!}.\tag{5}$$

(Proof: this formula satisfies the initial condition $(3)$ as well as the recursion $(4)$.)

The connection between the Exponential and Poisson parameters

There are two ways to find $p_0$. First, we may exploit what we learned previously about exponential waiting times: $p_0$ is the chance that an Exponential variable of parameter $\kappa$ exceeds $1$. This is

$$p_0 = e^{-\kappa}.\tag{6}$$

The second is the fact that the probabilities sum to unity, whence

$$1 = \sum_{k=0}^\infty p_0\frac{\lambda^k}{k!} = p_0\sum_{k=0}^\infty \frac{\lambda^k}{k!} = p_0 e^{\lambda}.$$

Therefore

$$p_0 = e^{-\lambda}\tag{7}$$

is the unique value that works. Consequently $(5)$ is now fully explicit:

$$p_k = e^{-\lambda} \frac{\lambda^k}{k!}.$$

This is the Poisson distribution.

Equating $(6)$ and $(7)$ reveals that

$$\kappa = \lambda.$$

This explicitly relates the parameter of the Exponential waiting times to the parameter of the Poisson distribution.

Simulation for greater understanding

The first figure did not show a sufficiently long time span to estimate the $p_k$ accurately. Consider, though, a portion another realization of $N(t)$ that is a hundred times longer:

Figure of a longer portion

(The vertical lines are no longer drawn, because they are so numerous.)

Here are the proportions of time spent for each $k$. Beneath them are the proportions for the Poisson$(2.5)$ distribución:

         0      1      2      3      4      5      6      7      8     9
y   0.0745 0.2068 0.2637 0.2215 0.1290 0.0660 0.0235 0.0128 0.0016 6e-04
fit 0.0821 0.2052 0.2565 0.2138 0.1336 0.0668 0.0278 0.0099 0.0031 9e-04

El acuerdo es evidente, aunque todavía imperfecta, porque esto es solo un corto segmento inicial de la realización.


Aquí es el R código utilizado para producir las figuras. Experimento con lambda y n para conseguir una sensación para este análisis.

lambda <- 2.5               # Poisson intensity
n <- 1000                   # Number of points to realize
x <- cumsum(rexp(n, lambda))# Accumulate the waiting times

# Compute the proportion of times for each `k` and compare to the Poisson distribution.
f <- ecdf(x)                # The ECDF does the work of computing N(t)
b <- max(x)
u <- seq(1, b, length.out=10*n)
y <- table(round(n*(f(u) - f(u-1)), 4))
y <- y / sum(y)
fit <- dpois(as.numeric(names(y)), lambda)
round(rbind(y, fit), 4)

# Plot N(t)
y.max <- max(as.numeric(names(y)))
curve(ifelse(x >= 1 & x <= b, n*(f(x) - f(x-1)), NA), 0,b, ylim=c(0, y.max*1.01),
      n=max(10001, 10*n), xlab="t", ylab="", col="#00000080",
      yaxp=c(0, y.max, y.max), bty="n", yaxt="n", yaxs="i")
rect(0, 0, b, y.max, col="#f4f4f4", border=NA)
abline(h=0:y.max, col="White")
if (n < 1000) {
  abline(v=x, lty=1, col="Gray")
  abline(v=x[x <= b-1]+1, lty=3, col="Blue")
}
curve(ifelse(x >= 1 & x <= b, n*(f(x) - f(x-1)), NA), add=TRUE,
      n=max(10001, 10*n), lwd=2, col="#802020")

i-Ciencias.com

I-Ciencias es una comunidad de estudiantes y amantes de la ciencia en la que puedes resolver tus problemas y dudas.
Puedes consultar las preguntas de otros usuarios, hacer tus propias preguntas o resolver las de los demás.

Powered by:

X