Cuestión sobre la covarianza para el muestreo sin reposición

Question

Cuestión sobre la covarianza para el muestreo sin reposición

Preguntado el 21 de Agosto, 2017: Cuando se hizo la pregunta
152 visitas: Cuantas visitas ha tenido la pregunta
1 Respuestas: Cuantas respuestas ha tenido la pregunta
Resuelta: Estado actual de la pregunta

Supongamos que tengo números 1,2 ... 10 y muestro 5 de ellos aleatoriamente sin sustitución anotada como$X_1, X_2, X_3, X_4, X_5$ ¿Cuál es$Cov(X_i,X_j)$ para$i \not=j$

Asi que $Cov(X_i,X_j)=E(X_iX_j)-E(X_i)E(X_j)$

Considero que cualquier$X_i$ tratado por su cuenta es$Uniform(10)$ so$E(X_i)=E(X_j)=11/2$

Para$E(X_iX_j)$ Estoy un poco atascado.

Primero consideré conseguir el$f(x_i,x_j)=f(x_i|x_j)f(x_j)=\frac{1}{(n-1)n}=1/90$% pero esto parece incorrecto.

Preguntado el 21 de Agosto, 2017 por ErikN

Answer 1

1 Respuestas

Answer 2

9voto

jldugger Puntos 7490

Problemas en el muestreo de poblaciones finitas sin reemplazo generalmente puede ser resuelto en términos de las probabilidades de inclusión de la muestra $\pi(x)$, $\pi(x,y)$, etc.

Deje $\pi(x) = \Pr(X_1 = x)$ cualquier $x$ en la población $\mathcal P$ ($n=10$ elementos) y deje $\pi(x,y)=\Pr((X_1,X_2)=(x,y))$ cualquier $x$$y$$\mathcal P$. Por definición de expectativa,

$$E(X_1) = \sum_{x\in\mathcal P} \pi(x)x\tag{1}$$

and

$$E(X_1X_2) = \sum_{(x,y)\in\mathcal{P}^2} \pi(x,y)x y \tag{2}.$$

For this sampling procedure $X_1$ has equal chances of being any of the $n$ elements of $\mathcal P $, whence $$\pi(x)=\frac{1}{n}\tag{3}$$ for all $x$. Because sampling is without replacement, only the pairs $(x,y)$ with $x\ne s$ are possible, but all $n(n-1)$ of those are equally likely. Therefore

$$\pi(x,y) = \left\{\matrix{\frac{1}{n(n-1)} & x\ne y \\ 0 & x=y} \right.\tag{4}$$

That's the general result. For any particular population, you just have to do the arithmetic implied by formulae $(1)$ through $(4)$.

Suppose now that $\mathcal{P} = \{1,2,\ldots, n\}$. Formulae $(1)$ and $(3)$ give

$$E(X_1) = \sum_{i=1}^{n} \frac{1}{n} i = \frac{n+1}{2}$$

while formulae $(2)$ and $(4)$ give

$$\eqalign{E(X_1X_2) &= \sum_{i,j=1;\, i\ne j}^{n} \frac{1}{n(n-1)} i j \\ &= \frac{1}{n(n-1)}\left(\sum_{i=1}^{n}\sum_{j=1}^{n} i j - \sum_{i=1}^{n}^2\right)\\ &= \frac{1}{n(n-1)}\left(\sum_{i=1}^{n}\ \sum_{j=1}^{n} j - \sum_{i=1}^{n}^2\right)\\ &= \frac{1}{n(n-1)}\left(\left(\frac{n(n+1)}{2}\right)^2 - \frac{n(1+n)(1+2n)}{6}\right) \\ &= \frac{3n^2 + 5n + 2}{12}. }$$

Because there is no distinction among any of the $X_i$, these results hold for any $i \ne j$, not just $i=1$ and $j=2$. In particular,

$$\operatorname{Cov}(X_i,X_j) = E(X_iX_j) - E(X_i)E(X_j) = \frac{3n^2 + 5n + 2}{12} - \left(\frac{n+1}{2}\right)^2 = -\frac{n+1}{12}.$$

When $n=10$, the covariance of $X_i$ and $X_j$ is $-11/12 \aprox -0.917$. As a check, here is a simulation of a million such samples (using R):

> cov(t(replicate(1e6, sample.int(10, 5))))

The output is the $5\times el 5$ covariance matrix of $(X_1, \ldots, X_5)$. Because this is a simulation the output is random; but because it's a largish simulation, it's reasonably stable from one run to the next. In the first simulation I did, the off-diagonal elements of this covariance matrix ranged from $-0.9277$ to $-0.9080$ with a mean of $-0.9169$: narrowly spread around $-11/12$ como sería de esperar.

Respondido el 21 de Agosto, 2017 por jldugger (7490 Puntos )

Cuestión sobre la covarianza para el muestreo sin reposición

Respuesta

Preguntas Destacadas

Etiquetas mas usadas

i-Ciencias.com

Powered by:

Cuestión sobre la covarianza para el muestreo sin reposición

Respuesta

Preguntas relacionadas

Preguntas Destacadas

Etiquetas mas usadas

En nuestra red

i-Ciencias.com

Powered by: