OP escribió:
Deje $X_1, X_2, \dots, X_N$ ser yo.yo.d. continuo de variables aleatorias con el apoyo $[0, 1]$.
¿Qué es un razonable límite en la probabilidad de que ciertos pares de variables aleatorias es menor que ϵ aparte? ... interesado en los casos de una distribución uniforme ... solo necesito una probabilidad razonable obligado, no exactamente, la probabilidad de
Para el Uniforme caso:
Propongo la siguiente aproximación:
$$\Pr \left( \exists_{i,j:\ i\ne j}\ |X_i - X_j| < \epsilon \right) \approx 1-(1-\epsilon )^{n (n-1)}$$
Performance
Here is a quick comparison of the proposed APPROXIMATE solution $1-(1-\epsilon )^{n (n-1)}$ posited here, ... compared to the 'actual' probability calculated via Monte Carlo simulations (in each case, 500,000
samples of size $n$):
Case 1: $\epsilon = 0.01$ and $n = 2, 3, 4, 5, 8, 16$
Approx: {0.0199, 0.0585199, 0.113615, 0.182093, 0.430399, 0.910371}
Monte: {0.019478, 0.058606, 0.115204, 0.184624, 0.441432, 0.92556}
Case 2: $\epsilon = 0.03$ and $n = 2, 3, 4, 5, 8, 16$
Approx: {0.0591, 0.167028, 0.306158, 0.456206, 0.818358, 0.999331}
Monte: {0.05871, 0.16875, 0.315202, 0.473064, 0.848328, 0.999938}
Case 3: $\epsilon = 0.05$ and $n = 2, 3, 4, 5, 8, 16$
Approx: {0.0975, 0.264908, 0.45964, 0.641514, 0.943438, 0.999995}
Monte: {0.097572, 0.269868, 0.479214, 0.672386, 0.967794, 1.}
Case 4: $\epsilon = 0.1$ and $n = 2, 3, 4, 5, 8, 16$
Approx: {0.19, 0.468559, 0.71757, 0.878423, 0.997261, 1.}
Monte: {0.19069, 0.487214, 0.759996, 0.92185, 0.999922, 1.}
The performance seems surprisingly good for such a simple approximation. I would be interested to know if any better approximations exist (published or otherwise).
Derivation
We are given $X$ ~ Uniform(0,1) with pdf $f(x)$:
Let $(X_1, X_2, \dots, X_n)$ denote a random sample of size $n$ drawn on $X$, and let $(X_{(1)}, X_{(2)}, \dots, X_{(n)})$ be the order statistics, such that $(X_{(1)} < X_{(2)} < \dots < X_{(n)})$.
The joint pdf of the order statistics $X_{(r)}$ and $X_{(s)}$, for $r < s$, is say $g(x_{(r)},x_{(s)})$:
where OrderStat
is a mathStatica function which I am using to automate the mechanical aspects of the calculations.
We are interested to find a pair of random variables that are so close to each other that they are separated by less than $\epsilon$. The two closest random variables in the sample $(X_1, X_2, \dots, X_n)$ must necessarily be adjoining order statistics, say $X_{(r)}$ and $X_{(r+1)}$. Replacing $s$ with $r + 1$ in the previous result simplifies the joint pdf to:
Then $P((X_{(r+1)} - X_{(r)}) < \epsilon)$ is:
Note that there is no need to specify the probability using absolute values, as we are working with the ordered sample.
Thus far, we have calculated the probability that the distance between the $r$th and $(r+1)$th adjoining order statistics is smaller than $\epsilon$. But we do not have only 1 such chance ... there are $(n-1)$ such combinations of adjacent ordered statistics to choose from, ... which suggests the following approximation, as a sort of geometric-style modification to the previous result:
$$\approx 1-(1-\epsilon )^{n (n-1)}$$
The above accuracy / performance comparison suggests it works surprisingly well.
In the case of $n=2$, no hay ninguna aproximación, y el resultado debe ser teóricamente exacta.