No solo es posible, es fácil crear cualquier distribución $F$ alguna apoyado en el intervalo de $[-1/(N-2), 1]$, con la única condición de que $K \le N-2$. He aquí una manera. Crea conjuntos de datos en la que todas las variables tienen la misma correlación con cada uno de los otros.
Deje $\rho$ ser una variable aleatoria con distribución $F$. Definir $U \ge 1/(N-1)$ como la única solución a
$$\rho = \frac{1 + 2 U - (N-1)U^2}{2 - 2(N-2)U + (N-1)(N-2)U^2}.$$
Set $V = (N-2)U-1$ and construct the $K$ vectors, each of length $N$, given by
$$\left\{\eqalign{
X_1 &= (1, V, -U, -U, \ldots, -U) \\
X_2 &= (1, -U, V-U, -U, \ldots, -U) \\
&\ldots \\
X_K &= (1, -U, -U, \ldots, -U, V-U, \ldots, -U).
}\right.$$
Each has a $1$ in the first place, $V$ in the $K+1^\text{st}$ place, and $-U$ everywhere else.
A computation (which is simple because all the $X_i$ have zero means and the same variance) shows that $\rho$ is the correlation coefficient between each $X_i$ and $X_j$. Therefore all the correlation coefficients of these $K$ random vectors of length $N$ equal $\rho$, QED.
Appendix: Illustration via simulation
This R
code simulates from a given distribution $F$. Se muestra los histogramas de los coeficientes de correlación y la prueba de homogeneidad. Los comentarios explican los detalles.
#
# Specify the situation.
#
N <- 20 # Dataset size
K <- 4 # Number of variables
n.sim <- 1e4 # Simulation size
#
# Predefine some objects.
#
f <- function(rho, n) { # Maps `rho` to `U`
(1 + (n-2)*rho + sqrt(n * (1-rho)*(1+(n-2)*rho))) / ((n-1) * (1+(n-2)*rho))
}
pattern <- cbind(diag(rep(1, K)), matrix(0, K, N-K))
mask <- lower.tri(outer(1:K, 1:K))
#
# Conduct the simulation.
#
# rF <- runif # The random number generator
# qF <- qunif # The quantile function
# dF <- dunif # The density function
rF <- function(n) rbeta(n, 1, 3)
qF <- function(q) qbeta(q, 1, 3)
dF <- function(x) dbeta(x, 1, 3)
rho <- rF(n.sim) # Draw values of `rho`
#
# Construct the data and compute their correlation coefficients.
# Each row of `sim` will record one particular correlation coefficient.
# Its columns are the iterations.
#
U <- f(rho, N)
sim <- sapply(U, function(u) {
v <- (N-1)*u - 1
x <- matrix(rep(c(rep(-u, N-1), 1), K), nrow=K, byrow=TRUE) + v*pattern
cor(t(x))[mask]
})
#
# Display the distributions of the correlation coefficients.
#
n.plots <- choose(K,2)
n.rows <- floor(sqrt(n.plots))
n.cols <- ceiling(n.plots/n.rows)
par(mfrow=c(n.rows, n.cols))
breaks <- qF(seq(0, 1, by=1/20))
invisible(apply(sim, 1, function(x) {
H <<- hist(x, main="Marginal Histogram", freq=FALSE, breaks=breaks)
curve(dF(x), add=TRUE, col="Red", lwd=2)
#
# Test the uniformity with a chi-squared test.
#
p <- chisq.test(H$counts)$p.value
mtext(paste0("(Test of uniformity: p = ", signif(p, 3), ")"), cex=0.75)
}))
par(mfrow=c(1,1))