Esto suena como que tiene un conjunto de datos que consta de las mediciones de $(x_i, i=1,2,\ldots,n)$ $(y_j, j=1,2,\ldots,m)$ ( $n=30, m=12$ ). Permítanos postulan que
Todas las medidas pueden ser consideradas independientes variables aleatorias.
Hay una cantidad fija ("parámetro") $\mu$ para que todos los $x_i-\mu$ tienen un punto en común de distribución de $F$ (cuya expectativa es $0$, lo que refleja una falta de sesgo en las mediciones) y todas las $y_j - 1/\mu$ tienen un punto en común de distribución de $G$ ($0$ valor esperado).
Una manera de hacer algo de progreso es el estudio de las distribuciones de error de $F$$G$. Para ilustrar cómo esta información puede ser usada, consideremos una amplia aplicación del modelo en el que las distribuciones tienen formas idénticas, pero se desconoce las cantidades de dispersión (que vamos a medir con la varianza). Deje que la varianza de $F$ $\sigma^2$ y la varianza de la $G$$\tau^2$. A menudo, estas distribuciones aproximadamente Normales, por ejemplo (aunque muchas otras formas de error puede ser modelada).
Los supuestos de independencia implica la probabilidad de las observaciones, $L$, es el producto de las densidades de probabilidad. Deje $\phi$ ser la densidad por unidad de varianza. Cuando asumimos Normalmente distribuida variación, por ejemplo,
$$\phi(z) = \frac{1}{\sqrt{2\pi}} \exp({-z^2/2}).$$
Then $\phi_\sigma(x) = \phi(x/\sigma)/\sigma$ is the density of $F$ and $\phi_\tau(y)=\phi(y/\tau)/\tau$ is the density of $G$. Accordingly,
$$L(\mu, \sigma, \tau; (x_i), (y_j)) = \prod_{i=1}^n \phi_\sigma(x_i-\mu) \prod_{j=1}^m \phi_\tau(y_j-1/\mu).$$
We may estimate $\mu$ using the method of Maximum Likelihood: find values of $\mu,\sigma,\tau$ que hacen de esta posibilidad tan grande como sea posible.
Para simplificar los productos, y para cumplir con la convención de que los problemas de optimización son generalmente de fundición como la minimización de los problemas, nos vamos a minimizar la negativa de registro de probabilidad
$$\eqalign{
\Lambda(\mu\sigma,\tau) y= -\log(L(\cdots)) \\
y= -\sum_{i=1}^n \left(\log \phi\left(\frac{x_i-\mu}{\sigma}\right) - \log \sigma \right) - \sum_{j=1}^m \left(\log \phi\left(\frac{y_j-1/\mu}{\tau}\right) - \log \tau \right) \\
&=-n\log\sigma - m\log\tau \sum_{i=1}^n \log \phi\left(\frac{x_i-\mu}{\sigma}\right) - \sum_{j=1}^m\log \phi\left(\frac{y_j-1/\mu}{\tau}\right).
}$$
To continue the illustration, assume from now on that the error distributions are Normal. We easily find that the minimum must occur when $\sigma^2$ is the variance of the $(x_i)$ and $\tau$ is the variance of the $(y_j)$:
$$\hat\sigma^2 = \frac{1}{n} \sum_{i=1}^n (x_i - \bar x)^2; \quad \bar x = \frac{1}{n}\sum_{i=1}^n x_i; \\
\hat\tau^2= \frac{1}{m} \sum_{i=1}^m (y_i - \bar y)^2; \quad \bar y = \frac{1}{m}\sum_{j=1}^m y_j.$$
It remains to find $\hat\mu$ for which $\Lambda(\hat\mu\hat\sigma\hat\tau)$ is minimum. This value could be any real number--there are no boundary values to check. Since $\Lambda$ is a differentiable function of its first argument, the minimum must occur at a zero of its derivative:
$$0 = \frac{\partial}{\partial \mu}\Lambda(\mu\cdots) = \frac{1}{ \hat\sigma }\sum_{i=1}^n \frac{\phi^\prime\left(\frac{x_i-\mu}{\hat\sigma}\right)}{ \phi\left(\frac{x_i-\mu}{\hat\sigma}\right) } -
\frac{1}{\mu^2 \hat\tau}\sum_{j=1}^m \frac{\phi^\prime\left(\frac{y_j-1/\mu}{\hat\tau}\right)}{ \phi\left(\frac{y_j-1/\mu}{\hat\tau}\right) }. $$
Normal distributions are often chosen in models precisely because the function $\phi^\prime(z)/\phi(z) = -z$ is linear, making such equations easy to solve. In this case the presence of $1/\mu$ complicates things a bit:
$$\frac{n}{\hat \sigma^2}\left(\bar x - \mu\right) = \frac{1}{\hat\sigma}\sum_{i=1}^n \frac{x_i - \mu}{\hat\sigma} =
\frac{1}{\mu^2\hat\tau}\sum_{j=1}^m\frac{y_j - 1/\mu}{\hat\tau} = \frac{m}{\mu^2\hat\tau^2}\left(\bar y - 1/\mu\right).$$
The equation in $\mu$, whose solutions must include the estimate $\hat\mu$, is of fourth degree, rather than linear. Nevertheless it can be solved numerically and typically will produce a global minimum somewhere near $\bar x$ or $1/\bar y$, provided there are enough data and their variances are not too large. (The presence of negative values is not a good sign!)
(Alternatively, we might hope that the variance of $s$ decreases with $\mu$, as is often the case in measuring positive quantities. In that case we might discover that the $y_j$ are perhaps better modeled using distributions whose variances are $\tau^2/\mu^2$ (for example). This would turn the preceding equation back into one which is linear in $\mu$, making it straightforward to solve. This possibility suggests there is value in studying how the precision of the measurement process producing the $y_j$ might vary with $\mu$. The $x_i$ measurement process deserves a comparable study.)
Simulations suggest that with the conditions described in the question ($\bar x$ near $3$, $n=30$, $m=12$, and some negative values in the $s$ data), using the $s$ data actually does not improve the precision of the estimates. The estimates are improved when the aggregate $s$ measurements are relatively more precise than the aggregate $x$ measurements; that is, when $m\tau^2 \mu^2 \gg n\sigma^2 / \mu^2$ ($\tau \ll \frac{m}{n}\sigma/\mu^2$), assuming $\mu \gt 1$. Here is an example of that good situation, and indeed $\hat\mu$ is closer to $\mu$ than $\bar x$ is:
The vertical solid blue lines are the true mean $\mu=3$. The vertical solid gray lines show the means $\bar x$ and $1/\bar y$. The vertical dashed red lines show the ML estimate $\hat\mu$. The horizontal dashed red line in the Profile Likelihood plot shows an upper $95\$ confidence limit for $%\Lambda$: values of $\mu$ for which the graph of $\Lambda$ lie below this limit form a two-sided $95\$ confidence interval for $%\mu$. In this example that interval just barely includes the true value of $\mu$.
FWIW, applying this procedure to the data (as given in a comment to another answer, interpreting the 12 values of "first var" to be $x$ and the 30 values of "second var" to be $y$) yields $\hat\mu = 1.79$, with a $95\$ confidence interval approximately $%[0.9,3.2]$. The data reflect a large amount of measurement error: $\hat\sigma=1.85$ and $\hat\tau=1.40$. Here is a summary of the data and the fit:
Here is the R
code to compute $\hat\mu \hat\sigma \hat\tau$, y para llevar a cabo dichas simulaciones.
#
# Negative log (partial) likelihood.
#
lambda <- function(mu, sigma2, tau2, x, y) {
(sum((x - mu)^2)/sigma2 + sum((y - 1/mu)^2)/tau2)/2
}
#
# Maximum likelihood estimation.
#
mle <- function(x, y) {
sigma.hat <- mean((x-mean(x))^2)
tau.hat <- mean((y-mean(y))^2)
fit <- optimize(lambda, c(min(1/max(y), min(x)), max(x, 1/min(y))),
sigma2=sigma.hat, tau2=tau.hat, x=x, y=y)
list(mu.hat=fit$minimum, sigma.hat=sigma.hat, tau.hat=tau.hat,
Lambda=fit$objective)
}
#
# Create sample data.
#
set.seed(17)
n <- 30; m <- 12
mu <- 3
sigma <- 1/2
tau <- 0.5 * (m/n) * sigma / mu^2
x <- rnorm(n, mu, sigma)
y <- rnorm(m, 1/mu, tau)
#
# Find the solution.
#
fit <- mle(x, y)
#
# Plot the data and profile log likelihood
#
se <- sd(x) / sqrt(n)
i <- seq(fit$mu.hat-3*se, fit$mu.hat+3*se, length.out=101)
z <- sapply(i, function(j) lambda(j, fit$sigma.hat, fit$tau.hat, x, y))
markup <- function(z) {
abline(v = mu, col="Blue", lwd=2)
if(!missing(z)) abline(v = z, col="Gray", lwd=2)
abline(v = fit$mu.hat, lwd=2, col="Red", lty=3) #$
}
par(mfrow=c(1,3))
hist(x, freq=FALSE); markup(mean(x))
hist(1/y, freq=FALSE); markup(1/mean(y))
plot(i, z, type="l", xlab="mu", ylab="Lambda", main="Profile Likelihood")
abline(v = mu, col="Blue", lwd=2)
abline(h = fit$Lambda + qchisq(0.95, 1)*2, lty=3, lwd=2, col="Red")