Observo $y_i=\cos(\theta)+z_i$, $i=1,\ldots,n$, donde cada una de las $z_i\sim\mathcal{N}(0,\sigma^2)$ es un yo.yo.d. cero significa Gaussiana de la variable aleatoria.
Estoy interesado en la estimación de $\theta\in[0,\pi]$ con mínimo error cuadrático medio (MSE).
Por el caso general de escalar de CRLB sé que
$$\tag{1}\mathbb{E}[(\theta-\hat{\theta})^2]\geq\frac{\sigma^2\sin^2\theta}{n}.$$
I am wondering if an estimator exists that comes close to the CRLB in (1). Specifically, I am looking for an estimator whose MSE goes to zero when $\theta\rightarrow0$ or $\theta\rightarrow\pi$. Any ideas?
Update: Unfortunately (1) is incorrect: the $\sin\theta$ should be moved to denominator and squared. Thus, an estimator with MSE going to zero when $\theta\rightarrow0$ or $\theta\rightarrow\pi$ does not exist. I also forgot to square the $\sin\theta$ term in the analysis of the natural estimator below. When corrected, the asymptotic MSE in (2) of this estimator matches the CRLB in (1). See the answer below.
What I tried
The "natural" estimator would just average $y_i$'s and take the inverse cosine (we must also truncate the average so that it's in $[-1,1]$ to use inverse cosine). Its MSE is:
$$\mathbb{E}[(\theta-\hat{\theta})^2]=\mathbb{E}\left[\left(\theta-\cos^{-1}\left(\cos\theta+\frac{1}{n}\sum_{i=1}^nz_i\right)\right)^2\right]$$
The noise term $\frac{1}{n}\sum_{i=1}^nz_i$ gets small as $n$ increases, so the Taylor series expansion of $\cos^{-1}(\theta+x)$ around $x=0$ yields:
$$\tag{2}\mathbb{E}[(\theta-\hat{\theta})^2]\approx \frac{\sigma^2}{n\sin\theta},$$
where the approximation is from dropping of the lower-order terms in the Taylor series. While we can also use the Taylor series expansion to show that this estimator is unbiased, MSE in (2), unlike (1), has $\sin\theta$ in the denominator, which means the error gets worse when $\theta\rightarrow0$ or $\theta\rightarrow\pi$. This seems to be an inherent problem with this estimator (and not from approximation in (2)), as evident from a numerical experiment:
In the figure 'numerical' plots the MSE from the numerical experiment for $n=1000$ and $\sigma^2=1$, while 'approx $\cos^{-1}$(average)' and 'CRLB' plot (2) and (1) corresponding to these values of $n$ and $\sigma^2$. I think that the truncation of the average to $[-1,1]$ es la culpa, pero no estoy seguro de cómo solucionar este problema.
Para referencia, aquí está el código de MATLAB utilizado para generar la "numérica" de la curva de arriba:
theta_array=linspace(0,pi,20);
for ii=1:20
theta=theta_array(ii);
z=randn(1000,1000);
average=cos(theta)+mean(n);
average(average>1)=1;
average(average<-1)=-1;
error=theta-acos(average);
mean_error(ii)=mean(error);
mse(ii)=mean(error.^2);
end