Para convenienc ef notación que utiliza $\pi_i=\pi(c_i)$.
Para $\mu$, usted debe tomar el promedio ponderado de la media:
$$\mu = \sum_{i=1}^{C}\pi_i\mu_i$$
For the covariance matrix:
$$\Sigma=\left(\sum_i^C \pi_i (\Sigma_i+\mu_i\mu_i^T)\right)-\mu\mu^T$$
For the intuitive reason of why this works, think about the mean of all points that are drawn from the GMM, where do you expect the mean to be?
But, in the following I'm writing a rigorous proof for that:
For $\mu$, you should calculate: $E_{x\sim GMM}[x]$
$$E_{x\sim GMM}[x]=\int_{x\in \mathcal{X}} x\sum_{i=1}^C \pi_i \frac{1}{|2\pi \Sigma_i|^\frac{-1}{2}}e^{-\frac{1}{2}(x-\mu_i)^T\Sigma_i^{-1}(x-\mu_i)}dx$$
$$\Rightarrow=\sum_{i=1}^C \pi_i \int_{x\in \mathcal{X}} x \frac{1}{|2\pi \Sigma_i|^\frac{-1}{2}}e^{-\frac{1}{2}(x-\mu_i)^T\Sigma_i^{-1}(x-\mu_i)}dx$$
$$\Rightarrow=\sum_{i=1}^C \pi_i \mu_i$$
For the covariance, you should calculate: $$E_{x\sim GMM}[(x-\mu)(x-\mu)^T]=E_{x\sim GMM}[xx^T]-\mu\mu^T$$
Let's focus on $E_{x\sim GMM}[xx^T]$:
$$E_{x\sim GMM}[xx^T]=\int_{x\in \mathcal{X}} xx^T\sum_{i=1}^C \pi_i \frac{1}{|2\pi \Sigma_i|^\frac{-1}{2}}e^{-\frac{1}{2}(x-\mu_i)^T\Sigma_i^{-1}(x-\mu_i)}dx$$
$$\Rightarrow = \sum_{i=1}^C \pi_i \int_{x\in \mathcal{X}}xx^T\frac{1}{|2\pi \Sigma_i|^\frac{-1}{2}}e^{-\frac{1}{2}(x-\mu_i)^T\Sigma_i^{-1}(x-\mu_i)}dx$$
$$\Rightarrow = \sum_{i=1}^C \pi_i \int_{x\in \mathcal{X}}xx^T\frac{1}{|2\pi \Sigma_i|^\frac{-1}{2}}e^{-\frac{1}{2}(x-\mu_i)^T\Sigma_i^{-1}(x-\mu_i)}dx$$
$$\Rightarrow = \sum_{i=1}^C \pi_i (\Sigma_i+\mu_i\mu_i^T)$$
Therefore the covariance of the GMM is:
$$\Sigma=\left(\sum_{i=1}^C \pi_i (\Sigma_i+\mu_i\mu_i^T)\right)-\mu\mu^T$$
El siguiente código de Matlab verifica los resultados teóricos para un GMM con dos Gaussianas:
n1=1000000;
n2=3000000;
p1=n1/(n1+n2);
p2=n2/(n1+n2);
mu1=[0,0,0];
mu2=[10,10,10];
A=rand(3);
S1=A'*A
A=rand(3);
S2=A'*A
r1 = mvnrnd(mu1,S1,n1);
r2 = mvnrnd(mu2,S2,n2);
S1
S1_hat=cov(r1)
S2
S2_hat=cov(r2)
r=[r1;r2];
mu=mean(r)
mu_hat=p1*mu1+p2*mu2
S=cov(r)
S_hat=p1*(S1+mu1'*mu1)+p2*(S2+mu2'*mu2)-mu_hat'*mu_hat
Aquí está el resultado de ejecutar el código:
mu =
7.5009 7.5007 7.5000
mu_hat =
7.5000 7.5000 7.5000
S =
20.5464 20.4126 19.7789
20.4126 20.4026 19.7273
19.7789 19.7273 19.8504
S_hat =
20.5485 20.4149 19.7801
20.4149 20.4051 19.7284
19.7801 19.7284 19.8508