Conflicto en los intervalos de confianza para la diferencia de medias y el intervalo de confianza para el tamaño del efecto Cohen

Question

Conflicto en los intervalos de confianza para la diferencia de medias y el intervalo de confianza para el tamaño del efecto Cohen

Preguntado el 18 de Febrero, 2014: Cuando se hizo la pregunta
760 visitas: Cuantas visitas ha tenido la pregunta
2 Respuestas: Cuantas respuestas ha tenido la pregunta
Resuelta: Estado actual de la pregunta

Puedo ejecutar un experimento para comparar el rendimiento de dos algoritmos. El diseño de mi experimento de comparaciones pareadas. Estoy dando mis resultados de esta manera:

No había valores atípicos en la media de las diferencias de los datos como se evaluó mediante la inspección de boxplot. La suposición de normalidad no se violó evaluada por la asimetría de 0.2276 (SE = 0.4405) y la curtosis de -0.2766 (SE = 0.8583). El rendimiento fue mayor en el algoritmo (M = 0.3876, SD = 0.3138) que en el algoritmo de B (M = 0.2241, SD = 0.3476), un estáticamente significativo incremento medio de 0.1635, 95% CI [0.0393, 0.2877], t(27) = 2.7007, p = 0.0118, d = 0.4938, IC del 95% para el d [-0.0501, 1.0378].

Tenga en cuenta que el 95% de intervalo de confianza para la diferencia de medias no incluye el cero, pero el 95% de intervalo de confianza para Cohen había incluye el cero. Yo estaba a punto de concluir que el algoritmo tiene Un mejor rendimiento que el algoritmo B con significación estadística y la media del tamaño del efecto, pero me confundí acerca de cómo interpretar el intervalo de confianza para la d de Cohen.

¿Qué puedo decir sobre el tamaño del efecto con estos datos?

A continuación mis datos y cómo puedo calcular los valores en R.

Gracias por su atención.

a = c(0.40000000, 0.44011976, 0.72727273, 0.50000000, 0.00000000, 0.07692308, 0.00000000, 0.00000000, 0.00000000, 1.00000000, 0.50000000, 0.91666667, 0.19354839, 0.74883721, 0.50000000, 0.50000000, 0.55000000, 0.17142857, 0.50000000, 0.51351351, 0.68000000, 0.85714286, 0.03703704, 0.05454545, 0.54219949, 0.44444444, 0.00000000, 0.00000000)
b = c(0.00000000, 0.54491018, 0.72727273, 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.00000000, 1.00000000, 0.00000000, 0.00000000, 0.00000000, 0.33953488, 0.00000000, 0.00000000, 0.00000000, 0.48571429, 0.00000000, 0.83783784, 0.80000000, 0.57142857, 0.00000000, 0.00000000, 0.06393862, 0.90476190, 0.00000000, 0.00000000)
mean(a)
sd(a)
mean(b)
sd(b)
t.test(a, b, paired=TRUE)

library(compute.es)
mes(mean(a), mean(b), sd(a), sd(b), length(a), length(b), dig=4)

Preguntado el 18 de Febrero, 2014 por John Albietz

Answer 1

2 Respuestas

Answer 2

6voto

Derek Swingley Puntos 3851

Existen medidas de tamaño de efecto análogas a las de Cohen para datos emparejados, a veces denominados "cambio medio estandarizado" o "ganancia promedio estandarizada". Esto se calcula con$$d = \frac{\bar{x}_1 - \bar{x}_2}{SD_D} = \frac{\bar{x}_D}{SD_D},$$ where $ \ bar {x} _1$ is the mean at time 1 (or under condition 1), $ \ bar {x} _2$ is the mean at time 2 (or under condition 2), $ \ bar {x} _D$ is the mean of the change/differences scores, and $ SD_D$ is the standard deviation of the change/differences scores.

This is the standardized mean change using "change score standardization". There is also the standardized mean change using "raw score standarization", but the former more directly relates to your use of the dependent samples t-test.

You can use the metafor package to compute this (and the corresponding CI):

summary(escalc(measure="SMCC", m1i=mean(a), sd1i=sd(a), m2i=mean(b), sd2i=sd(b), ni=length(a), ri=cor(a,b)))

yields:

      yi     vi    sei     zi  ci.lb  ci.ub
1 0.4961 0.0401 0.2003 2.4769 0.1035 0.8886

So, now the CI doesn't include 0 anymore, which is consistent with the results from the t-test. (Note: the value under yi is the d-value above, but after using a slight bias correction).

Some references if you want to read more about this:

Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis with repeated measures and independent-groups designs. Psychological Methods, 7, 105–125.

Viechtbauer, W. (2007). Approximate confidence intervals for standardized effect sizes in the two-independent and two-dependent samples design. Journal of Educational and Behavioral Statistics, 32, 39-60.

Update: Getting the exact CI for d.

In rare cases, it can happen that the results of the t-test (and the CI for the mean difference) yields a different conclusion than the CI for d obtained above (i.e., the CI for the mean difference includes the value 0, while the CI for d does not, or vice-versa). This is due to the fact that the CI for d is based on an asymptotic approximation using the normal distribution.

One can compute an exact CI for the standardized mean change, but this requires iterative methods (see Viechtbauer, 2007, and the references given therein). The advantage of the exact CI is that it will always agree 100% with the results from the t-test and the CI for the mean difference in its conclusion.

Instead of letting the computer do the iterative work for us (which can be done in a few lines of code), one can also just do this manually by trial and error. For the data given in http://pastebin.com/12J7UghC, the bounds of the exact CI for d can be obtained with:

tval <- t.test(a, b, paired=TRUE)$statistic
pt(tval, df=length(a)-1, ncp=-0.00265265 * sqrt(length(a)), lower.tail=TRUE)
pt(tval, df=length(a)-1, ncp=-0.77193310 * sqrt(length(a)), lower.tail=FALSE)

Essentially, we just need to find those two values of the non-centrality parameter of the t-distribution, so that the observed t-value cuts off .025 in the lower and upper tails of the distribution. With a bit of trial and error (and starting with the CI bounds obtained earlier), we find the exact 95% CI for d is $ (- 0.003, -0.772)$. And now things are consistent again: The t-test rejects (just barely, with $ p = .048 $), el IC para la diferencia de medias excluye 0 (apenas), y el CI para d excluye 0 (apenas).

Respondido el 18 de Febrero, 2014 por Derek Swingley (3851 Puntos )

Answer 3

3voto

Zizzencs Puntos 1358

mes no tiene en cuenta el emparejamiento de los datos, por lo tanto, es una prueba diferente; por lo general, las pruebas emparejadas son más potentes, por lo que no es de extrañar que la prueba emparejada haya sido significativa y la no emparejada no lo haya sido.

Respondido el 18 de Febrero, 2014 por Zizzencs (1358 Puntos )

Conflicto en los intervalos de confianza para la diferencia de medias y el intervalo de confianza para el tamaño del efecto Cohen

Respuestas

Preguntas Destacadas

Etiquetas mas usadas

i-Ciencias.com

Powered by:

Conflicto en los intervalos de confianza para la diferencia de medias y el intervalo de confianza para el tamaño del efecto Cohen

Respuestas

Preguntas relacionadas

Preguntas Destacadas

Etiquetas mas usadas

En nuestra red

i-Ciencias.com

Powered by: