Me gustaría recibir ayuda para encontrar la composición del intercepto en un modelo lineal mixto generalizado.
Para tu información, uso un Macbook Pro de 2013 con un chip intel de doble núcleo a 2,4 GHz, 8 GB de ram, macOS big sur 11.2.2, RStudio versión 1.4.1106 y el paquete base de R 4.04.
Este es el modelo general que he utilizado: price ~ cut + color + carat + (1 | clarity) + (1 | depth)
. Utilicé la prioridad por defecto para el enfoque bayesiano. Utilicé las primeras 300 filas del conjunto de datos ggplot2::diamonds.
Tenga en cuenta que he adoptado enfoques tanto frecuentista (lme4) como bayesiano (brms) para analizar estos resultados.
¿Cómo puedo averiguar qué niveles de los IVs se utilizaron para generar la base del intercepto? ¿Tengo que localizarlo lógicamente mirando summary(mlm_bayes_proper)
? ¿Hay algún código que pueda utilizar para encontrar esto? ¿Está ya presente en alguna parte y me lo he perdido? ¿Hay algún otro método? ¿Difiere cuando se utiliza el enfoque frecuentista frente al bayesiano o los niveles IV del intercepto son los mismos?
El código que he utilizado para los análisis y algunos resultados se encuentran a continuación.
Gracias.
Aquí hay información sobre el conjunto de datos utilizado:
[1] "diamonds300 dataset info.txt"
[1] "# ---- NOTE: gives dataset info"
[1] " \n "
[1] "head(diamonds300)"
carat cut color clarity depth table price x y z
1 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
2 0.86 Fair E SI2 55.1 69 2757 6.45 6.33 3.52
3 0.84 Fair G SI1 55.1 67 2782 6.39 6.20 3.47
4 0.70 Fair G VVS1 58.8 66 2797 5.81 5.90 3.44
5 0.76 Fair G VS1 59.0 70 2800 5.89 5.80 3.46
6 0.57 Fair E VVS1 58.7 66 2805 5.34 5.43 3.16
[1] " \n "
[1] "str(diamonds300)"
'data.frame': 327 obs. of 10 variables:
$ carat : num 0.23 0.86 0.84 0.7 0.76 0.57 0.74 0.91 0.98 0.71 ...
$ cut : Ord.factor w/ 5 levels "Fair"<"Good"<..: 2 1 1 1 1 1 1 1 1 1 ...
$ color : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 4 4 4 2 3 5 2 1 ...
$ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 5 2 3 7 5 7 4 2 2 4 ...
$ depth : num 56.9 55.1 55.1 58.8 59 58.7 61.1 61.3 53.3 56.9 ...
$ table : num 65 69 67 66 70 66 68 67 67 65 ...
$ price : int 327 2757 2782 2797 2800 2805 2805 2825 2855 2858 ...
$ x : num 4.05 6.45 6.39 5.81 5.89 5.34 5.82 6.24 6.82 5.89 ...
$ y : num 4.07 6.33 6.2 5.9 5.8 5.43 5.75 6.19 6.74 5.84 ...
$ z : num 2.31 3.52 3.47 3.44 3.46 3.16 3.53 3.81 3.61 3.34 ...
NULL
[1] " \n "
[1] "colnames(diamonds300)"
[1] "carat" "cut" "color" "clarity" "depth" "table" "price" "x" "y" "z"
[1] " \n "
[1] "nrow(diamonds300)"
[1] 327
[1] " \n "
[1] "# ---- NOTE: gives unique values of Fixed and Random effects"
[1] "unique(diamonds300$cut)"
[1] Good Fair Very Good
Levels: Fair < Good < Very Good < Premium < Ideal
[1] " \n "
[1] "unique(diamonds300$color)"
[1] E G F H D I J
Levels: D < E < F < G < H < I < J
[1] " \n "
[1] "unique(diamonds300$carat)"
[1] 0.23 0.86 0.84 0.70 0.76 0.57 0.74 0.91 0.98 0.71 0.75 0.72 0.88 0.90 0.99 1.06 0.85 0.73 1.20 0.92 1.00 0.95
[23] 1.01 0.96 1.14 1.02 0.61 1.17 1.18 0.94 0.97 0.31 1.16 1.05 1.45 0.51 1.07 1.13 0.30 0.93 1.24 1.04 1.35 1.65
[45] 1.50 1.21 1.42 1.56 2.01 1.44 1.51 1.57 1.52 1.53 1.76 1.62 1.55 2.00 3.00 2.10 0.29 1.91 1.32 2.29 1.98 2.03
[67] 2.51 2.48 0.45 0.36 0.50 0.37 0.46 0.25 0.35 0.40 0.43 0.24 0.42 0.54 0.62 0.49 0.56 0.89 0.68 0.53 0.52 0.64
[89] 0.67 0.63 0.55 0.77 0.69 0.60 0.81 0.79 0.82 0.80 0.78
[1] " \n "
[1] "unique(diamonds300$clarity)"
[1] VS1 SI2 SI1 VVS1 VS2 I1 VVS2 IF
Levels: I1 < SI2 < SI1 < VS2 < VS1 < VVS2 < VVS1 < IF
[1] " \n "
[1] "unique(diamonds300$depth)"
[1] 56.9 55.1 58.8 59.0 58.7 61.1 61.3 53.3 55.8 56.6 64.1 56.0 58.0 61.6 59.1 61.0 57.7 58.6 62.2 62.5 62.4 57.4
[23] 64.6 57.5 60.9 60.7 58.5 61.5 58.1 60.1 61.8 60.8 57.6 60.0 65.7 65.6 58.9 59.8 59.2 57.0 59.3 56.8 56.4 56.5
[45] 55.2 64.8 62.6 60.2 64.9 59.5 62.0 60.6 55.9 61.7 63.1 61.9 61.4 56.3 60.4 68.5 57.8 59.6 55.6 58.4 67.3 59.9
[67] 62.1 59.7 56.7 58.2 55.3 64.2 58.3 57.3 61.2 62.7 59.4 60.3 57.9 62.8 63.4 51.0 54.2 52.7 62.9 54.3 65.5 57.2
[89] 56.1 62.3 55.0 52.2 63.6 53.4 63.9 68.8 68.2 63.0 65.3 56.2 65.8 55.5 79.0 54.7 64.5 64.3
[1] " \n "
[1] "unique(diamonds300$table)"
[1] 65.0 69.0 67.0 66.0 70.0 68.0 95.0 71.0 73.0 65.4 79.0 76.0
[1] " \n "
Aquí está el código R que utilicé para crear el modelo:
# generalized linear mixed models
## packages used
# ---- NOTE: for loading diamonds dataset
if(!require(ggplot2)){install.packages("ggplot2")}
# ---- NOTE: for interpreting mixed effect models
if(!require(jtools)){install.packages("jtools")}
# ---- NOTE: for bayes modeling
if(!require(rstan)){install.packages("rstan")}
# ---- NOTE: for bayes modeling
if(!require(brms)){install.packages("brms")}
# ---- NOTE: run mixed effects models
if(!require(lme4)){install.packages("lme4")}
# ---- NOTE: run mixed effects models comparisons
if(!require(lsmeans)){install.packages("lsmeans")}
# ---- NOTE: run mixed effects models comparisons
if(!require(emmeans)){install.packages("emmeans")}
# ---- NOTE: data wrangling
if(!require(tidyverse)){install.packages("tidyverse")}
# ---- NOTE: for mixed effect models
if(!require(car)){install.packages("car")}
### dataset
# ---- NOTE: selects only the top 300 rows of the dataset
diamonds300 <- data.frame(top_n(diamonds, 300, table))
# ---- NOTE: gives dataset info
head(diamonds300)
str(diamonds300)
colnames(diamonds300)
nrow(diamonds300)
# ---- NOTE: gives unique values of Fixed and Random effects
unique(diamonds300$cut)
unique(diamonds300$color)
unique(diamonds300$carat)
unique(diamonds300$clarity)
unique(diamonds300$depth)
unique(diamonds300$table)
## DV is price
# ---- NOTE: FIXED EFFECTS MAIN IV - cut
# ---- NOTE: FIXED EFFECTS CONTROLLED VARIABLES - color + carat
# ---- NOTE: RANDOM EFFECTS - (1 | clarity) + (1 | depth)
# ---- NOTE: full variable model - price ~ cut + color + carat + (1 | clarity) + (1 | depth)
### frequentist model
#### full model
# ---- NOTE: creates model
(mlm_freq = lme4::glmer(
price ~ cut + color + carat + (1 | clarity) + (1 | depth),
data = diamonds300,
family = 'poisson'))
# ---- NOTE: gives model summary
summary(mlm_freq)
# ---- NOTE: summary of model, with p values for F-statistic for fixed effects
Anova(mlm_freq)
##### Gives exponentialized table of estimates
summ(mlm_freq, exp = T)
#### prints frequentist results
sink("diamonds300 frequentist.txt")
print("diamonds300 frequentist.txt")
print("
")
print("mlm_freq")
print(mlm_freq)
print("
")
print("summary(mlm_freq) ")
print(summary(mlm_freq) )
print("
")
print("Anova(mlm_freq)")
print(Anova(mlm_freq))
print("
")
print("summ(mlm_freq, exp = T)")
print(summ(mlm_freq, exp = T))
print("
")
sink()
### bayesian approach
#### creates bayes model, with proper fixed and random effects
# ---- NOTE: used default priors
mlm_bayes_proper = brms::brm(
price ~ cut + color + carat + (1 | clarity) + (1 | depth),
data = diamonds300,
family = 'poisson'
)
# ---- NOTE: gives summary of model
summary(mlm_bayes_proper)
# ---- NOTE: gives 95% credible intervals, which can be used as a significance test for levels of fixed effect when compared to intercept, because it gives odds changes (see decimal points with 1 as the +/- point and https://www.rensvandeschoot.com/tutorials/generalised-linear-models-with-brms/)
exp(fixef(mlm_bayes_proper)[,-2])
#### prints bayesian results
sink("diamonds300 bayesian.txt")
print("diamonds300 bauesian.txt")
print("
")
print("mlm_bayes_proper")
print(mlm_bayes_proper)
print("
")
print("summary(mlm_bayes_proper)")
print(summary(mlm_bayes_proper))
print("
")
print("exp(fixef(mlm_bayes_proper)[,-2])")
print(exp(fixef(mlm_bayes_proper)[,-2]))
print("
")
sink()
### dataset
# ---- NOTE: selects only the top 300 rows of the dataset
diamonds300 <- data.frame(top_n(diamonds, 300, table))
# ---- NOTE: gives dataset info
head(diamonds300)
str(diamonds300)
colnames(diamonds300)
nrow(diamonds300)
# ---- NOTE: gives unique values of Fixed and Random effects
unique(diamonds300$cut)
unique(diamonds300$color)
unique(diamonds300$carat)
unique(diamonds300$clarity)
unique(diamonds300$depth)
unique(diamonds300$table)
#### prints dataset info
sink("diamonds300 dataset.txt")
print("diamonds300 dataset info.txt")
print("# ---- NOTE: gives dataset info")
print("
")
print("head(diamonds300)")
print(head(diamonds300))
print("
")
print("str(diamonds300)")
print(str(diamonds300))
print("
")
print("colnames(diamonds300)")
print(colnames(diamonds300))
print("
")
print("nrow(diamonds300)")
print(nrow(diamonds300))
print("
")
print("# ---- NOTE: gives unique values of Fixed and Random effects")
print("unique(diamonds300$cut)")
print(unique(diamonds300$cut))
print("
")
print("unique(diamonds300$color)")
print(unique(diamonds300$color))
print("
")
print("unique(diamonds300$carat)")
print(unique(diamonds300$carat))
print("
")
print("unique(diamonds300$clarity)")
print(unique(diamonds300$clarity))
print("
")
print("unique(diamonds300$depth)")
print(unique(diamonds300$depth))
print("
")
print("unique(diamonds300$table)")
print(unique(diamonds300$table))
print("
")
sink()
```