Aquí se sugirió una estrategia general para obtener un conocimiento inicial de los métodos bayesianos La conexión entre la estadística bayesiana y la modelización generativa y aquí Metanálisis bayesiano de la desviación estándar residual mediante BUGS
Lo que sigue es un esbozo de eso para esta pregunta - donde creo que funciona bastante bien - al menos para mí;-)
# Set of values that may become known (the rating from 1:5)
Knowns<-1:5
# A probability generating model for one such known (all equal probability)
sample(1:5,size=1,prob=c(.2,.2,.2,.2,.2))
# The possible unknowns that need to be _used_ to generate one such possible known
# First an example
PossibleUnknown<-c(.2,.2,.2,.2,.2)
sample(1:5,size=1,prob=PossibleUnknown)
# Note the unknowns here are the probabilities of a rating of 1,2, ... 5 and the known is the first rating given
# Getting a probability distribution for the possible unknowns (a prior distribution)
# Not immediate because its has 5 elements in 4 dimensions (as the probabilities must sum to 1)
# Fortunately Bayesians have one we can start with
library(MCMCpack) # Just used to get the prior
(PossibleUnknown<-rdirichlet(1,rep(1,5) ))
sum(PossibleUnknown)
# OK now using the two stage conceptualiaztion of Bayes by Don Rubin 1984 it is direct and transparent
# Number of MC smaples to generate
reps<-1000000
# Sample from prior of Possible UnknownS
PossibleUnknowns<-rdirichlet(reps,rep(1,5))
# Sample rating from data generating model for each Possible Unknown above
PossibleKnowns<-apply(PossibleUnknowns,1,function(x) sample(Knowns,size=1,prob=x) )
PossibleJoints<-cbind(PossibleUnknowns,PossibleKnowns)
head(PossibleJoints)
# The sample from the Posterior if 1st rating is 5
# (those Possible Unknowns that generated PossibleKnowns = 5)
ConditionalUnknowns<-PossibleUnknowns[PossibleKnowns==5,]
# Plot prior and posterior marginals (separate rating probabilities) to _see_ whats going on
par(mfrow=c(2,2))
for(i in 2:5) hist(PossibleUnknowns[,i],main=paste("Rating of",i),xlab="PossibleUnknown\n(probabilities)")
for(i in 2:5) hist(ConditionalUnknowns[,i],main=paste("Rating of",i),xlab="PossibleUnknown\n(probabilities)")
# Calculate marginal or expected probability for each rating separately
# Prior probabilities
apply(PossibleUnknowns,2,mean)
# Posterior probabilities
apply(ConditionalUnknowns,2,mean)
# A good guess at the prior probablities??
rep(1,5)/sum(rep(1,5))
# A good guess at the posterior probablities??
(rep(1,5) + c(0,0,0,0,1))/sum((rep(1,5) + c(0,0,0,0,1)))
# So if the next rating is a 5
(rep(1,5) + c(0,0,0,0,2))/sum((rep(1,5) + c(0,0,0,0,2)))
# Easy to directly check as for the second rating its prior is the posterior from 1st rating
PossibleUnknowns<-ConditionalUnknowns
PossibleKnowns<-apply(PossibleUnknowns,1,function(x) sample(Knowns,size=1,prob=x) )
ConditionalUnknowns<-PossibleUnknowns[PossibleKnowns==5,]
# Posterior probabilities
apply(ConditionalUnknowns,2,mean)
# Not bad
(rep(1,5) + c(0,0,0,0,2))/sum((rep(1,5) + c(0,0,0,0,2)))
# OK now read wiki dirichlet for the math
# Now the above was not just a warm exercise as for instance if you want to
# use different priors, perahps an empirical prior based on past ratings of
# films of the same genre - you now know how to do that!
# Now the mean rating if you are interested in that is the sum of probability of rating * rating
# Prior mean rating
sum( rep(1,5)/sum(rep(1,5)) * Knowns )
# Posterior mean rating given first rating was a 5
sum( (rep(1,5) + c(0,0,0,0,1))/sum((rep(1,5) + c(0,0,0,0,1))) * Knowns )