He estado mirando la normalización por lotes y me he confundido.
He visto este tipo de gráfico no convexo o convexo cuando se aprende el aprendizaje profundo.
entonces
1. is this graph of loss function?
2.How they graph this because we do not know the entire loss function graph.
3. is normalizing change the graph of loss function?
4. for batch normalization. is it same thing as normalizing the input data. between deep learning layers? then it also change the loss function graph?