A mixture-of-experts model consists of a set of functions , the 'experts', and a gating function that determines how to select which…
References: Dayan, Hinton, Neal, Zemel (1994) https://www.cs.toronto.edu/~hinton/absps/helmholtz.pdf This paper is one of the first to…
References: Jacobs, Jordan, Nowlan, Hinton. Adaptive Mixtures of Local Experts (1991) Shazeer et al. Outrageously Large Neural Networks…
Closely related to [ discrete latent variable ]s and to [ reinforcement learning ] with discrete actions. If I do a thing and it goes well…