MTA SZTAKI, Lágymányosi u. 11, Rm 306 or 506
Dropout has been the most popular choice of normalization in RNN language models since Zaremba (2014). However, early attempts at applying it to the recurrent connections (as opposed to connections between layers) were unsuccessful. In the seminar, we are going to review three recent papers that managed to crack the nut: Gal and Ghahramani (2016), Semeniuta et al. (2016) and Moon et al. (2015). We shall also discuss Zoneout, a related concept introduced in Krueger et al. (2017).