TY - JOUR
T1 - A review on the long short-term memory model
AU - Van Houdt, Greg
AU - Mosquera, Carlos
AU - Nápoles, Gonzalo
N1 - Funding Information:
We thank the reviewers for their very thoughtful and thorough reviews of our manuscript. Their input has been invaluable in increasing the quality of our paper. Also, a special thanks to prof. J?rgen Schmidhuber for taking the time to share his thoughts on the manuscript with us and making suggestions for further improvements.
Publisher Copyright:
© 2020, Springer Nature B.V.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/12
Y1 - 2020/12
N2 - Long short-term memory (LSTM) has transformed both machine learning and neurocomputing fields. According to several online sources, this model has improved Google's speech recognition, greatly improved machine translations on Google Translate, and the answers of Amazon's Alexa. This neural system is also employed by Facebook, reaching over 4 billion LSTM-based translations per day as of 2017. Interestingly, recurrent neural networks had shown a rather discrete performance until LSTM showed up. One reason for the success of this recurrent network lies in its ability to handle the exploding/vanishing gradient problem, which stands as a difficult issue to be circumvented when training recurrent or very deep neural networks. In this paper, we present a comprehensive review that covers LSTM's formulation and training, relevant applications reported in the literature and code resources implementing this model for a toy example.
AB - Long short-term memory (LSTM) has transformed both machine learning and neurocomputing fields. According to several online sources, this model has improved Google's speech recognition, greatly improved machine translations on Google Translate, and the answers of Amazon's Alexa. This neural system is also employed by Facebook, reaching over 4 billion LSTM-based translations per day as of 2017. Interestingly, recurrent neural networks had shown a rather discrete performance until LSTM showed up. One reason for the success of this recurrent network lies in its ability to handle the exploding/vanishing gradient problem, which stands as a difficult issue to be circumvented when training recurrent or very deep neural networks. In this paper, we present a comprehensive review that covers LSTM's formulation and training, relevant applications reported in the literature and code resources implementing this model for a toy example.
KW - Deep learning
KW - Long short-term memory
KW - Recurrent neural networks
KW - Vanishing/exploding gradient
UR - http://www.scopus.com/inward/record.url?scp=85084971994&partnerID=8YFLogxK
U2 - 10.1007/s10462-020-09838-1
DO - 10.1007/s10462-020-09838-1
M3 - Review article
SN - 0269-2821
VL - 53
SP - 5929
EP - 5955
JO - Artificial Intelligence Review
JF - Artificial Intelligence Review
IS - 8
ER -