A review on the long short-term memory model

Greg Van Houdt*, Carlos Mosquera, Gonzalo Nápoles

*Corresponding author for this work

    Research output: Contribution to journalReview articlepeer-review

    1024 Citations (Scopus)

    Abstract

    Long short-term memory (LSTM) has transformed both machine learning and neurocomputing fields. According to several online sources, this model has improved Google's speech recognition, greatly improved machine translations on Google Translate, and the answers of Amazon's Alexa. This neural system is also employed by Facebook, reaching over 4 billion LSTM-based translations per day as of 2017. Interestingly, recurrent neural networks had shown a rather discrete performance until LSTM showed up. One reason for the success of this recurrent network lies in its ability to handle the exploding/vanishing gradient problem, which stands as a difficult issue to be circumvented when training recurrent or very deep neural networks. In this paper, we present a comprehensive review that covers LSTM's formulation and training, relevant applications reported in the literature and code resources implementing this model for a toy example.

    Original languageEnglish
    Pages (from-to)5929-5955
    Number of pages27
    JournalArtificial Intelligence Review
    Volume53
    Issue number8
    DOIs
    Publication statusPublished - Dec 2020

    Keywords

    • Deep learning
    • Long short-term memory
    • Recurrent neural networks
    • Vanishing/exploding gradient

    Fingerprint

    Dive into the research topics of 'A review on the long short-term memory model'. Together they form a unique fingerprint.

    Cite this