TY - JOUR
T1 - Long-term Cognitive Network-based architecture for multi-label classification
AU - Nápoles, Gonzalo
AU - Bello, Marilyn
AU - Salgueiro, Yamisleydi
N1 - Funding Information:
The authors would like to sincerely thank Isel Grau from the Vrije Universiteit Brussel, Belgium, who pointed out the advantages of using the squared hinge function instead of the mean squared error. This paper was partially supported by the Program CONICYT FONDECYT de Postdoctorado, Chile through the project 3200284 .
Publisher Copyright:
© 2021 The Author(s)
PY - 2021/8
Y1 - 2021/8
N2 - This paper presents a neural system to deal with multi-label classification problems that might involve sparse features. The architecture of this model involves three sequential blocks with well-defined functions. The first block consists of a multilayered feed-forward structure that extracts hidden features, thus reducing the problem dimensionality. This block is useful when dealing with sparse problems. The second block consists of a Long-term Cognitive Network-based model that operates on features extracted by the first block. The activation rule of this recurrent neural network is modified to prevent the vanishing of the input signal during the recurrent inference process. The modified activation rule combines the neurons’ state in the previous abstract layer (iteration) with the initial state. Moreover, we add a bias component to shift the transfer functions as needed to obtain good approximations. Finally, the third block consists of an output layer that adapts the second block’s outputs to the label space. We propose a backpropagation learning algorithm that uses a squared hinge loss function to maximize the margins between labels to train this network. The results show that our model outperforms the state-of-the-art algorithms in most datasets.
AB - This paper presents a neural system to deal with multi-label classification problems that might involve sparse features. The architecture of this model involves three sequential blocks with well-defined functions. The first block consists of a multilayered feed-forward structure that extracts hidden features, thus reducing the problem dimensionality. This block is useful when dealing with sparse problems. The second block consists of a Long-term Cognitive Network-based model that operates on features extracted by the first block. The activation rule of this recurrent neural network is modified to prevent the vanishing of the input signal during the recurrent inference process. The modified activation rule combines the neurons’ state in the previous abstract layer (iteration) with the initial state. Moreover, we add a bias component to shift the transfer functions as needed to obtain good approximations. Finally, the third block consists of an output layer that adapts the second block’s outputs to the label space. We propose a backpropagation learning algorithm that uses a squared hinge loss function to maximize the margins between labels to train this network. The results show that our model outperforms the state-of-the-art algorithms in most datasets.
KW - Long-term cognitive networks
KW - Recurrent neural networks
KW - Backpropagation
KW - Multi-label classification
U2 - 10.1016/j.neunet.2021.03.001
DO - 10.1016/j.neunet.2021.03.001
M3 - Article
SN - 0893-6080
VL - 140
SP - 39
EP - 48
JO - Neural Networks: The official journal of the International Neural Network Society, European Neural Network Society, Japanese Neural Network Society
JF - Neural Networks: The official journal of the International Neural Network Society, European Neural Network Society, Japanese Neural Network Society
ER -