Combinatorial algorithms for the seriation problem

Matteo Seminaroti

Research output: ThesisDoctoral ThesisScientific

348 Downloads (Pure)

Abstract

In this thesis we study the seriation problem, a combinatorial problem arising in data analysis, which asks to sequence a set of objects in such a way that similar objects are ordered close to each other. We focus on the combinatorial structure and properties of Robinsonian matrices, a special class of structured matrices which best achieve the seriation goal. Our contribution is both theoretical and practical, with a particular emphasis on algorithms. In Chapter 2 we introduce basic concepts about graphs, permutations and proximity matrices used throughout the thesis. In Chapter 3 we present Robinsonian matrices, discussing their characterizations and recognition algorithms existing in the literature. In Chapter 4 we discuss Lexicographic Breadth-First search (Lex-BFS), a special graph traversal algorithm used in multisweep algorithms for the recognition of several classes of graphs. In Chapter 5 we introduce a new Lex-BFS based algorithm to recognize Robinsonian matrices, which is derived from a new characterization of Robinsonian matrices in terms of straight enumerations of unit interval graphs. In Chapter 6 we introduce the novel Similarity-First Search algorithm (SFS), a weighted version of Lex-BFS which we use in a multisweep algorithm for the recognition of Robinsonian matrices. In Chapter 7 we model the seriation problem as an instance of Quadratic Assignment Problem (QAP) and we show that if the data has a Robinsonian structure, then one can find an optimal solution for QAP using a Robinsonian recognition algorithm. In Chapter 8 we discuss how to solve the seriation problem when the data does not have a Robinsonian structure, by finding a Robinsonian approximation of the original data. Finally, in Chapter 9 we discuss some experiments which we have carried out in order to compare the performance of the algorithms introduced in the thesis.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Tilburg University
Supervisors/Advisors
  • Laurent, Monique, Promotor
  • Sotirov, Renata, Promotor
Award date2 Dec 2016
Place of PublicationTilburg
Publisher
Print ISBNs9789056684969
Publication statusPublished - 2016

Fingerprint

Seriation
Combinatorial Algorithms
Breadth-first Search
Quadratic Assignment Problem
Recognition Algorithm
Proximity Graphs
Permutation Graphs
Structured Matrices
Interval Graphs
Graph Algorithms
Combinatorial Problems
Straight
Enumeration
Search Algorithm
Data analysis
Optimal Solution
Unit
Approximation
Graph in graph theory

Cite this

Seminaroti, M. (2016). Combinatorial algorithms for the seriation problem. Tilburg: CentER, Center for Economic Research.
Seminaroti, Matteo. / Combinatorial algorithms for the seriation problem. Tilburg : CentER, Center for Economic Research, 2016. 219 p.
@phdthesis{06ce98d85e2f48d885475d3530533b5b,
title = "Combinatorial algorithms for the seriation problem",
abstract = "In this thesis we study the seriation problem, a combinatorial problem arising in data analysis, which asks to sequence a set of objects in such a way that similar objects are ordered close to each other. We focus on the combinatorial structure and properties of Robinsonian matrices, a special class of structured matrices which best achieve the seriation goal. Our contribution is both theoretical and practical, with a particular emphasis on algorithms. In Chapter 2 we introduce basic concepts about graphs, permutations and proximity matrices used throughout the thesis. In Chapter 3 we present Robinsonian matrices, discussing their characterizations and recognition algorithms existing in the literature. In Chapter 4 we discuss Lexicographic Breadth-First search (Lex-BFS), a special graph traversal algorithm used in multisweep algorithms for the recognition of several classes of graphs. In Chapter 5 we introduce a new Lex-BFS based algorithm to recognize Robinsonian matrices, which is derived from a new characterization of Robinsonian matrices in terms of straight enumerations of unit interval graphs. In Chapter 6 we introduce the novel Similarity-First Search algorithm (SFS), a weighted version of Lex-BFS which we use in a multisweep algorithm for the recognition of Robinsonian matrices. In Chapter 7 we model the seriation problem as an instance of Quadratic Assignment Problem (QAP) and we show that if the data has a Robinsonian structure, then one can find an optimal solution for QAP using a Robinsonian recognition algorithm. In Chapter 8 we discuss how to solve the seriation problem when the data does not have a Robinsonian structure, by finding a Robinsonian approximation of the original data. Finally, in Chapter 9 we discuss some experiments which we have carried out in order to compare the performance of the algorithms introduced in the thesis.",
author = "Matteo Seminaroti",
year = "2016",
language = "English",
isbn = "9789056684969",
series = "CentER Dissertation Series",
publisher = "CentER, Center for Economic Research",
school = "Tilburg University",

}

Seminaroti, M 2016, 'Combinatorial algorithms for the seriation problem', Doctor of Philosophy, Tilburg University, Tilburg.

Combinatorial algorithms for the seriation problem. / Seminaroti, Matteo.

Tilburg : CentER, Center for Economic Research, 2016. 219 p.

Research output: ThesisDoctoral ThesisScientific

TY - THES

T1 - Combinatorial algorithms for the seriation problem

AU - Seminaroti, Matteo

PY - 2016

Y1 - 2016

N2 - In this thesis we study the seriation problem, a combinatorial problem arising in data analysis, which asks to sequence a set of objects in such a way that similar objects are ordered close to each other. We focus on the combinatorial structure and properties of Robinsonian matrices, a special class of structured matrices which best achieve the seriation goal. Our contribution is both theoretical and practical, with a particular emphasis on algorithms. In Chapter 2 we introduce basic concepts about graphs, permutations and proximity matrices used throughout the thesis. In Chapter 3 we present Robinsonian matrices, discussing their characterizations and recognition algorithms existing in the literature. In Chapter 4 we discuss Lexicographic Breadth-First search (Lex-BFS), a special graph traversal algorithm used in multisweep algorithms for the recognition of several classes of graphs. In Chapter 5 we introduce a new Lex-BFS based algorithm to recognize Robinsonian matrices, which is derived from a new characterization of Robinsonian matrices in terms of straight enumerations of unit interval graphs. In Chapter 6 we introduce the novel Similarity-First Search algorithm (SFS), a weighted version of Lex-BFS which we use in a multisweep algorithm for the recognition of Robinsonian matrices. In Chapter 7 we model the seriation problem as an instance of Quadratic Assignment Problem (QAP) and we show that if the data has a Robinsonian structure, then one can find an optimal solution for QAP using a Robinsonian recognition algorithm. In Chapter 8 we discuss how to solve the seriation problem when the data does not have a Robinsonian structure, by finding a Robinsonian approximation of the original data. Finally, in Chapter 9 we discuss some experiments which we have carried out in order to compare the performance of the algorithms introduced in the thesis.

AB - In this thesis we study the seriation problem, a combinatorial problem arising in data analysis, which asks to sequence a set of objects in such a way that similar objects are ordered close to each other. We focus on the combinatorial structure and properties of Robinsonian matrices, a special class of structured matrices which best achieve the seriation goal. Our contribution is both theoretical and practical, with a particular emphasis on algorithms. In Chapter 2 we introduce basic concepts about graphs, permutations and proximity matrices used throughout the thesis. In Chapter 3 we present Robinsonian matrices, discussing their characterizations and recognition algorithms existing in the literature. In Chapter 4 we discuss Lexicographic Breadth-First search (Lex-BFS), a special graph traversal algorithm used in multisweep algorithms for the recognition of several classes of graphs. In Chapter 5 we introduce a new Lex-BFS based algorithm to recognize Robinsonian matrices, which is derived from a new characterization of Robinsonian matrices in terms of straight enumerations of unit interval graphs. In Chapter 6 we introduce the novel Similarity-First Search algorithm (SFS), a weighted version of Lex-BFS which we use in a multisweep algorithm for the recognition of Robinsonian matrices. In Chapter 7 we model the seriation problem as an instance of Quadratic Assignment Problem (QAP) and we show that if the data has a Robinsonian structure, then one can find an optimal solution for QAP using a Robinsonian recognition algorithm. In Chapter 8 we discuss how to solve the seriation problem when the data does not have a Robinsonian structure, by finding a Robinsonian approximation of the original data. Finally, in Chapter 9 we discuss some experiments which we have carried out in order to compare the performance of the algorithms introduced in the thesis.

M3 - Doctoral Thesis

SN - 9789056684969

T3 - CentER Dissertation Series

PB - CentER, Center for Economic Research

CY - Tilburg

ER -

Seminaroti M. Combinatorial algorithms for the seriation problem. Tilburg: CentER, Center for Economic Research, 2016. 219 p. (CentER Dissertation Series).