TY - GEN
T1 - CRATOR a CRAwler for TOR
T2 - 29th European Symposium on Research in Computer Security, ESORICS 2024
AU - De Pascale, Daniel
AU - Cascavilla, Giuseppe
AU - Tamburri, Damian A.
AU - Van Den Heuvel, Willem Jan
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - Dark web crawling is a complex process that involves specific methodologies and techniques to navigate the Tor network and extract data from hidden services. This study proposes a dark web crawler designed to extract pages handling security protocols, such as CAPTCHAs, efficiently. Our approach uses a combination of seed URL lists, link analysis, and scanning to discover new content. We also incorporate methods for user-agent rotation and proxy usage to maintain anonymity and avoid detection. We evaluate the effectiveness of our crawler using metrics such as coverage, performance, and robustness. Our results demonstrate that our crawler effectively extracts pages handling security protocols while preserving anonymity and avoiding detection. Our proposed dark web crawler can be used for several applications, including threat intelligence, cybersecurity, and online investigations.
AB - Dark web crawling is a complex process that involves specific methodologies and techniques to navigate the Tor network and extract data from hidden services. This study proposes a dark web crawler designed to extract pages handling security protocols, such as CAPTCHAs, efficiently. Our approach uses a combination of seed URL lists, link analysis, and scanning to discover new content. We also incorporate methods for user-agent rotation and proxy usage to maintain anonymity and avoid detection. We evaluate the effectiveness of our crawler using metrics such as coverage, performance, and robustness. Our results demonstrate that our crawler effectively extracts pages handling security protocols while preserving anonymity and avoiding detection. Our proposed dark web crawler can be used for several applications, including threat intelligence, cybersecurity, and online investigations.
KW - crawler
KW - Dark Web
KW - Law Enforcement Agency
KW - Open Source Intelligence
KW - TOR
UR - http://www.scopus.com/inward/record.url?scp=85204537472&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-70890-9_8
DO - 10.1007/978-3-031-70890-9_8
M3 - Conference contribution
AN - SCOPUS:85204537472
SN - 9783031708893
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 144
EP - 161
BT - Computer Security – ESORICS 2024
A2 - Garcia-Alfaro, Joaquin
A2 - Kozik, Rafał
A2 - Choraś, Michał
A2 - Katsikas, Sokratis
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 16 September 2024 through 20 September 2024
ER -