Large scale disambiguation of scientific references in patent databases

Kangran Zhao, Emiel Caron, Stanislaw Guner

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

The PATSTAT database stores information on patent applications and publications. One of its tables, stores scientific references cited by patents. As such, this is a potentially powerful resource to investigate the relation between science, technology and innovation. We aim to provide a reliable way to conduct research on such databases. To this end, we employ automated data cleaning and extract bibliographic information. Furthermore, a scoring system is used, and clusters of duplicates of scientific references are obtained by a clustering algorithm.
Original languageEnglish
Title of host publicationProceedings of 21st International Conference on Science and Technology Indicators (STI 2016)
Subtitle of host publicationPeripheries, frontiers and beyond
EditorsIsmael Rafols, Jordi Molas-Gallart, Elena Castro-Martinez, Richard Woolley
Place of PublicationValència (Spain)
PublisherEditorial Universitat Politècnica de València
Pages1404-1410
Number of pages6
ISBN (Print)9788490485194
Publication statusPublished - 14 Sep 2016
EventInternational Conference on Science and Technology Indicators - Valencia, Spain
Duration: 14 Sep 201616 Sep 2016
Conference number: 21
http://www.sti2016.org

Conference

ConferenceInternational Conference on Science and Technology Indicators
Abbreviated titleSTI2016
CountrySpain
CityValencia
Period14/09/1616/09/16
Internet address

Fingerprint

Clustering algorithms
Cleaning
Innovation

Cite this

Zhao, K., Caron, E., & Guner, S. (2016). Large scale disambiguation of scientific references in patent databases. In I. Rafols, J. Molas-Gallart, E. Castro-Martinez, & R. Woolley (Eds.), Proceedings of 21st International Conference on Science and Technology Indicators (STI 2016) : Peripheries, frontiers and beyond (pp. 1404-1410). València (Spain): Editorial Universitat Politècnica de València.
Zhao, Kangran ; Caron, Emiel ; Guner, Stanislaw. / Large scale disambiguation of scientific references in patent databases. Proceedings of 21st International Conference on Science and Technology Indicators (STI 2016) : Peripheries, frontiers and beyond. editor / Ismael Rafols ; Jordi Molas-Gallart ; Elena Castro-Martinez ; Richard Woolley. València (Spain) : Editorial Universitat Politècnica de València, 2016. pp. 1404-1410
@inproceedings{f07e7ff50db44f2b87573632ca6aa1eb,
title = "Large scale disambiguation of scientific references in patent databases",
abstract = "The PATSTAT database stores information on patent applications and publications. One of its tables, stores scientific references cited by patents. As such, this is a potentially powerful resource to investigate the relation between science, technology and innovation. We aim to provide a reliable way to conduct research on such databases. To this end, we employ automated data cleaning and extract bibliographic information. Furthermore, a scoring system is used, and clusters of duplicates of scientific references are obtained by a clustering algorithm.",
author = "Kangran Zhao and Emiel Caron and Stanislaw Guner",
year = "2016",
month = "9",
day = "14",
language = "English",
isbn = "9788490485194",
pages = "1404--1410",
editor = "Ismael Rafols and Jordi Molas-Gallart and Elena Castro-Martinez and Richard Woolley",
booktitle = "Proceedings of 21st International Conference on Science and Technology Indicators (STI 2016)",
publisher = "Editorial Universitat Polit{\`e}cnica de Val{\`e}ncia",

}

Zhao, K, Caron, E & Guner, S 2016, Large scale disambiguation of scientific references in patent databases. in I Rafols, J Molas-Gallart, E Castro-Martinez & R Woolley (eds), Proceedings of 21st International Conference on Science and Technology Indicators (STI 2016) : Peripheries, frontiers and beyond. Editorial Universitat Politècnica de València, València (Spain), pp. 1404-1410, International Conference on Science and Technology Indicators, Valencia, Spain, 14/09/16.

Large scale disambiguation of scientific references in patent databases. / Zhao, Kangran ; Caron, Emiel; Guner, Stanislaw.

Proceedings of 21st International Conference on Science and Technology Indicators (STI 2016) : Peripheries, frontiers and beyond. ed. / Ismael Rafols; Jordi Molas-Gallart; Elena Castro-Martinez; Richard Woolley. València (Spain) : Editorial Universitat Politècnica de València, 2016. p. 1404-1410.

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

TY - GEN

T1 - Large scale disambiguation of scientific references in patent databases

AU - Zhao, Kangran

AU - Caron, Emiel

AU - Guner, Stanislaw

PY - 2016/9/14

Y1 - 2016/9/14

N2 - The PATSTAT database stores information on patent applications and publications. One of its tables, stores scientific references cited by patents. As such, this is a potentially powerful resource to investigate the relation between science, technology and innovation. We aim to provide a reliable way to conduct research on such databases. To this end, we employ automated data cleaning and extract bibliographic information. Furthermore, a scoring system is used, and clusters of duplicates of scientific references are obtained by a clustering algorithm.

AB - The PATSTAT database stores information on patent applications and publications. One of its tables, stores scientific references cited by patents. As such, this is a potentially powerful resource to investigate the relation between science, technology and innovation. We aim to provide a reliable way to conduct research on such databases. To this end, we employ automated data cleaning and extract bibliographic information. Furthermore, a scoring system is used, and clusters of duplicates of scientific references are obtained by a clustering algorithm.

M3 - Conference contribution

SN - 9788490485194

SP - 1404

EP - 1410

BT - Proceedings of 21st International Conference on Science and Technology Indicators (STI 2016)

A2 - Rafols, Ismael

A2 - Molas-Gallart, Jordi

A2 - Castro-Martinez, Elena

A2 - Woolley, Richard

PB - Editorial Universitat Politècnica de València

CY - València (Spain)

ER -

Zhao K, Caron E, Guner S. Large scale disambiguation of scientific references in patent databases. In Rafols I, Molas-Gallart J, Castro-Martinez E, Woolley R, editors, Proceedings of 21st International Conference on Science and Technology Indicators (STI 2016) : Peripheries, frontiers and beyond. València (Spain): Editorial Universitat Politècnica de València. 2016. p. 1404-1410