Fields of Gold: Web Scraping and APIs for Impactful Marketing Insights

Johannes Boegershausen, Hannes Datta, Abhishek Borah, Andrew Stephen

Research output: Working paperOther research output

43 Downloads (Pure)


Marketing scholars increasingly use web scraping and Application Programming Interfaces (APIs) to collect data from the internet. Yet, despite its widespread adoption across methodological traditions and substantive topics, a reflection about the challenges in collecting such data is lacking. How can researchers ensure that the datasets generated via web scraping and APIs are valid? Existing resources narrowly focus on technical details of extracting web data. These resources do not cover the broad range of validity concerns arising from researchers’ design decisions during the extraction. This article proposes a novel methodological framework that outlines how to maximize validity when selecting, designing, and collecting web data. Importantly, the framework highlights how addressing validity concerns requires the joint consideration of idiosyncratic technical and legal challenges. The authors also demonstrate the impact of web-data-based marketing research, how web data is collected and from which sources, and offer a taxonomy of how web data has advanced marketing thought. The article closes with novel research directions to identify, explore, and exploit new fields of gold filled with web data.
Original languageEnglish
Place of PublicationTilburg
Publication statusPublished - Feb 2022


  • web scraping
  • application programming interface (APIs)
  • field data
  • research credibility
  • reproducability
  • replicability
  • generalizability
  • Ecological value
  • research methods
  • workflow
  • open science
  • open data


Dive into the research topics of 'Fields of Gold: Web Scraping and APIs for Impactful Marketing Insights'. Together they form a unique fingerprint.

Cite this