Predicting Housing Market Trends Using Twitter Data

Marlon Velthorst, Çiçek Güven

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

In this study, we try to predict the Dutch housing market trends using text mining and machine learning as an application of data science methods in finance. Our main goal is to predict the short term upward or downward trend of the average house price in the Dutch market by using text data collected from Twitter. Twitter is widely used as well and has been proven to be a helpful source of data. However, Twitter, text mining (tokenization, bag-of-words, n-grams, weighted term frequencies) and machine learning (classification algorithms) have not been combined yet in order to predict the housing market trends in short term. In this study, tweets including predefined search words are collected relying on domain knowledge, and the corresponding text is grouped by month as documents. Then words and word sequences are transformed into numerical values. These values served as attributes to predict whether the housing market moves up or down, i.e. we approached this as a binomial classification problem relating text data of a month with (up or down) trends for the following month. Our main results reveal there is a correlation between the (weighted) frequency of words and short term housing trends, in other words, we were able to make accurate predictions of trends in short term using multiple machine learning and text mining techniques combined.
Original languageEnglish
Title of host publicationProceedings of 2019 6th Swiss Conference on Data Science (SDS)
PublisherIEEE
DOIs
Publication statusPublished - 8 Aug 2019
Externally publishedYes

Fingerprint

Learning systems
Finance

Cite this

Velthorst, M., & Güven, Ç. (2019). Predicting Housing Market Trends Using Twitter Data. In Proceedings of 2019 6th Swiss Conference on Data Science (SDS) IEEE. https://doi.org/10.1109/SDS.2019.00010
Velthorst, Marlon ; Güven, Çiçek. / Predicting Housing Market Trends Using Twitter Data. Proceedings of 2019 6th Swiss Conference on Data Science (SDS). IEEE, 2019.
@inproceedings{7b149a15521a4772ae4d8876ec3669b3,
title = "Predicting Housing Market Trends Using Twitter Data",
abstract = "In this study, we try to predict the Dutch housing market trends using text mining and machine learning as an application of data science methods in finance. Our main goal is to predict the short term upward or downward trend of the average house price in the Dutch market by using text data collected from Twitter. Twitter is widely used as well and has been proven to be a helpful source of data. However, Twitter, text mining (tokenization, bag-of-words, n-grams, weighted term frequencies) and machine learning (classification algorithms) have not been combined yet in order to predict the housing market trends in short term. In this study, tweets including predefined search words are collected relying on domain knowledge, and the corresponding text is grouped by month as documents. Then words and word sequences are transformed into numerical values. These values served as attributes to predict whether the housing market moves up or down, i.e. we approached this as a binomial classification problem relating text data of a month with (up or down) trends for the following month. Our main results reveal there is a correlation between the (weighted) frequency of words and short term housing trends, in other words, we were able to make accurate predictions of trends in short term using multiple machine learning and text mining techniques combined.",
author = "Marlon Velthorst and {\cC}i{\cc}ek G{\"u}ven",
year = "2019",
month = "8",
day = "8",
doi = "10.1109/SDS.2019.00010",
language = "English",
booktitle = "Proceedings of 2019 6th Swiss Conference on Data Science (SDS)",
publisher = "IEEE",

}

Velthorst, M & Güven, Ç 2019, Predicting Housing Market Trends Using Twitter Data. in Proceedings of 2019 6th Swiss Conference on Data Science (SDS). IEEE. https://doi.org/10.1109/SDS.2019.00010

Predicting Housing Market Trends Using Twitter Data. / Velthorst, Marlon; Güven, Çiçek.

Proceedings of 2019 6th Swiss Conference on Data Science (SDS). IEEE, 2019.

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

TY - GEN

T1 - Predicting Housing Market Trends Using Twitter Data

AU - Velthorst, Marlon

AU - Güven, Çiçek

PY - 2019/8/8

Y1 - 2019/8/8

N2 - In this study, we try to predict the Dutch housing market trends using text mining and machine learning as an application of data science methods in finance. Our main goal is to predict the short term upward or downward trend of the average house price in the Dutch market by using text data collected from Twitter. Twitter is widely used as well and has been proven to be a helpful source of data. However, Twitter, text mining (tokenization, bag-of-words, n-grams, weighted term frequencies) and machine learning (classification algorithms) have not been combined yet in order to predict the housing market trends in short term. In this study, tweets including predefined search words are collected relying on domain knowledge, and the corresponding text is grouped by month as documents. Then words and word sequences are transformed into numerical values. These values served as attributes to predict whether the housing market moves up or down, i.e. we approached this as a binomial classification problem relating text data of a month with (up or down) trends for the following month. Our main results reveal there is a correlation between the (weighted) frequency of words and short term housing trends, in other words, we were able to make accurate predictions of trends in short term using multiple machine learning and text mining techniques combined.

AB - In this study, we try to predict the Dutch housing market trends using text mining and machine learning as an application of data science methods in finance. Our main goal is to predict the short term upward or downward trend of the average house price in the Dutch market by using text data collected from Twitter. Twitter is widely used as well and has been proven to be a helpful source of data. However, Twitter, text mining (tokenization, bag-of-words, n-grams, weighted term frequencies) and machine learning (classification algorithms) have not been combined yet in order to predict the housing market trends in short term. In this study, tweets including predefined search words are collected relying on domain knowledge, and the corresponding text is grouped by month as documents. Then words and word sequences are transformed into numerical values. These values served as attributes to predict whether the housing market moves up or down, i.e. we approached this as a binomial classification problem relating text data of a month with (up or down) trends for the following month. Our main results reveal there is a correlation between the (weighted) frequency of words and short term housing trends, in other words, we were able to make accurate predictions of trends in short term using multiple machine learning and text mining techniques combined.

U2 - 10.1109/SDS.2019.00010

DO - 10.1109/SDS.2019.00010

M3 - Conference contribution

BT - Proceedings of 2019 6th Swiss Conference on Data Science (SDS)

PB - IEEE

ER -

Velthorst M, Güven Ç. Predicting Housing Market Trends Using Twitter Data. In Proceedings of 2019 6th Swiss Conference on Data Science (SDS). IEEE. 2019 https://doi.org/10.1109/SDS.2019.00010