Automatic identification of writers’ intentions: Comparing different methods for predicting relationship goals in online dating profile texts

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Psychologically motivated, lexicon-based text analysis methods such as LIWC (Pennebaker et al., 2015) have been criticized by computational linguists for their lack of adaptability, but they have not often been systematically compared with either human evaluations or machine learning approaches. The goal of the current study was to assess the effectiveness and predictive ability of LIWC on a relationship goal classification task. In this paper, we compared the outcomes of (1) LIWC, (2) machine learning, and (3) a human baseline. A newly collected corpus of online dating profile texts (a genre not explored before in the ACL anthology) was used, accompanied by the profile writers’ self-selected relationship goal (long-term versus date). These three approaches were tested by comparing their performance on identifying both the intended relationship goal and content-related text labels. Results show that LIWC and machine learning models correlate with human evaluations in terms of content-related labels. LIWC’s content-related labels corresponded more strongly to humans than those of the classifier. Moreover, all approaches were similarly accurate in predicting the relationship goal.
Original languageEnglish
Title of host publicationProceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)
PublisherAssociation for Computational Linguistics
Pages94–100
Number of pages7
Publication statusPublished - Nov 2019
EventThe 5th Workshop on Noisy User-generated Text (W-NUT @ EMNLP) - Hong Kong, Hong Kong
Duration: 4 Nov 2019 → …

Workshop

WorkshopThe 5th Workshop on Noisy User-generated Text (W-NUT @ EMNLP)
CountryHong Kong
CityHong Kong
Period4/11/19 → …

Fingerprint

Learning systems
Labels
Classifiers

Cite this

van der Lee, C., van der Zanden, T., Krahmer, E., Mos, M., & Schouten, A. (2019). Automatic identification of writers’ intentions: Comparing different methods for predicting relationship goals in online dating profile texts. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019) (pp. 94–100). [D19-5512] Association for Computational Linguistics.
@inproceedings{3fb58c5391134e409122479f80f1363d,
title = "Automatic identification of writers’ intentions: Comparing different methods for predicting relationship goals in online dating profile texts",
abstract = "Psychologically motivated, lexicon-based text analysis methods such as LIWC (Pennebaker et al., 2015) have been criticized by computational linguists for their lack of adaptability, but they have not often been systematically compared with either human evaluations or machine learning approaches. The goal of the current study was to assess the effectiveness and predictive ability of LIWC on a relationship goal classification task. In this paper, we compared the outcomes of (1) LIWC, (2) machine learning, and (3) a human baseline. A newly collected corpus of online dating profile texts (a genre not explored before in the ACL anthology) was used, accompanied by the profile writers’ self-selected relationship goal (long-term versus date). These three approaches were tested by comparing their performance on identifying both the intended relationship goal and content-related text labels. Results show that LIWC and machine learning models correlate with human evaluations in terms of content-related labels. LIWC’s content-related labels corresponded more strongly to humans than those of the classifier. Moreover, all approaches were similarly accurate in predicting the relationship goal.",
author = "{van der Lee}, Chris and {van der Zanden}, Tess and Emiel Krahmer and Maria Mos and Alexander Schouten",
year = "2019",
month = "11",
language = "English",
pages = "94–100",
booktitle = "Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)",
publisher = "Association for Computational Linguistics",

}

van der Lee, C, van der Zanden, T, Krahmer, E, Mos, M & Schouten, A 2019, Automatic identification of writers’ intentions: Comparing different methods for predicting relationship goals in online dating profile texts. in Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)., D19-5512, Association for Computational Linguistics, pp. 94–100, The 5th Workshop on Noisy User-generated Text (W-NUT @ EMNLP), Hong Kong, Hong Kong, 4/11/19.

Automatic identification of writers’ intentions : Comparing different methods for predicting relationship goals in online dating profile texts. / van der Lee, Chris; van der Zanden, Tess; Krahmer, Emiel; Mos, Maria; Schouten, Alexander.

Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). Association for Computational Linguistics, 2019. p. 94–100 D19-5512.

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

TY - GEN

T1 - Automatic identification of writers’ intentions

T2 - Comparing different methods for predicting relationship goals in online dating profile texts

AU - van der Lee, Chris

AU - van der Zanden, Tess

AU - Krahmer, Emiel

AU - Mos, Maria

AU - Schouten, Alexander

PY - 2019/11

Y1 - 2019/11

N2 - Psychologically motivated, lexicon-based text analysis methods such as LIWC (Pennebaker et al., 2015) have been criticized by computational linguists for their lack of adaptability, but they have not often been systematically compared with either human evaluations or machine learning approaches. The goal of the current study was to assess the effectiveness and predictive ability of LIWC on a relationship goal classification task. In this paper, we compared the outcomes of (1) LIWC, (2) machine learning, and (3) a human baseline. A newly collected corpus of online dating profile texts (a genre not explored before in the ACL anthology) was used, accompanied by the profile writers’ self-selected relationship goal (long-term versus date). These three approaches were tested by comparing their performance on identifying both the intended relationship goal and content-related text labels. Results show that LIWC and machine learning models correlate with human evaluations in terms of content-related labels. LIWC’s content-related labels corresponded more strongly to humans than those of the classifier. Moreover, all approaches were similarly accurate in predicting the relationship goal.

AB - Psychologically motivated, lexicon-based text analysis methods such as LIWC (Pennebaker et al., 2015) have been criticized by computational linguists for their lack of adaptability, but they have not often been systematically compared with either human evaluations or machine learning approaches. The goal of the current study was to assess the effectiveness and predictive ability of LIWC on a relationship goal classification task. In this paper, we compared the outcomes of (1) LIWC, (2) machine learning, and (3) a human baseline. A newly collected corpus of online dating profile texts (a genre not explored before in the ACL anthology) was used, accompanied by the profile writers’ self-selected relationship goal (long-term versus date). These three approaches were tested by comparing their performance on identifying both the intended relationship goal and content-related text labels. Results show that LIWC and machine learning models correlate with human evaluations in terms of content-related labels. LIWC’s content-related labels corresponded more strongly to humans than those of the classifier. Moreover, all approaches were similarly accurate in predicting the relationship goal.

M3 - Conference contribution

SP - 94

EP - 100

BT - Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

PB - Association for Computational Linguistics

ER -

van der Lee C, van der Zanden T, Krahmer E, Mos M, Schouten A. Automatic identification of writers’ intentions: Comparing different methods for predicting relationship goals in online dating profile texts. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). Association for Computational Linguistics. 2019. p. 94–100. D19-5512