LLMs Can Infer Personality form Digital Footprints
As Large Language Models (LLMs) demonstrate increasingly human-like abilities in various natural language processing (NLP) tasks that are bound to become integral to personalized technologies, understanding their capabilities and inherent biases is crucial. Our study investigates the potential of LLMs like ChatGPT to infer psychological dispositions of individuals from their digital footprints. Specifically, we assess the ability of GPT-3.5 and GPT-4 to derive the Big Five personality traits from users' Facebook status updates in a zero-shot learning scenario. Our results show an average correlation of r = .29 (range = [.22, .33]) between LLM-inferred and self-reported trait scores. Furthermore, our findings suggest biases in personality inferences with regard to gender and age: inferred scores demonstrated smaller errors for women and younger individuals on several traits, suggesting a potential systematic bias stemming from the underlying training data or differences in online self-expression.
Collaborators: Sandra Matz
Context-Aware Prediction of User Engagement
The success of online social platforms hinges on their ability to predict and understand user behavior at scale. Here, we present data suggesting that context-aware modeling approaches may offer a holistic yet lightweight and potentially privacy-preserving representation of user engagement on online social platforms. Leveraging deep LSTM neural networks to analyze more than 100 million Snapchat sessions from almost 80.000 users, we demonstrate that patterns of active and passive use are predictable from past behavior (R2=0.345) and that the integration of context information substantially improves predictive performance compared to the behavioral baseline model (R2=0.522). Features related to smartphone connectivity status, location, temporal context, and weather were found to capture non-redundant variance in user engagement relative to features derived from histories of in-app behaviors. Further, we show that a large proportion of variance can be accounted for with minimal behavioral histories if momentary context information is considered (R2=0.44). These results indicate the potential of context-aware approaches for making models more efficient and privacy-preserving by reducing the need for long data histories. Finally, we employ model explainability techniques to glean preliminary insights into the underlying behavioral mechanisms. Our findings are consistent with the notion of context-contingent, habit-driven patterns of active and passive use, underscoring the value of contextualized representations of user behavior for predicting user engagement on social platforms.
Collaborators: Yozen Liu, Francesco Barbieri, Raiyan Abdul Baten, Sandra Matz, Maarten Bos
Generalizable Error Modeling
Human data annotation is critical in shaping the quality of machine learning (ML) and artificial intelligence (AI) systems. One significant challenge in this context is posed by annotation errors, as their effects can degrade the performance of ML models. This paper presents a predictive error model trained to detect potential errors in search relevance annotation tasks for three industry-scale ML applications (music streaming, video streaming, and mobile apps) and assesses its potential to enhance the quality and efficiency of the data annotation process. Drawing on real-world data from an extensive search relevance annotation program, we illustrate that errors can be predicted with moderate model performance (AUC=0.65-0.75) and that model performance generalizes well across applications (i.e., a global, task-agnostic model performs on par with task-specific models). We present model explainability analyses to identify which types of features are the main drivers of predictive performance. Additionally, we demonstrate the usefulness of the model in the context of auditing, where prioritizing tasks with high predicted error probabilities considerably increases the amount of corrected annotation errors (e.g., 40% efficiency gains for the music streaming application). These results underscore that automated error detection models can yield considerable improvements in the efficiency and quality of data annotation processes. Thus, our findings reveal critical insights into effective error management in the data annotation process, thereby contributing to the broader field of human-in-the-loop ML.
Collaborators: James Rae, Alireza Hashemi
Model Share AI
Machine learning (ML) has the potential to revolutionize a wide range of research areas and industries, but many ML projects never progress past the proof-of-concept stage. To address this issue, we introduce Model Share AI (AIMS), an easy-to-use MLOps platform designed to streamline collaborative model development, model provenance tracking, and model deployment, as well as a host of other functions aiming to maximize the real-world impact of ML research. AIMS features collaborative project spaces and a standardized model evaluation process that ranks model submissions based on their performance on unseen evaluation data, enabling collaborative model development and crowd-sourcing. Model performance and various model metadata are automatically captured to facilitate provenance tracking and allow users to learn from and build on previous submissions. Additionally, AIMS allows users to deploy ML models built in Scikit-Learn, TensorFlow Keras, PyTorch, and ONNX into live REST APIs and automatically generated web apps with minimal code. The ability to deploy models with minimal effort and to make them accessible to non-technical end-users through web apps has the potential to make ML research more applicable to real-world challenges.
Collaborators: Michael Parrott
Message Response Behaviors In Context
Instant messaging plays a significant role in people's social and professional lives, but little is known about the factors shaping message response behaviors. In the present study, we investigate the determinants of message response behaviors from a predictive-explanatory perspective. Using a large and diverse sample of Snapchat users, we first show that message response times are highly predictable (AUC=0.97). Second, we employ ablation techniques to examine the contributions of several important groups of predictors: message attributes (such as message length, time sent, and location of sender, but not message content), user attributes, network communication patterns, and dyad-level communication patterns, as well as spatial and temporal context. The results indicate that dyad-specific communication patterns in conjunction with spatial and temporal context account for the largest share of explained target variance. Our findings are consistent with the idea of dyad-specific, context-contingent messaging habits, and a state-based view of message response behaviors. Our work has implications for the development of new systems and UX design. For example our models could facilitate context-aware delivery timing, message rankings, and availability status displays.
Collaborators: Ron Dotsch, Yozen Liu, Sandra Matz, Maarten Bos
Organizational Fit AND JOB Tenure
Job tenure is an important organizational outcome. It is usually in the interest of both employees and employers that individuals stay in the organization for an extended time in order to be productive and contribute in meaningful ways. A potentially important factor influencing job tenure is organizational fit, the alignment of employees’ values and beliefs with those of the organization they work for. We analyze the effects of organizational fit on job tenure using a very large corpus of user data scraped from LinkedIn. The project will help to better understand how organizational fit affects job tenure and which variables moderate this effect. For example: Is organizational fit particularly important in certain companies or industries, and who are the people that profit the most/least from organizational fit?
Collaborators: Daniel Stein, Seung-Jae Bang, Huang Ke-Wei, Sandra Matz
Predicting The SPread of COVID-19
Social behaviors and compliance behaviors play a critical role in the transmission of COVID-19. Consequently, regional variation in personality traits that capture individual differences in these behaviors may offer new insight into the spread of COVID-19. We combine self-reported personality data (3.5M people), COVID-19 prevalence and death rates, and behavioral mobility observations (29M people) to show that regional personality differences in the US and Germany predict COVID-19-related outcomes and behaviors incremental to a conservative set of socio-demographic, economic, and pandemic-related control variables. Earlier onsets of COVID-19 and steeper initial growth rates were related to higher levels of Openness and lower levels of Neuroticism. We also show that (i) regional personality is associated with objective indicators of social distancing, (ii) the effects of regional personality can change over time (Openness), and that (iii) the effects of regional personality do not always converge with those observed at the individual level (Agreeableness and Conscientiousness).
Collaborators: Friedrich Götz, Tobias Ebert, Sandrine Müller, Jason Rentfrow, Sam Gosling, Martin Obschonka, Jeff Potter, Sandra Matz
Investigating the Relationships between Mobility and WellBeing
People interact with their physical environments every day by visiting different places and moving between them. Such mobility behaviours likely influence and are influenced by people's subjective well‐being. However, past research examining the links between mobility behaviours and well‐being has been inconclusive. Here, we provide a comprehensive investigation of these relationships by examining individual differences in two types of mobility behaviours (movement patterns and places visited) and their relationship to six indicators of subjective well‐being (depression, loneliness, anxiety, stress, affect, and energy) at two different temporal levels of analysis (two‐week tendencies and daily level). Using data from a large smartphone‐based longitudinal study (N = 1765), we show that (i) movement patterns assessed via GPS data (distance travelled, entropy, and irregularity) and (ii) places visited assessed via experience sampling reports (home, work, and social places) are associated with subjective well‐being at the between and within person levels. Our findings suggest that distance travelled is related to anxiety, affect, and stress, irregularity is related to depression and loneliness, and spending time in social places is negatively associated with loneliness. We discuss the implications of our work and highlight directions for future research on the generalizability to other populations as well as the characteristics of places.
Collaborators: Sandrine Muller, Weichen Wang, Sandra Matz, Gabriella Harari
Using Sensory Substitution to Improve decision Making
Our brains are able to juggle vast amounts of information unconsciously, but we often fail to make optimal decisions when faced with conscious decisions that require us, for example, to memorize and manipulate numbers or estimate probabilities. Here, we explore whether we can overcome some of these cognitive limitations in everyday decision making by enabling people to feel data in a more holistic way and rely on intuitive rather than deliberative cognitive processes. Specifically, we propose that sensory substitution - a concept used in neuroscience to describe the process of encoding a particular type of sensory information in a way that makes this information accessible to a different sensory modality - can improve the quality of decision making. To test this idea, we are (1) developing a tactile interface that enables us to convey complex information through vibrations, and (2) running a series of rigorous experiments to demonstrate that sensory substitution can not only expand our sensory horizon, but also improve decision making in the lab and in the real world.
Collaborators: Moran Cerf, Sandra Matz
Measuring Spatial Reasoning Ability with Minecraft
Video games are a promising tool for the psychometric assessment of cognitive abilities. They can present novel task types and answer formats, they can record process data, and they can be highly motivating for test takers. This paper introduces the first game-based intelligence assessment implemented in Minecraft, an exceptionally popular video game with 176m copies sold. We found that abilities measured with Minecraft and conventional tests were highly correlated at the latent level (r = .72). Furthermore we found that behavioral log-data collected from the game environment was highly predictive of performance in the Minecraft test and, to a lesser extent, also predicted scores in conventional tests. We identify a number of behavioral features associated with spatial reasoning ability, demonstrating the utility of analyzing granular behavioral data in addition to traditional response formats. Overall, our findings indicate that Minecraft is a suitable platform for game-based intelligence assessment and encourage future work aiming to explore game-based problem solving tasks that would not be feasible on paper or in conventional computer-based tests.
Collaborators: Andrew Kyngdon, David Stillwell
The Big Data Toolkit for Psychologists
I have summed up my approach to data analysis for a chapter of the APA handbook, The Psychology of Technology. It serves as a practical introduction for psychologists who want to use large data sets and data sets from nontraditional data sources in their research. First, the chapter discusses the concept of Big Data and reviews some of the theoretical challenges and opportunities that arise with the availability of ever larger amounts of data. Second, it discusses practical implications and best practices with respect to data collection, data storage, data processing, and data modeling for psychological research in the age of Big Data.
Collaborators: Sam Gosling, Zachariah Marrero