CS3 lab for Computational Survey and Social Science is an interdisciplinary group of researchers from various fields, such as social and computer science, assembling expert knowledge in survey methodology, ux research, machine learning, NLP, and generative AI. CS3 lab is led by Prof. Dr. Jan Karem Höhne and situated in the Research Infrastructure and Methods Department at the German Centre for Higher Education Research and Science Studies (DZHW). Together, the members of CS3 lab constantly explore new avenues for extending the methodological and analytical toolkit for substantive social science research.
In our research, we utilize online surveys as a comprehensive tool for collecting various digital data about people’s attitudes, traits, and behaviors. This includes trace data from mobile apps, search queries, and website visits to, for example, draw conclusions about people’s living conditions, such as pregnancy and parenthood. This is accompanied by research on smartphone sensors, such as accelerometer data for inferring motion conditions and activity levels. Similarly, we introduce qualitative research impulses to quantitative data collection by gathering voice answers to open narrative questions that are recorded through the built-in microphone of smartphones. In doing so, we are going beyond pure text-as-data methods extracting tonal cues to infer affective states in situ. Finally, we engage in social media recruitment strategies and investigate the potential of synthesizing social data through LLMs.
Most recently, we started to work on fusing embodied interviewing agents with online surveys. Specifically, we envision a fusion of elements of interviewer-based and online surveys with a multi-modal agent that is sensible to various input methods, such as text and voice, potentially increasing survey participation, respondent satisfaction, and data quality. The ultimate goal is to create conversational interviewing agents that autonomously conduct interviews facilitating human-like interactions in online surveys.
Our methodological and data collection efforts are accompanied by the utilization of powerful data analysis techniques. For example, we use machine learning algorithms in unsupervised settings to analyze human-based answer behavior. We extract features from textual and non-textual information. This includes downstream analyses of text, such as topic modeling, sentiment analysis, and entailment, using special embeddings, linguistic features, and transformers. We also apply pre-trained classification models, such as Support Vector Machines, with modular structures facilitating the fusion of, for example, voice and image data.
In all research efforts, we ascribe ourselves to open-science. We release data collection tools, analysis codes, and models through open-source repositories, such as Harvard Dataverse and GitHub. We also release data for extended replications and quality assurance following the notion of the European Research Council (ERC): As open as possible, as closed as necessary.
In addition, the CS3 lab holds a regular CS3 meeting by inviting international expert researchers to present their contemporary work in the field of computational survey and social science. CS3 meeting takes palce online during summer (April to July) and winter terms (October to January) and is broadcasted to the scientific community. In doing so, the CS3 lab aims to provide a floor for scientific exchange and the discussion of novel research directions in the social sciences and beyond.
Upcoming and recent CS3 meetings:
Anke Radinger, Ulrike Efu Nkong, & Dorothée Behr – GESIS Leibniz Institute for the Social Sciences (January 16, 2025, from 3:15 to 4:00 PM)
Comparing professional translators and social scientists when producing questionnaire translations from scratch vs. based on machine translation output
Laura Boeschoten – Utrecht University (December 12, 2024, from 3:15 to 4:00 PM)
Digital trace data collection through data donation
Andreas Jungherr, Adrian Rauchfleisch, & Alexander Wuttke – Bamberg University, National Taiwan University, & Ludwig Maximilians University (November 14, 2024, from 4:15 to 5:00 PM)
Deceptive uses of Artificial Intelligence in elections strengthen support for AI ban
Saijal Shahania, Joshua Claassen, Jan Karem Höhne, & David Broneske – German Center for Higher Education Research and Science Studies (October 24, 2024, from 3:15 to 4:00 PM)
The bot that lived: SurveyBot’s role in automated web survey pretesting
Oriol Bosch – Oxford University (July 12, 2024, from 12:15 to 1:00 PM)
Tell me what you read, and I will tell you who you are: a novel method for measuring ideology using web browsing data
Carolina Haensch, Leah von der Heyde, & Alexander Wenz – LMU Munich and University of Mannheim (June 26, 2024, from 12:15 to 1:00 PM)
Can Large Language Models predict how people vote? Evidence from Germany
Timo Lenzner – GESIS Leibniz Institute for the Social Sciences (June 12, 2024, from 12:15 to 1:00 PM)
Integrating ChatGPT into cognitive pretesting procedures