- Dr. Jan Karem Höhne

CS3 lab for Computational Survey and Social Science is an interdisciplinary group of researchers from various fields, such as social and computer science, assembling expert knowledge in survey methodology, ux research, machine learning, NLP, and generative AI. CS3 lab is led by Prof. Dr. Jan Karem Höhne and situated in the Research Infrastructure and Methods Department at the German Centre for Higher Education Research and Science Studies (DZHW). Together, the members of CS3 lab constantly explore new avenues for extending the methodological and analytical toolkit for empirical social research and science studies.

In our research, we utilize online surveys as a comprehensive tool for collecting various digital data about people’s attitudes, traits, and behaviors. This includes trace data from mobile apps, search queries, and website visits to, for example, draw conclusions about people’s media consumption and living conditions. This is accompanied by research on smartphone sensors, such as accelerometer data for inferring physical activity levels. Similarly, we introduce qualitative research impulses to quantitative data collection by gathering voice answers to open narrative questions that are recorded through the built-in microphone of smartphones. In doing so, we are going beyond pure text-as-data methods extracting tonal cues to infer affective states in situ. Finally, we engage in social media recruitment strategies, including bot prevention and detection measures for protecting data integrity, and the potential of synthesizing social data through Large Language Models.

Most recently, we started to work on fusing embodied interviewing agents with online surveys. Specifically, we envision a fusion of elements of interviewer-based and online surveys with a multi-modal agent that is sensible to various input methods, such as text and voice, potentially increasing survey participation, respondent satisfaction, and data quality. The ultimate goal is to create conversational interviewing agents that autonomously conduct interviews facilitating human-like interactions in online surveys.

Our methodological and data collection efforts are accompanied by the utilization of powerful data analysis techniques. For example, we use machine learning algorithms in supervised and unsupervised settings to analyze human-based answer behavior. We extract features from textual and non-textual information. This includes downstream analyses of text, such as topic modeling, sentiment analysis, and entailment, using special embeddings, linguistic features, and transformers. We also apply pre-trained classification models, such as Support Vector Machines, with modular structures facilitating the fusion of, for example, voice and image data.

In all research efforts, we ascribe ourselves to open-science. We release data collection tools, analysis codes, and models through open-source repositories, such as Harvard Dataverse and GitHub. We also release data for extended replications and quality assurance following the notion of the Horizon Europe: As open as possible, as closed as necessary.

In addition, the CS3 lab holds a regular CS3 meeting inviting international expert researchers to present their contemporary work in the field of computational survey and social science. CS3 meeting takes palce online during summer (April to July) and winter terms (October to January) and is broadcasted by DZHW. In doing so, the CS3 lab aims to provide a floor for scientific exchange and the discussion of novel research directions in the empirical social research and beyond.

Upcoming and recent CS3 meetings:
Joshua Claassen – DZHW, Leibniz University Hannover (July 14, 2026, from 3:15 to 4:00 PM)
Going beyond self-reports: A computational look at processing, enriching, and analyzing digital traces

Angelo Moretti – Utrecht University (June 29, 2026, from 3:15 to 4:00 PM)
The role of new forms of data in small area estimation of social indicators: Methodological challenges and applications

Johanna Hölzl – University of Mannheim (May 19, 2026, from 3:15 to 4:00 PM)
Still problematic? Generalizability, validity, and reliability of API-based research

Jan Karem Höhne – DZHW, Leibniz University Hannover (April 22, 2026, from 3:15 to 4:00 PM)
Processing and analyzing audio data: From audio files to text-as-data methods

Melanie Revilla – University Pompeu Fabra (January 28, 2026, from 3:15 to 4:00 PM)
Beyond clicks and text: Integrating new types of data into web surveys

Zaza Zindel – DeZIM, Bielefeld University (November 25, 2025, from 3:15 to 4:00 PM)
Social media as a tool for survey recruitment: Opportunities and challenges

Georg Ahnert – University of Mannheim (October 29, 2025, from 3:15 to 4:00 PM)
Analytic flexibility in silicon samples: Generating survey responses with Large Language Models

Joshua Claassen – DZHW, Leibniz University Hannover (July 08, 2025, from 3:15 to 4:00 PM)
Web surveys under attack: Novel strategies for detecting LLM-driven bots

Johannes Breuer – GESIS & CAIS (June 18, 2025, from 3:15 to 4:00 PM)
Ethical questions in research with digital trace data

Indira Sen – University of Mannheim (May 22, 2025, from 3:15 to 4:00 PM)
At the intersection of NLP and survey methodology: Potentials, challenges, and provocations

Haomiao Jin – University of Surrey (April 15, 2025, from 3:15 to 4:00 PM)
Exploring an AI-powered survey interviewing agent for individuals who are blind or visually impaired

Anke Radinger, Ulrike Efu Nkong, & Dorothée Behr – GESIS Leibniz Institute for the Social Sciences (January 16, 2025, from 3:15 to 4:00 PM)
Comparing professional translators and social scientists when producing questionnaire translations from scratch vs. based on machine translation output

Laura Boeschoten – Utrecht University (December 12, 2024, from 3:15 to 4:00 PM)
Digital trace data collection through data donation

Andreas Jungherr, Adrian Rauchfleisch, & Alexander Wuttke – Bamberg University, National Taiwan University, & Ludwig Maximilians University (November 14, 2024, from 4:15 to 5:00 PM)
Deceptive uses of Artificial Intelligence in elections strengthen support for AI ban

Saijal Shahania, Joshua Claassen, Jan Karem Höhne, & David Broneske – German Center for Higher Education Research and Science Studies (October 24, 2024, from 3:15 to 4:00 PM)
The bot that lived: SurveyBot’s role in automated web survey pretesting

Oriol Bosch – Oxford University (July 12, 2024, from 12:15 to 1:00 PM)
Tell me what you read, and I will tell you who you are: a novel method for measuring ideology using web browsing data

Carolina Haensch, Leah von der Heyde, & Alexander Wenz – LMU Munich and University of Mannheim (June 26, 2024, from 12:15 to 1:00 PM)
Can Large Language Models predict how people vote? Evidence from Germany

Timo Lenzner – GESIS Leibniz Institute for the Social Sciences (June 12, 2024, from 12:15 to 1:00 PM)
Integrating ChatGPT into cognitive pretesting procedures