Thom Volker

Thom Volker

Statistician • Data Scientist • Sociologist

PhD Candidate in Methods and Statistics

Utrecht University

Biography

I am a PhD candidate at the Methods and Statistics department of Utrecht University, researching different techniques for creating privacy-preserving synthetic data sets, under the supervision of Dr. Erik-Jan van Kesteren, Dr. Peter-Paul de Wolf and Prof. Dr. Stef van Buuren. I aim to work at the intersection of social-scientific research and cutting-edge statistical techniques, to get the most out of expensively collected research data.

In the past, I worked on several projects on evidence synthesis, aiming to aggregate evidence over heterogeneous studies that do not allow for meta-analysis. Together with Irene Klugkist I outlined and evaluated the methodology, while I applied it on a set of heterogeneous, sociological studies with Vincent Buskens and Werner Raub. Additionally, I worked on several projects in a broad range of topics (multiple imputation of missing data, unsupervised text analysis and hypothesis evaluation using information criteria).

Besides research, I teach graduate and post-graduate level courses in data science techniques and multiple imputation of missing data.

Interests

  • Causal Inference
  • Bayesian Statistics
  • Multiple Imputation
  • Evidence Synthesis

Education

  • MSc in Methods and Statistics for the Behavioural, Biomedical and Social Sciences, 2022

    Utrecht University

  • MSc in Sociology and Social Research, 2022

    Utrecht University

  • BA in Liberal Arts & Sciences, 2019

    Utrecht University

Teaching statistics

Over the years, I have teached in courses with topics ranging from structural equation modeling and missing data methods to social network analysis.

Research

My research concentrates on developing and advancing methodology to create and evaluate privacy-preserving synthetic data, that aims to overcome disclosure risks related to disseminating research data.

Consultation

Over the past years, I consulted in multiple projects with applied researchers (ranging from sociologists and educational scientists to medical scientists). Drop me a line if you are interested.

Experience

 
 
 
 
 

PhD Candidate

Utrecht University

Jul 2022 – Present
My PhD-project focuses on the creation of privacy-preserving synthetic data, and aims to further the methodology for generating artificial data sets that serve as non-disclosive alternatives of collected research data.
 
 
 
 
 

Student-Assistant

Utrecht University

Apr 2018 – Jun 2022 Utrecht
As a student-assistant, I have contributed to teaching and course development, research and the organization of the Methodology and Statistics Utrecht Summer School courses. I developed materials for and taught in multiple Bachelor’s, Master’s and post-graduate level courses on a wide range of topics, such as data science techniques, missing data, structural equation model, social network analysis and standard statistical techniques, and co-supervised Bachelor’s and Master’s theses. Additionally, I assisted in research related to multiple imputation, model selection, hypothesis evaluation and automated text analysis.

Education

Methodology and Statistics of the Behavioural, Biomedical and Social Sciences

Research Master’s programme

  • Thesis: Combining support for hypotheses over heterogeneous studies with Bayesian Evidence Synthesis: A simulation study
  • Cum laude (GPA 8.9/10)

Sociology and Social Research

Research Master’s programme

  • Thesis: The future is made today: Concerns for reputation foster trust and cooperation
  • Cum laude (GPA 8.7/10)

Liberal Arts & Sciences

Bachelor’s programme with a major in Pedagogical Sciences and a minor in Sociology & Social Research.

Projects

*

Bayesian Evidence Synthesis

Bayesian Evidence Synthesis is a method to integrate the results of multiple studies with varying, seemingly incompatible, designs using Bayes Factors, to enhance the aggregation of scientific evidence.

Multiple Imputation of Synthetic Data

Synthetic data allows for openly sharing of research data, without disclosing identifying information of the participants, that could be as informative as the actually observed data.

Contact