Research Methodology · Experimental Design · AI Evaluation

Daniel Mullins, PhD

I make sure your data actually says what you think it says.
PhD, Sociology — Texas A&M Quantitative methods focus Teaches graduate methods & statistics STATA · R · Qualtrics · Prolific Preregistration / OSF

Most organizations running experiments on people — A/B tests, user studies, surveys, model evaluations — are making expensive decisions on designs that don't hold up. Underpowered samples. Broken randomization. Metrics that measure the wrong thing. Results that are noise dressed as signal.

I'm a methodologist who teaches this at the graduate level and writes the code to execute it. I review your design, tell you whether the result is real, and show you how to fix what isn't. The same rigor that goes into peer-reviewed experimental research — applied to the decisions your team is making this quarter.

I also work at the intersection most teams can't staff: measurement for AI systems. Model evaluations are human-subjects experiments. They need someone who understands both the statistics of valid measurement and the machine learning context. That combination is rare. It's what I do.

Services01

Experimental & causal-inference review

A/B tests, pricing experiments, user studies. Power analysis, randomization integrity, construct validity, multiple-comparison correction. I tell you if the result is real.

AI / LLM evaluation methodology

Designing valid model evaluations, eliminating bias in human rating protocols, and knowing whether "Model A beat Model B" survives scrutiny. Measurement rigor for ML teams.

Survey & instrument design

Qualtrics builds with proper scale construction, randomization, and skip logic. Prolific sampling that gives you clean, usable data the first time.

Statistical analysis & execution

You have data and need answers. I run the models in STATA or R and write up results you can defend — mediation, moderation, multilevel, longitudinal.

Academic & admissions consulting

Dissertation and thesis methodology, R1 graduate-school applications, and the academic job market — research statements, teaching statements, writing samples.

Rates02

Methodology & causal-inference consulting Industry · product · research teams
$200 / hr
AI / LLM evaluation engagements Project-scoped or monthly retainer
from $6,000 / mo
Statistical analysis & survey design Fixed-scope project
$400–1,500
Academic methods & dissertation consulting Graduate students & researchers
$125 / hr
Graduate-school application review Per application package
$300–450

Discovery calls are free. Project rates are quoted up front — no hourly surprises. Retainers available for ongoing work.

Tell me what you're trying to measure.