Experience

Boston University

Postdoctoral Researcher, Bioinformatics
July 2014-present
  • Integrating diverse, large-scale biological datasets from a variety of experimental sources in different formats.
  • Building mathematical models of biological processes, such as protein folding and alternative splicing, and evaluating them through comparisons with data.
  • Developed a Bayesian network model of a high-throughput experimental method to detect protein interactions which estimated experimental parameters such as false-positive and false-negative rates.
  • Identified relationships between the structure of proteins and their evolutionary history, using robust statistical testing.

CERN

Graduate Research Student
2010-May 2014
  • Analyzed a petabyte-scale dataset of particle collisions, using thousands of CPUs in parallel.
  • Worked with highly sophisticated models of particle collisions, improving them and quantifying the uncertainty on the model parameters through comparisons with experimental data.
  • Developed statistical software in Python to conduct hypothesis tests between different particle physics models and calculate confidence intervals on model parameters, taking into account multiple sources of measurement uncertainty.
  • Wrote high performance C++ code, run as part of the detector software.
  • Trained and optimized a neural network to classify particle collisions.

Education

University College London

PhD, Experimental Particle Physics
2010-May 2014

Awarded the UCL High Energy Physics Group Prize for outstanding research.

University of Warwick

Masters in Physics (MPhys), First Class Honors
2006-2010

Skills

Python and C++ development experience Over 4 years of writing software to analyze data.

Statistics Strong experience using advanced statistics including Bayesian methods, hypothesis testing and machine learning. Responsible for the statistical methodology of a CERN publication.

Analysis on petabyte-scale datasets Using large computing clusters.

Publication standard data visualization Produced figures published in reputable scientific journals.

Machine Learning Developed the use of a neural network to classify different types of particle collisions.

Presentations From weekly meetings to speaking at 3 national conferences and 1 international conference, to audiences of up to 200 people.

Software UNIX environments, shell scripting, R, SQL, GIT, LaTeX.