Dr Neil Caithness
PhD

Senior Researcher in Data Science and Machine Learning

neil.caithness@oerc.ox.ac.uk
+44 (0)1865 610608

 

Research & Development

I have been the lead data scientist on many research and development projects, including the following selected projects: (Most recent first)

  • DIET – In collaboration with British Gas (energy provider), EDMI (smart meter manufacturer) and MDS (data service provider) and funded by Innovate UK, I was the lead data scientist on a project that aimed to detect energy theft from smart meter data using advanced machine learning techniques. The proprietary method for anomaly detection that I developed in this project has been filed through Oxford University Innovation as British patent application 1713896.7. (Tools employed: Matlab, R, RStudio, Git, Bitbucket, MongoDB, Mongo Atlas)
  • Strategic Blue KTP – I was the mentor for the KTP associate in data science at Strategic Blue teaching data science techniques in R and RStudio for deployment on shinyapps.io.
  • CLOUDWATCH – Funded by the European Commission, for this project I performed a detailed analysis of the relationships among the EC's portfolio of funded cloud computing enterprises using advanced unsupervised machine learning methods and the National Institute for Standards and Technology's defining characteristics of cloud computing. The results of the analysis were published in the Journal of Cloud Computing. (Tools employed: Matlab)
  • INFORM – In collaboration with the Global Canopy Program and the European Forest Institute I developed a program to trace the impact of commercial supply chains on Amazonian deforestation. A paper detailing the methods has been submitted to PLOS ONE. (Tools employed: Matlab and many diverse data discovery and manipulation tools.)
  • LEFT – In collaboration with University of Oxford's Zoology Department I implemented the Local Ecological Footprint Tool used to assess the impact of mining explorations and prospecting. This work was sponsored by Statoil and resulted in several publications. This was the prototype development for an advanced global ecological impact assessment tool that is still used by environmental impact assessment professionals.
  • VIBRANT – In collaboration with the Natural History Museum, London, I led the data science and cloud computing initiatives to build a 'virtual laboratory' for taxonomic and biodiversity researchers. I was the principal developer and the research and development team manager.

Computing

I am a data scientist, I research problems and implement solutions, usually in R or Matlab. I have special expertise in unsupervised, high-dimensional machine learning (discovery), and supervised learning and validation (classification, regression) as well as standard and traditional multivariate statistics and visualization techniques. I have developed novel techniques for anomaly detection and systems failure analysis. 

I have been involved in software development ever since becoming a research scientist and have experience and expertise in many different languages and platforms. A notable selection includes the following: Pascal, FORTRAN, ADA, C, C++, SPSS, SAS, Matlab, Java, SQL, NoSQL, PostgreSQL, MongoDB, and finally (and current preference) R, RStudio, Shiny & Git.

Publications

See https://scholar.google.com/citations?hl=en&user=T0b1MsoAAAAJ for complete list of publications.
(Journal articles 20, Citations 775, h-index 12, i10-index 17, updated February 2018)

Patents

Detection of Anomalous Systems, filed on 30 August 2017 by Oxford University Innovation as British patent application 1713896.7. 

I have been involved in software development ever since becoming a research scientist and have experience and expertise in many different languages and platforms. A notable selection includes the following: Pascal, FORTRAN, ADA, C, C++, SPSS, SAS, Matlab, Java, SQL, NoSQL, PostgreSQL, MongoDB, and finally (and current preference) R, RStudio, Shiny & Git.