Matthew Wilcoxson, Research Software Engineer, talks about semantically linked data, data mining and AI.
When did you start at the Centre and what was your first role here?
I started at the Centre in August 2015 as a Research Software Engineer.
What is your background?
I studied Computer Science with Artificial Intelligence for my first degree but for my first few jobs I worked as a Software Engineer at education based software houses, where I mainly built C++ based educational applications.
I started at the University of Oxford in the Bodleian Libraries' Digital Library (BDLSS), working on projects to bring some of the Bodleian's holdings online, including Islamic manuscripts and Hebrew genizot.
I also recently finished a degree in Astrophysics.
Summarise the research you are doing / your research interests in a few sentences.
Currently as part of the Fusing Audio and Semantic Technologies (FAST) project, I'm investigating the possibility of auto-generating websites from audio based semantically linked data – data that both follows strict naming standards and links between other datasets.
I also develop the Cultures of Knowledge catalogue website. This contains more than 100,000 communications (also known as “letters”!) from around the 16-18th centuries. This project has a particularly rich dataset, which is useful to practice my visualization techniques.
Why is this important (to the scientific community / the world at large)?
The ability to quickly create websites to explore semantically linked datasets should encourage more data to be created in a semantic way.
These types of datasets are more easily adopted in research and more easily explored by the general public thus leading to better understanding across all disciplines.
The Cultures of Knowledge website contains a record of our own history. Its repository of fascinating letters – collected from around the world – detail our past, handwritten by the people who lived through it. It is a remarkable source of historical commentary.
What would you like to do next, funding permitting?
I'm interested in encouraging the dissemination and processing of datasets by the general public – I would like to make datasets more easily accessible.
Are you involved in any wider collaborations? Why are these important?
Both my projects involve collaborations with other universities and institutions across the UK and Europe. Our colleagues around Europe, and those further afield, are essential for the e-Research Centre – and the wider University – to continue to be one of the best research houses around.
What do you think the most important issues/challenges in your field will be in the next decade and how is the Centre placed to address them?
Data mining (the mass collection and analysis of large datasets) and associated techniques, like AI's Deep Learning, are fast becoming issues of privacy. These issues cross international borders and involve many areas of research. We can only develop solutions to these by working with international partners across multiple fields and this is exactly what institutions like Oxford's e-Research Centre do best. In my opinion that's what makes it a great place to work. (My office colleagues are great too...)