GPU Centre Of Excellence Award

Publications

Mike Giles / Istvan Reguly / Gihan Mudalige

M.B. Giles. 'Approximation of the inverse Poisson cumulative distribution function'. To appear in  ACM TOMS , 2015. ( online )

I.Z. Reguly, G.R. Mudalige, C. Bertolli, M.B. Giles, A. Betts, P.H.J. Kelly, and D. Radford. `Acceleration of a full-scale industrial CFD application with OP2',  IEEE Transactions on Parallel and Distributed Systems , 2015. ( online )

I.Z. Reguly, E. Laszlo, G.R. Mudalige, M.B. Giles. `Vectorizing unstructured mesh computations for many-core architectures', To appear in  Concurrency and Computation: Practice and Experience , 2015. ( online )

I. Reguly, M.B. Giles. 'Finite element algorithms and data structures on graphical processing units'.  International Journal of Parallel Programming , 43(2):203-239, 2015. ( online )

M.B. Giles, E. Laszlo, I. Reguly, J. Appleyard, J. Demouth, "GPU implementation of finite difference solvers", Proceedings of the Seventh Workshop on High  Performance Computational Finance (WHPCF'14). Held in conjunction with IEEE/ACM Supercomputing 2014(SC'14) ( online )

G.R. Mudalige, I.Z. Reguly, M.B. Giles, A.C. Mallinson, W.P. Gaudin, J.A. Herdman, "Performance Analysis of a High-level Abstractions-based Hydrocode on Future Computing Systems". To appear in Proceedings of the 5th international workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS '14). Held in conjunction with IEEE/ACM Supercomputing 2014(SC'14) (online)

I.Z. Reguly, G.R. Mudalige, M.B. Giles, D. Curran and S. McIntosh-Smith, "The OPS Domain Specific Abstraction for Multi-Block Structured Grid Computations". To appear in Proceedings of the 4th international workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing(WOLFHPC '14).Held in conjunction with IEEE/ACM Supercomputing 2014(SC'14). ( online )

I. Z. Reguly, E. László, G. R. Mudalige, and M. B. Giles. “Vectorizing Unstructured Mesh Computations for Many-core Architectures ”. In: Proceedings of the 2014 International Workshop on Programming Models and Applications for Multi- cores and Manycores. PMAM ’14. Orlando, Florida, USA: ACM, 2014. doi: 10. 1145/2560683.2560686. ( online slides )

I. Z. Reguly, G. R. Mudalige, C. Bertolli, M. B. Giles, A. Betts, P. H. J. Kelly, and D. Radford. “Acceleration of a Full-scale Industrial CFD Application with OP2”, Poster at GTC ( online ) and talk at the UK Many-Core Conference 2013 ( slides )  

E. Laszlo, M. B. Giles, " Efficient Solution of Multiple Scalar and Block-Tridiagonal Equations ", GTC 2014, GPU Technology Conference, San Jose, CA, 26th March 2014 ( slides )

I. Reguly, M. Giles, ''Finite Element Algorithms and Data Structures on Graphical Processing Units", International Journal of Parallel Programming, Springer, 2013. ( online )

G.R. Mudalige, M.B. Giles, J. Thiyagalingam, I. Reguly, C. Bertolli, P.H.J. Kelly and A.E. Trefethen,  Design and Initial Performance of a High-level Unstructured Mesh Framework on Heterogeneous Parallel Systems.  Parallel Comput.   (2013) ( online )

C. Bertolli, A. Betts, N. Loriant, G.R. Mudalige, D. Radford, D.A. Ham, M.B. Giles, and P.H.J. Kelly. 'Compiler optimizations for industrial unstructured mesh CFD applications on GPUs', Languages and Compilers for Parallel Computing, pp.112-126, Springer, 2013. ( PDF )  

M.B. Giles, G.R. Mudalige, B. Spencer, C. Bertolli, I. Reguly,  Designing OP2 for GPU architectures , Journal of Parallel and Distributed Computing, Volume 73, Issue 11, November 2013, Pages 1451-1460, ISSN 0743-7315. ( PDF )

G.R. Mudalige, I. Reguly, M.B. Giles, C. Bertolli and P.H.J. Kelly.  OP2: An Active Library Framework for Solving Unstructured Mesh-based Applications on Multi-Core and Many-Core Architectures.  In Proceedings of Innovative Parallel Computing (InPar), 2012,  pp.1-12, 13-14 May 2012. ( PDF )

G.R. Mudalige, M.B. Giles, C. Bertolli, and P.H.J. Kelly.  Predictive Modeling and Analysis of OP2 on Distributed Memory GPU Clusters . SIGMETRICS Perform. Eval. Rev. 40, 2 :61-67 (2012). ( PDF )

G.R. Mudalige, M.B. Giles, C. Bertolli, and P.H.J. Kelly. 2011.  Predictive Modeling and analysis of OP2 on distributed memory GPU clusters . In Proceedings of the second international workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS '11). ACM, New York, NY, USA, 3-4. Held in conjunction with IEEE/ACM Supercomputing 2011 (SC'11), Seattle, WA, USA
 
M.B. Giles,   G.R. Mudalige, Z. Sharif, G. Markall, P.H.J. Kelly.  Performance Analysis and Optimization of the OP2 Framework on Many-core Architectures  (2012) Computer Journal, 55 (2), pp. 168-180.  ISSN 0010-4620 ( PDF ).
 
I. Reguly, M.B. Giles. `Efficient sparse matrix-vector multiplication on cache-based GPUs', in Innovative Parallel Computing (InPar), 2012, IEEE, 2012. ( PDF )
 
G. Klingbeil, R. Erban, M. Giles, P.K. Maini. 'Fat vs. thin threading approach on GPUs: application to stochastic simulation of chemical reactions'. IEEE Transactions on Parallel and Distributed Systems, 23(2):280-287, 2012. ( PDF )
 
M.B. Giles. 'Approximating the erfinv function'. pp.109-116 in GPU Computing Gems, Jade Edition, Morgan Kaufmann, 2011. ( PDF )
 
T. Bradley, J. du Toit, M.B. Giles, R. Tong, P. Woodhams. 'Parallelisation techniques for random number generators'. pp.231-246 in GPU Computing Gems, Emerald Edition, Morgan Kaufmann, 2011. ( PDF
 

Paul Dellar / Emma Warneford

P.J. Dellar (2013) Lattice Boltzmann magnetohydrodynamics with  current-dependent resistivity, J. Comput. Phys. 237 115-131 ( online )
 
P. J. Dellar (2014) Lattice Boltzmann algorithms without cubic defects in Galilean invariance on standard lattices, J. Comput. Phys. 259 270–283 (online )
 
E. S. Warneford & P. J. Dellar (2014) Thermal shallow water models of geostrophic turbulence in Jovian atmospheres Phys. Fluids 26 016603 ( online)
 
P. J. Dellar (2014) Lattice Boltzmann formulation for linear viscoelastic fluids using an abstract second stress, to appear in SIAM J. Sci. Comput. ( online)
 
E. S. Warneford & P. J. Dellar (2014) Super- and sub-rotating equatorial jets: Newtonian cooling versus Rayleigh friction, (arXiv)
 
E. S. Warneford (2014) "The thermal shallow water equations, their quasi-geostrophic limit, and equatorial super-rotation in Jovian atmospheres", DPhil thesis ( online )
 
P. J. Dellar (2014) Short course on "Discrete Kinetic Theory for Viscoelastic Liquids: Similarities and Differences" at the 11th International Conference for Mesoscopic Methods in Engineering and Science (ICMMES 2014) in New York, (online )
 
 

Jared Tanner / Jeffrey Blachard

J.D. Blanchard, J. Tanner and K. Wei, ' CGIHT: Conjugate Gradient Iterative Hard Thresholding for compressed sensing and matrix completion'. Information and Inference ,  in press, 2015

GAGA - A large scale software package for compressed sensing.

J.D. Blanchard and J. Tanner 'Performance comparison of greedy algorithms in compressed sensing', Numerical Linear Algebra with Applications. 22(2):254-282, 2015 ( PDF

J.D. Blanchard, J. Tanner, and K. Wei. Conjugate Gradient Iterative Hard Thresholding: observed noise stability for compressed sensing', IEEE Trans. on Signal Processing, 63(2):528-537, 2015

J.D. Blanchard and J. Tanner 'GPU accelerated greedy algorithms for compressed sensing'. Mathematical Programming Computation, 5(3):267-304, 2013. ( PDF )

 

Nando de Freitas / Phil Blunsom

D Kotzias, M Denil, N de Freitas, P Smyth. 'From group to individual labels using deep features'. Proceedings of the 21st ACM SIGKDD Conference, 2015.

N Kalchbrenner, E Grefenstette, P Blunsom. 'A Convolutional Neural Network for Modelling Sentences'. ACL 2014.

KM Hermann, P Blunsom. 'Multilingual Models for Compositional Distributed Semantics'. ACL 2014.

 

Andrew Zisserman / Andrea Vedaldi

K. Chatfield, R. Arandjelović, O. M. Parkhi, A. Zisserman. 'On-the-fly learning for visual search of large-scale image and video datasets'. International Journal of Multimedia Information Retrieval, 2015

M. Cimpoi, S. Maji, A. Vedaldi. 'Deep Filter Banks for Texture Recognition and Segmentation'. IEEE Conference on Computer Vision and Pattern Recognition, 2015
E. J. Crowley, O. M. Parkhi, A. Zisserman. 'Face Painting: querying art with photos'. British Machine Vision Conference, 2015
 
M. Jaderberg, K. Simonyan, A. Vedaldi, A. Zisserman 'Deep Structured Output Learning for Unconstrained Text Recognition'. International Conference on Learning Representations, 2015
N. Jammalamadaka, A. Zisserman, C. V. Jawahar. 'Human Pose Search using Deep Poselets'. International Conference on Automatic Face and Gesture Recognition, 2015
K. Lenc, A. Vedaldi. 'Understanding image representations by measuring their equivariance and equivalence'. IEEE Conference on Computer Vision and Pattern Recognition, 2015
 
A. Mahendran, A. Vedaldi. 'Understanding Deep Image Representations by Inverting Them'. IEEE Conference on Computer Vision and Pattern Recognition, 2015
 
O. M. Parkhi, A. Vedaldi, A. Zisserman. 'Deep Face Recognition'. British Machine Vision Conference, 2015
 
T. Pfister, J. Charles, A. Zisserman. 'Flowing ConvNets for Human Pose Estimation in Videos'. IEEE International Conference on Computer Vision, 2015
 
K. Simonyan, A. Zisserman. 'Very Deep Convolutional Networks for Large-Scale Image Recognition'. International Conference on Learning Representations, 2015
 
 

Steve Roberts 

T. Gunter, M.A. Osborne, R. Garnett, P. Hennig, & S.J. Roberts. Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature. Advances in Neural Information Processing Systems (NIPS) 2014.
 
A Karastergiou, J Chennamangalam, W Armour, C Williams, B Mort, F Dulwich, S Salvini, A Magro, S Roberts, M Serylak, A Doo, AV Bilous, RP Breton, H Falcke, J-M Grießmeier, JWT Hessels, EF Keane, VI Kondratiev, M Kramer, J van Leeuwen, A Noutsos, S Osłowski, C Sobey, BW Stappers, P Weltevrede. 'Limits on fast radio bursts at 145MHz with ARTEMIS, a real-time software backend.'  Monthly Notices of the Royal Astronomical Society, 452(2):1254-1262, 2015 (online )

W. Armour, A. Karastergiou, M. Giles, C. Williams, A. Magro, K. Zagkouris, S. Roberts, S. Salvini, F. Dulwich and B. Mort.  A GPU-based survey for millisecond radio transients using ARTEMIS. ASP Conference Series, 461, 33 (ADASS XXI, 2012).

 

FMRIB (Steve Smith)

FMRIB ( www.fmrib.ox.ac.uk ) research focuses on the use of Magnetic Resonance Imaging (MRI) for neuroscience research. FMRIB's Software Library (FSL) is a comprehensive library of analysis tools for structural, functional and diffusion MRI data.

We have developed and released (actually will be released in the next 1-2 weeks) through FSL a CUDA version of our popular diffusion MRI processing toolbox, bedpostx  (Bayesian Estimation of Diffusion Parameters Obtained using Sampling Techniques).

Other FSL tools are in development in CUDA, but at the moment not released. These include:

- Eddy Current Distortion Correction tool

- Rubix . A Bayesian hierarchical data fusion framework for robust processing of diffusion MRI data. It combines data acquired at different spatial resolutions for bayesian inference of crossing fibers in diffusion MRI

- Probtrackx , probabilistic tractography  

SN Sotiropoulos , S  Jbabdi , J  Xu , JL  Andersson , S  Moeller , EJ  Auerbach , MF  Glasser , M  Hernandez , G  Sapiro , M  Jenkinson ,  DA  Feinberg , E  Yacoub , C  Lenglet , DC  Van Essen , Ugurbil , TE  Behrens. 'Advances in diffusion MRI acquisition and processing in the Human Connectome Project', Neuroimage, 80:125-143, 2015.

M. Hernandez, G. D. Guerrero, J. M. Cecilia, J. M. Garcia, A. Inuggi, S. Jbabdi,T. E. Behrens and S. N. Sotiropoulos, 'Accelerating fibre orientation estimation from diffusion weighted magnetic resonance imaging using GPUs', PLoS One, 8(4), e61892, 2013 ( online )

M. Hernandez, G. D. Guerrero, J. M. Cecilia, J. M. Garcia, A. Inuggi and S. N. Sotiropoulos, 'Accelerating fibre orientation estimation from diffusion weighted magnetic resonance imaging using GPUs', International Conference on Parallel, Distributed and Network-based Processing (PDP), Germany, February 2012. ( online )

 

Wes Armour

A Karastergiou, J Chennamangalam, W Armour, C Williams, B Mort, F Dulwich, S Salvini, A Magro, S Roberts, M Serylak, A Doo, AV Bilous, RP Breton, H Falcke, J-M Grießmeier, JWT Hessels, EF Keane, VI Kondratiev, M Kramer, J van Leeuwen, A Noutsos, S Osłowski, C Sobey, BW Stappers, P Weltevrede. 'Limits on fast radio bursts at 145MHz with ARTEMIS, a real-time software backend.'  Monthly Notices of the Royal Astronomical Society, 452(2):1254-1262, 2015 ( online )

J. Chennamangalam, A. Karastergiou, D. MacMahon, W. Armour, J. Cobb, D. Lorimer, K. Rajwade, A. Siemion, D. Werthimer and C. Williams. 'ALFABURST: A realtime fast radio burst monitor for the Arecibo telescope'. Proceedings of the Fourteenth Marcel Grossmann Meeting on General Relativity, 2015. ( online)

J Chennamangalam, A Karastergiou, W Armour, C Williams, M Giles. 'ARTEMIS: A real-time data processing pipeline for the detection of fast transients'. Radio Science Conference (URSI AT-RASC), IEEE, 2015. ( online )

M. Serylak, A. Karastergiou, C. Williams, W. Armour, M. Giles and LOFAR Pulsar Working Group.  Observations of transients and pulsars with LOFAR international stations  and the ARTEMIS system Proceedings of IAUS 291 (Neutron Stars and Pulsars: Challenges and Opportunities after 80 years, 2012).

M. Serylak, A. Karastergiou, C. Williams, W. Armour and LOFAR Pulsar Working Group.  Observations of transients and pulsars with LOFAR international stations ASP Conference Series, 466 , ( Electromagnetic Radiation from Pulsars and Magnetars, 2012 ).

W. Armour, A. Karastergiou, M. Giles, C. Williams, A. Magro, K. Zagkouris, S. Roberts, S. Salvini, F. Dulwich and B. Mort.  A GPU-based survey for millisecond radio transients using ARTEMIS.  ASP Conference Series, 461, 33 (ADASS XXI, 2012).

 

Ian Reid

V.A. Prisacariu, I. Reid.  PWP3D: Real-Time Segmentation and Tracking of 3D Objects.  Int Journal of Computer Vision , 98(3):335-354, 2012.

V.A. Prisacariu, I. Reid. Robust 3D hand tracking for human computer interaction. IEEE conference on Automatic Face & Gesture Recognition (FG 2011), 2011.

V.A. Prisacariu, I. Reid. 3D hand tracking for human computer interaction. Image and Vision Computing , 30(3):236-250, 2012.

V.A. Prisacariu, A.V. Segal, I. Reid. Simultaneous Monocular 2D Segmentation, 3D Pose Recovery and 3D Reconstruction.11th Asian Conference on Computer Vision, LNCS 7724, 2012.

A. Dame, V.A. Prisacariu, C.Y. Ren, I. Reid.  Dense reconstruction using 3D object shape priors.  17th Int Conf on Computer Vision and Pattern Recognition, 2013.

 

Jonathan Doye / Ard Louis

C. Matek, P. Sulc, F. Randisi, J.P.K. Doye, A.A. Louis. 'Coarse-grained modelling of supercoiled RNA'. Journal of Chemical Physics, 143(24), 2015.  (online)

L. Rovigatti, P. Šulc, I.Z. Reguly, F. Romano "A comparison between parallelization approaches in molecular dynamics simulations on GPUs", Journal of Computational Chemistry, 36:1-8, (2015), (arxiv)

J.P.K. Doye, T.E. Ouldridge, A.A. Louis, F. Romano, P. Sulc, C. Matek, B.E.K. Snodin, L. Rovigatti, J.S. Schreck, R.M. Harrison, and W.P. Smith, 'Coarse-graining DNA for simulations of DNA nanotechnology ', Phys. Chem. Chem. Phys. 15, 20395-20414 (2013)

R. Matthews, A.A. Louis, C.N. Likos.  "Effect of bending rigidity on the knotting of a polymer under tension".  ACS macro letters , 1(11):1352-1356, (2012).

C. Matek, T.E. Ouldridge, J.P.K. Doye and A.A. Louis, "Plectoneme tip bubbles: Coupled denaturation and writhing in supercoiled DNA", submitted (arXiv:1404.2869)

 

Chris Holmes

A. Lee, C. Yau, M.B. Giles, A. Doucet, C.C. Holmes. 'On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods'. Journal of Computational and Graphical Statistics , 19(4): 769-789, 2010.

A. Lee, F. Caron, A. Doucet, C. Holmes. 'Bayesian sparsity-path-analysis of genetic association signal using generalized t priors'. Statistical Applications in Genetics and Molecular Biology , 11(2), 2012.

M.A. Suchard, C. Holmes, M. West. 'Some of the what? why? how? who? and where? of graphics processing unit computing for Bayesian analysis'.  Bulletin of the Int Society for Bayesian Analysis , 17(1), 2010.

 

Oege de Moor / Luke Cartey

Luke Cartey‚ Rune Lyngsø and Oege de Moor, 'Synthesising graphics card programs from DSLs', In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation. Pages 121–132. New York‚ NY‚ USA. 2012. ACM.