Stream HPC

PDFs of Monday 5 September

Live from le Centre Pompidou in Paris: Monday PDF-day. I have never been inside the building, but it is a large public library where people are queueing to get in – no end to the knowledge-economy in Paris. A great place to read some interesting articles on the subjects I like.

CUDA-accelerated genetic feedforward-ANN training for data mining (Catalin Patulea, Robert Peace and James Green). Since I have some background on Neural Networks, I really liked this article.

Self-proclaimed State-of-the-art in Heterogeneous Computing (Andre R. Brodtkorb a , Christopher Dyken, Trond R. Hagen, Jon M. Hjelmervik, and Olaf O. Storaasli). It is from 2010, but just got thrown on the net. I think it is a must-read on Cell, GPU and FPGA architectures, even though (as also remarked by others) Cell is not so state-of-the-art any more.

OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems (John E. Stone, David Gohara, and Guochun Shi). A basic and clear introduction to my favourite parallel programming language.

Research proposal: Heterogeneity and Reconfigurability as Key Enablers for Energy Efficient Computing. About increasing energy efficiency with GPUs and FPGAs.

Design and Performance of the OP2 Library for Unstructured Mesh Applications. CoreGRID presentation/workshop on OP2, an open-source parallel library for unstructured grid computations.

Design Exploration of Quadrature Methods in Option Pricing (Anson H. T. Tse, David Thomas, and Wayne Luk). Accelerating specific option pricing with CUDA. Conclusion: FPGA has the least Watt per FLOPS, CUDA is the fastest, and CPU is the big loser in this comparison. Must be mentioned that GPUs are easier to program than FPGAs.

Technologies for the future HPC systems. Presentation on how HPC company Bull sees the (near) future.

Accelerating Protein Sequence Search in a Heterogeneous Computing System (Shucai Xiao, Heshan Lin, and Wu-chun Feng). Accelerating the Basic Local Alignment Search Tool (BLAST) on GPUs.

PTask: Operating System Abstractions To Manage GPUs as Compute Devices (Christopher J. Rossbach, Jon Currey, Mark Silberstein, Baishakhi Ray, and Emmett Witchel). MS research on how to abstract GPUs as compute devices. Implemented on Windows 7 and Linux, but code is not available.

PhD thesis by Celina Berg: Building a Foundation for the Future of Software Practices within the Multi-Core Domain. It is about a Rupture-model described at Ch.2.2.2 (PDF-page 59). [total 205 pages].

Workload Balancing on Heterogeneous Systems: A Case Study of Sparse Grid Interpolation (Alin Murarasu, Josef Weidendorfer, and Arndt Bodes). To my opinion a very important subject as this can help automate much-needed “hardware-fitting”.

Fraunhofer: Efficient AMG on Heterogeneous Systems (Jiri Kraus and Malte Förster). AMG stands for Algebraic MultiGrid method. Paper includes OpenCL and CUDA benchmarks for NVidia hardware.

Enabling Traceability in MDE to Improve Performance of GPU Applications (Antonio Wendell de O. Rodrigues, Vincent Aranega, Anne Etien, Frédéric Guyomarc’h, Jean-Luc Dekeyser). Ongoing work on OpenCL code generation from UML (Model Driven Design). [34 pag PDF]

GPU-Accelerated DNA Distance Matrix Computation (Zhi Ying, Xinhua Lin, Simon Chong-Wee See and Minglu Li). DNA sequences distance computation: [PDF] #OpenCL #GPGPU #Biology

And while browsing around for PDFs I found the following interesting links:

  • Say bye to Von Neumann. Or how IBM’s Cognitive Computer Works.
  • Workshop on HPC and Free Software. 5-7 October 2011, Ourense, Spain. Info via
  • Basic CUDA course, 10 October, Delft, Netherlands, €200,-.
  • Par4All: automatic parallelizing and optimizing compiler for C and Fortran sequential programs.
  • LAMA: Library for Accelerated Math Applications for C/C++.