Last week AMD released ports of Caffe, Torch and (work-in-progress) MXnet, so these frameworks now work on AMD GPUs. With the Radeon MI6, MI8 MI25 (25 TFLOPS half precision) to be released soonish, it’s ofcourse simply needed to have software run on these high end GPUs.
The ports have been announced in December. You see the MI25 is about 1.45x faster then the Titan XP. With the release of three frameworks, current GPUs can now be benchmarked and compared.
Especially the expected good performance/price ratio will make this very interesting, especially on large installations. Another slide discussed which frameworks will be ported: Caffe, TensorFlow, Torch7, MxNet, CNTK, Chainer and Theano.
This leaves HIP-ports of TensorFlow, CNTK, Chainer and Theano still be released.
HIP is a subset of CUDA that works on modern AMD GCN GPUs. While you’re reading this, AMD engineers are working on progressing HIP. Specifying a subset makes it possible to split the focus areas of feature-adding and performance-improvement. You can read more about HIP in our last year’s blog post on HIP and its potential. With 3 frameworks being released simultaneously while the team also worked on many other projects, tells you see it’s powerful indeed.
StreamHPC is a proud service partner for AMD ROCm-services, which includes HIP. We can port your software to AMD hardware, and make it run at maximum performance – our code improvements have been proven to speed up both NVidia and AMD implementations, due to years of experience with GPU-coding.
In our blogpost about ROCm 1.5 you find all information on the driver-stack, including how to install it.
Current hardware support is:
If you need your code to be benchmarked on AMD GPUs (daily), get in touch to learn more about our services.
Caffe was developed at the Berkeley Vision and Learning Center (BVLC). Caffe is useful for performing image analysis (Convolutional Neural Networks, or CNNs) and regional analysis within images using convolutional neural networks (Regions with Convolutional Neural Networks, or RCNNs).
AMD has shown the Caffe-port at SC16 with focus on the time it took to port it with HIP. The original plan was to release it with ROCm 1.5, as required features had to be performance-optimized within ROCm.
Torch was originally developed at NYU, and is based upon the scripting language Lua, which was designed to be portable, fast, extensible, and easy to use in development. Lua was also designed to have an easy-to-use syntax, which is reflected by Torch’s syntactic ease of use. Torch features a large number of community-contributed packages, giving Torch a versatile range of support and functionality.
MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity. At its core, MXNet contains a dynamic dependency scheduler and a graph optimizer, to automatically parallelize both symbolic and imperative operations on the fly, while optimizing for both execution and memory efficiency. It also adds a collection of blue prints and guidelines for building deep learning systems.
NB: This version is told to be work-in-progress.
AMD has been releasing several libraries that work with hcc or HIP:
We can expect more software that depend on these libraries to be ported to AMD GPUs. Which do you think is next? Put your best bet in the comments.
Important is that the code gets integrated within the projects, so the three frameworks officially work on both NVIDIA and AMD. This can only be done when the maintainers know about these ports and understand there is demand. If you find the ports useful, start an issue in Github to show there is need.