Colfax International has put together an entire course for learning parallel programming for the Xeon processor and Xeon Phi coprocessor. Jeff Cogswell reviews the most excellent book portion.
This week I decided to read and review “Parallel Programming and Optimization with Intel Xeon Phi Coprocessors” from Colfax International, a standalone book supplementing the company’s training courses. A manufacturer of custom workstations, servers, clusters and storage solutions based on Intel’s Xeon processors and Xeon Phi coprocessors, Colfax speaks with some experience and authority.
As readers of this blog well know, Xeon Phi coprocessors require some advanced parallel programming techniques. Colfax took it upon themselves to create a complete course for their clients — and anyone who wants to program the Xeon Phi coprocessor. As someone involved in the publishing industry for 20 years now as a side job from my programming work, I can say they’ve done a fine job.
Best of Both Programming Worlds
In the world of programming, there are two distinct branches. One focuses on hardware and building low-level software that requires developers know all the ins and outs of the underlying platform. The other is application and higher-level software programming. Here, you’re using high-level languages and needn’t much worry about hardware.
The world of parallel programming, however, has blurred the two. Today if you write a scalable application that needs to run on the latest, greatest hardware, you really need to take time to learn the underlying hardware. I’m speaking here, of course, about parallel programming and multi-core programming and, in the Xeon Phi world, writing for a coprocessor. You really can’t just set a couple switches and let the compiler make your application “go parallel.” No, you must adjust your programming techniques. And the way you learn to do so is through blogs like this, as well as books and training materials.
This new parallel programming book is a good start. It’s a bit more hardware-intensive than some programmers might be used to. If you’re an application programmer new to such hardware concepts, don’t be put off; the book does a great job introducing basics and getting you up to speed. The authors take the approach of introducing a few topics, showing how they fit in to the bigger picture, then going into more detail in the next chapter. This tack works out especially well for a training situation, which is the book’s primary purpose, after all.
Structured for … Education?!?
The book contains only a handful of chapters (four), which take up only about half the volume. The remainder is study exercises and samples. (Having written books for the big publishers, I must commend Colfax for willingness to diverge from strict industry formats optimized for profit, not education. And as a classroom instructor, I really appreciate books like this that are squarely focused on learning.)
A 35-page Introduction covers overall architecture of the Xeon processor, Xeon Phi coprocessor, and overall MIC architecture. Then come sections on installing the software tools and managing the coprocessor. (Remember, the coprocessor works as an independent computer on the same board, running a version of Linux; it then appears as a computer on a network.)
The next three chapters are all about programming: Starting with programming models, then parallel programming techniques, and finally optimization. Chapter 2 discusses how to build programs for the coprocessor, and then how to deploy them to the coprocessor. For example, one technique discussed is how to offload only parts of your code to the coprocessor simply by using a pragma. The code following the pragma then gets pushed over to the coprocessor and is run over there. The book also discusses issues with shared memory and a technique called MYO (Virtual Shared) Memory Model. (The MYO acronym standards for “Mine, Yours, Ours.”)
Chapter 3 details some of the techniques we’ve covered in this blog: SIMD operations and vectorization; data alignment; Cilk Plus; and so on. And finally, Chapter 4 covers optimizations, and tackles this by showing both the wrong way and the right way to do optimizations. (For example, one section is called “Data Parallelism: Vectorization the Right and Wrong Way.”)
The exercises are practical, straightforward and an excellent extension to the concise text.
I’ll be blunt: This is an excellent book. A couple of weeks ago I reviewed “Intel Xeon Phi Coprocessor High-Performance Programming” by James Jeffers and James Reinders. This book is a fine complement to that work. They do cover some of the same material, but from different perspectives, one within Intel, one outside. If you’re serious about parallel programming, I say pick up both.
The book ($49 PDF, $69 +SH for print) supplements Colfax in-class or self-paced training with dedicated access to a computing system with Intel Xeon Phi coprocessors.
Learn more below: