When you’re writing Intel Parallel Studio XE code in C++, you’ll find that there are a lot of constructs, macro definitions, and pragmas that you can use for generating parallel and vectorized code. In this blog post, I want to go through some of the more common options so that you’re familiar with them. For example, if you’ve read my column before, you’ve seen #pragma novector and #pragma simd.
First, for keywords, there are really just three new keywords; these keywords support Cilk Plus programming. They are:
- _Cilk_spawn
- _Cilk_for
- _Cilk_sync
However, the header files provide macros that give you slightly shorter, more C-looking terms that are in all lowercase and don’t have the preceding underscores. The following are actually macros defined in cilk.h that map to the three keywords I just showed you:
- cilk_spawn defines to _Cilk_spawn
- cilk_for defines to _Cilk_for
- cilk_sync defines to _Cilk_sync
(Take a look at the cilk.h file yourself; you might be surprised how small it is.) Cilk also has a single pragma called grainsize. It’s not used very often, and you can read about it in the Intel Cilk User Guide at http://software.intel.com/file/28680.
For vectorization, there are several pragmas. The pragma list is actually quite long, so I’m going to cover the most commonly used ones:
- pragma inline, pragma noinline, pragma forceinline. These let you instruct the compiler whether to use inlining or not on a function. As with inlining in general, the compiler still gets the final say. So they’re actually just suggestions to the compiler. The inline and forceinline have an optional parameter, recursive. For example:
1 for (int i = 0; i < 100; i++) {2 dosomething(i);3 #pragma forceinline recursive4 for (j = 0; j < 100; j++) {5 somethingElse(i, j);6 }7 }
- pragma vector and pragma novector: This instructs the compiler to attempt vectorization, or not to attempt to vectorize a function or loop. I’ve covered this many times in previous blogs.
- pragma nofusion: If you have two loops, one after the other, and they have the same size, same index, same everything, the compiler might try to fuse them together into an optimized single loop. But if you don’t want this to happen, you can use the nofusion pragma.
- pragma optimize: Sometimes (such as in debugging) you might have a function you don’t want optimized. Or, conversely, you might have optimization turned off but want to optimize a function. This only applies to functions, and you put it just before the function header. Here’s how you turn off optimization, and turn on optimization respectively:
1 #pragma optimize("", off)2 #pragma optimize("", on)
But what is the first argument, you ask? It’s not used, and the documentation provides no further clues other than to say that it is ignored.
If you’ve fussed with the different compiler settings, you’ve probably seen how you can have different optimization levels. You can also control these through pragmas. You have to throw in the word “intel”:
1 #pragma intel optimization_level 1
Where 1 is the optimization level. Choices are 0, 1, 2, 3, and they match the compiler options: O0 (the first is the letter Oh), -O1, -O2, and -O3.
Next, if you’ve been reading my column, you know that you can target different processors. Believe it or not, you can add some granularity to individual functions so that a function might target a particular processor. (Think about that for a moment: If you have a function that only supports a certain architecture, and the code is running on a machine that doesn’t support that architecture, then you’ll want to first check the architecture before calling the function.) The pragma is a long one; here’s what it looks like to target AVX architecture, for example:
1 #pragma intel optimization_parameter target_arch=AVX
But what about our old friend #pragma simd? This one actually has a lot of options, and I’ll take that one up next time, and show how exactly it works together with the vector pragma. And let’s not forget the declspecs. Again, that’s a big topic and we’ll cover that soon as well.
Have you tried any of these and had much luck? Share your experiences in the comments section and we’ll talk more about them.
________________________________________________________________
Jeff Cogswell is a Geeknet contributing editor, and is the author of several tech books including C++ All-In-One Desk Reference For Dummies, C++ Cookbook, and Designing Highly Useable Software. A software engineer for over 20 years, Jeff has written extensively on many different development topics. An expert in C++ and JavaScript, he has experience starting from low-level C development on Linux, up through modern web development in JavaScript and jQuery, PHP, and ASP.NET MVC.







