Hand-coded threads are harder to debug than high-level abstractions such as OpenMP. They are harder to deal with than processes, because threads can step on each other. Intel Threading Building Blocks extend C++ with template libraries: STL for parallelism.
“I heard threads were evil.” I was approached this way by a developer thinking about using the parallelism to optimize his application for multi-core platforms.
Threads are one of the fundamental pieces to program for parallelism; managing them directly, however, causes a lot of problems.
On balance, I agree that threads are evil. I say, “Avoid hand-coded threading like the plague.” Friends don’t let friends hand-code their threading.
Hand-coded threads are harder to debug than high-level abstractions such as OpenMP. They are harder to deal with than processes, because threads can step on each other (since they share data/memory). They result an explosion of thread-management code. They are seldom as scalable as they could be, which means you need to come back and re-tune them over and over for new machines (because to get scalability right the effort, expertise, testing and machines are a rarity). Finally, they are hard to maintain, especially when the “new guy” gets to pick up the code you wrote (and we all know the next programmer is never as good).
Of course, hand coding is more flexible. You can control every little bit, making you feel like you are not giving up anything. That sounds like the arguments we used to have about assembly language programming vs. higher-level languages.
We got over assembly language programming, even those of us who enjoyed it, and we’ll get over hand-coding of threads. And like everything else in today’s world, we’ll get over it faster this time.
OpenMP Limits C++ Developers
Last month, I advocated trying OpenMP. Alexa and others pointed out it cannot do everything. It is tied too tightly to loopsso it’s very rigidly tied to control flow. It also does nothing for us with data structures, and getting your data organized right for parallelism can be a chore. Both these have a lot to do with the heritage of OpenMP and its connection to Fortran. So it is no surprise that the most complaints come from C/C++ programmers.
So, listening to C++ developers, we came up with the Intel Threading Building Blocks to extend C++ with template libraries. I think of it as “STL” for parallelismthe same mechanism, same basic concepts, and the same argument that you get abstraction and performance together through generic programming.
I’ve talked to some professors who are looking at using it in teaching computer science because it lets them focus on the parallelism instead of the thread management. I wish I had this when I took Parallel Computer Algorithms in graduate school. I’m happy to talk with professors interested in learning more; I send them copies of Intel Threading Building Blocks and I’m eager to see the emerging curriculum ideas possible when programming without direct thread management.
Proven Solution: C++ Template Libraries
You should take a look at Intel Threading Building Blocks. I think it is a BIG idea, and a really great implementation. Everyone can try it for freeand if you want to do more, drop me a note with your ideas.
Here is why I think it is so interesting: Intel Threading Building Blocks let you specify tasks instead of threads. This way, you avoid being forced to efficiently map logical tasks onto threads. Intel’s Threading Building Blocks run-time library automatically threads in a way that makes efficient use of processor resources.
It supports many types of threading under the covers, where it is highly optimized and can change to best suit the hardware. It delivers performance, it is focused on parallelizing computationally intensive work and delivering higher-level, simpler solutions.
It works with other threading packagesusing it in a program with OpenMP or hand-coded threads, will work. It emphasizes scalable, data parallel programming. This avoids breaking up programs functionally, which typically does not scale well when the number of functional blocks is fixed.
It relies on generic programming, so you get the best algorithms with the fewest constraints. The C++ Standard Template Library (STL) is a good example of generic programming in which the interfaces are specified by requirements on types.
Take a looklet me know what you think. While I’m very excited about Intel Threading Building Blocks, I’m hoping to see more proposals which offer the high level abstraction to avoid hand coded thread management while offering more functionality.
We can go a long way with programming parallelism in C++ with this template library for parallelism. And for the momentI’ve side stepped controversy over using threading. Threads are best when carefully programmed, debugged, and hidden inside somethinglike Intel’s Threading Building Blocks.
Try this template library out. Avoid threading. I’ll come back to “threads are evil” another day, with ideas on what to do if you just cannot resist the temptation.