Featured

Powerful C, C++ Extension Eases Parallelism

Software developers have depended on Moore’s Law to make applications run faster and do more for decades, but conventional, general-purpose CPUs hit a wall in the mid-2000s. Increasing clock rates offered more problems than benefits. However, Intel and other CPU designers came up with a solution — use the additional …

Read Full Post Posted in Build

Most Powerful Supercomputer’s Big Weakness

China’s Tianhe-2 may be the world’s fastest supercomputer, but some researchers say its use is limited by its high cost and dearth of software, according to a recent article in the South China Morning Post. The Tianhe-2, which reaches 33.86 petaflops per second, was ranked No. 1 in the Top500 …

Read Full Post Posted in Design

How to Keep Thread-Safe When Queuing Your Data

Queue data structures are common and useful, but not always thread-safe. Jeff Cogswell explains how a queue works and why a typical implementation won’t cut it for multithreaded programming. He then introduces the Threading Building Blocks queue structure, which scales nicely for high-performance, multithreaded programming.  A common and useful data …

Read Full Post Posted in Tune

Most Recent Posts Blog Archive

Stop Threads from Clashing Over Variables in OpenMP

OpenMP lets you allocate blocks of code that will be duplicated across threads. These can be in the form of loops or just simple blocks. To help you with your data, variables can be duplicated within each thread. Jeff Cogswell shows you how. Last time we explored a bit of …

Read Full Post Posted in Build

Next-Gen Nuclear Agency Supercomputer to Launch

The National Nuclear Security Administration (NNSA) has awarded Cray a $174 million contract to build a new supercomputer that is expected to become one of the fastest in the world. The new supercomputer — named “Trinity,” the code-name for the first nuclear weapon test in 1945 — will be housed …

Read Full Post Posted in Design

Taking OpenMP Out for a Spin

OpenMP provides a way to write parallel code using pragmas embedded in your C++ code. Jeff Cogswell tries out a simple pragma that results in spawning multiple, identical parallel threads. In my last blog, I briefly introduced OpenMP, which is a technology whereby you can write parallel code in ways …

Read Full Post Posted in Build

Determine Processor SIMD Features at Runtime

The Intel compiler can generate code that behaves differently for different processors. Sometimes you might want to manually check the processor features. Or you might just want to know how the generated code does it. In this video, Jeff Cogswell shows you how to use the CPUID assembly instruction to …

Read Full Post Posted in Build

Timing Matters in Threading Building Blocks

When you want to time how long a set of parallel tasks takes to complete, you want to use the actual time, not the CPU time. And you want the time-measuring mechanism to be thread-safe. Jeff Cogswell shows you how to use the timing classes in Threading Building Blocks to …

Read Full Post Posted in Build

Configuring Microsoft Visual Studio for OpenMP

  In this video, Jeff Cogswell shows you how to configure a project in Microsoft’s Visual Studio using Parallel Studio and OpenMP. He then takes you through a quick OpenMP program, demonstrating the pragmas  

Read Full Post Posted in Build

Exploring Microsoft’s C# with Parallel Studio

When most people think of programming with Parallel Studio, they think of C++. But there’s actually a good bit of support for other languages, including Microsoft’s C#. Jeff Cogswell explores what’s available for C# programmers in Parallel Studio.  When I first started working with Intel Parallel Studio a few years …

Read Full Post Posted in Verify

OpenMP: Parallel Programming Alternative

Although we’ve spent a lot of time here at Go Parallel on Cilk Plus, there’s another technology you can use with Parallel Studio called OpenMP. Jeff Cogswell gives you an overview. Here at Go Parallel, we’ve spent a lot of time talking about Cilk Plus, which is a set of …

Read Full Post Posted in Build

Get More Out of OpenCL Cross-Device Portability

Open Computing Language (OpenCL) is a royalty-free standard for cross-platform parallel programming that supports heterogeneous platforms with various types of devices including general-purpose CPUs, graphics processors, and coprocessors.  OpenCL attracts developers because, in addition to its broad support by hardware vendors, the language brings the promise of cross-device portability. For …

Read Full Post Posted in Build

Use New MPI-3 Standard to Master Performance Challenges

The latest MPI-3 standard, which is a widely used programming interface for distributed memory systems, contains new major features such as non-blocking and neighbor collective operations, extensions to the Remote Memory Access (RMA) interface, large count support, and new tool interfaces. Each of these new features may contribute to performance …

Read Full Post Posted in Build