Most Recent Tune Posts RSS



IPC Boot Camp: How to Use MPI in Your Programs

Intel’s MPI technology has replaced other interprocess communication techniques with a state-of-the-art API. Rick Leinecker walks you through several code examples that show you how to get up and running very quickly. See everything you need to know to start using MPI in your programs, providing a robust interprocess communication …

Read Full Post Posted in Tune | Tagged , | Leave a comment

Destroy Data Races with Intel Inspector XE

Multithreading can speed up your programs but introduces a new kind of bug–a data race, where two or more threads try to write to the same variable at the same time. David Bolton shows how to catch data races using Intel Inspector XE 2015 and demonstrates a real bug, then …

Read Full Post Posted in Tune | Tagged , , | Leave a comment

Data Normalization with SIMD Vectorization

Data sets often represent collected values that reflect real-world situations. For instance, census data might contain the ages of all residents within a certain township. Another example is when schools aggregate grade averages among classrooms. Many times, though, data collections have strong components with large magnitudes that tend to overwhelm …

Read Full Post Posted in Tune | Tagged , , , | Leave a comment

Bring New Life to Legacy C++ Code with Parallel Studio

In the software engineering class I teach, we discuss the need and importance of refactoring code. Over time, it is inevitable that code must be reworked to meet new needs. Sometimes customers complain about performance, while other times many software engineers want cleaner and more maintainable code. Using Intel Parallel …

Read Full Post Posted in Tune | Tagged , , | Leave a comment

Fast Matrix Multiply Fortran Program Using OpenMP

David Bolton demonstrates how to speed up an intensive Fortran program, making it three times as fast by using OpenMP. First, he runs an unoptimized version that takes about 18 seconds to do a matrix multiplication of two 650 x 650 arrays. Then he runs it optimized in just six …

Read Full Post Posted in Tune | Tagged , | Leave a comment

Parallel Power to the Programmer: Coding Course Leads the Way

As systems with multiple CPUS, each carrying multiple cores become increasingly popular for a large variety of workloads, organizations unsurprisingly want to take full advantage of each CPU and coprocessor that are part of the execution environment. Those who include Xeon Phi coprocessors as part of their infrastructure will be …

Read Full Post Posted in Tune | Tagged , | Leave a comment

Go Lock-Free to Keep Your Code Up to Speed

To get the best performance in parallel programming, you want to try to avoid locks and critical sections, which can slow down your code. In this blog, Jeff Cogswell investigates lock-free programming and explains briefly how it works. Then he’ll show you where to learn more from a master with …

Read Full Post Posted in Tune | Tagged | Leave a comment

4 Steps to Tune Up MPI Apps for Boosted Performance

Scientific and engineering programmers want to get every bit of performance possible from clustered systems. The growing popularity of MPI applications calls for new tools to analyze and improve overall system performance. This white paper discusses a methodical four-step approach to profiling and analyzing MPI performance using Intel Trace Analyzer …

Read Full Post Posted in Tune | Leave a comment

Multicore vs. Vectorized: Programming Techniques Compared

Parallel programming includes two separate technologies multicore and vectorized programming. But what is the difference and how can the two work together? Jeff Cogswell tackles this question. Here at Go Parallel, we’ve talked about two primary ways you can use parallel programming: multicore and vectorized. I’ve received a few emails …

Read Full Post Posted in Verify | Leave a comment

Calculating Pi with Monte Carlo and MKL

The Math Kernel library provides a great way for calculating huge arrays of random numbers. Creating a Monte Carlo simulation is then easy once you have these random numbers. Jeff Cogswell shows how you can use both MKL and a Monte Carlo algorithm to estimate pi, thus learning the mechanics …

Read Full Post Posted in Build | Leave a comment