Most Recent Build Posts RSS



Optimizing for Intel MIC Part 2: Vectorizing for Added Performance Gains

There’s more than one way to get the most out of the latest Intel multicore and many core architectures embodied in Xeon and Xeon Phi, and this series shows several ways to do just that. In “Optimization Techniques for the Intel MIC Architecture. Part 2 of 3: Strip-Mining for Vectorization,” …

Read Full Post Posted in Build | Tagged , , | Leave a comment

Parallel Algorithm Boot Camp

I have spent a good bit of time writing about parallelization using Parallel Studio. My blogs have included using OpenMP, Cilk, and Threaded Building Blocks. And while these are important because they provide an opportunity to parallelize code that is normally sequential, it is now time to look into algorithms …

Read Full Post Posted in Build | Tagged , , | Leave a comment

Get Best Memory-Consumption Scalability of Intel MPI Library

High-performance computing applications tend to use most of the available memory on a node, making it difficult to estimate the memory consumption of MPI libraries. But there are ways to estimate memory consumption and ways to fine-tune the Intel MPI Library settings to reduce the memory footprint. For example, users …

Read Full Post Posted in Build | Tagged , | Leave a comment

Multicore Optimization Realized: Tuning for Intel MICs

As the number of cores available to programmers has grown, so have the opportunities to exceed results expected by Moore’s Law. Servers routinely offer dozens of Xeon cores and the ability to run hundreds of simultaneous threads on Xeon Phi coprocessors. The question is how best to parallelize and optimize …

Read Full Post Posted in Build | Tagged , , | Leave a comment

Reproducing Results With Intel MPI Library

With high-performance computing, floating-point operations in numerical codes can introduce differences that increase with each iteration. The Intel MPI Library offers algorithms to gather conditionally reproducible results, even when the MPI rank distribution environment differs from run to run. Learn more about how you can achieve conditionally reproducible outcomes without …

Read Full Post Posted in Build | Tagged , | Leave a comment

How to Use New All-MPI Alternative to MPI/OpenMP

The Message Passing interface (MPI) provides a standard way to exchange messages in distributed-memory parallel programming. Today, hybrid parallel programming on many-core systems typically combine MPI with OpenMP. But there’s a new all-MPI alternative that can improve performance. The latest MPI standard, MPI-3, introduces new shared memory capabilities that provides …

Read Full Post Posted in Build | Tagged , | Leave a comment

Speeding Up C++ Map Generator with TBB

Nearly 30 years ago, I devised a map generator for a game. Originally coded in Turbo Pascal, it was followed by a Z80 assembler version and the version I’m trying to speed up is a C++ version. Incidentally, there was also a C version of it which is still available …

Read Full Post Posted in Build | Tagged , , | Leave a comment

MPI Primer: What You Need to Know

Though microprocessors have become multi-core relatively recently, this happened a lot earlier in the world of mainframes, minis and workstations. If you wanted to write parallel programs on the different architectures used in the science and research fields, it was a bit messy. Back 25 years ago, there was no …

Read Full Post Posted in Build | Tagged , | Leave a comment

Parallel Programming Best Practices Revealed

Intel engineers James Reinders and Jim Jeffers have written a new how-to book that teaches developers to optimize parallel performance on Intel’s multicore and many-core processors. The book, titled “High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches,” provides real-world examples and source code on how to leverage parallelism on …

Read Full Post Posted in Build | Tagged , , | Leave a comment

Intel Software Dev Products Win Top HPC Honors

HPCwire magazine has recognized two Intel clustering technologies as the best in the industry. Intel MPI Library 5.0, which focuses on making applications perform better on Intel architecture-based clusters, won a readers’ choice award for best HPC cluster solution or technology. Intel Parallel Studio XE Cluster Edition — High Performance …

Read Full Post Posted in Build | Tagged , , | Leave a comment