Most Recent Tune Posts RSS



Roofline Analysis in Intel Advisor 2017

What is Roofline Analysis? First proposed by UC Berkeley professors in 2009, Roofline is designed to analyze performance in multi-core and many core systems, and can uncover the bottlenecks that are impacting overall performance. In this video, Intel Software expert Alex Shinsel takes us for a spin with Advisor’s new …

Read Full Post Posted in Tune | Tagged , , | Leave a comment

Meet The New Intel Storage Performance Snapshot Tool

Helping to Locate, Eliminate Performance Bottlenecks How will changing your storage configuration impact overall system performance? Check out a demo of Intel’s new performance tool with a live demo by Data Center Solutions Architect Ken Letourneau  

Read Full Post Posted in Tune | Tagged , | Leave a comment
HOW-DeepDive-Oversize-Banner-1

Colfax Hands-On Webinar Series: Deep Dive into Performance Optimization

Free 20-hour webinar series includes parallel programming, performance optimization, remote access to advanced servers Intel partner Colfax Research is offering a free 20-hour hands-on in-depth training on parallel programming and performance optimization in computational applications on Intel architecture. The first run in 2017 begins January 16, 2017. Broadcasts start at …

Read Full Post Posted in Tune | Tagged , , | Leave a comment
race

Data Races: What They Are, How to Fix Them

I have talked a lot about the parallelization of loops using OpenMP. It is an easy way to improve performance in your applications, especially if you can apply the technique to loops that happen often or loops with many iterations. In many cases, OpenMP provides optimized performance with no down-side …

Read Full Post Posted in Tune | Tagged , | Leave a comment
Threading in Python Example

Threading in Python: Beating Moore’s Law

Threading in Python Herb Sutter of C++ fame wrote in 2005 that the end was in sight for single core CPUs keeping up with Moore’s Law. The way forward was multiple cores and concurrency, i.e. doing multiple things at the same time. If you have multiple systems or even multiple …

Read Full Post Posted in Tune | Tagged , | Leave a comment
rickpick3

What is the Effect of Simultaneous OpenMP Loops?

OpenMP simplifies code parallelization, but can one overdo their use of this valuable tool? In the blog Slashdot Media Contributing Editor Rick Leinecker creates some gnarly code to see if it creates a performance hit I have spent a lot of time here at Go Parallel talking about OpenMP loops. …

Read Full Post Posted in Tune | Tagged , | Leave a comment

Breaking Down OpenMP Loops

OpenMP can bring amazing performance boosts to your applications. This presentation breaks down OpenMP loops that have no dependencies. It also shows how easy it is to parallelize with OpenMP by using compiler directives.

Read Full Post Posted in Tune | Tagged , | Leave a comment

2D Fourier Transforms using Intel’s Math Kernel Library

Many tasks can benefit from 2D Fourier transforms, and in this video Slashdot Media Contributing Editor Rick Leinecker demonstrates how the Intel MKL makes the task easy  

Read Full Post Posted in Tune | Tagged | Leave a comment

Resizing Images with Intel Performance Primitives (IPP)

There are dozens of algorithms for image scaling, from clunky to elegant. In this blog Slashdot Media Contributing Editor Rick Leinecker compares naïve image scaling to functions available with Intel Performance Primitives (IPP) I have been keenly interested in image processing for many years. I was senior software engineer at …

Read Full Post Posted in Tune | Tagged | Leave a comment
shannon entropy

Improving Data Compression: a Parallel Algorithm for “Shannon Entropy”

A great deal of my personal research is in the area of data compression. I have been doing this type of research for about 20 years. A closely-related topic is data entropy. Data entropy is similar to the thermodynamic entropy that many people think of. The higher the data entropy, …

Read Full Post Posted in Tune | Tagged , | 1 Comment