Most Recent Tune Posts RSS



Getting the Most From Your Cache: Data Locality

In this video, Slashdot Media Contributing Editor Rick Leinecker demonstrates how to organize data in a program to take advantage of your cache. It shows three techniques for improving cache utilization in code. The main concept for all three techniques is achieving data locality.  

Read Full Post Posted in Tune | Tagged | Leave a comment

Getting Started with Numba for Python

  Love Python but hate the slow, interpreted code? In this video Slashdot Media Contributing Editor David Bolton shows you how to get going with the Numba compiler

Read Full Post Posted in Tune | Tagged , , | Leave a comment
NumbaOnUbuntu

Using Numba to Accelerate Python Execution

Python is an excellent language but not being compiled makes it slower than compiled languages. There are various ways to speed it up, using cython, PyPy (a compiler) and Numba. It speeds up your Python applications by just-in-time compiling Python code using the LLVM compiler to produce optimized machine code …

Read Full Post Posted in Tune | Tagged , , | Leave a comment
openmp_lg_transparent

Code Modernization: Weighing Pros and Cons Of OpenMP

I am an OpenMP evangelist. I use it, and I love it. This semester I spent one week in my advanced architecture class showing how it contributes to the continuance of Moore’s Law. I have also spent a lot of time here at Go Parallel talking about OpenMP, and showing …

Read Full Post Posted in Tune | Tagged , , | Leave a comment
cache

Code Modernization – The Importance of Cache Awareness

Thirty years ago in a simpler 8 bit world, I spent a few years developing games in 6502 and Z80 assembler. Life was simpler then with processors running at 1 MHz frequency and instructions being 2-3 words long. These Instructions took typically 2-3 clock cycles to execute but could be …

Read Full Post Posted in Tune | Tagged , , | Leave a comment

The Tenets of Code Modernization

Code modernization starts with taking advantage of the resources that are available to an application. The easiest way to modernize is by attempting to parallelize sections, since multiple processors or cores can greatly speed execution time if you parallelize your code. This presentation gets you started.

Read Full Post Posted in Tune | Tagged | Leave a comment
big data as art

Intel Releases BigDL Deep Learning Framework

Accelerated Big Data code development, performance for Apache Spark Intel has unveiled BigDL, an open-source deep learning library for Apache Spark. It allows users to write their deep learning applications as standard Spark programs, which can run on top of existing Spark or Hadoop clusters, the company says. The BigDL …

Read Full Post Posted in Tune | Tagged , , | Leave a comment

Thread Synchronization and OpenMP: Mechanisms Ease Code Integrity

  Spawning multiple threads can pose numerous problems, including ensuring that sensitive sections of code can only be accessed by one thread at a time. This video shows how to use some of the OpenMP mechanisms such as critical, atomic, and barrier to keep things working correctly.

Read Full Post Posted in Tune | Tagged , , | Leave a comment

Roofline Analysis in Intel Advisor 2017

What is Roofline Analysis? First proposed by UC Berkeley professors in 2009, Roofline is designed to analyze performance in multi-core and many core systems, and can uncover the bottlenecks that are impacting overall performance. In this video, Intel Software expert Alex Shinsel takes us for a spin with Advisor’s new …

Read Full Post Posted in Tune | Tagged , , | Leave a comment

Meet The New Intel Storage Performance Snapshot Tool

Helping to Locate, Eliminate Performance Bottlenecks How will changing your storage configuration impact overall system performance? Check out a demo of Intel’s new performance tool with a live demo by Data Center Solutions Architect Ken Letourneau  

Read Full Post Posted in Tune | Tagged , | Leave a comment