Artificial neural networks (ANNs) are used today to learn solutions to parallel processing problems that have proved impossible to solve using conventional algorithms. From cloud-based, voice-driven apps like Apple’s Siri to realtime knowledge mining apps like IBM’s Watson to gaming apps like Electronic Arts’ SimCity, ANNs are powering voice-recognition, pattern-classification and function-optimization algorithms perfect for acceleration with Intel hyper-threading technology.
“Artificial neural networks and hyper-threading technologies are ideally suited for each other,” says Chuck Desylva, a support engineer for Intel performance primitives. “By functionally decomposing ANN workloads–dividing them among logical processors and employing additional optimization methods–you can achieve significant performance gains.”
Desylva recently tested several widely available open-source ANN algorithms on a Pentium-4 extreme edition to demonstrate how using its dual threads can achieve significant speed-ups. For the forthcoming massively parallel Xeon Phi, Desylva predicts even more significant acceleration of ANN algorithms, since Xeon Phi supports four threads for each of its 50+ cores.
“I think that Xeon Phi will be a perfect fit for ANNs,” Desylva believes.
Biological neurons (upper left) are emulated by artificial neural network (ANN) mapping concepts that sum inputs (upper right) then supply an output (bottom) filtered by an activation function. Source: Intel
Artificial neural networks (ANNs) emulate how the brain’s billion of neurons and trillions of synaptic connections divide and conquer tough combinatorial problems involving detection of features, perception of objects and cognitive functions of association, generalization and attention. By implementing multiple layers of virtual parallel processors–each simulating a layer of interconnected neurons like those found in the cerebral cortex–ANNs are capable of learning the solution to programming problems impossible to execute in realtime using conventional algorithms.
For instance, ANNs enable voice-recognition systems to instantaneously match your voice against millions of stored samples, in contrast with standard algorithms that would have to serially compare your voice to each sample then calculate the best match, a task too computationally intensive for realtime execution.
To evaluate how to accelerate ANNs, Desylva adapted for hyper-threading several popular algorithms, such as the back-propagation-of-error (BPE) learning algorithm that sends corrective feedback to previous layers in a multi-layer neural network until the desired real time response is achieved.
Testing of these neural-learning algorithms was applied to a virtual network 10 million neurons. Performance boosts of over 10 percent were achieved immediately by using the Streaming SIMD Extensions 2 (SSE2) and thread-safe versions of the Microsoft Standard template Library (STL). OpenMP pragmas were then used to direct the compiler to use threading, resulting in a 20 percent overall performance increase compared to the original source. VTunes was then run to show a 3-to-4 times speedup in the commands OpenMP uses to synchronize threads.
Next the same OpenMP-based optimization technique was applied to the update function, which calculates the output of each neural-network layer before passing it to the next, resulting in double the average performance of several different ANN applets.
Finally, dissecting the BPE learning algorithm itself, resulted in as much as a 3.6-times speedup over the original unmodified source.
Colin Johnson is a Geeknet contributing editor and veteran electronics journalist, writing for publications from McGraw-Hill’s Electronics to UBM’s EETimes. Colin has written thousands of technology articles covered by a diverse range of major media outlets, from the ultra-liberal National Public Radio (NPR) to the ultra-conservative Rush Limbaugh Show. A graduate of the University of Michigan’s Computer, Control and Information Engineering (CICE) program, his master’s project was to “solve” the parallel processing problem 20 years ago when engineers thought it would only take a few years. Since then, he has written extensively about the challenges of parallel processors, including emulating those in the human brain in his John Wiley & Sons book Cognizers – Neural Networks and Machines that Think.