Tutorial: Pointer Checker – Catch Out-of-Bounds Memory Accesses Share your comment!

Bookmark and Share

 

Kittur Ganesh, Technical Consulting Engineer for Intel Software Development Products, explains how Pointer Checker – a feature of  Intel Parallel Studio XE automates detection and protects against out-of-bounds memory accesses, facilitating debugging, preventing data corruption, and, when enabled in testing runtimes, exposing issues that might cause exploitable exceptions before release. This tutorial was summarized by John Jainschigg, Geeknet Contributing Editor, from an Intel video webinar  (51:39) 

Pointers are a conspicuous and powerful feature of C and C++, but one that smart programmers always treat with caution. Every beginning C coder spends some time scratching his/her head about “character pointer pointer argv.” Every intro book and course talks about pointer hygiene: about methodical initialization, memory allocation and value-assignment, type compatibility, deallocation and removal; about the dangers of using computed offsets, indexing into buffers, and so on. And they talk about the scary results: not just segfaults and GP faults, but about silently corrupted data and resulting hair-pulling, hard-to-find-and-reproduce bugs.

These days, much attention is also given to application and system security: how digital vandals may use inputs to influence pointers, triggering faults and causing data corruption, or may combine fault-triggering with other techniques to destabilize and exploit a process and gain control of the system hosting it.

Intel Parallel Studio XE  2013 offers a Pointer Checker feature that detects and limits the side-effects of pointer problems. It can be used in debugging and enabled as a runtime component of applications during testing, offering insight that can be used to improve application robustness and security.

The Pointer Checker feature works on IA-32 and Intel 64 architectures, under Linux or Windows, run on any Intel Pentium 4 or later compatible processor. The feature is off by default, enabled with one compile-time switch that let you define whether read/write or just write operations will be checked. Its work is further directed via a user API enabling control over what happens when violations are detected. Pointer Checker is added by the compiler mostly as a runtime library that gets linked in automatically, and it works without changing structure layout or ABIs (Application Binary Interfaces – API plus ML runtimes). So you can use Pointer Checker to perform checking on single files, groups of files, or a whole application, and do so by recompiling only the pointer-checked items, because pointer-checked application components and non-pointer-checked components can coexist. This is an important convenience, particularly when applying Pointer Checker to improve quality of legacy code.

When Pointer Checker is enabled, the compiler establishes bounds for each explicit and implicit pointer (e.g., & operator references, array references, etc.) and then copies, stores, loads and passes these bounds when pointers are used, also generating checks when pointers are used indirectly to reference memory. The runtime library wrappers, meanwhile, create bounds dynamically when memory is allocated; and these bounds follow pointers in use: through transformations via casting, through memory, array, and pointer-value copies and derivations, etc., enabling all usage to be checked.

Working with Pointer Checker can be simple and iterative: include the chkp.h header, invoke reporting by conditionally compiling an appropriate call via the Pointer Check API, such as:

1
#ifdef REPORT
2
_chkp_report_control(__CHKP_REPORT_TRACE_LOG,0);
3
#endif

… and compile and execute with the appropriate switches, for example, by entering:

1
% lcc main.c –DREPORT –check-pointers=write –rdynamic –g;./a.out

… at the console. The resulting output identifies bounds violations, shows the traceback to individual lines of code, and shows the program output, if any.

Pointer Checker maintains its notion of bounds through copies, casts and references via function:

1
char *my_chptr = “abc”;     // pointer to null-terminated string, four chars long
2
char *another_chptr; // declare another char pointer
3
another_chptr = (char *) malloc (strlen((char *)my_chptr));  // allocate identical four-char buffer and set pointer to it
4
memset(another_chptr,’@’,sizeof(my_chptr));    //  coder presumably means ‘put four @ characters into my four-character buffer’ but is actually saying ‘write eight @ characters into my four-character buffer,’ which is bad (8 bytes = 64 bits, the size of a char pointer in the current model)

So when you get eight ‘@’s as output from printf(), you don’t have to scratch your head too long to see why.

Deeper Debugging

Variations in compiler switch options let you tell Pointer Checker to check for dangling pointer references, and to check the bounds of un-dimensioned arrays, as when a region of memory is referred to using array syntax.

Pointer Checker can also be used to detect ‘dangling’ pointers into freed storage, whether in stack or heap. It manages this, when enabled, by wrapping the C runtime function free() and the C++ delete operator. When used to free memory, Pointer Checker’s deallocation wrappers set the associated pointer’s lower bound to 2 and its upper bound to 0 – illegal values that can be accessed via the API calls to determine if a particular violation is caused by a dangling pointer.

Deeper analysis can be built using four simple intrinsic functions which let you access and use the lower and upper bound information associated with any pointer, remove it (letting the pointer be used to access memory unrestricted by bounds-checking), and set it manually. This is useful for building pointer-checking logic around custom memory allocators, such as might be used to arrange structured data in vector-register-sized chunks for optimal consumption by SIMD instructions.

These intrinsics are also important aids when using Pointer Checker on some modules in a program, but not others. A potential problem arises when enabled and non-enabled modules are combined, because a function in a non-enabled module that returns a pointer will not also return its bounds. Pointer Checker mitigates this, in most cases, by checking the stored pointer against a copy stored with the bounds, local to the non-enabled function; but this can fail in some cases, as when an enabled function passes a pointer (with known bounds) to a function which then reallocates the memory associated with it. In these cases, you can use the intrinsic functions to reset pointer bounds appropriately.

Pointer Checker can also be used to check runtime library functions that use pointers and move memory around. The feature provides a replacement library with wrapped versions of all standard C/C++ RTL functions – code added to insure that pointers are created by these functions with enforceable bounds, and that bounds are created and updated as necessary when, for example, these functions copy and move memory around. Wrapped functions appear in the replacement libraries with the prefix __chkp_ (e.g., __chkp_strcpy – the wrapper for strcpy()).

The Pointer Checker library function __chkp_report_control() gives you broad control over how Pointer Checker operates. Arguments to the function let you tell Pointer Checker to log bounds errors and continue, execute breakpoint interrupts for the debugger, call user-defined functions, or various combinations.

Tips for Using Pointer Checker

Pointer Checker is best used in Debug configuration, which makes symbols visible for improved clarity of tracebacks. Linux users will wish to compile using GCC’s –rdynamic option, which tells the linker to add all symbols – not just the ones that are used – to the symbol table, so that backtraces can be generated from them.

Another important technique is to work, where possible, to ensure that out-of-bounds errors detected by Pointer Checker occur near where bad pointers are generated. One way to help ensure this is to compile without optimization, which can eliminate memory accesses and tends to create ultra-efficient code that doesn’t map to source lines as well as un-optimized code.

Using the range of Pointer Checker options systematically is also important. It makes sense to check for write errors first, then, having eliminated these, for read errors – which will, by now, be reduced in number. Examining the trace logs for repeated references to single pointers and sets of bounds can help you identify, and quickly eliminate errors occurring within loops.

It’s important to release your application with Pointer Checker disabled, since otherwise, it will incur the overhead of moving bounds around with pointers and performing copious checks: increasing execution time and application size. Arguably, however, in certain cases, it may be justifiable to release code compiled with Pointer Checker enabled in order to provide in-service protection against pointer-related security exploits. Obviously, this must be strategized as a trade-off against application size and performance.

Want more detail? Read a related article from Intel’s Parallel Universe magazine

 

Posted on by John Jainschigg, Geeknet Contributing Editor
0 comments