[Llvm-bgq-discuss] address sanitizer

Hal Finkel hfinkel at anl.gov
Tue Jul 9 22:27:03 CDT 2013


----- Original Message -----
> 
> 
> 
> Hi Hal,
> 
> Could you say a bit more on the need for 3 shadow regions? Is it
> because the address space is discontiguous, or are you referring to
> a different limitation? If discontiguous, how does it work under
> normal Linux that has discontiguous ranges for executable, dynamic
> libraries, heap, and stack?

This works under normal Linux because there you can ask for a large chunk of mapped (but not committed) memory at some (essentially) arbitrary portion of the address space. As you touch those pages they're committed, but in practice, you don't touch most of them. As far as I can tell, under CNK, you can only mmap pages (create a memory region) that is part of an existing virtual-memory range (text, data, heap, shmem, etc.), and these are always 'committed'.

For CNK, consider a (fairly generic) c1 process layout:
        0x000001000000-0x000001058000   /gpfs/vesta-fs0/projects/llvm/hfinkel/asan/./stack-oob-frames
        0x000001100000-0x000001105000   /gpfs/vesta-fs0/projects/llvm/hfinkel/asan/./stack-oob-frames
        0x001c07000000-0x001c07102000   [heap0]
        0x001c07102000-0x001c07202000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07202000-0x001c07203000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07202000-0x001c07213000   [heap1]
        0x001c07213000-0x001c07216000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07216000-0x001c0721b000   [heap2]
        0x001c0721b000-0x001c0731b000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c0731b000-0x001c0731c000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c0731b000-0x001c0732b000   [heap3]
        0x001c0732b000-0x001c0732d000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c0732d000-0x001c0733e000   [heap4]
        0x001c0733e000-0x001c0743e000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c0743e000-0x001c0743f000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c0743e000-0x001c0744e000   [heap5]
        0x001c0744e000-0x001c07450000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07450000-0x001c07451000   [heap6]
        0x001c07451000-0x001c07651000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07651000-0x001c0766b000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07651000-0x001c07684000   [heap7]
        0x001c07684000-0x001c07784000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07784000-0x001c07786000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07784000-0x001c07795000   [heap8]
        0x001c07795000-0x001c0779f000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c0779f000-0x001c0789f000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c0779f000-0x001c0789f000   [heap9]
        0x001c0789f000-0x001c078a1000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c078a1000-0x001c078a2000   [heap10]
        0x001c078a2000-0x001c07aa2000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07aa2000-0x001c07aa6000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07aa2000-0x001c07ab5000   [heap11]
        0x001c07ab5000-0x001c07ad0000   /bgsys/drivers/V1R2M0/ppc64/gnu-linux/powerpc64-bgq-linux/lib/li
        0x001c07ad0000-0x002000000000   [heap_and_stack]
        0x003003000000-0x003003100000   /lib64/ld64.so.1
        0x003245000000-0x003245010000   /dev/shm/unique-comm-agent-shmem-file
        0x007000000000-0x007040000000   [process_window0]
        0x007040000000-0x007080000000   [process_window1]
        0x007080000000-0x0070c0000000   [process_window2]
        0x0070c0000000-0x007100000000   [process_window3]
        0x007100000000-0x007140000000   [process_window4]
        0x007140000000-0x007180000000   [process_window5]
        0x007180000000-0x0071c0000000   [process_window6]
        0x0071c0000000-0x007200000000   [process_window7]
        0x007200000000-0x007240000000   [process_window8]
        0x007240000000-0x007280000000   [process_window9]
        0x007280000000-0x0072c0000000   [process_window10]
        0x0072c0000000-0x007300000000   [process_window11]
        0x007300000000-0x007340000000   [process_window12]
        0x007340000000-0x007380000000   [process_window13]
        0x007380000000-0x0073c0000000   [process_window14]
        0x0073c0000000-0x007400000000   [process_window15]
        0x03fdc0000000-0x03fe00000000   [mmio]
        0x4064880000000-0x40648c0000000 [l2atomic0]
        0x40648c0000000-0x40648c1000000 [l2atomic25]
        0x40648c1000000-0x40648c2000000 [l2atomic26]
        0x40648c2000000-0x40648c3000000 [l2atomic27]
        0x40648c3000000-0x40648c4000000 [l2atomic28]
        0x40648c4000000-0x40648c5000000 [l2atomic29]
        0x40648c5000000-0x40648c6000000 [l2atomic30]
        0x40648c6000000-0x40648c7000000 [l2atomic31]
        0x40648c7000000-0x40648c8000000 [l2atomic32]
        0x40648c8000000-0x40648c9000000 [l2atomic17]
        0x40648c9000000-0x40648ca000000 [l2atomic18]
        0x40648ca000000-0x40648cb000000 [l2atomic19]
        0x40648cb000000-0x40648cc000000 [l2atomic20]
        0x40648cc000000-0x40648cd000000 [l2atomic21]
        0x40648cd000000-0x40648ce000000 [l2atomic22]
        0x40648ce000000-0x40648cf000000 [l2atomic23]
        0x40648cf000000-0x40648d0000000 [l2atomic24]
        0x40648d0000000-0x40648d1000000 [l2atomic9]
        0x40648d1000000-0x40648d2000000 [l2atomic10]
        0x40648d2000000-0x40648d3000000 [l2atomic11]
        0x40648d3000000-0x40648d4000000 [l2atomic12]
        0x40648d4000000-0x40648d5000000 [l2atomic13]
        0x40648d5000000-0x40648d6000000 [l2atomic14]
        0x40648d6000000-0x40648d7000000 [l2atomic15]
        0x40648d7000000-0x40648d8000000 [l2atomic16]
        0x40648d8000000-0x40648d9000000 [l2atomic1]
        0x40648d9000000-0x40648da000000 [l2atomic2]
        0x40648da000000-0x40648db000000 [l2atomic3]
        0x40648db000000-0x40648dc000000 [l2atomic4]
        0x40648dc000000-0x40648dd000000 [l2atomic5]
        0x40648dd000000-0x40648de000000 [l2atomic6]
        0x40648de000000-0x40648df000000 [l2atomic7]
        0x40648df000000-0x40648e0000000 [l2atomic8]
        0x5000001500000-0x5000001600000 [l1p]
        0x5000001600000-0x5000001700000 [l2]

There are three disjoint regions from which most normal processes will want to read/write data. The first is the segments from the executable, in this case:
        0x000001000000-0x000001058000   /gpfs/vesta-fs0/projects/llvm/hfinkel/asan/./stack-oob-frames
        0x000001100000-0x000001105000   /gpfs/vesta-fs0/projects/llvm/hfinkel/asan/./stack-oob-frames
the second is the heap and stack:
        0x001c07000000-0x001c07102000   [heap0]
        ...
        0x001c07ad0000-0x002000000000   [heap_and_stack]
and the third is the shared/persistent memory region:
        0x003245000000-0x003245010000   /dev/shm/unique-comm-agent-shmem-file

obviously most programs also access other regions (like the l2atomic regions), but that happens inside inline assembly or other libraries. For address sanitizer, only those loads and stores being instrumented by the compiler matter.

Using the most aggressive setting that I can (which is a 32:1 ratio because we have 32-byte aligned stacks), I could not map one area large enough to cover the entire address space of interest. That would be (0x003245010000-0x000001000000)/32 ~ 200GB/32 ~ 6.3GB. Taking 6GB of space out of 16GB total just for the shadow region (which is not the only overhead) would, I think, greatly limit the usefulness of the system. Even worse, that size would not decrease when running with more processes per node. As a result, I'm currently allocating one shadow region, which is logically divided into three parts: one part of the low address space (to cover the executable image segments), one part for the stack and heap and one part for the shared/persistent memory region. The stack/heap is the largest, and so the total shadow region takes only 16GB/32 = 512MB, and that seems reasonable. When not in c1 mode, the size of the shadow region shrinks accordingly.

> 
> I am not sure how the address sanitizer works internally, but would
> it be possible to grab the shadow memory from the front, by being
> the first caller of 'brk'?

No, because it seems like the dynamic loader always grabs stuff at the beginning of the heap area. Regardless, I'm currently mapping the region in the middle of the stack/heap area. Because it seems like it needs to be in there somewhere, that seems like the most out-of-the-way place.

> 
> Is the dynamic library requirement a statement of the llvm runtime
> build (e.g., missing "libanalyzer.a" library)? Or is the analyzer
> runtime effectively an "ELF interpreter"? Or some other requirement?

The address sanitizer runtime requires dynamic linking, I think that it uses that in order to hook functions, and maybe also for other reasons. I did experiment with removing the restriction and building it as a static library, but it would die in calls to dlsym, etc. (which is why I think it has something to do with the function hooking). I can find out more if we'd like.

Thanks,
Hal

> 
> Thanks!
> Tom
> 
> Tom Gooding
> Senior Engineer / Blue Gene Kernels
> 507-253-0747 (internal: 553-0747)
> 
> 
> Inactive hide details for Hal Finkel ---07/06/2013 04:49:17
> PM---Hello everyone, One of my motivations for working on
> LLVM/ClanHal Finkel ---07/06/2013 04:49:17 PM---Hello everyone, One
> of my motivations for working on LLVM/Clang for the BG/Q was to
> enable use of th
> 
> 
> 
> 
> From:
> Hal Finkel <hfinkel at anl.gov>
> 
> 
> 
> To:
> llvm-bgq-discuss at lists.alcf.anl.gov,
> 
> 
> 
> Date:
> 07/06/2013 04:49 PM
> 
> 
> 
> Subject:
> [Llvm-bgq-discuss] address sanitizer
> 
> 
> 
> Sent by:
> llvm-bgq-discuss-bounces at lists.alcf.anl.gov
> 
> 
> 
> Hello everyone,
> 
> One of my motivations for working on LLVM/Clang for the BG/Q was to
> enable use of the sanitizer debugging projects on the BG/Q. Over the
> last couple of weeks, I've taken initial steps toward that goal. For
> those of you who don't know, address sanitizer is a tool for
> detecting memory allocation and use errors: use-after-free,
> double-free stack and heap overruns, etc. Because this works using
> instrumentation, address sanitizer, in general, has much lower
> overhead than tools like valgrind (which reply on processor
> emulation). This should make it feasible to use address sanitizer to
> debug memory misuse errors on the Q, including those which only show
> up at scale.
> 
> To use this feature, pass -fsanitize=address to the compiler (when
> compiling and linking). I highly recommend using at least -O1 (if
> not -O3) and -g as well.
> 
> Address sanitizer requires dynamic linking. When you provide
> -fsanitize=address, the wrapper script will automatically switch
> into non-static-linking mode. I've made a number of improvements to
> the wrapper scripts (both bgclang and the mpi scripts), and small
> fix to Clang that applies to dynamically linking in C++11 mode, to
> better support dynamic linking. In short, hopefully this will now
> all *just work*.
> 
> Because of a limitation of the LLVM PowerPC backend (it cannot do
> dynamic stack realignment yet), the ability of the current build to
> detect stack overruns is limited. I'll make the necessary
> improvements in the PowerPC backend soon, and then the ability to
> detect stack overruns will be the same as on other platforms.
> 
> Some details on overheads: address sanitizer introduces runtime and
> memory overheads in several different ways. First, the runtime
> allocates a 'shadow' memory region which it uses to record state
> information on allocated memory regions. As I have it configured,
> this uses 1 byte of 'shadow' memory for every 32 bytes. The 'normal'
> upstream address sanitizer uses a simple mapping between addresses
> and 'shadow' bytes. Unfortunately, due to limitations imposed by CNK
> on virtual memory use and mapping, I've had to divide this shadow
> region into three distinct pieces (one for segments from the
> executable image, one for the heap/stack, and one for things in
> /dev/shm). Selecting between these regions introduces an extra
> penalty from the instrumentation. Nevertheless, the additional
> overhead does not seem too bad. Also, because of CNK restrictions,
> this shadow area needs to be allocated somewhere in the heap/stack
> region. I'm currently placing it in the middle, so if your
> application c
> urrently uses more than 8GB of stack in c1 mode, for example, then
> this configuration won't work for you. 'red zones' are also
> allocated around every heap allocation and stack variable, further
> increasing the memory overhead. I've tried this on HACC, and the
> runtime slowdown on various stages was between 3x and 50x. If you're
> code spends 95% of its time in dgemm, however, you'll probably not
> notice anything ;)
> 
> For those maintaining their own installs:
> As of yet, I've not bumped the version number of the install (I'll do
> that the next time I rebase). Nevertheless, there are obviously new
> parts of the patchset, build scripts, etc. You'll find a new archive
> (-v2) on the trac page https://trac.alcf.anl.gov/projects/llvm-bgq
> -- please note: do not checkout compiler-rt into the llvm/projects
> subdirectory as you would for a normal build (and as specified on
> the clang web page). The compiler-rt library needs to be
> cross-compiled using the bgclang-wrapped compiler. Just checkout
> compiler-rt into its own top-level directory, create an empty build
> directory for it, and use the build script in the archive (after
> adjusting paths as appropriate).
> 
> Happy bug hunting! (and please let me know if you encounter any
> problems).
> 
> -Hal
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> llvm-bgq-discuss mailing list
> llvm-bgq-discuss at lists.alcf.anl.gov
> https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> 
> 
> 
> 
> _______________________________________________
> llvm-bgq-discuss mailing list
> llvm-bgq-discuss at lists.alcf.anl.gov
> https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-bgq-discuss mailing list