[Llvm-bgq-discuss] clang on BGQ performance

Biddiscombe, John A. biddisco at cscs.ch
Tue Mar 25 12:24:48 CDT 2014


Hal,

Interesting. (One clarification. My own code doesn't use openmp at all - so my own slow application must be just poor thread scheduling/placement/contention).

JB

> -----Original Message-----
> From: Hal Finkel [mailto:hfinkel at anl.gov]
> Sent: 25 March 2014 18:03
> To: Biddiscombe, John A.
> Cc: llvm-bgq-discuss at lists.alcf.anl.gov
> Subject: Re: [Llvm-bgq-discuss] clang on BGQ performance
> 
> John,
> 
> Thanks for looking into this (and providing a useful benchmark)! You'll find
> this interesting:
> 
> bgclang -O3 -fopenmp with 1 thread:
> 
> Function    Best Rate MB/s  Avg time     Min time     Max time
> Copy:             635.7     0.251708     0.251708     0.251709
> Scale:            519.7     0.307855     0.307855     0.307856
> Add:              802.0     0.299267     0.299266     0.299267
> Triad:            753.4     0.318716     0.318566     0.318735
> 
> gcc 4.7.2 -O3 -fopenmp with 1 thread:
> 
> Function    Best Rate MB/s  Avg time     Min time     Max time
> Copy:            2067.4     0.077393     0.077392     0.077395
> Scale:           1329.4     0.120353     0.120353     0.120354
> Add:             1943.5     0.123490     0.123489     0.123490
> Triad:           1872.4     0.128179     0.128178     0.128179
> 
> gcc without OpenMP is actually slightly worse, go figure ;)
> 
> bgclang -O3 with 1 thread (with no -fopenmp)
> 
> Function    Best Rate MB/s  Avg time     Min time     Max time
> Copy:           15660.2     0.010296     0.010217     0.010870
> Scale:           5523.7     0.028967     0.028966     0.028967
> Add:             6283.2     0.038198     0.038197     0.038198
> Triad:           6331.9     0.037906     0.037903     0.037920
> 
> bgxlc_r -O3 -qsmp=omp with 1 thread:
> 
> Function    Best Rate MB/s  Avg time     Min time     Max time
> Copy:            3762.0     0.042535     0.042531     0.042538
> Scale:           5083.5     0.031481     0.031474     0.031494
> Add:             7394.2     0.032487     0.032458     0.032510
> Triad:           7397.6     0.032481     0.032443     0.032499
> 
> bgxlc_r -O3 (no -qsmp=omp) with 1 thread:
> 
> Function    Best Rate MB/s  Avg time     Min time     Max time
> Copy:            3574.1     0.044768     0.044767     0.044769
> Scale:           3301.2     0.048468     0.048467     0.048469
> Add:             4233.2     0.056696     0.056694     0.056699
> Triad:           4350.1     0.055173     0.055171     0.055177
> 
> all of these defined TUNED (just because it puts the kernels into separate
> functions). It seems that the OpenMP outlining in Clang/LLVM is seriously
> interfering with the ability of the vectorizer and instruction scheduler to do
> useful work. I assume that most of this is because of pointer aliasing
> information being lost in the OpenMP transformation. We'll need to work on
> this! (I'm actually in the middle of working on a new pointer aliasing
> framework for LLVM, and I'll be able to use that to solve a lot of these
> issues).
> 
>  -Hal
> 
> ----- Original Message -----
> > From: "John A. Biddiscombe" <biddisco at cscs.ch>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: llvm-bgq-discuss at lists.alcf.anl.gov
> > Sent: Tuesday, March 25, 2014 11:44:50 AM
> > Subject: RE: [Llvm-bgq-discuss] clang on BGQ performance
> >
> > > Can you please provide details on exactly what you did? What compile
> > > flags did you use, did you define TUNED?
> >
> > edited Makefile to skip the fortran and set bgclang vars
> >
> > bbpbgas040:~/bgas/clang/build/stream$ cat Makefile
> >
> > CC = bgclang
> > CFLAGS = -O3 -fopenmp
> > -L/gpfs/bbp.cscs.ch/home/biddisco/apps/clang/bgclang/omp/lib/
> >
> > all:  stream_c.exe
> >
> > stream_c.exe: stream.c
> >         $(CC) $(CFLAGS) stream.c -o stream_c.exe
> >
> > clean:
> >         rm -f stream_c.exe *.o
> >
> >
> > then just a make. I didn't set any other vars (like TUNED etc)
> >
> >
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory


More information about the llvm-bgq-discuss mailing list