[Llvm-bgq-discuss] clang on BGQ performance

Hal Finkel hfinkel at anl.gov
Tue Mar 25 11:34:00 CDT 2014


----- Original Message -----
> From: "John A. Biddiscombe" <biddisco at cscs.ch>
> To: llvm-bgq-discuss at lists.alcf.anl.gov
> Sent: Tuesday, March 25, 2014 8:57:56 AM
> Subject: [Llvm-bgq-discuss] clang on BGQ performance
> 
> 
> 
> 
> 
> Dear people
> 
> 
> 
> I’d had terrible performance of my application which is intended to
> run on IO nodes, so I’ve been poking around to try to find out what
> might be wrong.
> 
> 
> 
> Today I compiled a simple stream memory writing test from
> http://www.cs.virginia.edu/stream/FTP/Code/

Can you please provide details on exactly what you did? What compile flags did you use, did you define TUNED?

 -Hal

> 
> I’ve run it using openmp threads up to 60, (because for reasons I
> don’t understand, the IO node only shows 15*4 threads)
> 
> 
> 
> The results for bgclang seem to echo what I’ve been finding with my
> code. I have not tested my stuff fully with gcc as I only just got
> that installed recently.
> 
> 
> 
> Any advice on what I might try to improve the bgclang numbers? in
> some cases gcc looks 2x better.
> 
> 
> 
> Note that my program doesn’t use openmp so I don’t directly care much
> about this particular example, but the trend mirrors what I’m seeing
> with HPX threads
> 
> 
> 
> thanks
> 
> 
> 
> JB
> 
> 
> 
> using bgclang version 20140309
> 
> 
> 
> export OMP_NUM_THREADS=1
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 659.5 0.242635 0.242601 0.242724
> 
> Scale: 536.2 0.298403 0.298376 0.298535
> 
> Add: 828.5 0.289701 0.289669 0.289839
> 
> Triad: 711.8 0.337206 0.337151 0.337325
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=2
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 1318.8 0.121335 0.121322 0.121360
> 
> Scale: 1072.5 0.149223 0.149185 0.149375
> 
> Add: 1657.2 0.144868 0.144823 0.145036
> 
> Triad: 1423.8 0.168611 0.168565 0.168755
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=4
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 2636.4 0.060729 0.060688 0.060919
> 
> Scale: 2236.9 0.071580 0.071529 0.071774
> 
> Add: 3311.2 0.072555 0.072482 0.072750
> 
> Triad: 2845.6 0.084426 0.084341 0.084540
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=8
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 5265.6 0.030446 0.030386 0.030614
> 
> Scale: 4468.1 0.035848 0.035809 0.036030
> 
> Add: 6611.9 0.036341 0.036298 0.036526
> 
> Triad: 5684.9 0.042258 0.042217 0.042420
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=16
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 9390.8 0.018977 0.017038 0.025704
> 
> Scale: 7688.2 0.021786 0.020811 0.029255
> 
> Add: 11985.7 0.020990 0.020024 0.028394
> 
> Triad: 10875.0 0.023131 0.022069 0.031470
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=32
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 15556.4 0.011463 0.010285 0.012906
> 
> Scale: 13361.1 0.013228 0.011975 0.014883
> 
> Add: 20438.0 0.012872 0.011743 0.014259
> 
> Triad: 18047.8 0.014270 0.013298 0.016016
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=60
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 11472.0 0.016570 0.013947 0.022287
> 
> Scale: 10145.1 0.019031 0.015771 0.028346
> 
> Add: 15317.9 0.018322 0.015668 0.025756
> 
> Triad: 14106.8 0.018959 0.017013 0.025986
> 
> -------------------------------------------------------------
> 
> 
> 
> using GCC 4.8.2
> 
> export OMP_NUM_THREADS=1
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 3534.4 0.045289 0.045270 0.045306
> 
> Scale: 1318.8 0.121390 0.121325 0.121632
> 
> Add: 1899.0 0.126403 0.126384 0.126428
> 
> Triad: 1910.3 0.125667 0.125637 0.125724
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=2
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 7053.2 0.022716 0.022685 0.022744
> 
> Scale: 2613.9 0.061247 0.061211 0.061278
> 
> Add: 3794.3 0.063271 0.063252 0.063292
> 
> Triad: 3794.4 0.063288 0.063251 0.063449
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=4
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 13999.4 0.011470 0.011429 0.011494
> 
> Scale: 5218.5 0.030683 0.030660 0.030729
> 
> Add: 7585.3 0.031647 0.031640 0.031681
> 
> Triad: 7583.4 0.031663 0.031648 0.031690
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=8
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 25910.8 0.006205 0.006175 0.006233
> 
> Scale: 10432.9 0.015373 0.015336 0.015484
> 
> Add: 15130.5 0.015922 0.015862 0.016092
> 
> Triad: 15116.2 0.015971 0.015877 0.016139
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=16
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 28433.5 0.005643 0.005627 0.005665
> 
> Scale: 20547.1 0.007831 0.007787 0.007860
> 
> Add: 27006.3 0.008922 0.008887 0.008948
> 
> Triad: 27758.5 0.008658 0.008646 0.008672
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=32
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 28368.6 0.005673 0.005640 0.005742
> 
> Scale: 26302.8 0.006115 0.006083 0.006175
> 
> Add: 27164.4 0.008878 0.008835 0.008960
> 
> Triad: 27691.3 0.008702 0.008667 0.008744
> 
> -------------------------------------------------------------
> 
> export OMP_NUM_THREADS=60
> 
> -------------------------------------------------------------
> 
> Function Best Rate MB/s Avg time Min time Max time
> 
> Copy: 25715.2 0.008484 0.006222 0.012176
> 
> Scale: 22472.2 0.012979 0.007120 0.021724
> 
> Add: 25319.6 0.014178 0.009479 0.023234
> 
> Triad: 25591.9 0.013839 0.009378 0.023146
> 
> -------------------------------------------------------------
> 
> 
> 
> 
> 
> 
> 
> --
> 
> John Biddiscombe, email:biddisco @.at.@ cscs.ch
> 
> http://www.cscs.ch/
> 
> CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07
> 
> Via Trevano 131, 6900 Lugano, Switzerland | Fax: +41 (91) 610.82.82
> 
> 
> _______________________________________________
> llvm-bgq-discuss mailing list
> llvm-bgq-discuss at lists.alcf.anl.gov
> https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-bgq-discuss mailing list