[Llvm-bgq-discuss] clang on BGQ performance

Thomas Heller thom.heller at gmail.com
Tue Mar 25 09:29:28 CDT 2014


On 03/25/2014 02:57 PM, Biddiscombe, John A. wrote:
> Dear people
>
> I’d had terrible performance of my application which is intended to run
> on IO nodes, so I’ve been poking around to try to find out what might be
> wrong.
>
> Today I compiled a simple stream memory writing test from
> http://www.cs.virginia.edu/stream/FTP/Code/
>
> I’ve run it using openmp threads up to 60, (because for reasons I don’t
> understand, the IO node only shows 15*4 threads)
>
> The results for bgclang seem to echo what I’ve been finding with my
> code. I have not tested my stuff fully with gcc as I only just got that
> installed recently.
>
> Any advice on what I might try to improve the bgclang numbers? in some
> cases gcc looks 2x better.
>
> Note that my program doesn’t use openmp so I don’t directly care much
> about this particular example, but the trend mirrors what I’m seeing
> with HPX threads

So the problems you are experiencing is not related to HPX after all?

>
> thanks
>
> JB
>
> using bgclang version 20140309
>
> export OMP_NUM_THREADS=1
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:             659.5     0.242635     0.242601     0.242724
>
> Scale:            536.2     0.298403     0.298376     0.298535
>
> Add:              828.5     0.289701     0.289669     0.289839
>
> Triad:            711.8     0.337206     0.337151     0.337325
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=2
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:            1318.8     0.121335     0.121322     0.121360
>
> Scale:           1072.5     0.149223     0.149185     0.149375
>
> Add:             1657.2     0.144868     0.144823     0.145036
>
> Triad:           1423.8     0.168611     0.168565     0.168755
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=4
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:            2636.4     0.060729     0.060688     0.060919
>
> Scale:           2236.9     0.071580     0.071529     0.071774
>
> Add:             3311.2     0.072555     0.072482     0.072750
>
> Triad:           2845.6     0.084426     0.084341     0.084540
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=8
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:            5265.6     0.030446     0.030386     0.030614
>
> Scale:           4468.1     0.035848     0.035809     0.036030
>
> Add:             6611.9     0.036341     0.036298     0.036526
>
> Triad:           5684.9     0.042258     0.042217     0.042420
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=16
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:            9390.8     0.018977     0.017038     0.025704
>
> Scale:           7688.2     0.021786     0.020811     0.029255
>
> Add:            11985.7     0.020990     0.020024     0.028394
>
> Triad:          10875.0     0.023131     0.022069     0.031470
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=32
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:           15556.4     0.011463     0.010285     0.012906
>
> Scale:          13361.1     0.013228     0.011975     0.014883
>
> Add:            20438.0     0.012872     0.011743     0.014259
>
> Triad:          18047.8     0.014270     0.013298     0.016016
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=60
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:           11472.0     0.016570     0.013947     0.022287
>
> Scale:          10145.1     0.019031     0.015771     0.028346
>
> Add:            15317.9     0.018322     0.015668     0.025756
>
> Triad:          14106.8     0.018959     0.017013     0.025986
>
> -------------------------------------------------------------
>
> using GCC 4.8.2
>
> export OMP_NUM_THREADS=1
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:            3534.4     0.045289     0.045270     0.045306
>
> Scale:           1318.8     0.121390     0.121325     0.121632
>
> Add:             1899.0     0.126403     0.126384     0.126428
>
> Triad:           1910.3     0.125667     0.125637     0.125724
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=2
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:            7053.2     0.022716     0.022685     0.022744
>
> Scale:           2613.9     0.061247     0.061211     0.061278
>
> Add:             3794.3     0.063271     0.063252     0.063292
>
> Triad:           3794.4     0.063288     0.063251     0.063449
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=4
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:           13999.4     0.011470     0.011429     0.011494
>
> Scale:           5218.5     0.030683     0.030660     0.030729
>
> Add:             7585.3     0.031647     0.031640     0.031681
>
> Triad:           7583.4     0.031663     0.031648     0.031690
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=8
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:           25910.8     0.006205     0.006175     0.006233
>
> Scale:          10432.9     0.015373     0.015336     0.015484
>
> Add:            15130.5     0.015922     0.015862     0.016092
>
> Triad:          15116.2     0.015971     0.015877     0.016139
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=16
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:           28433.5     0.005643     0.005627     0.005665
>
> Scale:          20547.1     0.007831     0.007787     0.007860
>
> Add:            27006.3     0.008922     0.008887     0.008948
>
> Triad:          27758.5     0.008658     0.008646     0.008672
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=32
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:           28368.6     0.005673     0.005640     0.005742
>
> Scale:          26302.8     0.006115     0.006083     0.006175
>
> Add:            27164.4     0.008878     0.008835     0.008960
>
> Triad:          27691.3     0.008702     0.008667     0.008744
>
> -------------------------------------------------------------
>
> export OMP_NUM_THREADS=60
>
> -------------------------------------------------------------
>
> Function    Best Rate MB/s  Avg time     Min time     Max time
>
> Copy:           25715.2     0.008484     0.006222     0.012176
>
> Scale:          22472.2     0.012979     0.007120     0.021724
>
> Add:            25319.6     0.014178     0.009479     0.023234
>
> Triad:          25591.9     0.013839     0.009378     0.023146
>
> -------------------------------------------------------------
>
> --
>
> John Biddiscombe,                        email:biddisco @.at.@ cscs.ch
>
> http://www.cscs.ch/
>
> CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91) 610.82.07
>
> Via Trevano 131, 6900 Lugano, Switzerland   | Fax:  +41 (91) 610.82.82
>
>
>
> _______________________________________________
> llvm-bgq-discuss mailing list
> llvm-bgq-discuss at lists.alcf.anl.gov
> https://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss
>



-- 
Thomas Heller
Friedrich-Alexander-Universität Erlangen-Nürnberg
Department Informatik - Lehrstuhl Rechnerarchitektur
Martensstr. 3
91058 Erlangen
Tel.: 09131/85-27018
Fax:  09131/85-27912
Email: thomas.heller at cs.fau.de


More information about the llvm-bgq-discuss mailing list