From a73f0924e470a013a41e38377e1995ac30542588 Mon Sep 17 00:00:00 2001 From: lilia Date: Sun, 2 Feb 2014 02:09:18 +0100 Subject: [PATCH] 01-02-2014 --- GMRES_Journal.tex | 8 ++++++++ weak.pdf | Bin 0 -> 7783 bytes 2 files changed, 8 insertions(+) create mode 100644 weak.pdf diff --git a/GMRES_Journal.tex b/GMRES_Journal.tex index 2b077f1..198f8b5 100644 --- a/GMRES_Journal.tex +++ b/GMRES_Journal.tex @@ -935,6 +935,14 @@ torso3 & 57.469 s & 16.828 s & {\bf 3.415} & 926.588 s \end{center} \end{table} +\begin{figure} +\centering + \includegraphics[width=120mm,keepaspectratio]{weak} +\caption{Weak scaling of the parallel GMRES algorithm on a GPU cluster.} +\label{fig:09} +\end{figure} + +\textcolor{red}{\bf Figure~\ref{fig:09} presents the weak scaling of four versions of the parallel GMRES algorithm on a GPU cluster. We fixed the size of a sub-matrix to 5 million of rows per GPU computing node. We used matrices having five bands generated from the symmetric matrix thermal2. This figure shows that the parallel GMRES algorithm in its naive version or using either the compression format for vectors or the hypergraph partitioning is not scalable on a GPU cluster due to the large amount of communications between GPUs. In contrast, we can see that the algorithm using both optimization techniques is fairly scalable. This means in this version the cost of communications is relatively constant regardless the number of computing nodes in the cluster.} \textcolor{red}{\bf Finally, the parallel solving of a linear system can be easy to optimize when the associated matrix is regular. This is unfortunately not the case of many real-world applications. When the matrix has an irregular structure, the amount of communication between processors is not the same. Another important parameter is the size of the matrix bandwidth which has a huge influence on the amount of communications. In this work, we have generated different kinds of matrices in order to analyze different difficulties. With as a large bandwidth as possible involving communications between all processors, which is the most difficult situation, we proposed to use two heuristics. Unfortunately, there is no fast method that optimizes the communication in any situation. For systems of non linear equations, there are different algorithms but most of them consist in linearizing the system of equations. In this case, a linear system needs to be solved. The big interest is that the matrix is the same at each step of the non linear system solving, so the partitioning method which is a time consuming step is performed once only. } diff --git a/weak.pdf b/weak.pdf new file mode 100644 index 0000000000000000000000000000000000000000..5be3b00544ef198d02171624e52ee3b9196c74ad GIT binary patch literal 7783 zcmb_Bc|26z`&5=jsU(&4W*ez-XCGwAHYi*6wHh-QW1BHEb}CD@7E#uSC=@E9MTroK z5|UDqR6-?6d0(a9ok83C{r!I5Ki)f^nK|b<&vu^WEYDT4G9;rgYIuZF%iH7v1OY$; z40msYmKJCR`FnD_03r;r2FY|^4#WaLvM+@L8A4PB4bss;usJM<;)e*yDR(qyKP4`! z9dr1QDSdWwW~eXA%!Bo&sv_Jw%;epnAq!|rP*>IW8;bl#@`Ki#FOre77nj?rwKC+? zSJPIfrB1USzIbQd*elg^xBF8mQ{QyLWlyN~8K*51foDx7K2OmKRy(^?m^CYJX?)%> zBJUE#pRaLWb%5Dk67r$w@%8{*WVZO`SY3w$(}URfHAX8lswBn!Zl&6@bK`dg!pHAilII9 zV>cz1@+}gm$0#hzdmAj6*Dl?s{7B~JCMqrY#^Fm1#wtz8CjRluvXgVgkELAvBC0a2 zEgq3{J(YRgH0HTU?EOMQ%E*n+T@#cb#7L72r|MG_A!xl{5OHfoHUI5*C0B2^Tos=r z$7W8R&t3j5AvIjxacDaIdB{l1H&Bvw?5$Tz>0cpb2! zBGY5zkjmoWZI40*!tGNnZ}z7v4uuq#JobHW?^`2%k zb7*4?qy?s(NDFXQBnYMLZOcACtK%B^S48+;q|Shv(8T*&-GigE@BjYr={>ZxsCs+4 zmO^c-Lz{GYd6DiOr#-q_Ni!>6Oj*`x*;U;>0oh9&&E4hMH1niR^pMS6U**)|$g^2C zmCQ{Fn#zmTI?9<|IcQru#h&=&v4iM&!7eiiyX=~x!rN_AI}3c`kC0B2M{_J+rAS=1 zmRPr}O)FQRCf}LaDQ=q^+{0K(>S!Ka>t46@s0Ir_T^F2O5 zc33Zby!qjdm#T(tuK|VitI|;HV<$!0Q%w0?ci-X2uW&-Shc@FS3{3={Ool~eEnj*u zKJg)HZ;hd2#MViXhQ865prL5DJMyOrYE2R=DZV9N$BLlXwpSv>Frs%cboYuod}(VO8XVaeEP50`T@##Q^(JP7 zug_Hreq!le=nhVQO*4Jv|50F(kBlrP`C(sL`VPBrGl?_;p8tHfw}}3yd--<`i*1)X zAT@eS;kmi-zFSs9ad)nyUDatYF+UW&uDj--cfVj6q#ES&tw!#{n#O&r&WDW2Pc5;% z-b3kMGo3_nrCQhxeB0f_){Uj_5$^tG)#H+ET2yoS?{$R%<#7|Jj@rpIqm^sw4!;)O zKOS12XrH7=Bbgd7)J_~I}MJ$Ii$QaiI7Mw)_;(c zt*A60x@732tVYbM2kIh%*F$IYdJOYXQR^-t-WMtgA6{;-H9VvIeeH6o4v*!x^!P$K zyFVKVcRNIh$={AyZKK*#tS7ZCC6KZ&l{&o98Eu%+^tu3yj#t|ybf2~@Afb-soJe&pNvq{*s7j%AC*h8 zJNc@d`-Q8tZ>-9%$M1MRCoqyZHo5oBs~?(Qdg$NZ{zRs2FXxV%&rr~YX9I3|4|gDw ztjhvzHO>`69BboT|L@|=xJdhJvm3|oGaPnG|JnYLQbDhrjSaNE;UqCe+|&$?_SXC; z(&l4m!}WNrxD~HYp{^zp-%8!xk0blo#zrJQt$d&5)Qpc38ueH%#%fr+eHi;%LVNdO z?^2SxSNMew@A$@o{D}0%Y`-U!&l*Rgl3biFzH0as<{YK38 zHoptNE$+_2-7mQR7y=H9n1|ti?kcvSObE21ctW5#M59ym86kiR8Xm?G(Eyf!bwj{| zEXbb&;JEt^0ltZS8Z#A8vt_% zg?n&4p2Qg!=EJ-0u9KK7|c&4FlR6SsCk*H<>`bK@f*drSQyU2(hUw zI+Md-0UBHuJah0yU~|%0Y>t5!g$3ZSpc!Spjlp6-2Re=8#dbkpNnAm{esDM(!2SLL zLl6i=VBw4AwSNEo758VGNaW6#^oIlz77L&eb7OP3Jjk!|=Gurk89ce@pKY$j3%X#r zym%jVSmSvpPg^)Dxe%D{2R_{9O5(!cFE{-B;j<9k-xD^5zn(vv{-bXb=+5EBIyXFE zZ!W|f*Sx<3&L7t2buB1<(0_K5KAbLawxF;Wb$~>~0wg@fje8{XWW&BefO>2yH*;Vp z9P|{X2}Jkwn(uNQ%pK#BB0y6(R_Rnde@|ZsHX~@mf&A>@jKL#7b2^(1&xYq&?yM1@ z<2)MzP6Or&hXI^{DxeW)3|fFzpdCm7-9aiy10j$n4)g#$;DiM|K`)RF`hor+17!F^ zAQNPA^V%2k;LNpHT;)I($OhT;5RlEJP$7^5a)KFPAQ%J&gCSrj7zVKz8+kT?&z}$u zz;HeFYxHyb^VXa@q5gV6g(vo_4-6Jgv{(`Wr=#=Z>}qGf)Yhg~A`DyvF10I?M>F867=b&fW8muDwnAIzRF5hyE&R(CLTKDA`o z?nT7E>DAx-I@)RizU_^9$GB$y(B;dsobYEFGdZ;#LsmqpifBAl>ZFy##s`nby2`gW zJWOe{kZH6Sk}t!K6EY@?c1#vpBJ(F)@z099N3V=8LDN>WDC^&r63=Mdct%|;^|(WJ zUT)UubPZ%P7@W1GcdBFc3SC{@$9-M8vnl&k5tTSQEV5F?T7~_QC9Yasw>j2vQ`+M( zTGpMe>t&0a!pL#*jbW&bZf70=E~(}1_uAS}sT%QDTDAK7wO^;icXYkYsopP88OyPY zGE+O?K$B4I{`ft}WXA(DJO7=Id-SfiR*q^e^P#abo8s;Nl@&x+EmKOfsC zW-Y(XEI>ubdo#5Ct~}Ekj~OspuP&OSXfd{~10&Uvv%dRw^`Q6Fe&Pw@X(B1q;=1}y zg>h0}-uvg4FRx9o|JsGaDrcVY;7|*j$;p0~t1G7ByIY$N`*liGj9pujs~xrY!sT6? z@qv}jDHmD_WcCTpGW>Ea`=lP;R+_H+R(6n);LT1IHs@E-KoT-Blehf>SB&*FYWVDv zSiIJv!qX_^((p3rB!fe^dI66W-xnPKx@L4k&T%aB-#EDdCt{J!0$1)PkMnO3_J3n$ zUAkIki!suxI*!=`NPMnxPxInCJ5tp2I$E@8-_S~Cy_|mJ?q%uQ_7_=+*KJV99$Eg< zAj|dBw|Dy=$tJth+;>xbJCQUNlZc-1$lKxV8DtRls5G8cyZaG7ul$1T1<8sd`-3S~ zO%}Z_MFtvU=Zg0mq$fFOl*N@fZb$nGSqdw*9g2=ul&xTXIJqV{-d@XMV#k+>*CK&E zEtc=}D8;hgpQ4pldkx0r?c7|IdDJOIc;G|K zS9eLv>@6#{Y(89awTfIhnSEzp+E>*e_q7V;@|>)?AX;5U;+k>dtoctrsSuoNk2Jdrgxh z*5`7F!#Bt_69MI?j@CiLulvUplt0GAOj*}kl8>K0V7Er({g=iYvqt@=dL9*+He{4N zT)S(qO|0R%LO1yI=!J+~2XF&3dm1cHb%$B8%|99_KT6uvRe4KNH0St|GiB+dN64ZFnn|<{kv`!0HG9e`YB*MCdaKUi~sNX z-r-SB!iayXdAh{p4*SlcR=-CAiE?D(Qj_M=bucDpfD;s^Jx+0khG zmclnoqr3cqTlB67zw+PI=CB$EeeU~WBV8z7>~gNRS5zila)rzjhw#k5a6<=})=N%a z7|6cJ*FxoQ`shj*z9|%dfB4}<_jA58K=YaFtHO?%wN;h53Pnlk`FttZbA`Ol?t%Bc zRke2pMcQiQ?l&}_Uh8b@b{JSMD7(ZUzXNm>tCRR=Z5&b`IAXqk?r!co#n>CIa>)O=O&vx zds>onrpx&6+zmSV-7=*+R!-{qX|I+Z)c4Kj+I-V{1y-k;1ve7a*9k3a+b#I$OHku+ zq0=I7CIx)D3|0+YDwBG~7f|}1?%Cb^R#(F3dlAVbB$fAohy2g60uk4zH@4R*zL3J!pCCm^V@`v-~{E3_s?zIPFN2oVU}+cwSXkcao-kV03MD zl9QxfF8>W_;n7th;ZDI~CM$QTvq+IfZm|kMAL^HhemaBjF-t#T>GLE+rn>OFH&~4e zjS#O1G*3Beb+w2iV4eRqW|XRAd?~W#8~!8FZRF`{k<}@p+Uis11<8-X2Pad4GS&AV zl0O_^TIF>`N%%lTr2CPKnaa+MT}4ur&4&9DXC$KWIrsKpZ(qqRL=AkO5idRSxW-gL z!zl3Zq}9iOAO+Q;Wb<=v)hr&GtpBx72Xi)3`tlx~-yN{xHhUt;NKwA`z!x#=z(jp?X2 zZLk-6F82+(|JrG<`(W#vvPj=tRM>2A?>OhFc45Mi=MH;)$8+u`RQW%qfAvi4b#3T6 z;BKRM=hCO6gS0mXV^2OfRQjsDa3IaDcnB|Q@D07z`|j-m6Y)alyj02ZAQ`)levyvC z=g`KMA5?`*2fviwNH41sbCHV5ySHb>ktL=XCv@cm9^|w~8IaYa z-8UJgBMLP%4xewzd0`&ny2fto&v}krf}p=2TmEx_G$4X8n+-75RdHU7fzR zEqBU*`clAzQW?Kgq%r*!%E9DI{fkR`>kNZ;P@9Vfmvsusb?)9?R_bZ?K)>$I;o{~e z!OQF|ZtNLeo1DKDSt55>GOqun-e{XBut)Cmw35pG?9$b5$J`gI1`;b^N!|F?IJyvz=fK9e62T#&YeX)ty zkay_v>WH;XiXqcYPv4B!^^;GnKaqKqkaMoZ^^v|Ln<&v1Al@J=9R93v2WTSXT3ox^ z193g3UMk7~w`3hf@Z$_YIJBKGxVckg$5^<9{yY#n znwu@0hB0^&_Qz>hOE&~Qi2Fbsz#a0W`)eb|uhk&|I!zntKrly}GYudwx=9!dvI(=W zrH1)XHE2j(9fX!H1-|-2s1OPh<-scyvjO)F;MJKe-0%C<ikl_5x{{@9$kNoGk!)~0PycUe3X%63SJwgrP3X6^n1-?O&A(T5{3t%(> ztR_|yL($ zH0lpEnSm@{ULw$_Amj^iHDbfQ#lU*fsG1%OmLG*1#!Mz$4xn)0=my~!MCxe4gB&`? z7t-OSnVv6auHj4Prj4dA#otpK8G@og9+W^|4pJu=qWGvmO!l1led#)L4IUl%lNa2g zawCGlnqy}J1uXE@@umCHDSy!O#<}!AsNgnjzElM(`%f+lk-`m^|Bcyi4E{sB@Idg& z0*4OxW8Kn00RPhS?=kwhSi@U@HZqvI6#rv0_@zR!u(YWBrgXthnp#+yf8^1RMeI z0OnyBbv#@;`^RF#o{AXmHC^zvIC+ zf{USl!0?!V>Y_pT2OdU4{U0za2@e+%f9FRasQ;o1hed%)W-RzcE&$rl!{F*T9ssQw z@G8nHj02#lzXt>0)zjd5pdsL*Wr!nb5C|j#f&q@KM>a%bjPPU}38zjpBI%>ahIpO- bAH$7XA!lBpd;aP*O6oBqRO@)n7hE literal 0 HcmV?d00001 -- 2.39.5