From f69f75dd5354fe34ad3b1360af061a8f1aebb9aa Mon Sep 17 00:00:00 2001 From: Raphael Couturier Date: Thu, 20 Sep 2012 20:29:18 +0200 Subject: [PATCH] ajout --- BookGPU/Chapters/chapter1/biblio.bib | 35 + BookGPU/Chapters/chapter1/ch1.tex | 275 ++ .../low_latency_vs_high_throughput.pdf | Bin 0 -> 20474 bytes .../low_latency_vs_high_throughput.svg | 1164 +++++++ .../chapter1/figures/memory_hierarchy.pdf | Bin 0 -> 15256 bytes .../chapter1/figures/memory_hierarchy.svg | 2926 +++++++++++++++++ .../chapter1/figures/nb_cores_CPU_GPU.pdf | Bin 0 -> 21529 bytes .../chapter1/figures/nb_cores_CPU_GPU.svg | 649 ++++ .../Chapters/chapter1/figures/scalability.pdf | Bin 0 -> 19107 bytes .../Chapters/chapter1/figures/scalability.svg | 1061 ++++++ BookGPU/Chapters/chapter2/ch2.tex | 45 + aa | 0 plan.txt | 112 + 13 files changed, 6267 insertions(+) create mode 100644 BookGPU/Chapters/chapter1/biblio.bib create mode 100755 BookGPU/Chapters/chapter1/ch1.tex create mode 100644 BookGPU/Chapters/chapter1/figures/low_latency_vs_high_throughput.pdf create mode 100644 BookGPU/Chapters/chapter1/figures/low_latency_vs_high_throughput.svg create mode 100644 BookGPU/Chapters/chapter1/figures/memory_hierarchy.pdf create mode 100644 BookGPU/Chapters/chapter1/figures/memory_hierarchy.svg create mode 100644 BookGPU/Chapters/chapter1/figures/nb_cores_CPU_GPU.pdf create mode 100644 BookGPU/Chapters/chapter1/figures/nb_cores_CPU_GPU.svg create mode 100644 BookGPU/Chapters/chapter1/figures/scalability.pdf create mode 100644 BookGPU/Chapters/chapter1/figures/scalability.svg create mode 100755 BookGPU/Chapters/chapter2/ch2.tex delete mode 100644 aa create mode 100644 plan.txt diff --git a/BookGPU/Chapters/chapter1/biblio.bib b/BookGPU/Chapters/chapter1/biblio.bib new file mode 100644 index 0000000..f3b0ccc --- /dev/null +++ b/BookGPU/Chapters/chapter1/biblio.bib @@ -0,0 +1,35 @@ +@misc{ch1:cuda, + author = {{NVIDIA Corporation}}, + keywords = {CUDA}, + note = {Version 4.0}, + title = {{NVIDIA CUDA C} Programming Guide}, + year = 2011 +} + +@Article{ch1:Buck:2004:BGS, + author = "I. Buck and T. Foley and D. Horn and J. + Sugerman and K. Fatahalian and M. Houston and P. + Hanrahan", + title = "{Brook} for {GPUs}: stream computing on graphics + hardware", + journal = "ACM Transactions on Graphics", + volume = "23", + number = "3", + pages = "777--786", + month = aug, + year = "2004", +} + + +@article{ch1:CMR:12, + author = "B. Cloutier and B. K. Muite and P. Rigge", + title = "A comparison of CPU and GPU performance for Fourier pseudospectral simulations of the Navier-Stokes, Cubic Nonlinear Schrödinger and Sine Gordon Equations", + journal = "Computational Physics (physics.comp-ph)", + year = "2012", + archivePrefix = "arXiv", + eprint = "1206.3215", + primaryClass = "physics.comp-ph", +} + + + diff --git a/BookGPU/Chapters/chapter1/ch1.tex b/BookGPU/Chapters/chapter1/ch1.tex new file mode 100755 index 0000000..9a3f4eb --- /dev/null +++ b/BookGPU/Chapters/chapter1/ch1.tex @@ -0,0 +1,275 @@ +\chapterauthor{Raphaël Couturier}{Femto-ST Institute, University of Franche-Comte} + + +\chapter{Presentation of the GPU architecture and of the CUDA environment} +\label{chapter1} + +\section{Introduction}\label{ch1:intro} + +This chapter introduces the Graphics Processing Unit (GPU) architecture and all +the concepts needed to understand how GPUs work and can be used to speed up the +execution of some algorithms. First of all this chapter gives a brief history of +the development of Graphics card until they can be used in order to make general +purpose computation. + + + +\section{Brief history of Video Card} + +Video card or Graphics card have been introduced in personnal computers to +produce high quality graphics faster than classical Central Processing Unit +(CPU) and to alleviate CPU from this task. In general, display tasks are very +repetitive and very specific. Hence, some manufacturers have produced more and +more sofisticated video cards, providing 2D accelerations then 3D accelerations, +then some light transforms. Video cards own their own memory to perform their +computation. From at least two dedaces, every personnal computer has a video +card which a simple for desktop computers or which provides many accelerations +for game and/or graphic oriented computers. In the latter case, graphic cards +may be more expensive than the CPU. + +After 2000, video cards allowed to apply arithmetics operations simulatenously +on a sequence of pixels, also later called stream processing. In this case, +information of the pixels (color, location and other information) are combined +in order to produce a pixel color that can be displayed on a +screen. Simultaneous computations are provided by shaders which calculate +rendering effects on graphics hardware with a high degree of flexibility. These +shaders handles the stream data with pipelines + + +Some reasearchers tried to apply those operations on other data, representing +something different from pixels, and consequently this resulted in the first +uses of video cards for performing general purpose computation. The programming +model was not easy to use at all and was very dependent of the hardware +constraints. More precisely it consisted in using either DirectX of OpenGL +functions providing an interface to some classical operations for videos +operations (memory transfers, texture manipulation, ...). Floating point +operations were most of the time unimaginable. Obviously when something bad +happened, programmers had no way (and tools) to detect it. + +\section{GPGPU} + +In order to benefit from the computing power of more recent video cards, CUDA +was first proposed in 2007 by NVidia. It unifies the programming model for some +of their most performant video cards. Cuda~\cite{ch1:cuda} has quickly been +considered by the scientific community as a great advance for general purpose +graphics processing unit (GPGPU) computing. Of course other programming model +have been proposed. The other well-known alternative is OpenCL which aims at +proposing an alternative to Cuda and which is multi-platform and portable. This +is a great advantage since it is even possible to execute OpenCL programs on +traditionnal CPUs. The main drawbacks is that it is less tight with the +hardware and consequently provides sometimes less efficient programs. Moreover, +Cuda benefits from more mature compilation and optimization procedures. Other +less known environment have been proposed, but most of them have been stopped, +for example we can cited: FireStream by ATI which is not maintened anymore and +replaced by OpenCL, BrookGPU by Standford University~\cite{ch1:Buck:2004:BGS}. +Another environment based on pragma (insertion of pragma directives inside the +code to help the compiler to generate efficient code) is call OpenACC. For a +comparison with OpenCL, interested readers may refer to~\cite{ch1:CMR:12}. + + + +\section{Architecture of current GPUs} + +Architecure of current GPUs is constantly evolving. Nevertheless some trends +remains true through this evolution. Processing units composing a GPU are far +more simpler than a traditional CPU but it is much easier to integrate many +computing units inside a GPU card than many cores inside a CPU. This is due to +the fact that cores of a GPU a simpler than cores of a CPU. In 2012, the most +powerful GPUs own more than 500 cores and the most powerful CPUs have 8 +cores. Figure~\ref{ch1:fig:comparison_cpu_gpu} shows the number of cores inside +a CPU and inside a GPU. In fact, in a current NVidia GPU, there are +multiprocessors which have 32 cores (for example on Fermi cards). The core clock +of CPU is generally around 3GHz and the one of GPU is about 1.5GHz. Although the +core clock of GPU cores is slower, the amount of cores inside a GPU provides +more computational power. This measure is commonly represented by the number of +floating point operation per seconds. Nowadays most powerful GPUs provide more +than 1TFlops, i.e. $10^{12}$ floating point operations per second. Nevertheless +GPUs are very efficient to perform some operations but not all kinds of +operations. They are very efficient to execute repetitive work in which only the +data change. It is important to keep in mind that multiprocessors inside a GPU +have 32 cores. Later we will see that these 32 cores need to do the same work to +get maximum performance. + +\begin{figure}[b!] +\centerline{\includegraphics[]{Chapters/chapter1/figures/nb_cores_CPU_GPU.pdf}} +\caption[Comparison of number of cores in a CPU and in a GPU]{Comparison of number of cores in a CPU and in a GPU.} +\label{ch1:fig:comparison_cpu_gpu} +\end{figure} + +On most powerful GPU cards, called Fermi, multiprocessors are called streaming +multiprocessors (SM). Each SM contains 32 cores and is able to perform 32 +floating point or integer operations on 32bits numbers per clock or 16 floating +point on 64bits number per clock. SM have their own registers, execution +pipelines and caches. On Fermi architecture, there are 64Kb shared memory + L1 +cache and 32,536 32bits registers per SM. More precisely the programmer can +decide what amount of shared memory and L1 cache SM can use. The constaint is +that the sum of both amounts is less or equal to 64Kb. + +Threads are used to benefit from the important number of cores of a GPU. Those +threads are different from traditional threads for CPU. In +chapter~\ref{chapter2}, some examples of GPU programming will explicit the +details of the GPU threads. However, threads are gathered into blocks of 32 +threads, called ``warp''. Those warps are important when designing an algorithm +for GPU. + + +Another big difference between CPU and GPU is the latency of memory. In CPU, +everything is optimized to obtain a low latency architecture. This is possible +through the use of cache memories. Moreover, nowadays CPUs perform many +performance optimizations such as speculative execution which roughly speaking +consists in executing a small part of code in advance even if later this work +reveals to be useless. In opposite, GPUs do not have low latency memory. In +comparison GPUs have ridiculous cache memories. Nevertheless the architecture of GPUs is optimized for throughtput computation and it takes into account the memory latency. + + + +\begin{figure}[b!] +\centerline{\includegraphics[scale=0.7]{Chapters/chapter1/figures/low_latency_vs_high_throughput.pdf}} +\caption[Comparison of low latency of CPU and highthroughput of GPU]{Comparison of low latency of CPU and highthroughput of GPU.} +\label{ch1:fig:latency_throughput} +\end{figure} + +Figure~\ref{ch1:fig:latency_throughput} illustrates the main difference of +memory latency between a CPU and a GPU. In a CPU, tasks ``ti'' are executed one +by one with a short memory latency to get the data to process. After some tasks, +there is a context switch that allows the CPU to run concurrent applications +and/or multi-threaded applications. Memory latencies are longer in a GPU, the +the principle to obtain a high throughput is to have many tasks to +compute. Later we will see that those tasks are called threads with CUDA. With +this principle, as soon as a task is finished the next one is ready to be +executed while the waiting for data for the previous task is overlapped by +computation of other tasks. + + + +\section{Kinds of parallelism} + +Many kinds of parallelism are avaible according to the type of hardware. +Roughtly speaking, there are three classes of parallism: instruction-level +parallelism, data parallelism and task parallelism. + +Instruction-level parallelism consists in re-ordering some instructions in order +to executed some of them in parallel without changing the result of the code. +In modern CPUs, instruction pipelines allow processor to execute instruction +faster. With a pipeline a processor can execute multiple instruction +simultaneously due to the fact that the output of a task is the input of the +next one. + +Data parallelism consists in executing the same program with different data on +different computing units. Of course, no depency should exist between the the +data. For example, it is easy to parallelize loops without dependency using the +data parallelism paradigm. This paradigm is linked with the Single Instructions +Multiple Data (SIMD) architecture. This is the kind of parallism providing by +GPUs. + +Taks parallelism is the common parallism achieved out on cluster and grid and +high performance architecture where different tasks are executed by different +computing units. + +\section{CUDA Multithreading} + +The data parallelism of CUDA is more precisely based on the Single Instruction +Multiple Thread (SIMT) model. This is due to the fact that the programmer access +to the cores by the intermediate of threads. In the CUDA model, all cores +execute the same set of instructions but with different data. This model has +similarities with vector programming model proposed for vector machines through +the 1970s into the 90s, notably the various Cray platforms. On the CUDA +architecture, the performance is led by the use of a huge number of threads +(from thousand upto to millions). The particularity of the model is that there +is no context switching as in CPUs and each thread has its own registers. In practice, threads are executed by SM and are gathered into groups of 32 threads. Those groups are call ``warps''. Each SM alternatively executes ``active warps'' and warps becoming temporaly inactive due to waiting of data (as shown in Figure~\ref{ch1:fig:latency_throughput}). + +The key to scalability in the CUDA model is the use of a huge number of threads. +In practice threads are not only gathered in warps but also in thread blocks. A +thread block is executed by only one SM and it cannot migrate. Typical size of +thread block is a number power of two (for example: 64, 128, 256 or 512). + + + +In this case, without changing anything inside a CUDA code, it is possible to +run your code with a small CUDA device or most performant Tesla CUDA cards. +Blocks are executed in any number depending on the number of SM available. So +the programmer must conceive its code having this issue in mind. This +independence between threads blocks provides the scalability of CUDA codes. + +\begin{figure}[b!] +\centerline{\includegraphics[scale=0.65]{Chapters/chapter1/figures/scalability.pdf}} +\caption[Scalability of GPU]{Scalability of GPU.} +\label{ch1:fig:scalability} +\end{figure} + + +A kernel is a function which contains a block a instruction that are executed by +threads of a GPU. When the problem considered is a 2 dimensions or 3 dimensions +problem, it is possible to group thread blocks into grid. In practice, the +number of thread blocks and the size of thread block is given in parameter to +each kernel. Figure~\ref{ch1:fig:scalability} illustrates an example of a +kernel composed of 8 thread blocks. Then this kernel is executed on a small +device containing only 2 SMs. So in in this case, blocks are executed 2 by 2 in +any order. If the kernel is executed on a larger CUDA device containing 4 SMs, +blocks are executed 4 by 4 simultaneously. The execution times should be +approximately twice faster in the latter case. Of course, that depends on other +parameters that will be described later. + +Thread blocks provide a way to cooperation in the sens that threads of the same +block cooperatively load and store blocks of memory they all +use. Synchronizations of threads in the same block are possible (but not between +thread of different blocks). Threads of the same block can also share results in +order to compute a single result. In chapter~\ref{chapter2}, some examples will +explicit that. + + +\section{Memory hierarchy} + +The memory hierarchy of GPUs is different from the one of CPUs. In practice, +there is registers, local memory, shared memory, cache memroy and global memory. + +As previously mentioned each thread can access its own registers. It is +important to keep in mind that the number of registers per block is limited. On +recent cards, this number is limited to 64Kb per SM. Access to registers is +very fast, so when possible it is a good idea to use them. + +Likewise each thread can access local memory which in practice much slower than +registers. In practice, local memory is automatically used by the compiler when +all the registers are occupied. So the best idea is to optimize the use of +registers even if this implies to reduce the number of threads per block. + +Shared memory allows cooperation between threads of the same block. This kind +of memory is fast by it requires to be manipulated manually and its size is +limited. It is accessible during the execution of a kernel. So the principle is +to fill the shared memory at the start of the kernel with global data that are +used very frequently, then threads can access it for their computation. They +can obviously change the content of this shared memory either with computation +or load of other data and they can store its content in the global memory. So +shared memory can be seen as a cache memory manageable manually. This requires +obviously an effort from the programmer. + +On recent cards, the programmer may decide what amount of cache memory and +shared memory is attributed to a kernel. The cache memory is a L1 cache which is +directly managed by the GPU. Sometimes, this cache provides very efficient +result and sometimes the use of shared memory is a better solution. + +\begin{figure}[b!] +\centerline{\includegraphics[scale=0.60]{Chapters/chapter1/figures/memory_hierarchy.pdf}} +\caption[Memory hierarchy of a GPU]{Memory hierarchy of a GPU.} +\label{ch1:fig:memory_hierarchy} +\end{figure} + + +Figure~\ref{ch1:fig:memory_hierarchy} illustrates the memory hierarchy of a +GPU. Threads are represented on the top of the figure. They can access to their +own registers and their local memory. Threads of the same block can access to +the shared memory of this block. The cache memory is not represented here but it +is local to a thread. Then each block can access to the global memory of the +GPU. + + +%%http://people.maths.ox.ac.uk/gilesm/pp10/lec2_2x2.pdf +%%https://people.maths.ox.ac.uk/erban/papers/paperCUDA.pdf +%%http://forum.wttsnxt.com/my_forum/viewtopic.php?f=5&t=9519 +%%http://www.cs.nyu.edu/manycores/cuda_many_cores.pdf +%%http://www.cc.gatech.edu/~vetter/keeneland/tutorial-2011-04-14/02-cuda-overview.pdf +%%http://people.maths.ox.ac.uk/~gilesm/cuda/ + + +\putbib[Chapters/chapter1/biblio] + diff --git a/BookGPU/Chapters/chapter1/figures/low_latency_vs_high_throughput.pdf b/BookGPU/Chapters/chapter1/figures/low_latency_vs_high_throughput.pdf new file mode 100644 index 0000000000000000000000000000000000000000..6ecca71b49452ba73251850cca419dfc7846bff7 GIT binary patch literal 20474 zcmZs>1C%H+vo$)lZQC}^*tTukwr!uWZQHhO+cWQc_x|@^_r3M9dUcZBRjEqSE4#WA zQh8xfT1Gk+DAMNZnolTZ0tNy*LrW+g9s+u46I(N9a{`uskRlWT0Rg?Jg|)MZAFVhuYyPKNu?~G~Cx4 zx$iUFuOBv_FV=F=$fW+L-smlH-AKfavwf1PWB?w1u7V19FquCCUtEH)(sBC)95@7` zNslLOidC40H=X3wLJkmb&v_BO33yXA?IJNe36^2(SBm`o_YdeE^iv$| z{L7~=9J|SJr+TpxS9@GOXUDi4SY4q}2Mn0K5;Dyq((9EE53% z+oU4S3{Kp{&JCPC>4=lJHjoprxjTh)-}Se=u6lOsFOm1r;m; z363w4+xm+5XlCwEZ0DH62+7o|7F){$ZDCu}GRc(6VxFeUJhtFbm=;m7UL%KFcoZ*B z1&;OkD9uWb;#V*t;K0CYv?9N(xSBIqG{h`w4>{K2HKEyf-4SOn639&okIBCkyljcp_vCZm2!-rSWsftR$abyL@vXXQEGW+BIz@gxh5C%h=7x>{zg zs;M^~C}p#&TR*`>vod7h*rPdauN=UEM6U-{3t8-knumdIwuqEuGy&~^KBMx=o<$Hy zY}yZajW~bRynaeb@GlW#^2_9{P9@2RJZhLSBt$B7>5mI2d2BdM?!mb=&>q9KdPucp zTx-98t2L>vU)LbL>0)q#Rlf<-ywkKeY~fbAw*ewdt}x1|0dQ~(?ae3O1$~Iv{Umth zlLD$kL%e!!mgsmmBYXXoMxgqpS+K!*DrO69s|P6@^KT@B%JEjeORiz1D^e-Ve4>|G ztSxVrLG>C_1He-vvDW<$jz@DYd~>Ce*buF%PjvybR6we1`?dhGT}(Ki(Ne&~k!H{m zU*&Tj@v^tqn+_pH-0}LgB3Z1+fI9%4fUPqXo7NpwE&NF+Q)@jQJD8XC-}l`|!uRWD zDWN@qxvr#DtAwfx6~ zEX62o)dGNoBCOs$UL>HMeaTCUK)Q<|Yw|A`D35O~za0XwHO zTPP?J5FAYXye;BKSP zMBPVH>7LSrT}L>!CLOoiWnSl$&YCyy)M&xp(W0(3ipAJ{uwK5 z|JuTYE^iSiT8@WRVD*#nJ-iX19zA8_@sAyTcaGLhXeTzOCfVu!ulamMsfhvLFjiL+BU;(=4uW2FWjsp^uxv2_N_j@Osc%lN89zA8q7_E_sB;ED3gxV zpZa)uv_ElqH*n^POSYRG{M@I1eYx0zGO;!OKg9g6^dCz950C#R|HtHvoXo8M5%^D) z>HjkRe{sBsyR(>*^FMwk;NkfX@GsFDF#M|!&*{`*cZYG>>G zzoboR8E6>@SpO$#_$T!5`aj0l{;x3tdSwrL69Rg912dEVGAwW4X!5Te#{bcymoYK6 zFc7qJC(!x_F%YmYu+p(H{7Y;EOzf<5O#e!ZOawarr7QH$gMZ3S1pf(N{ySC?bF_1@ z|6gAHx3&L5|EF6@|9o&XuywNkk7Xl||AB@6;ld`a7DgtDVuJqx|4%m+O`Pmp9F0tz z2>vrw5M}~a zX6FBQoYkeGoNbfF7T(#Z)YjSAsbyd4Kd>Fh4Fb^*X-kN};%*`U+oR6DuAy*YPP`p(n)lC)?7|7& z9|8X=Y>51eD^LJ?APx|a^>t9rwmx_ueGz|>vB7QNsw;@O7Z*Sv4uIcyfTitho2`G# zU0ZTvR4w+{$jV_HVBZ$5{td{Tv*Ue}JwW>=P&L5LjO_~m^zLtvc>j1}3MPTcjrl%U zLjxcauri5yzcdmYoIk(x4FG#S_uMqJ`S8bRKoaX1)Y%arlN*4O7NGW4w$9d{Kk!cM zdj4ULz4w8z&g#wTfR2{8#w;o-#>&lrnM|4+nNW%wObY-qpqw5yv~{s9v}3q?H`dqt zHaB1(?*7b;Y^@+aLN>=1=YRk~dt&r;TJ~QA@SDHm{ZMlb;7maoRXjR4e^UJ{A?;m( z)dRL@akIC1`>Xb{d94EgIL+wuvyzB80{qOkd1Yf7Svs0%ULg__xPkBfNzj`yxljB2&0r>AE0{)qW{3%QS zWD5H1n;pHAGv8_XL+;-`nfFG6)8_&<`U-!!TlhP+@SO(%Ftg(K$Aa~o|LP`JznY(~ zUpo8uJSq;C2V)PV>d&tMXg)i**YO^ppM<~H!QCh){~&+tPbnHcaQb=gc|Fk2c}-YW zdvRDJc2;3~5_uam-rh$0w^D^4cDmod-X7ps;MNu|fTljc*}a_*i2mu(-ih(Ioe=a1 zBe>cI*wGE3Gq4--)fCj3IUJsZea|l&7Y@$eOMsUVpsJgo$*0*IYIao)AkeP^%*~UZ zAOHY%Tj_5C7CH4NKlLzxtD)`j)zKbw9gya4VZp@};0>Pt&+mR9z|X1=z*nn_{&$|P z2`?^=@-%M0@9%=ddS+)p?ftyA={G252k%0wt9h)0~+@l{6m#pd^cJO#)`R#SwCo?~Fjy2F9a420BDKGN; zPHz9SLwE>~+CD$0?@}ed2K^)^@OIAM_LDV@A7T(T7pM57(Y(&6@7fo?9h2Wgm8t&9 zGhhZ|GZV8@0P4Wq-)G}5yBXtbnw;uCvTS?Bd!Xr^?HQj$^+Dwv9_|3QoPC`Zpih7R zK7Hd2^<11@{xSe`xp;uTSF8*C$V2^b`c7`|PV4TTU!3~~0Ck+)-}p@A{?mh-@3m8) zsW$zOKd-^qeWuq}=P$LuvUL%~jlgZezb*s);B+H;+5mlI-)cZcj()6w))#(+bw!Wp zK%ev9)B!o^zcGN>VZ*&7b$^fbM05eNmU~I?05*5W`k|*xzv&@yQl|A*^#wQmB=!&l z{%NKR>OpG5kNQb`0krgL1SRyk0M{~SP3lqq)A0-dcK)q=mGk+%2!Q1MHQYn!{*~09 zI;|y3AAYR`p$pi>Klhs{DJp7&-{XY_clX!eXm{^VfV2k$ac>9u@d^29MXx9>&BAuy z{oNASTmPkB#{$u1CYS>^kNCiF1w_oyWd;6g8s%WRMGDvjiwFV!5!^G^uJ=OApvgIz zC70vcl~H4w=i2&F)k$R4T6;?xTW9i)dEK6 zL!giu_|d@CY$&1d(M6OkI!V2+4}?i`i*sQJbVbWWs30*EhvShD+o}0o(-IHnfgW^R z+aF%;_?(Lx(XUoqUm`L1;CaK{E9q4#XM4l-3WBJbZWLCG6n)m&a?=+|cV(H~^Oduk z=Q~IJeDm$95ZZ=CXKlzDykJ#04EOX>vb_8DupbY$XskXAxR1RMo(>^Sq3S91XrvT! z8H!ii&R=sPW%+ggl@nqWOv&dt=w)seM&*!!*{^)vfc>Lw4IUVQheE@o@(G+R|F)F% zK^w8SVS3@~{^(DpDx<|KTLGss-GpfzbBh)vCL?!$vqMfaUk06Ugy-DAlbs#HyhAzz zq-ZvB!%wk)!dd5)(R62sC(tmrN-ANL^|_#`=dx;NcXrE+Wnu$wp8*DriJU)=b}GI- zBwzlf9&^?!)x8qWOmnHD|g*JRs5J>7^)RU}lc=PM~#`9yq@CRzK$qP`V zO>bpEWDpC}uwOPmg9(IeyUQi8?{kqf!E*j49V=Yb7tMF;N~ zqLJOGhn(&cLi;ig7wGPid17`&|UJ)gFMKg^xBgrJIm|0cQP%xU~gi6Itt?}GaIx; zA3faoE1LIII6a1@G{CIQntY;mQI*@bEq|TSyeyuul2VHCTU2*OgcProI#iNo3pir~ z7XU~0l=%f!Q*w(4Qf zpGHQ1Jwg0vP1ML+vJV!V3YlgSYMQW4bzIs4v5mg=&%E(&_Y9-L(%%JLg< z0;WHHR!N`J1C~0kKRg#yAkhw4OF)a|b?H2*Av%ON%NJML8kvfAJ3V!!kPRxBI1_If z%fDJ0?l~}R`P<+M)#IsA;uIJn+ujF>{(u8dn*duCFq_=33PuI2n**tG620meu+TAn zlObVUIV{f?wQNj!6S<)#YFle@E3p9f=SWw%A3t-fx@wNY!A5DSAzfrL)U{+<5=S(d zWP~I))OT==h8LXusg#kJz_&NyrXy8LNjY9GRO8sRVhD_Ja(LU1-Ao8byRCW}zk&cR zxR!hwtrFb_E97l88vz|o=i8A1;bP`cQsCTW(InS&*2(E7Vm512dy4mU3m$97xW=g-OQ9yj~v3#M1Wq`;lTRC2G8$4lwP$94tyz3H(xcg&!M*3c9>j=_QJR!{FXX3CFh2-l<6um?8m zm%6~w%66;unt0B0Nf0^aPUcSSMo6{hK_ZHrs9F&P|6t<~V4JN15n6nC_Mu_V_&N_D zxY@=7ud|R(3!XSJrR>KhHPz2y13wUysPHA!H3l5JTDNCuUK3nGbIa(Nc1<>zL7|w% z2A6{`a|XpqD|M#^z2K&xN)+NQ(ZwxX3=&>J)Twu--7Si`M5xn&CN#!Ct$%C_h(>yg z&4T&@;~ZN6EpHM|!2;7gA7Up;Efv!+M;&?WG21J${Zj_t!#&KJHdJaEe6IAiCUqYR zSx)cI_cd^Qs791IsM96p^b?;&ny&4Q6J-zk6~T@L;w11n@;RR#B{!K1J@l5|sUkMF z0pyMahX$`CAhR_1T1%C{DMMY~D?Y-ahizlQ+j6Wk@ws_;`S~V4cR5)YBfvtf$FdHI z=*KgSoW|;rygwE^v`aVZY?fu*l*Qk~ew8qgtr`31L@}Nz1Z$Usc{{Z-Efo$fxJRd@ z)0EV1;Qd+N2g3Y1iOl(Dd(>%nl(<3U&c^lQMILD;$^Z1}YtsQ*Rhg_FE+`}f1o(Y} z;5=>U7D2^F-iRs;`a%gL(`cD+cJg}&&*fPsj7%i?Qu;Q6;Jd;1!H4n$fWJnlr*vbGSGi_rNWC)$#5pbuj8Qf*3l`Jcym14hNSpsiqUPHRnVC_ z2y}cteRlLOEjd{rE|krTr9+@c?3nh~lLT}s(WOBj*~-HiQ)azq^88b@mc6!8Zk&d= zy6Pa6hOLX0v%AF46L4&VsX|-p2pFJ!uV~~0llMbImqQ)n@I#@HXklSX>y2@6b5a|No`0uolGSf*Iyzm`v6;Uz=+It z+jt^>_>dkPi1Z5nua0H+E;{q?w;Ft{CaYO`%W-w@QY=TyNk>~$E{iPFU}x57Gjl~l zSQTgK!OM)yl@rYGO-=L|&k5-z`IiweD<*461UMV(UF@i{P|!+q9u#q&vwaRwe1lLg zk%S~CJ9qJ&RyXuzC0}2!-Tk~1ZdSqPaSP6udtRg!vL6|<(m;9P1tgh4Hy8SU|6$j5 zUmJRFlP6$Qwhw`G9O^l>*}TmJWMo6h5Fj3zmE;4d6ik@YiOz#<3%7|IaaWh{Im7E5 zm8~uqqCmrsDNVXJis>-oNuM4|oI5c`g{`@}!Xmvr83nO|u!$o6E$s0g^`&I?kuIKu zAof*nBDaB5cq~h+DlgabaPlO?sP?5Ub(B(0sw10;A|hyFtYZ^})kzqXY7dAZF)SxZ zL(ftsy#q9b*M&NBZk65`^Vs5wS)=#$(BZPAnz8Cg6iS3@LI&03MvG|QN36u;!a8a$0k&k-L= z_yZ2M+yt}KamlxeD=HVA1Du<5l3}f)gG)27oTylYdvqNwQ61-=4%N78nU?n_r>JhJ zUhp#Il`RwAolh7+=?Mt&W=fgCr3426q2;-!-au?Y1*)bjmODc=w>nBqe|1&=GP<;q zGq#>H{AG?Qmij1v_LR_cn#H)4S6D^65>##J?OGnhOxn#gf8_mKEU{lD5-G-jWH`}E zc4SM`YFoLD1GzL?*?GpZG_WkViE-ZJZSGCPyn|WOE4DA8B=ROk1sTGXGcEy&giwn$b)U)^=K}MU$%w3_Vz8nI zJw|SO#cY^;LH=!BlqqmcTZuDg0N5*zWA++RRVrI^PtY}FmviQ0dMT`{G6jAEZkX|g z`2#?}=4lrgs)jI&vqYknaY|zZ@CZI5av4a*>NrgAtolr!IDYaDCd`h)hM4n)?V-dz z`NxK-`gSp@V?lX%gDT33z4?1;;WatJ4nEyzS0qR7q-$eV+P(+BGdbsPi7U*|$%;7z zny{(*n{!9t-wetoesGG|`z5fQc%yy=&?i`MWG!i->I2lFldCCf10Lh*-uSJGwtV{q zz^=^}F-UJdC+`F;&AqO*d@Uez*8M!Ta&6{%ef2S6`H&MU3wlv*WMkz-d(kip-02u$ zTd!7=!-5x`67IUE2^C zdP4od`M6dFJ2q2sq6vp%MM@7G$u|KZ_*A&^79iGFhfX^L{x_$@b%tT1WYD<9?11tufp&0XX=tsdw39x7TsC45zfGNf|Ipgk~ zu)l0;ytiv1ou&JxzD~!Gs#D1K8}TjtN7tc4Bh-JpqXKKFbok}G7<#@##3=C^JumRB zYh!Kf^Nt5a_w}o>uajSRXKGcgi&%wfx$;nCWoSs*t4`KVcdGaLu4VX#fT5769Df%P zXyCks&t=dsPm#})udSZ6IFl;UuHs`|l>FOK2S3J)HZ=Ru&30S%o7?12?jW+G?eD9$ z5bA;MwkX?)R}Ll}#sj0Xu3n0n40X^HAgtr{ZmS5TcK3vL$W`HeKmm{Oj58ft=qa<2 z70zMO?Q(!7fmE^@y2zRTq$yBy0LkW3U94@n+8kK53+jeEs9?E+qAzLzgSuW5^xls_?XYyfn z@JL%y%FpqPu-4L|^8F!}ZYUgnL`oHhrKjV85(r!i~M-Tk(4Tvi~dl$iUV$td=t3kdQbx=~rZt)lDH{7d;WZUs20pILU`rq-Q78$&vi= z@i`hWHLnw&oz4(j5jv%1?s(&j)c}d_OPMC43sZzNQga-dkswDaCxs^lxIpQAoG!R!fRWlU=bg*uj z6!4#wLuX#QOvvRgdCjLuMN6o~t{Re%*>D?Z@b_C~r!VCSUt&&k$(pfio7K==ub57wR-qUDdy3HSIa&t7lVFSV1N z)bx9O>1aeZqr!J9^XyyqNR7bt;waM=>9^^MsPLhl0E*Y`(9$ZwQY@$h3aC zpZvTOYDf&k#?=n8v6zc{4NmsS1{xFz5Z+?`!c4p^Z!F`E5|fcv;geZ7%=!Qe+F1541bM(`F$za_fV%3j?LZZ6t@}GIfGJI{- zRTG3g{Q#32zu3JsNhl$83JV|Kw(0r7ViHImJ`8cOu@34+@{G#QI_`=^qnIKeUs2<0 z@iI9Ucp(*^Bbm}2X<1sPyTK2oypZV~n;S=~YUlis(sT4HTGq2p{+qcfpzXZIC)_kC^9AI54hU8L_Uk<)l>);yAaA%GeQM=agKXznBLK{OjpgS zLs+x*WR?KIn~S5hWCGNIMt{FPZ>m`xVR3-;(6T@~87g7Xtt*>th@@VP-tETvuqAx9 z9Yzk(OXO>&d+~Nfo`Zw~S9xZczr*jlCZSI?QO{VG4not^gZ#NgyA6Foi~vlvls$dE zWFN72{z#RAG{Hd)!)e@9d5r^>#Jg&{xt)*^c_y=p2uWydAPw^>1k6@mGHvv^pat1+ z;?-k32EK-Bw=a#yFycV&sSP&nsjk0oy6E={2&*A+Z|!IyX%{Z%`MsF?o#mo*9KGPf z)%lso=MfksQdAAg4k*9|KM_aVynSU?ug;e@?|G+T)P=ChI{7(UtsXeezfXmWoK&GM zFIdXoOOcTG$P_d1xE0bo#<)8sw;U>sI#i~$z{f_koDZJ&U4t{KqVY#b{Knf!2kyH^VW|N6k zqnnhcN&t;vr2L0&m1)*^fp#EXS{`hYj?wR#x=3dNDRTHkziY(gyoJ)5owBhP&l$KA z%JzjUYO||vt|wv8E9%P3!tI%pba7%-V!L#!NYLvEQC%llUWm84d$0!hWMjan5%De8HHvj^ACwzSDt~I2l7KZ-yj_ax9CPSE zr}_U_wpWWd){cVSly=$3Y*@L8&EkeF%I1#VS+QN4`=*c93)E`gM8Fj5z zornYZP|M7%@YobWkF~2X3Sq0m0DEdfjO~XZ;4vN(KINCGi|TV zz{0mVE}_KMdJW{mMxX66KaQyJakF{Wp~!ZDfMOH`qj^@`DHpJBaaCdq^YR%~U^6vk zE>~pDe^6wfEu=t^@K)Vv^+6KThqLluUNrH~qG?w=L2%+TZE(Yt`>kg9u`9`-miCHj zwzl}N3$oX}2z_ijG}K4+Jx)5aXXx3ZJakQej(9@et7I`Il_a$9FjaB~o9uy;v_^>X z)>_l(Qc;qy9}BWeB$HQD=V9erhlL*Mmi|6D`WjEb3Cbgn;0?55LaGpepvL>C?)!kF zAnx_pNQp3}4!slLB=6?f`=Z~x<>m)>93~ja^IjeOb!4Kr?Y7c5YHz|pH%iX-UI)uc4o9gQUrs=qFwu%0z^7??q4POSQS#tF4cyQko^=b= zN)e&ZT`KweN^zc4+hFF(0@%!$EP(9{C)l>Pv~M+HDDsdUxfCbMYV+1+9Z~IdX<5}% zX++FBZJ*APK2JFRq~IIaCfx5JgKMc)*e?aTF45n1BAth7-IE5zy0HuiU3$Xrd`H=%e{_!tjl&rqN{D zXqw03m7k@IgY)6dD~%Lb#$mK8jj~pWZUq_rXND}|z;32-Zj1|^FT94fod|*6VB-jO zlCDQbhh2LQ1Wc=UCrbF`RQsal-XYgGZu>SY9b3beu&x`vprV6D^oub&O<}KnJ6cM`MnE#c_5(_Xod4rs12qF zilj|!8)!jpShfaC89_oG(cX^u5aVf-q$T1*B0$hHv4<05sU4GalatAlqqx`Kc%|wni z(j!X8Sggj-wAZ2{baVBcSx8=KHKXUmU`aA~e05~o%EM@K;2(87V1ynfoQs#}Zs%b^ zAA*%c)C~@v9&BSkbACL{Q>v&^0)?m_<@zf0#%$h0)l;;7trywb~ z{Oqx0{O_7ZDGNSma*o?7t?5TWc_?CD65iSLU5huntwhBW1Q|4divIQ+1Ium~5XDLL zWEyKu1d*`VjYidx=Snon6lS$eVVfg1qrVmgSzu~ylol;q&4yr*>=qadAs^!wYnkQ7 z3*2cZ8mYqPgb`sx@&jYRpU6Jc{rjHT;rPTg#`S#T5X@{51WxQ3QRiht7n`u!7+2HT zO*txgAHsO%dvFVT?#*Dtyzel>0Wl#s>tDx?bC?;Z@Ar;xwA_jy5;U9}kt4cfM^5lO zRi6~g>2_y)z(d~i`r77Zer4BBYOnuVK5`gQF$=ZJp=Hpjb*i8snZ#mXVQrV<8W~#> ztMJEFwBK)HNl4K>_|>%QkRDnNs~+Tw4!l zj2x0(s}Ksgnp7=&^661%@E5 z5)fUE5N4PqpFL^Eo3^JU!7onNvZt6}0z0liVJ|C(E?YmIpv3xtyE!66-}9M^AtRHH zI*{p3AnK6GxY`LSIITv8HLw~$nzKj(+1qzSNMW;o(F*Vd0^_G|&g-ZNZ7pbVzEXy3 z%+a+2a-~ULkA3RQ!^K ze;CiNgIfU*=OARMcPMN8;d!=A*{IR8rA}k`+~w3omoZr=pk~l5Gg65Q{Cc##s`zpP zDOg<9t<`Qj;x7}@fBMce3P~MJfaVH^+nn`{icGd#yAh10BtM}wcRa2l(J$}ME!`zl%Id&!Q!bmoU?v_u`(A6`Vnbj;l6)pJVg0x}eCU&)FJVm5Xg&Wpu4OA$dWv%8;- ze-Y2CEgznKdiVX6ugx}}uiqP6wQ&6Q(T)IDAf@X^=*#23x26PIghhUWRn@Fe6k8T0 z)WRV0G5HJ;r%_a;<10V#9Z_p$89$56;T#dl8Edqx zkJDk}auuhJ`qoAk$*-UKqg9d>mjfr`%v??~6MM`i+DT<3Key1Yp1-Pgs6Qiw=9cD} za{27j_MHi&>40lg9_hJz5GLctm@rr|o<@TN6 zWGK(~YS1dBTA0TRmf0O=5iOf@xMx&4y+|OGeT9Rv&!+wXjfjsX8`Z83Fh^8GbaMvP z`R>LTsuq3x+$IuQvx#Xh+|S&^(vi77qXpm2sTqrZiEOK@tU3@Jg%-a7#I_ZNj&SiE z!X5x>!KQMl4kgRn^`$5iL_orIIH^$@0~yJuT$0Q1>&6mB9y}=fz6Ug94L&lXIA?C? zsm`U_B)9UJcM2uiN7m=F%OIN@4K>N*IuEyoHf@Udu2DT|N}V8iOPM^I%hv#frTEVh z{W~QYpy~ae>o2J4oGzM{!~*%dw=#`*7zne_(R`zy+BfUq3HU-?9$3=8_JP+UDT;)e?tu(2z zrE1Gps^dIhhE!LmCUx(b{Mvf-T~L=@#XN$VWXKsLf92A5Z$7RB_g*ZW8mk#4h^alx zX2PlNQ!jmc*_>&t@Um>*=BuO@Qi}N88@va+7hRqR5etos33G@#gB<))$&(RI#|jPh_p~3$p`p*_ zS+DQPh(G}cUC``hvN3z>u4tLY{$@M|7?dDs$e{hXoP}O63@$3#U|vA6i1?v|#pxbM zpQ7{KsDmfi&*0(?qlOc=Go`{he?D2RJSx9-2^?;ZsoD5*zV zNActE=Zxkp3RzsmN3#-nE=82+jrQI&itu8|Ag!_+pA%N`SAodru4ujzi6kK7?YF4x zU2Bk#_XkKKbb~Nsxm1*jNfJY*{Du3=dIeh>LTZRKfFsX@i?uF#;JcfyU|*O<-%Lt0 z^Ge#-xw${e#Z3j9$fjPy%ES+b4{p)#?69p}jyM5Plecsw_SreIoaI^XCt&lz+q%VS zX+%ny1Ir_y0~I={^(jQgIL~g@qp&O|$S{J;|qn7uv81arSLGUOe) zglB!<6`f~5T(azSb7HNkcvEA4K0|?2cX}N~=0B9mS6c_k!cIxYr)I$rD0;_p^DoaEV|KhlV7kKVkca3wqCr9wMw|Kbbw^efM28sc<@bBTQ|3-D z`Vjj)9zKzbZA*H;6w|XkGI%Cwba%xV=eq7(jKmfaa|Vfx#6#L0)VC?wle=AnkJboC z3l}|W;*5VEm`BJy0TyzEbQez_%x?h3zH_cvGDHne(ypxQ`uEax!WGVj)E?OI7oi_w zN|Zx!rY>7{_%R0KT;}wa|sJ zy;EGeKe!)~y`|spxxiBxgS`bx0z{gWOBUD#%j6SuI|PO>!*tdf(K%jYTBnD6woY_z zM(3B61&p}5Bc8f}{)lxRKQMmSWYAk}i2-hdoq%M@Wog`7$!m{{S~MDYU!^dDz1@;v z7n0*6@yD^_s8QT?pYm4|iTQf(gMXngS%4Lcf(Fb1bCI}eEBV+Q`ATfI+#7YWY~*|Bt{U6SaR zm{I%@5Pg@V>p`gmp>+0&Hn*T^Li@M}!Y6>0r*A2pIy$T4_(?GU`9tiqW3IUXk+ ze%E}AjtrdHI$Rr{ZL;S2d_6v8no#5>0nJI7^i2o$7r2VjSxa(*GYJPd%sr+D2&c8e zRDQTXaPh6@BN@)-!1Usl4?+IsmUp9te`(hkAucz{bN8hrUu4~|`A%Fj4Fn?jD#f!c zyT{gar>@f_zJS?aC&MsC_NjH1a`lO8%GDuOL*ckzKPus4lC7IGIIlCB+|JYC&!Z#Z zc=6||J9^xGdkVwcQkc(xkr5tuAvUSV&&{RWyY#wuQum25g#)*rZ}0uSMMJ3d9uG?& zgz7ND^31O1sK_k-+cG1|ec?T~z_uZmmjI-!vCG781$C1(1Oy{wsoIIe(14&oI52#4 zar#C2P$CdYk8ZWZ&{i>O>k20FMd{ACyppWTTM62um}PpWrU`1lcb?%?ODsuSGH%Fo z>-1;Fa~$q|qx&mG7Dkw*a$^4*+)ui|$0h8}V{C)e#!*PD(mbb~#0%>uNn>I* zTMGON$j?@K-8cdgk=?l=Y*@?*)M33~?`R4;tA;2`=cDwfB*4K2 z^Ud;gI{5*N#}M4wM|8=TtslJb4s{O*bKXl=?i7M9xTl8n&Ot8eR0{ zSC7OEuSv^2&9v9+(G7|y2yiM4S}B&O!BC7$&=~KL0vJVL=-4CA%XMtTbj>Q*|(jed(%}$SW)4;m!oPS5!QG zdxnG44Z}}Ayl^2fFBNoGOii>aSnVXLW&AiAtc9uL4bO;*?dHwkveZbdGX**5qTNk;^wRl6`$>_0Q42%E8GStUs0dIB+SyU8lyq-SP-0NCA?(rklZ zkK6PeV&u&5$zA4=au&m{^}5Ut;AQ+Z&ZCgn9Y0t{#!(k^2W|D7ka(F^xj7MElns*5ZD zqwk`bF5bx`gx|Xf$!O!GCm%7&CWW4%cqx8|oOBgy!^pWl&Kbw<=>ctUAcO#m=9JZq zC4Dt%Wx{nr-3rNGgIi1RS@&xbFf5cunzhNp$DeOoTr6o%KbMy;t5%D#U;JBC=f)6O zFJ(s{tqueGkwVq{)jl_HMDxtFEC{k^f-r=&WjC{Q9kdRv2(UC2MIV79!{^;SDJ#f7 zENzN37QVOi#c1OXn^N=xhCTT~g<-rCQJAxmZlCX+)wS_(I4ECo@aSWyhv`Nk3)XyB^3os>O-+X@h*vFe03)zvGiqM%^>26h zCus&pj`m!nK18I=V~Ado_}bV1G2;G1flaIB6~pdV~B`}^VS4x`87p6KPw8zJ1 zzIqCVMmce>J%L7^fw7k_s#>KF;(^4WcPHsBMWz~&j8XL{`mwnlA)yzM-tG2!=v!uXr$! zY=I+?*cu;Y_}H6o06%jdeLq-9+${P!ufT30#jzf0eznhohh|)2=&z|*e;0^ySvR|~ z-(R8C}z`uwbR1adhOZiY1-eLXiE*fzVGZ+z)eCLd7N zhBv-?KjEjC%QT(q~U790*)|a zYi>AHFaeC0r;-3U^+h1;$8%S|-UV<<6ePB1Ev(AQUeWN|&&_BjN0|soXGwco#zzxH z3j&5_Y8m3a%IJmw*y(N1bKkL;f8j2(YNYCBo>_RdMoChwpwk4WS-#eya((mHmkaj}?!$NzG zetHyy?Q@9e7dG}*YmlaCVQyH6rygUko?bzC2-T6NZ{7#Ur`IO%G)rF>_@Uy}XwR4| z4AWP?(xZHB9I&bzAOG)V@ylz69MmawLwI$$S#_hu(6a zn~lHcU?7u?&(U$#P*j+u?&4?zR@xDI(!`bxf<9m2o};_+df&G#@HF z3ZlH0~nt z1yD;fg?b<>FRC}#D@};C95f5^&nGU!K4VDV#O=;bJJ{a3~ z(db`WcSiB>@sP_KRa^sw1J>PpW<_$VN>lA|NVwd$^H`Rx>^A(SGY?>n4IRM`9faOf zp@ppQ@^ctTpt!++lbtcVQMXPUrzRueYj`6OfaX0E*prm#o9d_J5eSflPFrfGP$pdG zf-kGRRbLp8kyVeR@>#0gs2^k@V3M|mr`VLm1lt}%vBdXDmLw)JSxDG%V7SD?3c1Tz z9_oUjFC0r*Y$e{}(jFF`gf|fHH^s=*<-Uwo=lyy*y@@qk^ezJW+AjGU=mJ;cMS2=v z8)CF&ry@!Pc`^R3SO#z~@12X^TtGEtQRJ&!U`fU)Rudtk2F z?=YV{Eqa58d$b5rSN{wj9T1nR$m?*hS)8+p&$TUe^RU-1?YS0utOo#u0p81xxkC2U zcW^J$C~&dHZwh-#aiVMUcE<=};2<(!&rM~T3UP52r4WaN@*b+^E2+YLqp5)RUX-{s z&M3|bPdDYMeMtD!o?Wu=BZCQ7-O@$})EKp}AkW2-qKlJWWKOG(t9C1M+3d>VNU6Pr z*yUxDVLB&SuJzUvZ|t)xE7dku2UyD^@|>{ytAMbivCeYL`3U_#jhqQIRPP(dqbQW3 zYzen$kuduhk!)F;`XM{hWM&v-nHjrMgUFKXTOuKQii(6HvV`ngC8A%1tPvUiJ5*ZE z|D4}FXU;v(_rA}&eBbB2XU@6je0rYuoE@vQSN@k**(&K(SGU~7ZOCmr3cXPY=^eJg z*!N5=#y&!q3}Jjjv7(#Xl+`~St_ud8QN~;Nu1;FyHNq9c{OLJsqa%^dXLliKem%( z#?qJX=_~H^caANEVi3FXMh{S59030}XTTX}@V)9Y5R;4IIjpCpQbbNAbxAGFX>Twy zGOncu*OV0lIYx0TOD-{C*!+BbQoO6WwUEoLFE?jqB)PEi7cJ+oc8@ai#71;Z?x=~` zD6 z)Ta2aVXHT3WLspSs*7X-uKbW-Jk(^e@Kd0SSisjUbzi*%Mdl6-1*XNRU8~{dUXdxR zTY3^-B9!ju=$)O!n{g^MVwYW7PfFuiY)JQ+?Jg`VpHs3YGG;kr=N1#5hgMGOVFiSM z1ipDriKp;mhlQ;bIfX9Q=;GwZ`a>)}6Oq2RV;;sEbl%QyYu5KIC+PrENCVOUXMTem zuFLbpPt>hHJjlS)dFOn%BcD=iXsQ{m|x&wEv@ezxu6M#>?}z6^(GJA}(KnW9U= zSG#%(+D5DzPLcF@wtkYS+udgSN|0prK4_tRBK&yoIo^z(9nrE`F~-C1bOSvNqT%I_ z&k2P0UP`T^qSD7&WJKGI?jEviXjCEMka6cLDl_|eN2^M+%=1L7^I7zr^whqv?-A3~ zdKXc?U*LYQootmw*EuaMo(C!WvheMe?&RyTHb>*%BzY|7e>L6ZoM#U=f%Y$bnS{l^ zj4t9X-{^j%FoolZLFjgKqhwXg+jJzh`6E>@r*)ivOzg2h59G8yaa1~FY&550CNBSZ ziQ|@K>|NZe93`CyiZ{B0pb;}*kJcc~mQJyrJ_9}66uFQsN6@f%6PZxw<?St$I!j!o6irGplk=2z*5dN82@5U?QZ=QLu`=yrfNC&`Ra6s+5C$c4-e0jYD~y+ z_s!Z$8aSl%5@u>ZHFNtAaXk0z=Eg0@#9RD1VMF+=dX%5$0;hp(y^WogT#mBwy}i$< zvb(3wsZ5&9XkcOkI4bP~QLgM;*g6lgJ~Xu}4f|N<*D!&p5oe zKE3BW$h*h-w%CAGYi@4WFy_u^`vsq4@gwkaSkY`>uVO(y^+8>}+9U0+2h#A3(2<8c znFKu}#>Ygn#Uje9%Y3);lBs4>Is-RMDdmnFkqb{Z?Y=S>H>vke#MN|_tC<%By6229 zn*-I~iZA=}2!5O5-jJsi`przd{spH)qMLK3+a%_Tg=m=j%xx`hEBR)mO>#qY+-28i zvG1nTMRJl)N88=e3M^%hl|TCRm6B(u#D~%{3$OHQn;mM$g_Jb&uE{>K>`3JI85e_Za*2%SrmCH* zRp)O|9;}XFF}*6`u+^%9Ap_feVBEMSLrnHCzxZJYB58*jyG3z>djGzZE4znsrL_-x zr#(Uc!jxe#Xud+JFq!QoU?7S~hiN8*Sfv0Z@_35bLYP_D1$gZXH8PRNSAf>Rd zQTl${$b^7WIFPnMUeEqU6j^aS$j>84fdy@P7s|y?gwq`)Qw*@}d zx+XE5MjygqKT_JI5wYj)tImF>H`&CwiyOmks`rT5-2h{(-AnIET4KTjy!I zN`CSfJ?J_#E??|MSCcqVeY@MxqV_AvUt;%-!VSW2sk22xH`H2i%Y-(%F-lHfjn_tc z+LBx9G#`K2@QWs&Es|vs@^^*%D^^YmAJoaAH2XpOlT%7saT(nUgRS}1V}YuT&Lbh$ z*zT)}9W9#{Cd1GhmSzg{^M(1e{zKL7em7~%_%sFm$ zYPY3$+Hx<>t>D8ul`GJ|iDO1Mw>K_}iIK#I)7Z+b=h0D$sU99&=O1fL${WXq7B;e| zEPZvQrO*M1)H`kzFx~57R0g7i#F5v)IL+-Xs<9DwkOge*V zN>D70D|uOT@iKV7{BuJgv%=h$y|ozUByp+1`n>`#IZFsK;ZhTKJR9eBPK(2VM6XCTnpV(jp!bFjIgK%O^h+@Pt7wvlK3iw%}D=xZJsiUVt&8Lo`E&iFo zAHQsu{h7fI-oBlvW3Y`kGmzupz2u&c|BC+oib}_qJvh9gwU0jovy?;Uc2L@@pmO8E zl6{=n7Uv^209n+Dmwo$ue9|2uecCeX1D0})zjtV4M@Rc zXQ&aWRur-m&5^Q7ePBvh{CSX#p+csuAv{!V@Dz}|wf5K^Z%qXdKbxwoAu^<)C@3He zgQEZ_NcR9RDEN0ah87KPPqtEau(l@xkku?@suh#80XA6;nZq+GV@Hq_ry&V)>IvRk(5da1T=D@%hg#=J=Fc$&`&}e3i08khN0Ec1# zIRqR)fcY@6Zv+Nx2kJ-^2D~%b9T;$2)Nj94XbA`(aXjNdW%RM$gd^-2Zj>jg|um+~=Q|8Y&HZeXP0*!e$_7 M&i(tr~m)} literal 0 HcmV?d00001 diff --git a/BookGPU/Chapters/chapter1/figures/low_latency_vs_high_throughput.svg b/BookGPU/Chapters/chapter1/figures/low_latency_vs_high_throughput.svg new file mode 100644 index 0000000..8daddbd --- /dev/null +++ b/BookGPU/Chapters/chapter1/figures/low_latency_vs_high_throughput.svg @@ -0,0 +1,1164 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + + + + + + Processing part + + + + Waiting for data + + + + Context switching + + + + + + + + + + + + CPU: optimized for low latency + GPU: optimized for high throughput + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Data ready to be processed + t1 + t2 + t3 + t4 + t1 + t2 + t3 + t4 + ti + + diff --git a/BookGPU/Chapters/chapter1/figures/memory_hierarchy.pdf b/BookGPU/Chapters/chapter1/figures/memory_hierarchy.pdf new file mode 100644 index 0000000000000000000000000000000000000000..88448ae82c7db8a2895562298e3da76a20816e3c GIT binary patch literal 15256 zcmcJW1zc3!*YBl~?iPj=lp1=F7U}Np0fvr2=?>`zkx)R8MjB~QN*V;DySp3ippQ?y z&;Pyey`Rs$GkoT-e`oKt_g;HV?C)8JUP)Yn1<1;cMqjm4a)rhTUKlu1u+g+|&@ZQ{l*H*c1s*Xm^4u$V#8?PWv6{+5d#J8Xu(1nTT$2-86 zG{A$!<6G}!3mP{U2E~z^Wk)QKVm_BHLSAJ7wHTz~o$F~RLxDYW58kmWRhLiD-yFZG8ftI;Z^&ZZML z`WWOnh2>LM@ZH}^%?5$Gub-#2mW6!?a3aWkmT>#&HQl@Nh`|W4QTa%5y%c^q@5iNr z0Q~;!g8RRI#YV>TSzOD}TA)Z_E#=!P11nqQ|vi1VNaVUfkgFopvK3+bIvdwdL zd$grf$TSv7+}&^_Q88iHTt3CDR=CCh<6lV$bOQgJl2wqpUNPOjy!_&TP|aKcHMyi zkI_?I>sJ%=08Py9Z)SydO>xC1HC@O&7ALP7@@xpq8ka*CdpX9g3RKVpaQlEN%F;BV zW`?#VnjAd@tw46O_0^rQO_LK<%6q&Ojr_}RIrOo=tegWET)E~0ylsmsz&4h{YA zG0Y>6Aw;$^N09Up_d;u;kfPOWdF#d*?(pX(TQ@F!aniOko;li;-L{ZdeTzxQFSE?E zf>_JdcqyO6SSA{smPtG(RYg_J0z%KtwxHApAJ->El|?o&o^}hHqi9?G-Dk)jIRk4BAz;hAb<#PD9Fi ztX31EotjPJB;DifNJRQ($t)gL`HJZ%E=zqtUXvszRORZ)aB{bB@^0z$6V3c&6~wh0 zCuMy;4$tBbOYGj>yXV@-@<6HDXV%w13o6YVHxIZE=Do4+)UP*$6kj%8Jz&m}H(Zj` zk0Ilx>mu!U$;=JQ;rHOzG)vR>Rdo(xHrm)+&^w487izpbzFmtR@$tF!JkcAvnTUG| z-3#JBJD4zNv?<}6N%1QAG2h#>8dk;maWxFHzrm)&wKv6LIc&tsEk!Sr)nPW_DIk5L z9Z99#{hVB-U3;n7s6cCRZzF$^f_PY38$OTN!yW5*u#E*AJHkcdqM_$UO`14RDs;_1<5PQgeBp!gT!6==pYqms0pAGj1@G*;;AbA^BU8nIol{v+v?Mj( z7lbpOE)2Wd3j@P;ND!aSaH0?8a$ry*LPtc^lSKX>kYy*w(?o5$ZbGTPJW5nT+J`li zt6dkdOZ%#MNEkohF=`&5v(EeKMXa}#qmwKkX}QA@8MaV2lfZR|u+JBu(uP{%^z^|K z4KBy8Q*TLPJ3fH7QqW!!t@8CRUr7MDDgCD}INI0*C%dh(d{jMS*&+m5T5^c)>4S#eVf9t$^TU zpJCtwK!jc_Dr`L~mwFS=keLDr830CDB1eVnn zgkC>{du^VO9i=NIC>nqSS^~oag zJX{H-=?u+*Z(kR}6fnsK(;mrf-hA1wCPcu2^RQ*=Ib|)W+x!jyn=izz1ycl@FWg<^ zSqE4?qG z%D}iVb%YE-$PYwOOJ9H)096}Rj21By&Oasyz$J!LjuXg?n2CTYhH$S9a#N#r`UR!@lAbwit{8gu{#XLpxClPP5{ne#f@5!SEn< zvIRhZ9ft=akOg5O)^EQ>!)EX%9#FeXk1PVSXC$IDStP8L8=~>p)$4Rym=xK_X-|8ihIca+3@I) z%jNlfQ|Prf^1ZxrWx&H9i`c)d<+1R;&=byEbsGS`KR;7|fk$x+)NuD*_* z{*Jx?iVzEukhD%Be+M#3Qvuhgd#Up=sbaU=R06(*wyMThXCWE_6Mr`{UQU#>9v-<# z5})+wlhqgU`AN!)xrnFeDmatkqU*s-sJ6KPPY;4XUR9hXLA171Ce)fE<{Ie~V$i^1 zA3>mfi2opnAkfa&e^48CfR)VA#0+5z#egQpD<^8niL@}44QNR5-``QjnUxcT1*8Te zsy)o+GlIbWOaK%CcE|!iVI>|_oH|ydOiY46R#DNh#Gw(zg5yPHiN~OGzoD{sna9ek zl~AbEY>x#D9O zt$OQQ9k;IWOhj^Boy+LAP4DE&>KWqE;z9$}GxMkUsTVFyvZZsCNmdi}6Kj|1?kylr zMaK@*o3n40$4?nD@5%w(2j!kEsdHQv!l%5K0zvdBu`PLs8Vn>Z{-TVZWLg;swTfH9 zkWj2z0mK;Yt=vc>k|{KsN&5aoW2QbIn_1+)yrCnBzL`1%el#b%Ji^g#YZOvQffslJx_(ED^fo4|ZbMEkVLoh58kf}5bd^1`58yGXb2aBOCnsD13zug+_OrLABq zAp(L`_0emtK%;Ek)?xKgfyT}a{b(FU02<`Mx95y2d4|1&!&2Kzm!7eQWL7*UQ}yl0 zTQ%3?T*o0ZcW@`O!oz4_JF`E2^TYnT-!0s45$=z$O#&x77wEpieUanO&BV{`!wXla zq$(7)bpQwm+zYS=n=w1A0APDA3IGDwOpRf$unh(5#IIL2341%}-)ft)u(PlOxPNa* z?hTl;0AUi(Zwb~8_V-8synpr&U{iB*00Y>Rj4i-F$D(BH1csRg{%*%64>p4si`lyZ zbYT*#4?b>IJ}wR*I|qQ1n}d~)gNK8Y6QKXI?Q__mV2zyt_nW0(J0nRadl!el4fI!U zKh@teRfUbg$=J@>;l5{6x8Jff0P5uOb4>TMc@C>34t@zS1*=Gk-Bf))$%*_bPdN4x37buqkILvs zZ*%X1_X6$FctVfjlb-z2(zB#eEj6LZeGhk+!Uvu}DJ2H7WsatIwcIU6%`ch|zCFr4 zl#vdujOy|8XmJENfMO`K+AW(%SaRipGZIOLjzTTjf7~925Xx}u95C#tca@5uinsOL zGG|+f#Ajtj=*5H6-gnJQOz7J~M|IzE_2`NAp~hzN*r8!3SLr1)4bw=1+~G+DAHlIO z>-O~bH=zu7Oy@eMPq@p!pg*nankC(v;|LKcn8jJGyL7OkCq z0!3&w9*81p+u<#B02R+g(n1PNME{G_?+hnIc9z0Oz^r1qd6rJ5njlbA2_+~ z5a6s*jjXYbdB^Lbg@=yk$LC>=$wIC!o2%TKb0);3$2RN5_)}Y3HL^RIQyDnn+Tbt# zv;5vGH-wSftr#t* z=V)WeHw%Nrjbso@KdF7RKA1+=n7#G#eUD@O=1!`()Y;=Ti9|{67megDPs7&4SbGwk zW8tI|B zetS(tz>NJ%lA8{A)6lZ%E^S9C29Q+lvt7NXgrAoc;ukLys{O%UR7*(1vNY5C(PcVx zscssTlH4-LBSq_$1Bmex`0}&bUpCD&BkS#yQWZ4WqU)lx-W+E?i} zHtQ(Xhfm@K=aH}u+T}ZmdDD#%l#NDcTa3tGBS?K;nva!I{Qw%~h@ddebISu)I#xm| z-74F+k=VkMW(K!-Om0x<9eB_0G!X)VKdp1YP!xA1_{0#9_!(tyO2Ti0O`1akrdKlBf%x z?FB1`9?A_EafsHzLrxM9@Sg7GNA&au+T8Y9Q!dSQ%T)%Aipts|xyWL3!Yud6Z=4n- z+l9p|k&BOYJyJJw_uLlNRJt{IO76B*AMhpG6pGf(<-EaspHen9W@W?shJf(9=SRDW z_svS%2Hv{ZV@pWXY=Kfx^X4djAwLQ_sx}=jnUDX|CbEjO^ ztShCUY{^h><-^S8!&IJ38#ITw*pZq+CoSHOr1LVOawA~9W+fk_JU%2rupnN<0sG4Xd{HsQ3VZOk~q0Yws{6p;l_+5C<#PQh&;mUYiZ`ZXS} zXO05r$n-7vy?uBEpUlKlrtiu}OGo#~A4^^c-dZ+~Q!PcayneDzAWrOxZh~flj@N@< zm~ACkHB>b_`DXKnZK!-`4W8huM%oE7>glaJJnS7x#A2f-e6kYDa&DK!p*0F z9-O4ktaZ_Puxx)UCL=!OMdMw#o%}py>MxQtcc&UTk-E^tg>YBwYMVDaCLrg2@mnw3iTEk zfvf?oj&3^P`3`pEq~C-KEOo*Q9dyF$_jUO*_o|d%JTh+W8aOZX6n$iQU6G}=dN*qZ z$ycLRm*G}PJdUU|R5=NVu*6u!3UK_C_TV8iX*{^}Es{1fit{p*o9b|aAdP12s5{Is z9l==(m#7=biZr2&zE<4(HJgUaclCY7!WOh|tYc3`mHIY6DU>X6%OG%l(jTUrX4zY< zPk4U)BIkvdd}Hou{t~LxbB;*n`1VMzbio!%SAEZVUDgezudC1GTS@Zi4W~QphZ+yR zwlvbXA+r)~Y=sB;U0s!=2W)bGrhIvsadBx78i`tyL)NBWCLLAy$SpVgj~5EGPoUntDzFu|F#)mU3w>WEitWi z>8Ad|s2zLC`bC$D>^@oVc0vB`e7A00G{5u_ zXfBc6_p=kzg*qz6Sx7eSU`K*Z>X_98*dz{Z+!kGVUzFY0;WEef30m`6^;@EESZsPNb2Ij0_R> zR5^0a+`>Dd?}^LU($Q?M*`nu&=EP4%&#e!<4kQ+z1H94~mVh`wxF*o(oYguEgz;yn86?v9wo@${RX7l0{zBnRL zI$U1XMs}NCC#bHYy2Ah^HDYwtN;bW!U&F#@Ov}EcX1i>sY>#)ZhNA3IS&gOeXbHq? zlBSKyF1s{?*=azp1IifpQf}_EUFVlM#@-0r>?={oL3@=4!#KgW5s#OPUmo0yi4QhA zu>fR8Ha0R}i+&QNKQ7a@5z_sVVKkL%Vs9^^U9gb}9WN+b{JJCy0$9^xdV;z8*X3X0&yZNSO5*9~cm^25NqV0Ce~beD2| zrbJ7W1)>H?PU}p3?4FifF$QHgp?SP%^B{+`)2`eI&Z+lZ0LA3!9?lhwi(t=(65pP# z$U22K#0^Q(zx#r%($m39_jqM)Xz*J}s|;VzI}?WL9(*vl-WOR{5}{00X#LAJjUU*A zU&F&$0%T;E%<)#>n-M+oShEjgmvi=TUw!<6Tnk^XoW$^)wqDH1r7igr9D0t7Q@mV$ z#Cfxbhkn+Hhfxm3H5#X1OfXJl?Z*ltM4xQ-M8k(;x&&ZCCDbd}w={+$G_0g7VB3XA z>7@vML&#jm%DJ@$OJ{e;a`&eyLj)~&UC5r`C0R~{phEdpmDrrV4omW|7L>{6apK@X1>34cK#D|#RS~Vi$q-|$*_$UHhukMyKV^JC)(Y1ClrZ~bo{=Iz z&BnL5QdP+SOm`Qjv70DNlTuUuF^6pdPa)8ZG2T9|q!nrKfy3~7hm<%$l1-ONsV1fz zmRnNiM&fW2m5^Kg@hx^&-S_X?FKcV6N=uDOC|bjHT+PrgzY|RmpA893*xvL7vU9f0XG=Ud}>evx}{?d-=(zF&ci0bSawI}k?lejCu7?#qjKD~aqq%pLAC z?5odJW=-VnDH!Q99XUR-PwPj8){x|Px7#+yHLvd1{Bp5$YzgJ#K{EW@$Bn(x&Zo(b z51wuv9MXC2t>In^`Uyw3ynE=WIj@M+M7yto&@{8LV2{3iJg-Z7p4)OXN+wJdY9DA4 z(lU;8D?ShMD}CsxKCkU(fmn?$z_OnZY|AY)%kb%B88Z81e>v2|`;aT6C{FSEN2eme z)zyBW;cXdqG`eRFT=YXvZKLoOs)qtPVZxQhN4Qk>tnIWR>DRBZv7a5#n_TAD3DE7o z@IyhObXfHE6aHbbB;3{F#dGb=#50tTi|i?Cv~)x?%ee1`zU#%Bw#U-GcFK3|yinWe zH}KY>rbQo9qe<2%fN%?-XR3D=G7ACl){k(qTQ5EZ%ulKt@`2E-o^Qt z$@FG&^V*k>NOltf5%##XZX|XzK;FpCx=3?_wYEs}B6=}frm>h@D^IQBh31FT`w1Pq zsGc%k9wTWyEFxIn^Dwz_zYT0T*>cop(jcgF3&I{p^VFLsBQL3*H)YVH-ygEmn}~c` z=jg_pgs#W1Uk^_e)w+kOS0FzxYq5am7e2o9{5unVt&~xIOAtIUYTCO1`k8ZIq;Mo} z*?9rK7tZnxdE{$v_iqOsynT@SarsgeG6dn!XZXTF=S8A?BSDQ*p1Fl#$EUwZSYMJ} z0uF^d6yj9;lsyn}0KbITF$oM_j&RJXA;_*DdQRgi1B6{s{Lm`phdaaP3LmWcSa1Hl z3I?rw&_j%z7nuv|MC*7YG=tLkY4F@}>Z!g_Czfn4Qbh{s6LgVuEUW1W`rzfUxX{!f zZ=Vd#i{%i2$r8E|rlr&f0ynW(b7-6asAk1Qm6s5nWy8#tJ6?plO7RXEb{#p`0UJ5l| zhL`zL2_o$KV~HT2q$dW+XO2`cK}Nqtu)|sMOZxVAp&D;3fJb!vh65b0E4gDv$q*pz zfWuWi!3d>dT4BSsLntqzcZQ#huXc5zVZBc z#To8EDvBYtn8IAR@TkegZIKe@J!rJ>7p|xDwv;`BQAxxKa-cl!$(wo z%?+W-Ia192ZiY~@EG_+FH@o1BK5-QkT5uu+eCSB82c!y${yJ)Eg14ByAEkY%Hf>>n z7auboIt~oitk5KGL#p_0UBY(smF~N$tfG7a@t%+ywu+BY5-Hcf`{GaF(q6eX>gn60 zrTNAgXP$8qrp3zX1eTBdt_^#_a(?<77f8i6Zz2C)o}WJdV|_S*oPRFQU;Ozu|GnqQ z;$UY}Cx`>o-svaJy_Xb>ZDDX&OGiaRRQ#zp*veSLMHPl>|2!{YYzwh*12FzN1-L)Q z^y{RJu?36+bNpFT>=#02fq`Ro01Fq88vx|tV+R0%JUoBUV`-?d4a8K`&cX%^VE@@p z)Y0%>_EJG5)(J&(FbgbAf)=Egu*YW421Xx(wtJ^{D6R7|{KTQ6%)>`$?xbl|*>GpE#JaWR8pk@~%Q!v0)bQq}8in5lV@Fxw#p{P= zk=Zf$M5v+!uWf9^Xi{QMTG~iR#K@I}qOl_o0MSOQ@UfZNI9*7c*N%J-*7HetNhEKy zJs03mS4G)T@EmI4g?YZ2Qe#;12JZ!4``;m(b!Os20Z%CAL$Wb^#8I-eaUR4WrD8se zb;XI=#}U9W#tB5EVE87&=Sg6WV1xX%9mpC*<4J-YeE#EA?Y!^f*7OGxd>-1LK|@7p z0s!8pudsQa6@(UBevk(+7ee>5`6Pn4hUoji+}I7i`4Oto zgQx(&1)t12!EOW*fU`g#HV{a$pGmQx?Hl3Xii&u(gh~_mM*e}4j&7~!B>{YpNuDfE zaOL`B3=xti#NiWGUO;T-X_0lkHY$q$XR*8MAAo_DX6^6xx`x@yeeMs8}T*?y81#` zBNrqJ7``BQB_yG)M?$Fudt~NgnB_udk&6OC%UA@ccl8YZQUr(mXbV*G=nh6P5TArj zNe3l)kq!}fqro`9?4C}D9P$c6)d9jP1!U1Cd&=<*iP^Oax!NT*AtJ7*d~>h_b&R>B z40%ioM#LAaU=md4v9^ax0h^*acVFQAZe1wf`$QltNNQNarS2f2Jeak2P$P;6Sifs0 z!4jEhy|Z<;7bgi$!VgAi*`_*q=}B^&h}22NMYJ&>pI{M*OpQUDplBf~_Xe}>0z2|? zh~KLw!J9 z3u6ddL%q~o(OqrNZ5)K1q#a<19x6f;>qmWl{dw~8Q3_#56i_Kjm;?<0wZoB5%OCMQ zQ}=3FBWS$Ce;DQj5x8Kga$nDO%9zg7`V2#aS@o* zG}rPR{;$1#q|N4WaW7I)E4Q0NJ8*#Wsi>1FuW#h_P;d|IX^<=ff)=AcPuJq2`-{1B z0~YEC1Q8s@w6HmN2GQ@DmhVobMv(ZhE>-zB-)s1MlOw1^@<*hGZXB$`#lGp4b;%Uf8kNk@!3O7Q}T*#a3Lj7`Ld+*_~@_FJYYI!B8u&@5$V_pGSih`8j$=cUYs6jTQ z1??(aS>D1_@d>N2nKvuPrHsrc&#OgUN@M#;^6Ztt+U(8-Xf@;AOHIvdW#n#4Qq2PO zH9~XO#r}_)kIld*gc{x=@0zssgZIHGBqVFA0zom8&eE{q5DN0Me4E)s`)E{V;-48fzT41kQYCf5!NH!^c&%`jo| zdpFq5a0ayx!M)nRa`ri`)35$`o}t$ohXQ(ppU+!GvHMEF;n9oWLWLxp>fQ;c?sL0= z-5D|=`+kPgydJy4tg^K=X|+^ak@3e`O)k%roE*nf<&hnJsMKs6f0!d{I^UUGtJllr zQ^PPdUilDuFt8)fESjwyV z@sSgvI9bn3`gzv&on)m1wqb?23!DDc`p{dmvY=9fCkw0Gt*wh|G@dVUZ~9**b_g)5 zqi?7KLyKx}Y6}=&zDQaN%^70rlcTPG6z1Q`=2e&dD$M5UDF0#zxCRQ~t%M;X>c9f@ zSiDIzGU4=31xiic*90%FlfOK%NsIy4zs=oBrjWb(RK*`+n&b|kj~q820gtbAA5c2j z=6MaApy%tlLK%|2eq^a8@k@=XX*JGNzmYbLZp$xk0!OOk5{^Mjo~o*R*p=IFWFdRB zF01mTV&F>SJPH(H!?r8L?6l?3;MzmNA`Q|XC$Nq$aefn4N4-0tUe+mH$z|fd_Oh^; zrtU%x>oFY2MIL7pLPw zxJ!+mX`9+^nuDhNvoT#~J7;`#WjUa)gP;FY-z zOM2>!RpVUnl_m-U4wG!SQ;=~;2bp<|hr|j+AH{VWV-nigw5+RCk597QWVgQn*VzwZ zAy~q`ctUOK!+z=SxH6l(vtf1JI~i8<&M9mc%>=({*uE2ca=@aPZB7Eq zZyXjL)@h}nd?n{nqpxd=;m$JyIyd2cB_~o(3^Veq8g`btz3rN8j*VUPe0T`HQM^7> zeC-0zltrHJiyhvW;691$U#$_e_k1}JF&gp09*F&sjyU8sh2t#EUBh{3W!rlU^<1Tz zs=Rneg3r1o#4jmT;zG`|rvT+tQSq4_=yQC|mB7@DXv2&OjWRIYz1HP8$m8P8E593m zzT$JY_>3uL_scrStrcXfQ~b{P_4dpsdS>cHn>GJ}%!t9R9=A=VcHC7DYyET|q!QkZ zlbWY_9xKae{WGX>%nQrNEOElj-96#e3Hy_FS|sHSwna@$LOI$_^?W0mt1a|9u72ub z6-i=I1J2&y@6Oq;uszAvRvJ{Ol3n;olcxvglW3g3NeYFyMO<848Jqmb75Cdcjp@p$ zthJ`iU-O8X%9qe{>C=5i5SyMmR`6j$^~;!?OPwLNQ|~Z=`a=(4pC?5h;$DeJ4w`YY zFfbJt-sW@DR;`2?96~RaRxY09+D`9-WX7?p=|@HestF2NdTaRtUJ>6msXeFDmL7J5 zu>N?@boLrb@$84FdeNB*t7hXDOhYp0c^ivGabN>D!(y>yIaqJ-6MbHg;uV=1i36y? zA&T=5nzI$5GkN{qZkxZ5KLqd0j&c3Ni<7GJ^%JI%=VoPu=7hsv2IjGcF1l^0(8y@X z%vJK0B`MS{O~s>MznY1QdGy+;7SXxj5EERn!sljnX%#fMwlxx}-=Z(Q70zc4|TPxrDAKf0%!RDTeVc%?C) z@WTmIGmCX{S%Wp>ZWX>AHLgd{?w-7Nvc7fL@M#cUgQYPRDs65->5{n^CpfNKJHh%+ zJ-yIvda`KuBKfiV`s@M%R5^V_66%9r)YaBN6<%J#7RP5f;*~$|)(t;ZRzhZQ8AbOt zfA^W1L4xWP)zn3errjGlI-N#i(Vz%kdydk?M^_U!)9G9v)X!9GP54vdSDn@drw=Z7 z-SfL+u@yFA%~)_Z{coEdNt8ZO*tiRHc$2>rhxTFc)j^~m?qD%=p~FTN~oi;1dD)HMU7;H@O;2`=^r^3fFdxF*dmq3V8c2 zrWbTouROg3kU%@lZdZc%4UH29TaK<3>yVs673BJ*JPV>RtrH-PL638%g7B!7+S5cm z&6=d-N!Fv;`jIR%|6=@!(u%6JmU@(){!pnVxF{6S%%-wk~( zDC)DaT(FCs^1LIQSL7MRVsdO>7VWaBm1@e2sdjwOn7lOZ?9mBUX5jEt$=UD`LPIC- zR$)_3+|0|PuZ6R>lpD7kR&KF|PgcF3t|bz5XF$5$59-{GviL&A!xc$*=W_-m$2&8$ zcS&SEx44pgF^sV%94)>EOLc?S=f#UP>n=m=4)J@aQNy$5Wj?0}q+Dm+e!ZiCXH(MF`l1ZMl ze5IiVL!da_L{|WS(_OFiYxzNbMs%9kL2BxXBu03c7K%!TJbhKPLLGs=vWCH3Z;_#i z=VGFBQbji@)1{yK(_jG>jSTg-nuay|HAP;jJ~P2T`hV;gS~acBZnb>hiHi@7E^XHl z6!>yrhqd!fKi~DE`kKM+Qa>iEcezC6>Vm}ZM{MwbxLWF~+79eicP)+$p9`sL;g-9H z5BSO9>`Q-alQ&fce)WNaj$||c2d;C!+x^XTc>e}={*E&KxJwE6XA0o|c$<=bY@gLv z08ZzgP&-9U2t5OooQUp|4IAh5SCV+C%( z*{3`>T9Peji_3(+7>Q$f)F`H9;bLD;&D$Ic!q5<}NWQavW;xV6A;!-gpS~<)CF4?& zJW#V2HT`7g|7F~TG1K_y7R3UJ%eTcpXu1BGc*AfdLJ53OHl9%9C>*y9LvClN@~2zF zaXMTj22U`C<+~8q9-*4@2P@q5j5bI~WH*707Rz!}`n%Z=K5)cS^90E+<~qdJvg+aU z3UHE!ZQ+fFDh4TjwH>OKp2|Rf=WJSor4kNFKARXNf{#om&RCO=*9kcjJ_uJV@breG z?GD2F4^F>#+mE&qFB#`6nk{Ee9X!ef6CGnkd1{ru0s4Dge; z{2jpjcEo*x>VFH&V6f~L>-vk9{qLs!e~rxU-To`b^fw{+eYfA_pcWB>Qo zxY>DOeCeK>!ywon*z30y4n7#1vEs*`?uU#;NO#HpTllqgfU?Nqp2~($sPb?1#+-*02nQy zPzQcCSZXp%v~V(Zu!NX8v)VgZF#Q6+N>27>E~c=XBLB~u{gs~#{C!>FuLyf?48z0r z7C&S5=WPB+QU0rus}tB9jUB+jj>i7y1Hi-0&B+Zg2mI22T-+Qm_ufAMc7JH>oIoDf zLjF@@XTRr||I|2vTrkJ}Qv(6-gWw+;tQKsE|EaNaa@=#yf7D}#rBB0_^q(3Z7vI17 z<>O|DEysVU2ZGJuU)sOt2IAp>y;=X!7RbrP^Iz@3w115c$jSGw`NFh+`3mM!zJJ-{ z;=TXc`Q0zn$ruLSoqpass|s-k->=akG-GK> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + + + + + + + + Registers + + Registers + + Registers + + Registers + + Registers + + Registers + + Registers + + Registers + + Local var + + Local var + + Local var + + Local var + + Local var + + Local var + + Local var + + Local var + + Shared memory + + + + + + + + + Registers + + Registers + + Registers + + Registers + + Registers + + Registers + + Registers + + Registers + + Local var + + Local var + + Local var + + Local var + + Local var + + Local var + + Local var + + Local var + + Shared memory + + + + + + + + + Registers + + Registers + + Registers + + Registers + + Registers + + Registers + + Registers + + Registers + + Local var + + Local var + + Local var + + Local var + + Local var + + Local var + + Local var + + Local var + + Shared memory + + Global memory + + + + + diff --git a/BookGPU/Chapters/chapter1/figures/nb_cores_CPU_GPU.pdf b/BookGPU/Chapters/chapter1/figures/nb_cores_CPU_GPU.pdf new file mode 100644 index 0000000000000000000000000000000000000000..977117ce84681b6cd9ddfa6c4c96822343f835a5 GIT binary patch literal 21529 zcmZsD1CSugwr$(CJ#9|gwr$(CZQFMDv~AnAZM*x=x%b`o-$zD9W#y{enR{16tsPO7 ztA5D|iO?|6GC}=ny8810#YjL;U~6Cj#l=NHCuMA7>SRX1^cPZqA|N236EU}PGIsde zTIo9(3mF^Q8W}_J@IW~_IT-6(L%C%&rB2!4u)>VoJfL_0QWbQAg7Bb_LxG$UUTsI= zAeCkc*YXr`KR!uLMI;TXyYZf{-o^}}{_MvVgfQ}=6`tSV_msQlM!!GL0Ijbc_LckI zRl>i%c)PzBMX{V;>@VG%hXyPi8@kq-7p`x)Fv3 zB77=(3q{7lDGZ+KK$2YGWIG6_D&Q94kBDL&95u@Euxh!z^PgAyS6TjiNq^OA_FDG9 zt@sk0>p5ma>dDV%7P-((bbJN6_flB2=;mx=yLtJ&3}dzEd=dKuy(8lDUhz=(T)%gB z_NOO=D0t}YZkn;x`>$ny#|8@pCl}N#95bhmZHi{9Xqc*UceYk}186R3&6A2o#%CSs z#W70H<`wbFr`P9)=#JqLlmkyZ1v@iG#d2MPO5kk~30Zs>RZZ%aHX$^kw#w|QNX!&< z+#sLQj1a14OPr77N58QpDXuF07_{C#1}=8&v!O)k78fX1$GQpnA9BV3gNrQ9qM-z?7<~2VoKsJkM@HNBg!)9cU5gt=sv3^Q8gu2 zA@R66dV%V&xV&}=gojxvEiv83c&5h9s;PT~RPXmv(Eae7)5Tz|W9rUae*m|My z1O*z4J-#wRq8Os`qh$NOM3-)dgzX5kmy%HN2!~Cx-N_JQDeLaHVAY2d(qrIF{wma!rEyaROmh3S>w#I$)0-&1uldbyOnyZc%>X6ze=V( zwLYS)Ox@ZU>ICn|R@Q<&Q^_l%THM+>W1fb6D440+pc*yVn|sB`uH7h9aI+EVyw=C| zZmP&zGjzrbAlg)ohO2M#+9b*MIwUMQsLN2JM@Nn4?y=d+;@s!{x#3kmvcWVFoL^0M z0!%=aw4VhB!aX2;Yl?gavI`foD-*H{IcQrt={IJ zoy6#$*?gVh|JXdiqPIfErWm%|8sd09fpby}#x{*z8HwtIkxT6dW@|>Yn4>og6+-tbGIi5Vx@8&U*Y1 z2wDPQ$}QRdjD~bui-PN31Jh1_5g(0Zhj=B7*_{JSWqyE;CzG*V*U-Q z2#im%D7^Lso~Q^?D9)sx2tpIwY%}G?z=`WcQaB=+u{4I3)^nW`9bb|Z@3WI}-epKK`96L>+9M?f!?O|6c81=zk(r{Of~*zKx^ZKPwx${|6TQiwhaM zm>U`^hzk4z{#WD*#*Vhm4u-~#1pnMch5r#1+u#2Gm9*IZ+mZiY()#B%{X4rc5HPSY zG5;%we*?b>7#JDZ*#8H@NWjd<_+5S!6uDN(fDUk2 zgv|gD2--f}fDX?9&sjZfs*kLu?}nGQ(>d>zLS8WDZi*YmtMnUk+Pq?=6@%njhh~5= z4o>%4hkEOFMsICC0GVE00sv=LQ*72e;z8P@%l!+m=DMd>M*sjPaDZ%p92wdiG&nt1%{bKF zjkPH}y}b`#017ts`Z|CjzZf(dJt}MjXg=Wp*Z`8fU4YlXLpVF&w!qC!J{2opAU5^D zYCah}y*yx$4XEAy&APTR$Ze*hY{qQ0pV}GeAEK!Wx~p+PKd?yylNLYyGA|;{^}Gh(oIC{ow%B0}v>1xEJ6dAz-wBMb`4W ztKs`O8=8UOf5-Q+^6M$N*aK^Pv~w|Z1>tZPDA@%(%ihqferN6fOa>#P<3qbVz6P8P zz~ICC5g`HqM<;_v-}oV*0Re|NYlL0h0zL#wCnZTk9Yuu5Cn?(YErcT@GxCaCQ{&sr zo?ZJPY7anE(N(6*PITnQvWX`_`Zk5ktsJZb%HI>xDAA(x271+f0lej1_P^pS9QWX0D+9tMwEyu>tOsr9ulUi;@{iF2-}%|N_Q>NkrZ3~@ zUI#5=f=>)l5A;L)m@EEO9-s{c;o|u1IAve^O$F-WOt3n3 zFtasu7yIJ7W$Zga$u~;*4iLq*%Erb<7!#NK^6a@~atu)3uj||1uQqK9S7Bo3H@Xm*JO7`6 z9_CGeXW!_1Ti7DU9|$f0m*k#bv=H4#J4l0%zCr;c;NPMFDEqfF3}VVZX$0aOpEUdh zKZ!gmx@+GDuc}W!VsSd|-|^p$n+XZiJbo`#SqL(<4)zcH6=;WG5cl>UKb%xQE#O=8 z3sbQ50=~;w0;NAG)qM!Z0en0Jl8>RYj%8tw=j!yZo{fwPv8;Y8`b2vGxpAf*FOlEE zq*SKvyN}{0m1`Q$;;tip$P3vI5q<}k?gfMxk*(v!gmh&o-A@&J0o#=?Bb@g1#U6Mu zd#}Tk^>xTS5vI7JaSe2&HKRtCN2`o_1E4{8@5b`0@oIIyjouuM_OSbv?&kAm*M)Qo zXtG$Wd#0*~)ge~;QNplMl-Ei zWS8@fR#_z$=yO*}RB0$Ab5(+Q=kLVX(geWAI73_00^TEHcFIVQM51^kN3^S#%4vLp zx}%5h7WDd6+B{}L`cEt8l@_Y>-z>Xj_=*cB$XHziO%p)K7V8!iAZ3(Vy2$eCuwSTj ze7~=6ifmydym@^)$OW~dQCJ!<`%YUH_Q2dc7ci}SVz~2$Sa9X-`Q8J5Q<4JtU=w#$ zzvZ{hJ1geGpN7`FWuf^t+F9;l*@%R#z2Yseuc^8H3ypjAfE0_mY7%02yc~FyxF5E7fu_T*1j=pZ+)ndKzTC+^atvsq()ZS zEq4s6%bhO{eqp2b1CiO_H=yx90;{1ttBT*hF6T5Vr*98WF%M{qBvxJV@I1^SC<0Lr z&vT_`zNXdC8JOq!Q*{iTbo~kIqs?#~Q4-(>zSwxCVdq@{- zz1$E+%0*y#R-hjZ#jLr%VJ*}GopYj=)(J_*(xKYUV!JuxXE6~bqSd7uP9|%J>axw3 ztLy<~F1)yg4IvkN4fT2TXRVn5pEV-Ovh`jFJ)?FYLFKY|NsLgq}_} z`RQe!;flyrW5Rt9R9*(O`2!*}DG4@M^HwN7w(O0_WA6KxWCG4z592D4nH!o&di`~S z&<=k2%E~EAF+F}jT-_vkT4V_$_))T{V;J6o_I2cLR(PhaP3XzR^zf=|NmCp?_lc{m5G_grb?FFO1W1(4)n@d8@){Z8iFC;Gw^2gv9{^Sje)Y2fo z-*`MHhwCFZE983~=kVESP~*@uQx$BY@AEZo?OOss$ZNNIe12n1>GGh#OWuMthhv|b zaRK8JY_uC66VMs2_78!g0kA-T#=sW(jK+{Ff)V^{7y&BmM6O%; zjkI&XB}iB}_PT|mR!yp}!ZuXD+SXcJOU=Im+tbz{D6wvR*3669m}t$l#mg-F`!*~p zqDkkIP0`a_WN?f{a>SY>FyXhae_sIK?uRrI0-I>_5 z7ZEkFJnM5(^7rLhmYSYq(6uSwJje|v3K`pNJ!)ankjKRpEgucKE*z5}Xw%y$oK^{i zwLCW}X+h5dL=`RwlCoq$VgalY=Dt}DW_T!5!+U|g{S#XJaWOM^9|SG{(14oMJqO4;>V75&GF|g{KY?dySHrmzajKliqM%jY z3V8h{P)H^m>g!VAOmmTI9VH@%3{^wgohZw0rnPzZ%h3)c7^gwj$A?0F7A2u{KXX-H zO9upEUr#Lp8rx^`kwnOt9`*FAF_x)QfNfW%Z<0XHx-5iUr3BFasLQy_%jqRLu`qDH zy&AP_*SHRS@3%j_Fs#03;pa8i2W=+@h;x*W9ITFx1mU~lgPso0M`fsFzY~>0=lTT# z;tjmRE*%YUm7IAeTgUqf-kIW$UsrZZlK~ZC-$Sl}7ls?5=jxHfF=qbjc#j)u8g0*6HMM`^_Rx5Jn8TTl z%3bz#$F;u4HhDZ+Y3l$u>&~8Co5lO2cS=U2L{1F4VS!$NGXAJxmFR5H#*n+vF}uf% z$1Y*_uxxqJ5FI*UTyf09SuC5eKmy}X`r?T`I&#&;10MbDQ6H2QlvyNh({`%xeosbr zANAsqA97zEGHe@6W!tjMw(N4v@Ul=`gnW1AbXOtyxJst6G%SKT+B{BGNS%yPzWSIL z8rzbZBoaGmJf*K%?@~XD@v&?$<{X~1l4kd9^-Z)mzAA_gI;B#sEWeI9uE8>r)O~8a ztJu7jU05PG=23R&u)|^`&e)+YQ}we<$ME)S9sUYvo1wp=aRB7wPl#+rnGH`+z$Jww zZ(y)c6LGc0f+Bt{fZaAwyFL!oR9?z&I@byb^L`gYn*OPXj5o(BmI-X~^I_p`as z_tLf@4^<5t>*4&5k}fNTw$t2ioNsmP7r3t?P3tHE3BYM7Zh9{c1Gkv)k{n=boN45t z*bt$>vsL=eOqtV$3c;g|McCw!Yx+OW5A*R?b#hG&+_b@@7T@E`61pxw_a0Qsg!eoMrwZ;C@QGrNo_jPe@{6^TB@nw}ZEg%~yd>`$@&)G==79bjDs(b88sn#_lixn2?U`6;5&@Ku zIf&*f$Jcum^{R49PHZcY<}+G{8u`ocn#4q_({V&5O3@u7bG!H*=tXpGsw zu#|iZ+~nsqU*cm}at2s0J6wUuv|T^`osAYIS!C}be%SToApN2L>xM;a+q7BwSXp%B zelu}=a0co%(2n5)iK!FF{Y1Z{b%OgO?Ni;LEu{)NV~HGvcR`bxOR@yu0apAEg>`=i z1nWadjVj;zp@uV~q{sZ?&^o2$#Yz(}^}MT6c+Jb|fpfw1$b$}UX^ES&DD4s1sd;uF zLmM+z1SldV8jtp$U2-&uE7|XegU%+v6%;mz#qTd-{vtF+dP@*c2KJ7I4Ye0_>=|1d zC9T=kQ^4&jPD3G{TsGbb8ajt;+j%ugp7*2quG6fx;_+MBY_?B-!1RI5&{LUN32DOP;7(C~ zTm4+-2Qp41TA%?TstFav5R!TvSK6k85&uC>wblH2{Uw&TV?3W3W``<#^fuT+HH5#c zr%C($$hi|vQ?9U=Zk{dCaYi+y@v5rS5NoM1`?MMo^1YD27FDaks`vz@uAg(Md) zT9f?>x>k9BwQc-XtH_~tA7~|HGUM!!J$E%-VQ7dNN(Dw{x{bunG)@ zMQT82j)4cy(D}#_nRO9wwf_9;yeU9^Nzz4aq>(Zx)u{9&tNW{ckulX}jX$BkdEl?JA}gC$&=HOA~&)anCkv>dBv zlyaD5Y#QyE@a@RLT*fhgKq@9 zW7zSCpCZ+MC9PJ}u!@=z6Z6Fz+2?DA60<@k#ENl{8FIV%Tozr`=jnHH=MKI;<-53M zJWAJ@GH^`Jt2nT{u@8ZKp3K2JRa7pBdY0|7b5W6BYn)q|7G@TK;oIW-b408hDA1{1 zdxECu^@c%asaR<3nZ{UM-jY5m4J7Y2dIh6R(u5}y+{|!ZHw~J>zjf z)jhxrM!vNoHjI1vh=tEQVTd*lme@#hxPr;BgRI*@l`Q!)GAmA``Vas+cD!V(R1x1l2eEolT;w zkvs}&n`v1oz&IfQ3^54VEQ3Euw@`%6j6DC)ltO*BC^VJJyrNJ z3p7huMM@RhrT36x74_y!R2ZIQ;AnY2knxC0`i#hDe{?j;!BS4A6Bh2c-XXz=Jj6A= zXAD^T+wPAXHlcl7yd}E--pDZ7%!zlJg7tlzj*7f66fQoyxZcJl$@q(AqlqM=jNDj%9s~Qhlv*>dfJi4i^;(&c zLn$-$i4aNZs7X>tw7s&0WuKyhvN~T*gOGfNJhttwC!tvML!#L#e~dqfh6YJ;>bmHa z4-}iHiU%C9tun4d91qqB;mpEjkJpqLmt#)s^l zt}M4SE<9}892ath`k&k#t>#q2PBps*Plh#48%a&YB+1t$TWGLLi=IAOZzFbkEx9i) zmPa4r^qz5Wh_yq0raNcv)+BidIWXks7WjHR!?vXLXvgaqs=9CZ$b-~y ztjIuUaV659`f~8yM^IUc5}LP?W*;4*hd@VMVj6aGOV9A*uC@LI8KQ_V3X9{?PRa{u zz3|j5B`I$U3vHbQ8s%7iajbEiF4Go_9S=RWyZY<6CYgVun6^;e3UA_|^Y zP`>0KiE&CZxRk;L5ze%vGCl7Y|H?g|~)<(+b_DPWXtJo@?PIa@)nr&D&Rg4`9Dk5sC z(NwHwOATJ%B&!=+B)%`5`Cn;mH6QMZOBruVKTWH-Mt7n#tHU#wi*Zs6F&x#?`vPH} zcv))V*s3LL@F!f?13}YY?b)Od>k(RwtmQ*0rbxDW=5_B7O<8j>2YekmZX&4|CDHnm z>3YJRby_lfk#{wXBkHcrvBe+NI1|RgN>oVFd~Sg73)=bu<`5VKRlRwWI*dZs4IASGWKp**CWo<2od-rk zMXj8PhglvpoE$0GhxZCo1nd=Qof3sS7u)a40mob{*uF3CGu6jp#kPc@tcKsiAB|N6P(g%Y z_tO;Jb|3|TS}k`}&_=l>`>vdfAKcJiK(6QPW9;p(fNM*}Sz3)qjQnV>r(+2>n}^JR_-$957hH8)lU^LA z&HX5EyiDDd)OUs>*Oi`eNy*+QJxxRr=hAvHng@;vf+3~xJB_H7OC~GDye||MM4~2A zR=XR%BNlv}jjC!e!Cnw;tkjsirToZg<;?^XLx-2#P6hZ~B*_2#QZ#9jSwYff@s$zt zkt2cSaPQYGO>;A_pf~x{tS_#6yEiTbQ1gAqb<$HgPj|(ASl#r{%wsoIT2a&Cq0$en8dS19Ukj$3 z&rSvwV!l{%Igb;$7mmN3+Vd;QpFs^*I@4T2md;byE2vob6>aT4!LOu1Zw-p&LM&_; zBT2Bo;>!%j5-K5Hu##>)p+H<^6(-VDkP+fc_8FiHc=>?gyLeK)%G-FO~aX%g`MD)jKr&18D&qMWI7s5=;yDUL?;?#Hqg!Y7%S%{F%RO% z+K3cc-BJ~VbBQ8=`oNrj&{9Vg?ZT3ZjR`zVW6Qh6`iN!_T)&E88O6ww!dEa|vFa$g zX|6%?_Jfjhbhu(k`rS8uIa>JH|k;?5kDBueoEAwgt9XjQ_d+thGn2- z;xc(h#&8h68DyAKz^BZ1rjz$)dl5~rL|Pn!vCI=%(p?FI&M`E(NJ)A(@_J5?mq+@e z$|ks`*<`y*!3}xx`+qsVQiF_%eL~JQ^7>|l8a97?jx+BPT+B|)U~YEe?&oTuSwGDx z;1JfBlj(k4G?Tm($~M~X1oFwA_ExetFywj;`{;os@C0;?v8~_$acWg%DKFTNKtdI= z=>LB6YYR*IV@Sg*{XGA6>S^V7BcJVRa&6m-CaE|Sc{Wg0IZgco=E>LWv*ILgm%^;) z=5Y?b#)j<3n+{}S*VndiZ=~W8z9Sj91_kL#mtPTHQisl@RgfhGfOToful$N$O=;-N z+wN|ZhG+`@=bY6g;~8-1eUcGYFjCs;DEe3XjJDakN~jF{B>7k^vI-|r2vj+92pNpu<* z_A#_?lQD>4Fw1xuM;GGw!&rm6MK8Su+_WgwF(Ot>W0Xh-K0m+g1TNVh zOZ}_U>g*0>VyhB5`9HCO`1kJnMjbv0Z0l;>G{RT7aGP~2I1iiqbPf7KIVJD&J0-g8 zXyf_{-@zrLCBUpbyZK}V=3c{9K`kze_T3OLU~V@ z<;$xv4O60OCjt#9=^GQL@Uix2LRLX0&-ijYVkN9!iZ{+dH+67NCKA%6IIMmxGfGFD z>*wiiH)%KR{6VmKf1gN(Zt<$(KvDTQ;=-=t%F){OQ99LmwloS-I*?&?Z~~>!77?zyRhDBDR>e%rk8l^&%5Y%o{pyEx3n9L_ zaX`ynnTy%qlwGj|WO)Qz%^I+xb?GiIrAiG5Y-@kqEMhP%6$*Hz#YLrhwWASDe8Z@= zTk;TIbQl^Mp;M9oySr;P9jo(GgA>Yk%6(($k}OvW94KoJ5(nrejP=;`(KO@Psge>h z&x;gKeyQ*~QYsDHa}F8x;sCY^^5s`Y#xAM+1_6sl;D*jXBx}2P?%a2JN5&pA#OjQ5 z`2DlTZP0tzuPuGjvZBHd^*Q4@af^~aI_KiG`pXVZgbIHVJ~Yk}B=*x1L^uEv3H?a> zio$Z8<(mh>)V!20bl?VtHj=#tdEQgnGpJpGltUUoKKJn9<5UPi zi>2+ef?e|0Z_%=)5Fih_$Ns%+woKLpIZo~7>-Z)z^0@3zwq4FsPG6MB#rofga|lKw z?3@ybli*~nxHr0PtKS79@X+(wEc#6+FqXKuyBy|J4oomF;8q*WxX`cX7pC>X!0j<; zN)3M==suIg#9u6Oec-BoL4gjtquO~h3k&O=7+T~&S3LO{7U3AkU_b?4C3NE$T~^g2 zJcHnn3M7h5HolTRCKj)t4xKgZ9nW9(r_{xqMVmOWHOZgPpgY#S37x7Mn22M)OvDlU z(=|nYJh)QbX%-Y9j5jCpJwR9H0C zI!@5%w8E+xXqtldlqfv4WC{>63;(ne*S^< zHL#_=tco*OAHRFr+YqESfg#l{8m3(na;!HEG+P&D3IM2F{MtjYk9KSZJRcrt|2b^S z9_l!R_)Ubu)gsZkRXHdWBqyGbN-#V?yTn8hp44(okU;P_5qEXjqKatsE&Nz8U;X4f zFjLJN`mR>6V_c%1Th!jd#@VeHKN(~psfGce5>FCFxJiYu%CLDo=?FwY-qsaIXyd?q zk!OjdgwYRo?-r}A9x-hMB9DFnp8ulMFCQ*vH@{<#!n~}6bAUZ-N~ZYS(>JjDiYq2E zn)WCZ!12P(n164U?tie~b{d6wOuBbz$l)@I&ZX>wCd$)X%3ru~&*FF5<*x*}`iKyAa07B;noi_`02N&mq zMHw9wka;4L=?~8d;t1TJUXhG#ig|S@rKWgfvrW)!FY_@iaz5Ja^UTJkk>eOk`m{N# zu2e9_w%LdrZ4%-aEV-G5?7Y75tP{K7ucj$#&K&L8df@lViFw8IlheAAvY~(DV6{NA zz@<|cn!PY^mabaK7(!)8x43ThLS=12e;S|;v*Dx*OU4-6~!NS!Up}(Hl^9=VtJ(aaq4NNWFNFZVj*2 z1bW$rP-WjtI1>&RcJuI@xX~!9W-+~tDGjx+4>k=S+8dD~esg$=S)|Dr8>93gyOD%q zd2>Y5)6=y4$TCXospy{3Q_YFUW#`o~?V-#_3yWEodViy~V8juJG-xTNKiv!p@Fy|8)T!Qp*=X!_>1WjF#jnIz4p?^Ai(TJ}Ty$ zfPhWy z?tNZqNbRPChn|>tStUyr&SD#-htDzjV4c~is5VVBZi|%exJU-J6(a(!zIfi2%KSvi zUIq_==?%(ZyDVb>CA2e}@R2#s!3>>MbtU>&!TQIW-iG~-T{b_53+cJ{LTUt}dE~-4 zmAUplo!WiF;=s|NG1WuHKy@NcBId_3jH#7FUZGA~$&CJ=Ov}v(4tO}#s7cx8H)g`i zNn;}8iAdy>SZrb1t(xHux2&D&p!jA&rf|P(t+>oVe*|) zjE*iwOhRW>{R8I<8-oW+(QD`pAM(B|8 zaZ{%|85xlLT4SO(ojFs+*obxNyq|pz&Rki0z=^L>as$<5Jal7B!+k;B)TOp|b4j&Q zLqzwNDoYlH!YO$NgAI3CX92K>LU*jguyg9cQ)qB_M`Y^sjeO3ui6d!iNqn&a^Et+ni^<7x3iP^3mqd&057w zCw~11&kWPZY#V%Z^Qg5QS;-wzY-FN1^iKYn&%4 zYm zyV*UH`m^Hl?l``L_D@>^`40}+OQt(E*+v)W6iR7$qe7&?e6v4KvT$?0-=k~wmQ|P8 z+Y*nz!~1ZP2rN$1B2;$EV`c8}{A2|0p#!C(~(@<`x z3(JSW^C9VV{gi~BJ958h4EkO{0XnICWReOR{n<0`^VOq(!`ne>irhsPGxA1M8Aamu zH$yr<)CNJ!lR4m>hjEiCpaxL!G4uSSUvWRGL9Lj)%Y&|H<{l zGZ1ZRqzK2j-s^}`K-KVC*`GB+eP_?$g3=hj>HlURUz*)e;?iBB5fN;Ao>!L4Ov&q! zT$mw_6=(k)uz2X6L#41aOntauK;GFr7?ZSxS2EsHB#k#KgYqX%vAQx6w0~*gCB=QV z`YT7P%Ca5p7)Mnt(KW0Wxg!mOD2edO=V$R&B0g;z&{=&?c^~!OLaZ$x)#j$$q?O}(;}VBh ze@=eArfeSUgQj_cSGn+mYBQ1ojPJ;l5)}+K{hdZCJJ;F`8>tl+<-tx(^tbBQxP4%^ zXM48$c4xTW=+Bw&F=FO^u(K>T1cF)e)_E2{D@{I1FVI-XYZKe`?z?X0?{FXJ%0yp} z&OYsv_+M~Q)Y^xLh;6D%&)-li=iA`|Pwc;4XCycK5&F&0r@ztuv_+rEIql%*p7@8d z0kKqCt3_A9x?3>1JMYMa7V<+Y3rQ6xH>ka!49l|+>+94_%_EtArO5o|_>rh~fv}jl zCasMn6rOc1!K7J`Ev?s)24v zFmJ{bA#=sxk>y(vBUdN5?2ViNKG#!NKEdZEZ1cpIKyi}+GJ={JHwrgAH$Qs#yd!ze zq?&2>x2U~?NZ%AS51MytC#GHH^A32#nh(jTD(H-L+4aIyzkuSUmKzD!HpO~7XD;pY z;Q(W?Z(T~iBo^kR!BgxH+1?f3+b%V>v37k-s=5&5G-I9$xPTW*PwKnOC0eFAuwe4C zOewUSz?>TIx&QQ$>hGXiA7>478A?i}$&C42LyV@556Yx!#yp~V?4O{Ina_2N859>1 zH2b?dx7*8r6=ny+af-gu()ptgCX-JRShelva?G-hdU8it)4TVhJ@w!PIDBP2N@rUO zeti)u2^}gNXS1xx{zed}QL%|&rP5#)Wm);NLspv#YKI+p#f81(P%rNTh^3XkiSWp-~#8B-Na3Zo@8*@1@u?Ms!? zyo`P+WY1Q}p0X6({iB=T2+*kpE3sOydi#cy`MvqM7#E0ewmNiyOfI=@AqTQ}u=pct zx0zKjVkfj=!TU&edN1bBB4-PWzxq)#kwN`*hi_6aM6g@VtHTNnvo-o*&Ng(5F+o1p z!^ZcrMx$%i(E{n>WFLqBUAV*tmzL~7fkVG7(hn3AWvppi5&@SH{P&J8uu!kzR#HE3Ly?ota3U|t5jfZJe0Zgfwx`p&26K=4v z6E(G|O0I3!K5|*E4f!_*f1jmbPOts4$S$Lru{dEtKzHLYp)gx8OFVmK`lEQrI}I!A z!{me9LrLj$R^%B_Ihuw3%R%Ant*8kN)K!?rUi0)-fp*RSH^lho>E)2(XzTdAz2yl6 zp@MO+d^QUD%3inR^^mj4fUF5{&FMA=xBb2s1iTON875-_d0}EI5Wb|pd@2Ejh*0*> zTF*ux#$oQmIrKbSt@AzoC6WrC>V$IYb-#~OZ^1=c+rIWSYvYc?F9bjPg7ap{YzUsP z$<{2v!8ee(GUL%=Wd zRd^S|cr1V(D%KO=4{eJmCKP^0oTzWL{pRiaMvK_=*fYvcEV0@vB)$i=N)DF8c9bXr zjN0bLw*Kxa7C}aifN|^D`tnQ8-xsuZA63{r?u_baE*RK(7D=&g^)t#8NlSsNbZ#Tk z3JNK~EEbUHY0Oq!UN9LmRJ><4Bzj7mwsC%=NC?G_H=^>Fjlv9qvv{%_%)vz0t zV3-U|!Qh#0h81B?&=>n`YD86^!t~WWPr)2N*#a{Y*68=c3)h6$#hG;TMD;;t;$-LB zqpQ9@6Bs@t%_+XR)GulsG>;w%aKwnS;7Ezn7< zVrfNtufA`2pFaRUxd63J0DxOuSwHVDicU#C&=9dP692FG^&j!&f5fj$EcEREQJ*sW zqa*#lgqjThTX6dSQ)>Rze*X9G1+){SZ35{Lf?u)>QqG9_Lr9pA1S;lXBfrZU!2vw8 zh)@kNP5Z7c?(&q*`AQ$+1sXaP%?GuSIFM|lnGA$@|k5_7=4Mga(XAF#g6YHB~)oxGN|vOfACtpVbh zVa=HAHzjX$J|GEY_q{6(yDo57*yu>+WjEDW3VR-lMR;?*J3FB7CmNqnDipJPG89K ze-o-oqxxTsTnjXmTN|#9O61aoMwBlSsbRjkPlenjax0fQNE!?#(+p!Kx6qZ+ASoS` zPHtgBoRnNkHK81G?I2x7Zc#^wPA>oV&8X1n|Nn2TS+l-(@AupLec!#;+Rt9=d0cA7 zO}PCQD-=H&I&6F%t>Qil)B8OKIcUso2*o6TeiiLtQB zl5z;54;#?fqWTzmQdp2Ugs7QL@$dz(-?|!zSj0#W4+2Ou1`mKB2?LPum}&a3DVsv4 zx#}}K=u`kD^wRfrg~`ZJ7vU>BJp2wj3k`-8Zxbrb!;=l*P$CZ-HucaRfcX8zL8|R;(+Z{V4f`IEa3DyRtHE)f2)cSAIMmq#)Tdi3O^-NL=Gu-$$NTu(un5Hgl zNK){t(QYlnW}momRTaG!nWU@cJ?Q<1=PizWj#Zv4h`Ze;wo0Nb-M?99RQ;_&xR{>G zbKjPJnvGj9nIC(+tu}!bdWO`3-{Dh5ZLZ(?uuZ)0 zRWzs_JCGmIC@B?-eo1U?jJ=U!u_H16%|4ozbHb+V%G5aP^#zyT9ymEvoZH=6bS}BR zxuGbQk#5u{`0(m^*NuoAcIF{l>Ym$$OT(4FpgQ^+& zSAb4|&&Tz;50wWWAMvd345K_~l*bv$@sNgUMkhKppr3G5jjfeNM(^ysC~rsB>A#(< zs`8{d>(~c}@(g>Jo%>Gc!8!10vLLymM^Tn*#st>Z< z4%fprIqZU@Cj`dk?vTE}scD5H=<*ayZb>=gb3SLQmGabZ<3*?QmrBoH(sFre91?1f z`Z_;0hn*9Z9R4Dv>e+-XT8ia}ROMyo37l091{b_MnI3n0+|8NJdb{(9^_Q)DyN8F) z80znfdJ-#p_`$^$mfTbirc=}X7GB?&Sm>V~pHDLA@h$CjbzD2-iq9)_(~Gjadiq}6 zIA3gfZD08%_W3L$YiepxB2Tld{w|)y$?sK(x7}sd%e%bY*)Ia9 zlu2TNevP#sPwD~VA3>)UdIg#=z}5e2_^&O=;!&+^6YmNM~JGBLpov-9Dwe0#Ojjn_%a+}h7a#JK)^(7y!icYYQj9kb^P^5bU zTQ!1j_qXtBvh+)W?2*^Vfy>iDE}lm|!;8{-ww#NTuO_U2tg(Db=1fi?sXS(9T9%?6 zHt(4Qad1D~;!sdJU`V(<;_jdk_qbl#F|uYwjMknCg|$$Ciblee>MGo#%@KI_HsDSHHH$1YchGNHf6@J)=2ha=b^YX`n%YWlorOe(P}B)p4stbxcvT60XBBm}=5tmu)C9bn=O^_7l`{ zhp}}5hk__6O|9zXuTM2~&>nSMY_1-j+?3rrfeFcQ5j3B{JwgykhhGsI!TjvYGJbu5 zosU1axgzbN?(P1LZ#iibPplf_`lu*|etOja=u(vY1Xk3sLV0B{>daUk8oPE)mf{m{w#MqH z^}UPuG3$?1jvXD;73h$IWj{2h_4z4qGX2#RnV1fTzIvvbphm3ZKC#mO%HV2&9bXwy z?lB%UD3KC3m@lb~9O}Jgr)bZa#V& z;@-hd%rhjh<3%u0B(?hh7^S%?rWeydFtL?Q@BaW5A9yG=w1Bh!V|mq zg()nEKr|@K=VSS{{0oyBODWb%B@pFqnjiiJWYa z9&d^JsG~UIz5QQ~(lf+<&}ahq?JPa>fH^Iu27c4!`JZZPPUr3)d zLt@$`S&cR`>9)A^;CU&onpn*hzLjk)J^81kHIDo_6|MImcf#xgU2M{FzZByBuh*BX zDnD<4@X3#|!+G-FT=|&ou7I%WqByGjS&B(2K>i)wxKHJfw|{fUL)BmAO_Env6)ll= zZn?Hi%1d02m~s~QwdEK&mjA-d{~~5VZI?~1%}f5k1$t;iEfQ#q=Pfr4h%`GVVOq4r zev>t)F2Qz<;q|GJ#5b!W|B1WMWM?h;>~gU=C;hAjTZ_H1aphZ^q@b%er*7s7mN{n| zZ4BtJk17c-kXH7R((TVV5;EVxz~B8}G7D#G8l<4H*l&sbQO0QY@mhFbhfN}oVnjVi z*~77cq=rNdKZ7SIGzvgsK^!D-1RbG45IR?j^o@QFP@sb^d=j1kP()%v-vJ8z8DXFQ zjZk3ip*a`@oT+W}Gmv6vY6J%hH=~6^ea9+uLWg?V1o%Qw$&}&FguB90Q`S@u8U*43 z01bUeq>8!`wnLA$70ZpvqA@%G4O2G=ebLwf-^L`qqF#`fR@YJ=x zP913?%#5f|ZWIU-LK%ZmmoUTDqB_rjNFqtC@5)*5HN|YO2rFla7L*yy4gQia+UXTz z$mBq|0cS)8e!UeDs(V3-i4d3uG+Ze(783xqh*}UR)9_@oy|-vXs$Y=NgGKT7q`CTP zFSlgss=v*W7>uSF(hcX*)|jk+F5gKAO;U*ZB{>!0AlCb&?r0_ z+8ML@L0^MQH^)YR@bhdNh{xjR_JeGsANxTnLqGVE;Cqr;V}JxKyh~@>NCW~@-Z?fR zw8&@M&?pc+HwTanJ9k_(3Pb$Smx!G^7c`0l;r?%9uvrx7o`WS6@3f%>QsKJ68`+x4 mWCOzbi#8*?`2pd26kY=OvZ1RRVWrU^3XPUfQ!} + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + Core 1 + + Core 2 + CPU + + GPU + + Core 3 + + Core 4 + + Core 5 + + Core 6 + + Core 7 + + Core 8 + + + Multiprocessor 1 + 32 cores + + Multiprocessor 3 + 32 cores + + + + Multiprocessor 2 + 32 cores + + Multiprocessor 4 + 32 cores + + + + + + + + Multiprocessor 15 + 32 cores + + + + Multiprocessor 16 + 32 cores + + + + + + + diff --git a/BookGPU/Chapters/chapter1/figures/scalability.pdf b/BookGPU/Chapters/chapter1/figures/scalability.pdf new file mode 100644 index 0000000000000000000000000000000000000000..28398a07d1f5b4926d8e2c446d0ff4bae878c318 GIT binary patch literal 19107 zcmZs?1CVGjvo$)lJ!fp&wr$(CZQHhO+jGXYZ5!`=_x|^<`(C}&u4HE=o$k)wRjZRq zNaTe@X&7jkp-5V;Yd@iw@aggG3@xF!x$)_wO>E7a&GDK4L5fiL`1o|97S_%tj{iz) z17{Oq6C*og6DVF@C?{t}69XG4_v{f(8M`f3gr3{FxsHCu?0q)^)Ry}rJ0 z!TM*gK)fGNo}S+vc^+@>_5IcTO@Q|WCcmap^1AalJYMdc=+{Y7>}Xa>53p_)PZtrdcq)sOKdZBjqL~_=+p~)CsPl%Yf-_AZB@F8H z#pfvz0?w1_T+(drgaRA0Mc!a~!IVUl3ki40o*Qmz##V`bHiZ$qBpA|}M#5Q|`Y8RI zjEf%O06Y@87PDDQ*=7Ps-rx2>Rn|Z0e*P#b4-+NK>3b2Ymgu2jWs6mEWIMj-;aoqJTi-RV;Iem%qLg4jCSuCuGbB40VXY@elS8f zK*9uVcpA>kEo%k${1fXt;@yqw*jgnG=&aTUMJQs6sa&-a8%ow?*qcisww?h+2=Fxk ziYPyU&Cp62+|EB5DL%(oIXQ=XJkmwj*Wk`jju5>=CCrU4JdsgNJJC}uAYfdOD&TVL z%3OoRTC3$4#1s#N3bI(~lM5zBi9Ml)T2R~@DIedNl%_B(yMstGRqsMZ#?)RhvraOc zz$o_c$BSu7+z8B*pJp1LUv4o|Al)v@pwixYtzaMlsH~Lb#k8CdMcDA4uOYeS2B9&W z^_b-)fH!uo5Dw+N8A_%tH&gQQmSBH-e62#Eo|t1WDIaB+_LMjLH~$SVS$b53XdvrD zBq%^^im9S<7NN;xy-dJI+AXF30@*-oZEJ=nqr9ZYwoWHL&8ub(X+7-=>X}LUnex89 zTwamw6u!|#&8!{$0SA-jp^P|{^TVXLDa=mR7*UUp6dzql@FHwu*#-tVs+c?@vW8%C zRn)DM$GIZ{7Oa^^o6tZAXgd7DFPC(=nCpRB)?Fo`yx|U`oo1z+tGB$j(YyCV0g^wLMxmtasSmh>FJnVI(7P_m$oWnNS~T`2PJ?~rD!6kSI?#<3Lf<&pdwf@b5% zd-K@v%1^k>$b{8mbJ^q)W8wK5c!wD?aVvEq090*GO7*%lwu_MrTldmra$BuahrjUW zZ}|rGtMIjLXNyK@jB*LU-o2l@jXdiJ=Xzt`N<2t~WU@jG5{^RGp{>f`ot9!`UT1O7 zE$obbj7{BKthP^a8tva24wW1=Tc=7*gn1!Rl@QNDR~kE&aZMTB0$9!)R&oG3rSeVEl% zv}s&|@}k~$Q#h9l!%NRmJymg@L%Q&Bf}%=(o_WJ*Ym{;grY?|;`;Tlt75_`tpAi7t z%5i*e+rgCVSf{KOck{Nm%`?Tuj9=-zGt&OE3tTgyS*4oyy%8jmRd+LKg@VJ+LZDi- zv%>sBI`n%X*Ish@Jh9_MH``9@i4jh{_JPa16S3Rpvmum9?s~KS%msHJx)=4qm2JFO z{7d_NKGi|{@A20RJE>0fC8&4hyf1zB9mbUR#0FhK(|z8w?$t+Q8P%;#Vu5kD(W6Hh z^V4jAT6Mo*;?WZCy!Te(2!}=G_)hGp%;b)#)#z&$rRK8el4BJ;+IqP{Z#NM|=J8D! zzE^h^ZEqF?0{%kY!ww&n^G}a!z&qwod7h`Y0xlJh?<#pmY!5AZM18PNZ$;L`~S;4|RU85#V` z|1s>phX2m#MD1*y|5tBQ8hRRfe3t)V=zpF5J^x1;>;F}TPp9l*Z-P%JZ(wHfUy9`o z98La>!|*?HbTTH!76yWL?)X~&pnn6>Gtx3~{7cOE%uGzQtn~j9E56Qu$qN0m;GeJ) z{(nM+|Beg99PM1}|Cd$&t?j?i|8Yy{pAC)%wodl{(QM@LKd{h0T-e0b!pKBXOz=P8 z|8Y~%#L3RZ(a6LJ|36cv_`kej`&a+}1SstP*O33e0OdcE`rnZZ13m)>J;Q%F@!tR= zJ_8d6BisK%nDAMcnEqce&2^WFe702^Yj{`JVtbdCJ@m~D^Uci-T3{D45MW>z*TD8R zP9S*J&L_|5wpcHo64P&KDMh*2wpGogrl;DXii}8U^(e9BjX6}ZqqDt{p^>RbX(5su zU}n1p7pHi7JK#H@BJB!*h!{M8KTiMX*zDxw>?ok;9xx`*aP_wKHvi}gfQua&zCY|8 zIt2gv$o#^{Fm5rroAZNfLu<3+=O&>30TjIhKsy(EN7jZ$mnPseU~VjJD}d7NFA(vi z{^%400$T%{O9N{g2qO@RaE5?*p{uKZUxGFuEx%fR6!={ByB!cXw3SnSP*zI>4#Xz* z-qFF?ks+)zlfyk&JI4>ac6<<6(;p^&O$jAM6$JrB1)O}!iqcw2s#;nKK>3uER0TEQ z^>QtZE+B0jnk{Y_ZGQkZc~dk})Be8v>Fwz)0DO|Od0=7a$lZ#_t4M4z!SX_Q|a4&fvzk$xqtPKDGnEZYLiOvr7 ztncGd1IRf|JAs@5zeT~lFR6zbkBfFB2 zBoTe{I@*V;dn%fA@ueS`XkK;jx*?VW&2{80IuzZir7 zR9skOT#tTOnh3FwbK-#y-*`nsqhOYJ85+Nu6r2bV5gkQAIc)vd=J#nKg+*B9UH}`vVbDNt zd^f;*eKo+3zosdUZB0td4dC}bz+|>22SAfQzxvjOb~dNqew*6RZ*G5@BV%b_NL_!m z=q^oxGN}~6rRMY-eohNos*+pK=d{;1rl+9v4Gh6EPk!gj+0%h_uBvhTCu3_@{BGfq zv8Z-`7Xnq6r&a*ZE}-^IKeZr1YWn?(e!7(W%mxT;4X+R0URlN_pXw0QlVz%LwtjlM z)PEa(;<~PO8v`NDQF{P9tEZ}~t?`BCX8m|Ezu*}qz?sCHto~V{Xb&u{tu5V+s4gIX zrL7I1$2w=53ETs~Pbc3U^@SeQ$Iie0r3I$Y4#dNsHqa-> zPyK6o(kBN%ZTO3}CXhJXN5TsbwZ%uN3J^8Or;DGn!N)0YeDRygAGppJir?l3WDHJX z@@)r+(tFKE2@ti-2PAIt3lAp#DhGuh{;NnL-~WpejAMLu-}cluCM-QQ554zy7%Y>7 z)D_M)o(vD%KQ3$N*H8BG_dLY<iB7A=ks$6|OPMSS(@k+qY9; zO!JZswQa9F0CxjEm|h>>!XG^xdy#e`c}inri_B-e+_v<#I0CM8YEPuSB&sJa)@6w} z!SSQ~{J^!tEzQ+k!AlH-B(h0Vjj*=V6^b+exPdk)ppN`&YN|qI;~GgTvfRueoWi-d zpLCMut~Se5Kz>vLp~&`!g4Xq=kYiR+jQ1WIc$L*LU4(VY7mOwQ%*={nd~=~V;#vsu z@m4eI)vY-WQ$|r$Yz<~;=m|2O{4ew{E#cwLPQ6J0=L~h}E?gH;XeWiw%=mi{7?*F% zirB%aIf`KzEd(@>u}%0^9Ug5uF4v_AZPjpk%*e43*uZ`h)}-m=P%4pk5y_X@{*O-XyLiySYvU^~?=d#>Z5Fv5#}OOF)-x>uB%EXhM}<0m;B<=T>zTr>23~`287SF^s zCL6Aae~X#$P%RI|a8fZNB&RLm>_ks!^PnX*92j|^8~Bfdip?%I%r^P9l}w}>++rG> zn42~URg+nAr{p&DCUWiJD)qWbVL2ZF9@Aa3p7W_FY^v^TS;Dhy4B^Jx`1`??9fKCq zLl3kO^{Cb%`}w_vdx(9U@i?umh_4HW_zqI zALQ6jN8n&7RyFZs6;DW5HVe8Niu=r5hEtKJrGsRAno2I@YfJ8c6K@;%dlTcjj|< zUYb<#cGYx|Z+`NNl#XXcj;ROnBSNRSyB_b`vHP~gS60Ky@j6|0DkJNQ$=|~ZSNo?q zQvq;nfEZht=PL1;d@N59Idl*JCa9`I3>iLlwG?E|BP?!Mf5KNqGGJISQjY}NIn1XA z%#mtLXEqOMClPKC`iIO7O25+_ro<9G^`W#@xJ3n{A|_=>`7mef3gaVbaG)m{Is$-ja+G1DC= z|J2*sEG1t5Nn7@fqO1}W(5j$}AyG^-LrahvF}KJb49f?aZW|fj)SrCKmh@9oQ8|kK z9o^XWKd{pbAA)V#je%%iVO}L@7=ZV0rr^n@#|y)Zb==L#Izl&nd!R?%6F(If*mqww zPB7navN`gZ(Ttv%Z`q}hWad-SOi;x7>_$$(zntye?D87RS*m=tVLw!9W^A`L{oz8T zM@Y+GIp6a7`@oHmL1!EJ+nAhJ^-?G;3_GWfBvu0&SI33L3{fA@_oNoV@>DJ#tt@yl z5+l2}OYQR3EtJGRdffa3IjyyT^HT+3=2IEws zd(VL8eeNtE07_C_Zr0_bLfUAp12QRRZutW7akSQ|#YQ~gROWr6pAv}WeKz#EaAqa7 zZdpX^eD(5~zUoE%s24RC=A$OfQFGWdtU^m84aUkt+(p+3G03EQ&r6vN0ri>-sQ^NJ zba4pWqm_f7bz%~fZ_(-Li>gcGJrAnHRz5C-6_*bKud87G zixtx})?Urnt~BdN59vwL{KV7Wyws+k^1`onh|-9lk?1ciGK;KoXWLyV&4<;}C>;9v zAa>-VB{Vn0(Hen4D}ub_f0gFTC<7J@=(kZ%RYtc9Sswf9kSpUJbgP>GVBa7;q-=Snq}2ou|>%)YFO>>txj%|Cuebj9d+skyVal(O@5_%m z%qULW<=4ZQ1(m2VX#rSHMFly|`6&H;+c0PS1zQA6LD|TIgo_!J4mr-zHlWGRzhF=?+z^OyXn3c&+S29KC-Y}Qe z`(LjXlGWT}>jBC@ZL^!?rYAJo%AA?xGvmQasry)9;2qoy&QCeSZjsh-?yV;arBcHv zg?`r2i#N*10LS8=t86KGZKvKe2YYkZLr#gE*_ZPAg}`+(YN6!=dbt!D?mdGT_IH=H zCl`@Ld!@*A^e*Y&Rw2XsdMSs8@TX79oCqOf9h= zOI7U!6VxmafTENODU{n#hZ7(>Gr&zKF3#4nHH^#Y8G$|0!0XlVwb)@YLeaxfHOlot zq;GY&?d&oE9ZXU!|8tqlGi9<>@XIav?GaD4s^lC4Z$CV0?6mQa+h*%J#{?3jm7Nu^ zrxs3~Cxk9J=7xETlXcr!@--;xpHQW)6sJ2VMlM8qJWt^(5=LH`#8mW zNc5gG_s|X{A(AYD#PhBR*qz@yX_y!;o3Z3pu97U2OtqmKvc_;-KwKh1O^5Yct{k|0 zhuZ51sz^e70)~Y4ET_kFQVdFtMNPM-*iZDWB;VoZmM=cfwNC9Vt5iEb(fKIU8QL00 zPj|`8?^nH_+FzvwK|ta=y)vUuEPo|kv*EZpS_;nOQl%i;%p#H?gY?TAF~mXb$uME3 zl0>nXGV42G@OnPYn*jm~{w`<{8=~=3QMy@O$p=U9xQb4q7R96M*V3#+MI8X$a*U*gKA;I#w#4b1p3nkaL(anlwD zh5@|r$>fr&#&>}-qZvGo;=}s^w8;2Es;G1tI@ExlGqn-+W5nXjaun1+o1He}hP2J& zTbbYf@^4=r3rkb7(Zc9)bv$Qv;XWxTPv%M8ZCT&8)$K#5&n9Di->K=e*!gOtJviDf zG#!4etgbfcL_a``-S`q@31b!NqJ>kB*6jcZ7wTA259IaP@%eXH0=j&!X_fIfLo>Dw zfbFu)bPxWEy{IQ|2NBl)EM_ak=382oSM#hUyT;UtJYz1UQ>tFP28U^ODjzjxTC^H9 zNhAnYpKnPK+9HznMdN*Y22EhEb!~QiDGzYOw?WucEARt?j_HHDH?+kN2ywbDqTLs0 z_C8a(*pq!#>49;Sch757TiT`MUY{o4zutmB&(}1bT}qE3XhBJGr|&O{YX<$y36+Fi zkMp8}dc`bhfSu{5vagFX^(k_J@fUU-9|Cd^U?oJ*uObv3Mai`7!s>RQcJ0W7E6Pp6^UV5H4U z6ZcUFKrJu+D#W^Dst8{x6j+i$8zFY(;&8B!1FxRDN`EJkVVAYZ@<_#IY(_Top1bl2ypGY<`+!FURYDS4kXMj`Nr_(nl)kJT~U>e>nV0)lA+Kv_S#>Wqt5Jg_| zFAH7maJ{MXxgtq%?&H&&00Fw@K8f8<*xWM?tYgMVW@YT=Srzah#4M~V_g~JtoaY{l zE;)mA41|mTeX@tZIL!c%ns=Ki;`S1Z$FSCUu^QA`tYz?JcLegDLVH+Ctl>Jn1{xip zDDepks;1fKmhxb3?e4`#3D@O)g052xLn2T@}FOR%0LB+M0ng3Gbhbkb+ zPC(aiJbOA*a67)wLAnM3goU$zaRSAFV(Gf3$qfDoavMG4J}V65pAoPW8q8nW8OpYC zykxg3yiLz{895W(P+?MIbMZ)XGm)vo;8B}N?xcr|TWT}%eX~~CWE9)SSrB7GzrBJ( z-!;Nx5bO@;5$)ClEXXC&sCTUz$3~1ZcSgU$0UiTJPOs-KVy>(vSgCu%-9SyIXS3oE zVtD}JioiCQ*8I#*`H)2iea#K+yANcr(;Vagoj7Jv>D$dX@je*E`+$%bGaG>p8#?ox z*tG{IAmk~=2uDroGGrMoq~Pa#sbx0wAhLcY8~jxnd|N9p?W%*W5UMA}6Vg?qjL0O3--f(*!RTQzUnTiSPGFDTfz4q~`hZ$is@tbk1I9|f8u;5xG;VvTGFbwLhdM;N zDLB%pIbE{;XG*==kLy%Jo3gtid$-D{oYzS9;5N}?rObv-nu6D_h158~sELJ8_#0f4lLvvT>RXj{fR&habN94%p9j_9x?j zA}&Gn0hL`|8Ua^zFD-^SlT-~$#1cBdX@XO509Yau{En^xzVeVex)b z3SFg*f@FJW8D;W1ZKDTMun_4L$vFJ@x)?i=YC+=6&PcBub=6cB9cZM-$#HmVlP!*|oA zW<<=BcF&X!=7J`s$C$qD6@@4YEzFH4wSy&*b%sRc7MO45@A(5OVIyDZn8RI{W$!UH z?en|n)2N;)$;11V{56xD-yumw(8RInD8gdhH=ULmLrjH;BUkwXPIIL^WJ4csn(S^2 z5wdXYG594CUG?m2g0K;_g$N;?s}V~rg#NrWCNYtGIFCKh6`{LMv?$=|rclwUSGywh z+I`_k+EQ3|M5OLMS86O2 zGpdshO64#9T1Dix8}mYyZ<_64+HZQbSjR;Sghx~sVG-D;ac_Wk!;B*=TQk_1mTC~} zluk#+mI#vFuaXVD^uuGlA2!;JYC7DqU8Keu)OzIfxu4xeI;X)w`$(kLob7Zby?`$B1ciXdobJ zE+@>gjdq{TLUf<6^js&qpm2u_F1aZP=|y7tMhah?k4umM!2Aw@Y~gw$kw&J4Hw&D! zkP!*BiSk5Xq%*cov$&Xvl)1)ifFyO~6afU9VU3cSUx8^&11biKpaSL>_k7d0WDx4n z@Mbj-RxOfIeww_79(q**YHiwbI~7&T9@byD+(&R$n~m^sQbgTexX;h&aN2GVR<&p! zC+Me^IVTbp<$qn}V;3cP1Y2N-lv+G~W}&fQ-sWiCcMX4Znw5?Ud;I$z8K} zq}6P_a+lPsVG1;s_+vjrLc^MPq!Q4jc9Zl&W#8Z2qvqZ`{XgH(`G4<;o=18{(TEK}=~n~SW+ zsvR^1yYg_Vf*~}oWwe?^Lz4hnUf*DDNWbJxT0&P17C3h9%L7xP9uA$Zbb*8%YD;X7 zne{7ll-R4(3U@sLCVZy4AbT-n%%I4A2hqH9{CN4i+Qx< z@B!L_SXnqBAlFLMSt2mS?Zz%Ou0B%<4(*g0gIIio@y0UFTye;cO(O$xgtS({AazAv zEZuE{tHPwbKmVpahlw~XXKxvCR$WHISp+6o1gDUY;p)PnD{e+C_|C*(z|>JE^J<@% z!@tQaM-E6imEtedtvQwj4?~q6#h=&NITojF!5r3W!xz*nM7uFEwG1$Y%ukvw&B{Qq zlnbEyRgyNKrpg;9b@64ZefQ~6KRz~^w7+Zqb*(<`4Zzwe%;|?a{c+7N;VkGqb8}Yz zvblCe2$=l=)0l9ev>vF@_g~!r4x>37YmlQVAwdT(T26Ne11UnLqo^*4GI{0hxU&%) zd`At8nGGH?9^Q{V;%?SiT%=~v=wP94R)C@(sIAU0OSNOaMyeGXq>C^}Qm_BP*&;Oo zopAPI*FEmEUWMpPOP`m+YD`c88fr4S4frxUc32)rDMU8k>wrZT&1__A3AM!sBP=r?!G)}@J%&(eye(RdVHNilqw2(^Z)X zH}&d58`qv3f07@EtL>Dq4s(Y=s>u8aqb>t&Y4H`PE{p0ThgKs1G0^T>BlG)TTJ)L1{3dIz&b@*t_#fJBi41m zsI|iO;XA0dIOIGvQ5YE)IGGC^JG$PxMNKdSDp^bK^ka_A@)Isn@7xP05~o%^$;_t) z%WwWk9$LOJRHdM!Gz^SVh0IP7H}WVFhX*Bke@wuf29{KgjilV~58x+|1n45#`a`aN zF}~G`rOi?XLJT7@Y?x|RDV`NXa&oB8oPpXDBdB3|)0Qw%`v6SrAkztsR_LeId3~XD^6FYuRbFaf?nUF`Jy9Nv^FoRWAuP)AbN< z^8Tk!KtU3nRv_W^H=Mte3uF)_po6OIV1XZ@ag;jOERg0rn@vTVF{-72T#Zng(hTdR zN3&*qa7CRww|8)fdios~2VY>}z`OZc*s^pzFUI-b!b83l^W17NfH?X6XJM@0HxN{n zXvrRY!|kzFeA3@e$SH}XXx+*SDECy$m5LsQ%) zR>QO}M~gAzLR{Kq3sZ51}36nI{FW zd&c5}&#GY8y8Zn-VuL2DLUbo6bgJi zZ$u?;ly!vGe!hH^wh=mrF{ivxf4lgKuY20rGox9O@o7P4TJps%I9{M_k@}co=*Ka# z)WOs(*z@~nspQu|c_!h=d%YboOn zQUlp^po;Sa%>ZY(@rh(3d$3G1C)A@VcjVBFWP6ZKF+^CL(wP;9s_5`UK@}F;TxY_| z)^(xgA}k8?N05*5inWY$6Z{=?k_vyZJ75A9kv%|}acHuSjWDizeCpvhB-Wqr>p|IW zgWt$};#mD`DI($(S`sRHeaNS!9wS$7{Z6i7alYtH=T2Rx*`cQV7egCZ()IJg3<4rk zSnW2Vi23uC#$*UFnbD9|o9m7XrZjs0Jc@w3=dE~{o_qsZGqI|4WAV;%SWVqqH-a3Q zyUn72n`r@vnwGvskey+Ig_|%0LDylcmNqU*eeYmJt3tAW*rUQKKlAh#AeE2+$K&%s zRSo);tDIAPD$w-{MRRU6s3EVHZ-6D6GflC=(6KP2v87t}x)51p<*{=!;S&qUX5K>4 zyvLKkpH=@vH|#6RmeJY#G4OBZ-Z((ad_f;B)_oLY?uN1gWvkPcTyb+jee$Eh2&m)W z%Vm_B#bbe4p+Wev5HX8E+aRwv;~&nuO|g65@Gopb()L##oZv&UK#O_mg5j#~059J- zocf`VMn1WsfZy|2iXx;Ik2#X)NO`bG<6dupm)TSzf|{C)pXl3UgzO(Nc$E-boi~Gh z0L6IfT=2N*#$5^6+GtT@nY4bxJ^o_+D)DEMt^D-w-&v5(hiR}QwqI3s+49Lc?IOYX zf_aY%aJ`fCYy2$>t-#>_sl5KUuR7(VQDTmG$cY&MYgSETGD5&Q@f z=v99<70*$=Dx1EyboFdZ8G0ih6b)3U!Q9cXMkJ4lf{0?Q^@ zj)1(i*CoQo&dgSg=kD@IAI>7)+C_a6=L70@tBF)zdrB(zR>BcA)vqKE6fXQ{xM5up zDVd;!?!nPejOte+^#S)4Xy5NAGUQfO8n}8%p`qwfqXwNy!dvE~%QA_Vhq{G4uMJts=sNwjSe2R? z%wyI%=H0Vb6*f(~gu$&+tyA}B=(_EAVe-;q8pRoiL)fo@#W}X%ThCa6Gq+oBB%-fO zh3bvg{aEm~Qv>jg#Jrom3#4Wz+62ikcgJ!d(>HB7VCB1#r^x=;tjw5BapaD{q0cNi zXju!oMV^E%N%_lFt|nafp3~uMAIx5}l*)i!ci085T>0Z{juCGmnY0qY;GgnrRKvDR zFDL|D^yw&OP0c(pQDF@l6jz5^6DV6WDJ#3EyR6nlgNWf)Cgv^-&6y2ZSN1G9+^R!6 zy=B!Qph&g23?NsnFf{~=?zc67kaN{#NsPR#iWj)ytoNEEYsu16%(+rwaK&ThU=N*? zGyK@l^}`7d~wj?I~jh zFv+1vwQ(lop)&MiL84+(~TUfKbqupx1+`=WFqjf+~}h2D%XdP8m(dP8I= z(Mj#Zht^W6NxtwthaU3RmrCx9RSidNA7MU25Yve%vyg8A-&qyCDook9i^)6fm%1GI z3rAnmF;bzgPI{FRxhab*?iVw1tCsJ2Lh`1pjY}BZu`9bqwHI$Kyo!4LzJCvOhyf^O zpxwKf3GHn&vu&+tmF}VqESCqQxzRuK>t}Fcupr5?Z8hN|xR8t^W?$nv;y&wigNc@> zVu)4V)ahyElt!Sck+o-HYqzUyE$W8?nayC*t=9)N$H3NSH>0jyhQ0y0RIkmtcQ!UM zrsn-`7DYy`x&p2EI0OfWHdccMo`rw<+2aIX@hoIuF=wOsBXfdB$(04>gT2bKtIxH=s$Xr*siFQp(qn zK7bR6?N`_?Nd6sFNm-*&P@vy+>-FvG4zIY!J=nPaJsK8t$*b^CrJw5)gV`azM4fl( ztTit!Ul|-EkZ`+UG!L~+!1+oL@aT(XoN3-@E|iT8{2;iVft&B zDTh?^>_TK$13vT}J0>B~9{+$VDli!CB=Xn+fvpHICdBZHK&IFe) zHtSkfZJ4pic_98LzMFr=X=Oq+Tt3qgvJaG>r{#W;zFW+uF9=*aWb^J24OdJ(OmWnO zmq2f9Nij@pu|Z+47cy<(#G)Uezx~nk(8RW^Z+l#Jwns+KIMo`L@VrvjqpO~77!wKYOlCPCrnW4d=Zf_Wp~-W6#VNWFmXPqJD%ne0Xy^rY(knu4)gPV5bp&b+wSW7k^z{ zzraAk1}rwdK;W%ZnrUlG)s>c>lnhsl4=qZl)HSjh*olG!13HM%}sS z+QoUXnq95XUV`5sIg9&q3%24O^uz>Nuv;D^2J&C^_~b!6l4|-ritCyNw7Hya$4ot3 zuVF$&!Fwf>w3y7(C=6B*M(Na@3<^7~Wdds^>8eB~RPhzjjIGz*K!e-_n*9CMGdnrk7?quaV%2_2J+>JHoj?9O~5}kp{&)h=8YoC^m;43 z*P|DFHgQ3&=2enUJ+Xg>0OGc`9#~q6KldKtWcOe%mKHE&%(Dl+5m->LnRw1gye=X* zK~2YId)_c_E-}a^vEgm@6fy`)g!fa|w!wAigoQUWno{Gw?+-MooGK0JfbG8>lg)nq zpd^b^MXS;>L@E09E(9^Ml|8;*PjRWH(IP2>J%JNN-&YLGc)n-)cD>`iKsrTwlS>;I z@>QI~1=mZJO&r<4vx?ioft)+{{h3XK#7$G=tPGv2W2cTrMrS@84tLq{J*qpRapS;( z^DKOgN{iEKpj%`i7D^-d`Iv+99B1`K^OD{0?Z15?h|q&!es7U9w|*B{M@*SWtMV&4 z@(+NF;_POU9YROF55nmvs1g2+mZhcNEpJ*s>oBKlDU{VT z=0i^TOfc4R5@(i+(;>ZXh|zjB4mcf)fXjg)rF=@D-E!YDCgj2-x-(Ps-K1{kC-V+( zHUU1r(qV3X98Iud6-7Z<&A2bO9%4r=8zb=^X3Xb1bowMORyO#<|!z2 zo{Js@i$1Hp7&q6Z%AIgdmtSDR+mVXcl$_{ZI34!| zT?r9`MERw~`Gj-fN<7}1pabRwY2{VCcvUs=*-SXedj}CRj#(Yqd?yHej`<$zE!qdC z92gFE^j@u+4wt${;b=PG&MsW`dhGOq7inxiFU0PFso18z;6xiOJ!uZ{Yk~6#g<40= zTrgX)XqPpcA?heGI6)w=Xn3fBPqpj7M)HhdLrV#E(SkP^m)r$pg-uR-d;-K0w$eDB zKR_4!+uc|dRU9C@gv?`-mc|8hN8!mn_%=q|_m-N= zcB&rNj<5hidy>A_Fxw#A*cnhF?ep|Bb3~M3?9(+Q3`p04ZZ3dBO}AD#OUxD)oO?)T z(}g#eV-wXgzJh74LjX|i!7s0x5Wwa{qjl3=!>xB zWz@C*hg&);mS@)v>EgzjkIzbpv3SR5Uw;I|QD?VePlwn}lod}Wl@zO+gXRM7A<77A z?*}8JrC@bf*iiKPg8?j+CHVug6;Jg5x`4<&nXNgHw7qiH!S`gPysgb?nIrUf8r=T# zQ5{@!r#&}G_qLkB=5(|;&=3fqQXs~Z{=eWwcKbg zbbt~)qx%w>!+J9~I)$_u*lc2_)jq~63~3x@h}q@w{>g8Wxcsi9x)98@JD@AC>j`I+p{UAy`nTI*f<^u5OWMkB8Ic4TK+s zO?H3?4pEi{x+yNPEq(XmDpl&#=J^@*H&0%FlDLP!;Q&i@Mm%_=Ao|%5beoZE3EE*; z+Y%_ko~1TFfkIT5n8%0ZJtmee_yqDqD;cs?DtqtwMSG!#C zdb^~-CjgH1mk0f3h<4Rx=H+I&wLM;2l{~kf+*^tZ!54-AK9bKEQP~@Y%{l(y*Uwj@ z=XeO-D#7)FHV{vv!4 zk$~r`0!&Js^Wc&Dz-Dr`WM)3*sG}*?7mt64?rNoGJCv+XXDi!I99!Q$44?GU-+emR z@ES$+=-rdu(hxWvdl<+YslFkSp?CR#??*k*K?Ke1)2YTM7CkI8;zbEq%gM!qT2$Qs z=1q+C?<_=}dNocx2Xto3zp|?$j`7hKTx#Ppcln6h0WV9D}T~0?X zeF9^+1SYS2@$He^kO)S;qErmKAmG0BhmZcKn9x35Oo5q|*TaD6xtaYsKr9VQp(V)4 zChQGMk$y8sN^Ou$#A2ss1!73eO5^G>{~Ihj7N6E`QF`~Dd9Sz9O%LwX8YQGr+g0=} zw3eWd8#XU0jgU3IbGPn~H6f4fj;p0scErQRN&lqYw%Z{iVP5)J)cJ%wlI$Ciz)>x_ z+Nmht4-n3b3`c2oqLn(Xq55mg({v+~?F%KIx*cpuYLh+QKG$Yn)I3K|9idng8oUO& zatNskEv-QUh`EE*E2Zht^|`9ZS0?Pw$S$2jgr0%&eTUdEL7%xlb^|8X6UjG~Df{s) z>Z+;xTdTV=e*v_%vRy_Mn5nnEX{$s&h&IG*@}@7Jo&|mwkHou88WJnDYZ6)Swes%0 zU?rqf;X{1r5z|qITML#5-yJ0Y%L*A<`46M2oHw^#k(JP~X*W6E(VKD0;_UW2&Wah| z5-W-{ZZ8kDRQ|-NO)nR{I{h6C0l%7?JvFQIyGr(dS*~46Uz}VlT*-Tbd-JoJvOpf2 zlD(QT6rrq-olHQr!zq4h#uujq(?}b0yxx>v4_*@QEu!fVeFx>!)k+8v3;jLT~Y77tSGkcQ(cy*k+8(xpf4$YnfSromS(P;7V3PE<5qG+QysxrDZ{$mE z1m&UDqV8WNg3tv7GP?o|Y5y_#@Pvx%n!I`bVm5WA%sXh&mYTFuJ1w7*s~lm%J^_Qpq` zJM9xXl~D5w)JlE@2t8+OPKDHJf(_V%lPZgMS>h}CMT2Fh>&oOP3pY>$O8-OZS?hf^ z;@jcmYu`8!Xi`%B{kQGK!<&)Sc>8J=MdxIO=sP}N?*D1z+=H6Bwm7V9MGXRqis%J7 z7B!#(Imt_qX<_+%ERuHfV`$NJu{hPuf5iOthM*~C-eO>of7}i`9i}V^>0Y^Daol? zT4&NEzNmA14-Rj*dniStYCM$3kQ3cniflVCoNHM5ZI>yF@V`3){rLl~#3cvYotXc& zTiViW)MTxE%o1}S-SG)ANm7or=uS14U%zV_za;d`J*B1lE3Ng%7cD$D7zbIBSe6u# zn!L(_uqUv>y0-PVOxWe>c*kKJnZNY@;*rvfA^xsY>}{x_BLK}r2Fer|F7 zf;47V=DnQOjx>7J;N`$kd0m_8>P41((L7a*OlZ!&`}; zk{jk`pElYU{I)5fdr#N(eKiGbokp(L1eYi-Yf7CPYs-3mK5V9Ox|?K5=-EmsS)?`d z9kn_r>W~a~wcI;bee@MMZtTLUytEE)?UOBYvYzN}X~jYA&%7JU(UFs+mUUzM zu~qgJot00q6-DrY>V`K-5;)3zpTm{k}fBc2jTf<|BnM2mF`4Mt-$l zwkYmpFZaeLH38hRsMObX%CiJ3bzf&XwfsO(L7!!`ujYw-XH}~NTODHuR6~u-J9`sm+}3Y+vIFNoyQN4kJ=6RW~J;cGBMlZ;IG;#-qGs6dBhe| zd?^2IIT~v@V%hAkeHeZF$_JmN1Yh5z$t+)dth7*ge_n@%T?Q5JJ7%8FB?g-Q^#45F zgQnZ2N#7N%k#Q6r%f(6MDJ@)Bq&#i@iz+HPyT7B4U0F%x5m;k*E!~VMUcm`bMq408dtNfIJyx}{;odz)lkbv37X~>=L93ZNLAcEi6P4fxJA_e;F~k;u8=N)1F!!P*wb&HIB)$ zX94@y=ATm~7e&?uKHJKEv-za!uVy(8C%1Ne8J*c_j=!8koWL1G(-ZyZklm?z^(g%V6G)nnDT zX*)Xxz~>%9%)rQd3fNL0BV*_mvH3CxdFSe3V9*_48Vos*DKrR%85D>?qm0AnJ{Vgh z4hzP`ZF(D8-+3=od;EM$a zIS;}i^Qq7|4%pA$s($>ycn+qA>+uvA;BpVPjE4{PS9=ypWf-1>@f|SP<4rPEKjQQ( zf(*R72BE?TOdtb{KADa}G!g=lQ8ED584!ZdASz7P+o@FW8+>RW5MjU&1^1z$dfPZ^9gzFt;8-E9_5g-qs9o&ZLu_%=~Ydn;OPVvK}Y;cdHdKsxe;Xxjr qD_n>nKr9CJpHz + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + + + Block 1 + + Block 2 + Kernel + + Block 3 + + Block 4 + + Block 5 + + Block 6 + + Block 7 + + Block 8 + + + + Block 1 + + Block 2 + + Block 3 + + Block 4 + + Block 5 + + Block 6 + + Block 7 + + Block 8 + + SM 1 + + SM 2 + + Device + + + SM 1 + + + + SM 2 + + + Device + + + SM 1 + + + + SM 2 + + + + Block 1 + + Block 2 + Block 3 + + Block 4 + + Block 5 + + Block 6 + Block 7 + + Block 8 + + + + + Time + + + diff --git a/BookGPU/Chapters/chapter2/ch2.tex b/BookGPU/Chapters/chapter2/ch2.tex new file mode 100755 index 0000000..b06e9be --- /dev/null +++ b/BookGPU/Chapters/chapter2/ch2.tex @@ -0,0 +1,45 @@ +\chapterauthor{Author Name1}{Affiliation text1} +\chapterauthor{Author Name2}{Affiliation text2} + + +\chapter{Introduction to CUDA} +\label{chapter2} + +\section{Introduction}\label{intro} +In this chapter we give some simple examples on CUDA programming. The goal is +not to provide an exhaustive presentation of all the functionalities of CUDA but +rather giving some basic elements. Of course, readers that do not know CUDA are +invited to read other books that are specialized on CUDA programming. + + +\section{First example} + +This first example is intented to show how to build a very simple example with +CUDA. The goal of this example is to performed the sum of two arrays and +putting the result into a third array. A cuda program consists in a C code +which calls CUDA kernels that are executed on a GPU. + + +As GPUs have their own memory, the first step consists in allocating memory on +the GPU. A call to \texttt{cudaMalloc} allows to allocate memory on the GPU. The +first parameter of this function is a pointer on a memory on the device +(i.e. the GPU). In this example, \texttt{d\_} is added on each variable allocated +on the GPU meaning this variable is on the GPU. The second parameter represents +the size of the allocated variables, this size is in bits. + +In this example, we want to compare the execution time of the additions of two +arrays in CPU and GPU. So for both these operations, a timer is created to +measure the time. CUDA proposes to manipulate timers quick easily. The first +step is to create the timer, then to start it and at the end to stop it. For +each of these operations a dedicated functions is used. + +In order to compute the same sum with a GPU, the first step consits in +transferring the data from the CPU (considered as the host with CUDA) to the GPU +(considered as the device with CUDA). A call to \texttt{cudaMalloc} allows to +copy the content of an array allocated in the host to the device when the fourth +parameter is set to \texttt{cudaMemcpyHostToDevice}. The first parameter of the +function is the destination array, the second is the source array and the third +is the number of elements to copy (exprimed in bytes). + +\putbib[biblio] + diff --git a/aa b/aa deleted file mode 100644 index e69de29..0000000 diff --git a/plan.txt b/plan.txt new file mode 100644 index 0000000..fa82f37 --- /dev/null +++ b/plan.txt @@ -0,0 +1,112 @@ +Chapter 1: Presentation of the GPU architecture +Author: Raphaël Couturier: University of Franche-Comte, France +This chapter will introduce the GPU architecture and the classical model proposed by CUDA. All backgrounds necessary for the remainder of the book will be first presented here. + + +Chapter 2: Simple examples with CUDA +Author: Raphaël Couturier: University of Franche-Comte, France + + +**** Part : image processing + +Chapter 3: Fast kernels for image and signal processing +Authors: Gilles Perrot, Raphaël Couturier and Stéphane Domas: University of Franche-Comte, France, Nicolas Bertaux, University of Aix-Marseille, France +In this chapter, we will introduce and present many kernels that can drastically enhance signal and image processing algorithms. Although these kernels seem to be very common, they have not yet been well described in the literature. + + +Chapter 4: Region Based Algorithm for Large Images Segmentation on GPU +Authors: Gilles Perrot, Raphaël Couturier and Stéphane Domas: University of Franche-Comte, France, Nicolas Bertaux, University of Aix-Marseille, France +In this chapter, we will present an algorithm for region-based active contour techniques (snakes) as they seem to achieve a high level of robustness and fit with a large range of applications. + + + + + +**** Part : Software Development + + +Chapter 5: On the development of high-performance software library for emerging architectures: design and analysis +Authors: Allan P. Engsig-Karup, Bernd Dammann, Jeppe R. Frisvad and Stefan Lemvig: Technical University of Denmark, Denmark +This chapter will present performance portable tuning techniques via a modern parallel programming. Then it will focus on efficient and scalable iterative methods for solution of high-order numerical methods and strategies for efficient implementations on desktop architectures. + + +Chapter 6: Pertinence and development methodologies for GPU and cluster of GPU +Authors: Sylvain Contassot-Vivier: University of Nancy, France, Stéphane Vialle, Supélec, Metz, France +This chapter proposes to draw the main frontiers of the fields of applicability of GPU acceleration as well as development methodologies to obtain efficient codes in classical scientific applications. + + + +Chapter 7: Fast GPU-accelerated desktop application +Authors: Allan P. Engsig-Karup, Bernd Dammann, Jeppe R. Frisvad and Stefan Lemvig: Technical University of Denmark, Denmark +This chapter will present discussions, analysis and highlights of a new massively parallel engineering tool for nonlinear free surface flows intended for both engineering analysis and interactive real-time computing, e.g. for applications in coastal and offshore engineering and first of its kind physics-based ship simulation. + + + + + + +**** Part : Optimization + +Chapter 8: GPU-accelerated Tree-based Exact Optimization Methods +Authors: Imen Chakroun, Nouredine Melab and El-Ghazali Talbi: INRIA Lille, France +This chapter will present the latest techniques and algorithms for solving tree-based exact optimization methods on GPU. + +Chapter 9: Parallel Meta-heuristics for Solving Challenging Problems on GPU Accelerators +Authors: Thé Van Luong, Nouredine Melab and El-Ghazali Talbi: INRIA Lille, France +This chapter will describe parallel metaheuristics for solving complex problems in science and industry. This work is based on local search metaheuristics. + +Chapter 10: Linear programming on a GPU: a study case based on the simplex method and the branch-cut-and bound algorithm +Authors: Paul Albuquerque: HES-SO, Geneva, Switzerland, Xavier Meyer and Bastien Chopard: University of Geneva, Switzerland +This chapter will address the main issues related to programming the simplex method on a GPU. Then it will present how to integrate this GPU-based simplex method in a branch-cut-and-bound framework which will take place between the CPU and the GPU. + +Chapter 11: Performing large scale robust regression on GPUs +Authors: Gleb Beliakov and Gang Li: Deakin University, Melbourne, Australia +In this chapter we will report on the use of GPUs for large scale robust data analysis. Identification of outliers in large multivariate data sets is difficult, because outliers shift regression models in their direction so much that they become undetectable by their residuals. + + + +***** Part : Numerical applications + +Chapter 12: Sparse linear system solvers with the GMRES method on gpu clusters +Authors: Lilia Ziane Khodja, Raphaël Couturier and Jacques Bahi: University of Franche Comte, France +In this chapter, the adaptation of the GMRES method will be presented and several techniques (compression, partitioning …) allowing to increase the scalability of this algorithm for GPU cluster will be described. + + +Chapter 13: Parallel solution of the Obstacle problem on GPU clusters +Authors: Lilia Ziane Khodja, Raphaël Couturier and Jacques Bahi: University of Franche Comte, France, Ming Chau and Pierre Spiteri: University of Toulouse, France +This chapter is devoted to the implementation of the Obstable problem on GPU clusters. This problem is a non linear PDE occurring in financial mathematics (option pricing) and constrained structure mechanics. Synchronous and asynchronous implementations will be analyzed. + +Chapter 14: Complex fluid lattice Boltzmann on GPU clusters +Authors: Kevin Stratford and Alan Gray: University of Edinburg, United Kingdom +This chapter will present a complex fluid lattice Boltzmann application such that it can scale and perform excellently on large-scale GPU clusters. + +Chapter 15: Deployment on GPU of an atomic physics program +Authors: Pierre Fortin, Rachid Habel, Fabienne Jézéquel and Jean-Luc Lamotte: University of Paris 6, France Stan Scott +This chapter will describe the deployment on GPUs of PROP, a program of the 2DRMP suite which models electron collisions with H-like atoms and ions. + +Chapter 16: GPU-based envelop-follow simulation techniques for power converters design +Authors: Sheldon Tan + students: University of California, Riverside, USA +This chapter will introduce a new envelope-following parallel transient analysis method for the general switching power converters. This method exploits the parallelism in the envelope-following method and parallelize the Newton update solving part, which is the most computational expensive, in GPU platforms to boost the simulation performance. + +Chapter 17: Domain decomposition method on GPU architecture +Authors: Frédéric Magoules: Ecole centrale, Paris, France +This chapter will present how GPU architecture can increase performances of domain decomposition methods. + + + + + + + + + +**** Part Other + +Chapter 18: Pseudo Random Number Generator on GPU +Authors: Raphaël Couturier and Christophe Guyeux: University of Franche-Comte, France +This chapter will present some pseudo random number generators which are essential in many applications. We have proposed a generator which has chaotic properties which are proved, whereas it is not the case for other generators. Our generator succeeds to pass all statistical battery series. + + +Chapter 19: Solving large sparse linear systems for integer factorization on GPUs +Authors: Bertil Schmidt and Hao Yu Dang: University of Mainz, Germany +This chapter will present the number field sieve (NFS) which is the current state-of-the-art integer factorization method. It will focus on how GPUs can be used to accelerate this highly time consuming operation. -- 2.39.5