-two arrays. With CUDA, a kernel starts with the keyword \texttt{\_\_global\_\_}
-which indicates that this kernel can be call from the C code. The first
-instruction in this kernel is used to computed the \texttt{tid} which
-representes the thread index. This thread index is computed according to the
-values of the block index (it is a variable of CUDA
-called \texttt{blockIdx\index{CUDA~keywords!blockIdx}}). Blocks of threads can
+two arrays. With CUDA, a kernel starts with the keyword \texttt{\_\_global\_\_} \index{CUDA~keywords!\_\_shared\_\_}
+which indicates that this kernel can be called from the C code. The first
+instruction in this kernel is used to compute the variable \texttt{tid} which
+represents the thread index. This thread index\index{thread index} is computed
+according to the values of the block index (it is a variable of CUDA
+called \texttt{blockIdx}\index{CUDA~keywords!blockIdx}). Blocks of threads can