+% THIS SUBSECTION MUST BE IMPROVED
+
+\subsubsection{Intersection Core Matrix (\textit{ICM})}
+
+To extract core genes, we iteratively collect the maximum number of
+common genes between genomes and therefore during this stage
+an \textit{Intersection Core Matrix} (ICM) is built. ICM is a two
+dimensional symmetric matrix where each row and each column correspond
+to one genome. Hence, an element of the matrix stores
+the \textit{Intersection Score} (IS): the cardinality of the core
+genes set obtained by intersecting one genome with another
+one. Maximum cardinality results in selecting the two genomes having
+the maximum score. Mathematically speaking, if we have $n$ genomes in
+local database, the ICM is an $n \times n$ matrix whose elements
+satisfy:
+\begin{equation}
+score_{ij}=\vert g_i \cap g_j\vert
+\label{Eq1}
+\end{equation}
+\noindent where $1 \leq i \leq n$, $1 \leq j \leq n$, and $g_i, g_j$ are
+genomes. The generation of a new core gene depends obviously on the
+value of intersection scores $score_{ij}$:
+
+% TO BE CONTINUED
+
+$$
+\text{new Core} =
+\begin{cases}
+\text{Ignored} & \text{if $\textit{score}=0$;} \\
+\text{new Core id} & \text{if $\textit{Score}>0$.}
+\end{cases}
+$$
+
+if $\textit{Score}=0$ then we have \textit{disjoint
+relation} \emph{i.e.}, no common genes between two genomes. In this
+case the system ignores the genome that annul the core gene
+size. Otherwise, The system removes these two genomes from ICM and add
+new core genome with a \textit{coreID} of them to ICM for the
+calculation in next iteration. This process reduces the size of ICM
+and repeats until all genomes are treated \emph{i.e.} ICM has no more
+genomes. We observe that ICM is very large because of the amount of
+data that it stores. This results to be time and memory consuming for
+calculating the intersection scores. To increase the speed of
+calculations, it is sufficient to only calculate the upper triangle
+scores. The time complexity for this process after enhancement is thus
+$O(\frac{n.(n-1)}{2})$. Algorithm \ref{Alg1:ICM} illustrates the
+construction of the ICM matrix and the extraction of the core genes
+where \textit{GenomeList}, represents the database where all genomes
+data are stored. At each iteration, it computes the maximum core genes
+with its two genomes parents.
+
+% ALGORITHM HAS BEEN REWRITTEN
+
+\begin{algorithm}[H]
+\caption{Extract Maximum Intersection Score}
+\label{Alg1:ICM}
+\begin{algorithmic}
+\REQUIRE $L \leftarrow \text{genomes vectors}$
+\ENSURE $B1 \leftarrow Max Core Vector$
+\FOR{$i \leftarrow 0:len(L)-1$}
+ \STATE $score \leftarrow 0$
+ \STATE $core1 \leftarrow set(GenomeList[L[i]])$
+ \STATE $g1 \leftarrow L[i]$
+ \FOR{$j \leftarrow i+1:len(L)$}
+ \STATE $core2 \leftarrow set(GenomeList[L[j]])$
+ \STATE $Core \leftarrow core1 \cap core2$
+ \IF{$len(Core) > score$}
+ \STATE $score \leftarrow len(Core)$
+ \STATE $g2 \leftarrow L[j]$
+ \ENDIF
+ \ENDFOR
+ \STATE $B1[score] \leftarrow (g1,g2)$
+\ENDFOR
+\RETURN $max(B1)$
+\end{algorithmic}
+\end{algorithm}
+
+\subsection{Features visualization}
+
+The goal is to visualize results by building a tree of evolution. All
+core genes generated represent important information in the tree,
+because they provide information about the ancestors of two or more
+genomes. Each node in the tree represents one chloroplast genome or
+one predicted core called \textit{(Genes count:Family name\_Scientific
+names\_Accession number)}, while an edge is labeled with the number of
+lost genes from a leaf genome or an intermediate core gene. Such
+numbers are very interesting because they give an information about
+the evolution: how many genes were lost between two species whether
+they belong to the same familie or not. By the principle of
+classification, a small number of genes lost among species indicates
+that those species are close to each other and belong to same family,
+while a large lost means that we have an evolutionary relationship
+between species from different families. To depict the links between
+species clearly, we built a phylogenetic tree showing the
+relationships based on the distances among genes sequences. Many tools
+are available to obtain a such tree, for example:
+PHYML\cite{guindon2005phyml},
+RAxML{\cite{stamatakis2008raxml,stamatakis2005raxml}, BioNJ, and
+TNT\cite{goloboff2008tnt}}. In this work, we chose to use
+RAxML\cite{stamatakis2008raxml,stamatakis2005raxml} because it is
+fast, accurate, and can build large trees when dealing with a large
+number of genomic sequences.
+
+The procedure used to built a phylogenetic tree is as follows:
+\begin{enumerate}
+\item For each gene in a core gene, extract its sequence and store it in the database.
+\item Use multiple alignment tools such as (****to be write after see christophe****)
+to align these sequences with each others.
+\item Submit the resulting aligned sequences to RAxML program to compute the distances and finally draw the phylogenetic tree.
+\end{enumerate}
+
+\begin{figure}[H]
+ \centering \includegraphics[width=0.8\textwidth]{Whole_system}
+ \caption{Overview of the pipeline}\label{wholesystem}
+\end{figure}