From: Michel Salomon Date: Mon, 2 Dec 2013 13:02:43 +0000 (+0100) Subject: I've finished my modifications in section 3 X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/chloroplast13.git/commitdiff_plain/def3c748d0d7f8073ad9a8602822d8f7e0d36a9c I've finished my modifications in section 3 --- diff --git a/annotated.tex b/annotated.tex index c9e00bd..e9e3049 100644 --- a/annotated.tex +++ b/annotated.tex @@ -240,36 +240,25 @@ score_{ij}=\vert g_i \cap g_j\vert \label{Eq1} \end{equation} \noindent where $1 \leq i \leq n$, $1 \leq j \leq n$, and $g_i, g_j$ are -genomes. The generation of a new core gene depends obviously on the -value of intersection scores $score_{ij}$: - -% TO BE CONTINUED - -$$ -\text{new Core} = -\begin{cases} -\text{Ignored} & \text{if $\textit{score}=0$;} \\ -\text{new Core id} & \text{if $\textit{Score}>0$.} -\end{cases} -$$ - -if $\textit{Score}=0$ then we have \textit{disjoint -relation} \emph{i.e.}, no common genes between two genomes. In this -case the system ignores the genome that annul the core gene -size. Otherwise, The system removes these two genomes from ICM and add -new core genome with a \textit{coreID} of them to ICM for the -calculation in next iteration. This process reduces the size of ICM -and repeats until all genomes are treated \emph{i.e.} ICM has no more -genomes. We observe that ICM is very large because of the amount of -data that it stores. This results to be time and memory consuming for -calculating the intersection scores. To increase the speed of -calculations, it is sufficient to only calculate the upper triangle -scores. The time complexity for this process after enhancement is thus -$O(\frac{n.(n-1)}{2})$. Algorithm \ref{Alg1:ICM} illustrates the -construction of the ICM matrix and the extraction of the core genes -where \textit{GenomeList}, represents the database where all genomes -data are stored. At each iteration, it computes the maximum core genes -with its two genomes parents. +genomes. The generation of a new core gene depends obviously on the +value of the intersection scores $score_{ij}$. More precisely, the +idea is to consider a pair of genomes such that their score is the +largest element in ICM. These two genomes are then removed from matrix +and the resulting new core genome is added for the next iteration. +The ICM is then updated to take into account the new core gene: new IS +values are computed for it. This process is repeated until no new core +gene can be obtained. + +We can observe that the ICM is very large due to the amount of +data. As a consequence, the computation of the intersection scores is +both time and memory consuming. However, since ICM is a symetric +matrix we can reduce the computation overhead by considering only its +triangular upper part. The time complexity for this process after +enhancement is thus $O(\frac{n.(n-1)}{2})$. Algorithm ~\ref{Alg1:ICM} +illustrates the construction of the ICM matrix and the extraction of +the core genes, where \textit{GenomeList} represents the database +storing all genomes data. At each iteration, it computes the maximum +core genes with its two genomes parents. % ALGORITHM HAS BEEN REWRITTEN