X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/chloroplast13.git/blobdiff_plain/9164a34ac78d5cc31c7b460d9a3c264c855910d8..f8689d1f199221556dc52e80c5237c45addb0218:/annotated.tex?ds=sidebyside diff --git a/annotated.tex b/annotated.tex index e188973..762d50c 100644 --- a/annotated.tex +++ b/annotated.tex @@ -12,9 +12,8 @@ A local database attached with each pipe stage is used to store all the informat \subsection{Genomes Samples} In this research, we retrieve genomes of Chloroplasts from NCBI. Ninety nine genome of them are considered to work with. These genomes lies in the eleven type of chloroplast families. The distribution of genomes is illustrated in detail in Table \ref{Tab2}. - -\input{population_Table} - + +\input{population_Table} \subsection{Genome Annotation Techniques} Genome annotation is the second stage in the model pipeline. Many techniques were developed to annotate chloroplast genomes but the problem is that they vary in the number and type of predicted genes (\emph{i.e.} the ability to predict genes and \textit{for example: Transfer RNA (tRNA)} and \textit{Ribosomal RNA (rRNA)} genes). Two annotation techniques from NCBI and Dogma are considered to analyse chloroplast genomes to examine the accuracy of predicted coding genes. @@ -77,12 +76,15 @@ The second pre-processing method states: we can predict the best annotated genom \subsubsection{Intersection Core Matrix (\textit{ICM})} -The idea behind extracting core genes is to iteratively collect the maximum number of common genes between two genomes. To do so, the system builds an \textit{Intersection Core Matrix (ICM)}. ICM is a two dimensional symmetric matrix where each row and each column represents one genome. Each position in ICM stores the \textit{Intersection Scores(IS)}. IS is the cardinality number of a core genes which comes from intersecting one genome with other ones. Maximum cardinality results to select two genomes with their maximum core. Mathematically speaking, if we have an $n \times n$ matrix where $n \text{is the number of genomes in local database}$, then lets consider:\\ +The idea behind extracting core genes is to iteratively collect the maximum number of common genes between two genomes. To do so, the system builds an \textit{Intersection Core Matrix (ICM)}. ICM is a two dimensional symmetric matrix where each row and each column represents one genome. Each position in ICM stores the \textit{Intersection Scores(IS)}. IS is the cardinality number of a core genes which comes from intersecting one genome with other ones. Maximum cardinality results to select two genomes with their maximum core. Mathematically speaking, if we have an $n \times n$ matrix where $n$ +is the number of genomes in local database, then lets consider:\\ + \begin{equation} Score=\max_{i0$.}