update section two

[chloroplast13.git] / annotated.tex
diff --git a/annotated.tex b/annotated.tex

index e1889730f1cbbd38d2464dba9abf353905155088..762d50c60baa5a2db681c9f03b93d2467c74390c 100644 (file)
--- a/annotated.tex
+++ b/annotated.tex
@@ -12,9 +12,8 @@ A local database attached with each pipe stage is used to store all the informat
  
  \subsection{Genomes Samples}
  In this research, we retrieve genomes of Chloroplasts from NCBI. Ninety nine genome of them are considered to work with. These genomes lies in the eleven type of chloroplast families. The distribution of genomes is illustrated in detail in Table \ref{Tab2}.
-
-\input{population_Table}       
-
+       
+\input{population_Table}
  \subsection{Genome Annotation Techniques}
  Genome annotation is the second stage in the model pipeline. Many techniques were developed to annotate chloroplast genomes but the problem is that they vary in the number and type of predicted genes (\emph{i.e.} the ability to predict genes and \textit{for example: Transfer RNA (tRNA)} and \textit{Ribosomal RNA (rRNA)} genes). Two annotation techniques from NCBI and Dogma are considered to analyse chloroplast genomes to examine the accuracy of predicted coding genes.   
  
@@ -77,12 +76,15 @@ The second pre-processing method states: we can predict the best annotated genom
  
  \subsubsection{Intersection Core Matrix (\textit{ICM})}
  
-The idea behind extracting core genes is to iteratively collect the maximum number of common genes between two genomes. To do so, the system builds an \textit{Intersection Core Matrix (ICM)}. ICM is a two dimensional symmetric matrix where each row and each column represents one genome. Each position in ICM stores the \textit{Intersection Scores(IS)}. IS is the cardinality number of a core genes which comes from intersecting one genome with other ones. Maximum cardinality results to select two genomes with their maximum core. Mathematically speaking, if we have an $n \times n$ matrix where $n \text{is the number of genomes in local database}$, then lets consider:\\
+The idea behind extracting core genes is to iteratively collect the maximum number of common genes between two genomes. To do so, the system builds an \textit{Intersection Core Matrix (ICM)}. ICM is a two dimensional symmetric matrix where each row and each column represents one genome. Each position in ICM stores the \textit{Intersection Scores(IS)}. IS is the cardinality number of a core genes which comes from intersecting one genome with other ones. Maximum cardinality results to select two genomes with their maximum core. Mathematically speaking, if we have an $n \times n$ matrix where $n$  
+is the number of genomes in local database, then lets consider:\\
+
  \begin{equation}
  Score=\max_{i<j}\vert x_i \cap x_j\vert
  \label{Eq1}
  \end{equation}
-where $x_i, x_j$ are elements in the matrix. The generation of a new core genes is depending on the cardinality value of intersection scores, we call it \textit{Score}:
+
+\noindent where $x_i, x_j$ are elements in the matrix. The generation of a new core genes is depending on the cardinality value of intersection scores, we call it \textit{Score}:
  $$\text{New Core} = \begin{cases} 
  \text{Ignored} & \text{if $\textit{Score}=0$;} \\
  \text{new Core id} & \text{if $\textit{Score}>0$.}