update step III from the algorithm.

[chloroplast13.git] / annotated.tex
diff --git a/annotated.tex b/annotated.tex

index 86259a0aa26054462d5b41968d58f923d8c00f7a..c2d70c9ea448f567226963a00c5efa8169bfc37d 100644 (file)
--- a/annotated.tex
+++ b/annotated.tex
@@ -74,11 +74,12 @@ The Algorithm of construction the matrix and extracting maximum core genes where
  \end{algorithmic}
  \end{algorithm}
  
  \end{algorithmic}
  \end{algorithm}
  
-In this algorithm, \textit{GenomeList} represents the database.
+In this algorithm, \textit{GenomeList} represents the database.\\
  
  
-\textbf{Step III: Draw the Tree}\\
+\textbf{Step III: Drawing the Tree}\\
+The main objective here is to the results for visualizing a tree of evolution. We use here a directed graph from Dot graph package\cite{gansner2002drawing} from Graphviz library. The system produce this tree automatically by using all information available in a database. Core genes generated with their genes can be very important information, because they represent also the ancestor information for two genomes. In this tree, each node represent genome or core genes as \textit{(Genes count:Family name,Scientific name,Accession number)}, Edges here represent the number of lost genes from genomes-core or core-core intersection. The number of lost genes here can be an important factor for evolution, it represents how much lost of genes for the species in same or different families. By the principle of classification, small genes lost among species can say that these species are closely together and belongs to same family, while big genes lost means that species is far to be in the same family. To see the picture clearly, Phylogenetic tree is an evolutionary tree generated also by the system. Generating this tree is based on the distances among genes sequences. There are many resources to build such tree (for example: PHYML\cite{guindon2005phyml}, RAxML{\cite{stamatakis2008raxml}\cite{stamatakis2005raxml}, BioNJ , and TNT\cite{goloboff2008tnt}}. We consider to use RAxML\cite{stamatakis2008raxml}\cite{stamatakis2005raxml} to generate this tree.   
  
  
-The main drawback from this method is that we can not depending only on genes names because of three causes: first, the genome may have not totally named, so we will have some lost sequences. Second, may we have two genes sharing the same name, while their sequences are different. Third, we need to annotate 99 genomes.
+The main drawback from this method is that we can not depending only on genes names because of three causes: first, the genome may have not totally named (This can be found in early versions of NCBI genomes), so we will have some lost sequences. Second, we may have two genes sharing the same name, while their sequences are different. Third, we need to annotate all the genomes.
  
  \subsubsection{Extracting Core genome from NCBI gene contents}
  
  
  \subsubsection{Extracting Core genome from NCBI gene contents}