X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/ancetre.git/blobdiff_plain/fdfa2a23bd7d4c2aee5b98c5e8aa9eb5d173a136..refs/heads/master:/presentation.tex diff --git a/presentation.tex b/presentation.tex index 2b31802..00c1138 100644 --- a/presentation.tex +++ b/presentation.tex @@ -1,5 +1,24 @@ Given a bacteria, various complete genomes can be found on the Internet. - For each genome, the complete records in fasta file are downloaded from the -NCBI nucleotide website. - +NCBI nucleotide website. Then, GenemarkS is queried to find open reading +frames we will improperly called genes in the remainder of this document. +Another approach could be to download directly the coding sequence files from +the NCBI, however our experiments show that the annotated files are sometimes +really problematic. Furthermore, almost thirty gene prediction software (GPS) exist, +and they potentially can be used with various parameters, leading to numerous +different annotated genomes. For our part, we have chosen the three most famous +GPS, namely Glimmer, GeneMark, and Rast (see Table~\ref{GPS}). +\begin{table} +\centering +\begin{tabular}{|l|c|} +\hline +Gene prediction software & Good ORFs \\ +\hline +Glimmer & 2558 \\ +Genemask & 2768 \\ +Rast & 2560 \\ +\hline +\end{tabular} +\caption{Gene prediction scores of the best GPS on H37Rv} +\label{GPS} +\end{table}