From: bassam al-kindy Date: Thu, 13 Mar 2014 13:23:42 +0000 (+0100) Subject: adding the directory of paper2 X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/chloroplast13.git/commitdiff_plain/refs/heads/master?hp=45f51751b6853ef1ef0687f53f9bd8ef4aca3fda adding the directory of paper2 --- diff --git a/Paper2/Features.tex b/Paper2/Features.tex new file mode 100644 index 0000000..6f934cf --- /dev/null +++ b/Paper2/Features.tex @@ -0,0 +1,53 @@ + +The last stage of the proposed pipeline is naturally to take advantage +of the produced core and pan genomes for biological studies. As +this key stage is not directly related to the methodology for core +and pan genomes discovery, we will only outline a few tasks that +can be operated on the produced data. + +%\begin{figure} +%\centering +%\includegraphics[scale=0.215]{tree} +%\caption{Part of a core genomes evolutionary tree (NCBI gene names)} +%\label{coreTree} +%\end{figure} + +Obtained results may be visualized by building a core genomes evolutionary tree. +% All core genes generated represent an important information in the tree, +% because they provide ancestor information of two or more +% genomes. +Each node in this tree represents a chloroplast genome or +a predicted core. %, as depicted in Figure~\ref{coreTree}. In this figure, nodes labels are of the form \textit{(Genes number:Family name\_Scientific name\_Accession number)},while an edge is labeled with the number of gene loss when compared to its parents (a leaf genome or an intermediate core genome). Such numbers can answer questions like: how many genes are different between two species? Which functionality has been lost between an ancestor and its children ? For complete core treesbased either on NCBI names or on DOGMA ones, see supplementary data. + +A second application of such data is obviously to build accurate phylogenetic +trees, using tools like +PHYML\cite{guindon2005phyml} or +RAxML{\cite{stamatakis2008raxml,stamatakis2005raxml}. +Consider a set of species, the last common core genome in the core tree +contains all the genes shared in common by these species. These genes may be +multi aligned to serve as input of the phylogenetic tools mentioned above. +An example of such a phylogenetic tree on core 58 (NCBI cores tree, see +supplementary data) is provided in Appendix~\ref{philoTree}. Remark that, in +order to constitute a relevant outgroup, we have simply blasted each gene +of this core on a chosen \emph{Cyanobacteria}. + + +%, BioNJ, and +%TNT\cite{goloboff2008tnt}}. +% In this work, we chose to use +% RAxML\cite{stamatakis2008raxml,stamatakis2005raxml} because it is +% fast, accurate, and can build large trees when dealing with a large +% number of genomic sequences. +% +% The procedure used to built a phylogenetic tree is as follows: +% \begin{enumerate} +% \item For each gene in a core gene, extract its sequence and store it in the database. +% \item Use multiple alignment tools such as (****to be write after see christophe****) +% to align these sequences with each others. +% \item Use an outer-group genome from cyanobacteria to calculate distances. +% \item Submit the resulting aligned sequences to RAxML program to compute +% the distances and finally draw the phylogenetic tree. +% \end{enumerate} +% + + diff --git a/Paper2/Mixed.tex b/Paper2/Mixed.tex new file mode 100644 index 0000000..0ffe487 --- /dev/null +++ b/Paper2/Mixed.tex @@ -0,0 +1,279 @@ + +% +% \subsubsection{Processing annotated genomes from NCBI} +% +% The objective here is to generate sets of genes from each genome so that +% genes are organized without any duplication. The input is a list of +% chloroplasts annotated genomes downloaded from the NCBI website. All genomes, +% which consist in collections of +% protein coding sequences~\cite{parra2007cegma,RDOGMA}, +% are stored as \textit{fasta} files. To be able to build the set of +% core genes, we need to preprocess these genomes +% using \textit{BioPython} package \cite{chapman2000biopython}. +% % +% First of all, we starts by converting each genome from \textit{fasta} file to +% GenVision \cite{geneVision} format from DNASTAR. Each genome is thus +% converted in a list of genes, containing both gene names and occurrences. +% Gene +% name duplication can be accumulated during the treatment of a genome. +% % +% These duplication come from gene fragments (\emph{e.g.} gene +% fragments treated with NCBI) and from chloroplast DNA sequences. To +% ensure that all the duplication are removed, each list of gene is +% translated into a set of genes. +% % Note that NCBI genome annotation +% produces genes except \textit{Ribosomal (rRNA)} genes. +% + +%\subsubsection{Processing genomes annotated by DOGMA} + +% Protein coding genes are identified in an input genome +%using sequence similarity of genes in DOGMA database. In addition in +%comparison with NCBI annotation tool, +% It can detect +% both \textit{transfer RNAs (tRNA)} and \textit{ribosomal RNAs (rRNA)}. +% +% The DOGMA annotation process is divided into two tasks. First, we +% manually annotate chloroplast genomes using DOGMA web tool. The output +% of this step is supposed to be a collection of coding genes files for +% each genome, organized in GeneVision file. The second task is to solve +% the gene duplication problem and therefore we have used two +% methods. + +\begin{figure}[H] +\centering +\includegraphics[scale=0.3]{gensim} +\caption{Part of the implementation of the second method, compaire the common genes from NCBI and DOGMA.} +\label{Meth2:gensim} +\end{figure} + +\color{red}The second approach in this paper is an enhasement of the ICM\textit{Intersection Core Matrix} proposed in \cite{Alkindy2014} by considering gene names to find core genome. Based on gene names spelling, When they realizing simple homogenization of names provided by NCBI, they miss core genes which have slightly different name formats. \color{black} To enlarge the size of the core genome, to be as close as possible to the true natural one, we propose to integrate a similarity distance on gene names. Each similarity will be computed between a name from DOGMA, which operates as a reference here, and a name from NCBI as shown in figure~\ref{Meth2:gensim}. + +The proposed distance is the Levenshtein one, which is close to the Needleman-Wunsch, except that gap opening and extension penalties are equal. The same name is then set to sequences whose NCBI names are close according to this edit distance. + +The risk, by doing so, is to merge genes that are different but whose names are similar (for instance, ND4 and ND4L are two different mitochondrial genes but with similar names). The solution is thus to compare, in a second stage, the similarity of DNA sequences too (with a Needleman-Wunsch global alignment), and to simply ignore the gene if this similarity is below a given threshold. + +By doing so, the second approach is designed, which takes the fundamental idea contained in the annotation-based approaches in the previous work. Remark that this approach is simply a deeper processing of the naming stage in the second approach in \cite{Alkindy2014}, the other stages being identical. + +The DNA similarity computation raises another problem in the case of DOGMA: +contrary to what happens with gene features in NCBI, genes predicted by +DOGMA may be fragmented in several parts. Such genes are signaled +in the GeneVision file produced by DOGMA, as each fragment is in this file +and with the same gene name. A gene whose name is present at least twice +in the file is thus either a duplicated gene or a fragmented one. +Obviously, fragmented genes must be defragmented before the DNA similarity computation stage (remark that such a defragmentation has already been realized on NCBI website). As the orientation of each fragment is given in the GeneVision output, this defragmentation consists in concatenating all the possible permutations, and only keeping the permutation with the best similarity score to other sequences having the same gene name, if this score is larger than the given threshold. + +To put it in a nutshell, the genomes list of gene names are firstly updated in this third approach, following the process detailed in +Algorithm~\ref{Alg3:thirdM}, while Algorithm~\ref{Alg3:genechk} outlines +the \emph{geneChk} subroutine. These updated genomes are secondly sent to +Algorithm~\cite{Alkindy2014}, which will produce the desired core genomes, see +Figure~\ref{wholesystem} for an updated pipeline. + +%The first method, based on gene name, translates each genome +% %into a set of genes without duplicates. +% The second method avoid gene +% duplication through a defragment process. In each iteration, this +% process starts by taking a gene from gene list, searches for gene +% duplication, if a duplication is found, it looks on the orientation of +% the fragment sequence. If it is positive it appends directly the +% sequence to gene files. Otherwise reverse complement operations are +% applied on the sequence, which is then also append to gene files. +% Finally, a check for missing start and stop codons is performed. At +% the end of the annotation process, all the genomes are fully +% annotated, their genes are defragmented, and gene counts are +% available. +% Remark that there is no gene duplication with gene annotations +% from DOGMA after applying gene de-fragmentation process. In +% fact, genome annotation with DOGMA can be the key difference when extracting core genes. +% +% +% +% +% \subsubsection{Preprocessing} +% +% In order to extract core genomes in a suitable manner, the genomic +% data are preprocessed with two methods: on the one hand a method based +% on gene name and count, and on the other hand a method based on a +% sequence quality control test. +% +% In the first method, we extract a list of genes from each chloroplast +% genome. Then we store this list of genes in the database under genome +% nam and genes counts can be extracted by a specific length command. +% The \textit{Intersection Core Matrix}, described in next subsection, +% is then computed to extract the core genes. The problem with this +% method can be stated as follows: how can we ensure that the gene which +% is predicted in core genes is the same gene in leaf genomes? The +% answer to this problem is that if the sequences of any gene in a +% genome annotated from DOGMA and NCBI are similar with respect to a +% given threshold, the method is operational when the sequences are not similar. The problem of attribution of a sequence to a gene in the core genome come to light. + +% The second method is based on the underlying idea that it is possible to predict the the best annotated genome by merging the annotated genomes from NCBI +% and DOGMA according to a quality test on genes names and sequences. To +% obtain all quality genes of each genome, we consider the following +% hypothesis: any gene will appear in the predicted genome if and only +% if the annotated genes in NCBI and DOGMA pass a specific threshold +% of \textit{quality control test}. In fact, the Needle-man Wunch +% algorithm is applied to compare both sequences with respect to a +% threshold. If the alignment score is above the threshold, then the +% gene will be retained in the predicted genome, otherwise the gene is +% ignored. Once the prediction of all genomes is done, +% the \textit{Intersection Core Matrix} is computed on these new genomes +% to extract core genes, as explained in Algorithm \ref{Alg3:thirdM}. +% + +\begin{algorithm}[H] +\caption{Maximum similarity score between two sequences(geneChk)} +\label{Alg3:genechk} +\begin{algorithmic} +\REQUIRE $g1,g2 \leftarrow \text{NCBI gene sequence, DOGMA gene sequence}$ +\ENSURE $\text{Maximum similarity score}$ +\STATE $score1 \leftarrow needle(g1,g2)$ +\STATE $score2 \leftarrow needle(g1,Reverse(Complement(g2)))$ +\RETURN $max(score1,score2)$ +\end{algorithmic} +\end{algorithm} + + +\begin{algorithm}[H] +\tiny +\caption{Extract new genome based on gene quality test}\label{Alg3:thirdM} +\begin{algorithmic} +\REQUIRE {$Gname \leftarrow \text{Genome Name}, Threshold \leftarrow 60, RNGenes \leftarrow \text{[ ]}, RDGenes \leftarrow \text{[ ]},PNGenes \leftarrow \text{[ ]}, PDGenes \leftarrow \text{[ ]}$} +\ENSURE $geneList \leftarrow \text{Quality genes}$ +\FOR{$\text{gene in NCBI genes of Gname}$} + \IF {$\text{gene in RNGenes}$} + \STATE $dir(NCBI\_Genes) \leftarrow savePermutation(gene)$ + \STATE $PNGenes \leftarrow gene$ + \ELSE + \STATE $RNGenes \leftarrow gene$ + \ENDIF +\ENDFOR +\FOR{$\text{gene in Dogma genes of Gname}$} + \IF {$\text{gene in RDGenes}$} + \STATE $dir(Dogma\_Genes) \leftarrow savePermutation(gene)$ + \STATE $PDGenes \leftarrow gene$ + \ELSE + \STATE $RDGenes \leftarrow gene$ + \ENDIF +\ENDFOR +\STATE $geneList=\text{empty list}$ +\STATE $common=set(dir(NCBI\_Genes)) \cap set(dir(Dogma\_Genes))$ +\FOR{$\text{gene in common}$} + \STATE $\text{scores} \leftarrow \text{[ ]}$ + \IF {$\text{gene NOT in PNGenes AND gene NOT in PDGenes}$} + \STATE \dots + \STATE $scores \leftarrow geneChk(g1,g2)$ + \ELSIF {$\text{gene in PNGenes AND NOT gene in PDGenes}$} + \STATE $PGene \leftarrow loadPermutations('N',gene)$ + \dots + \FOR {$\text{X in PGene}$} + \STATE \dots + \STATE $scores \leftarrow geneChk(g1,g2)$ + \ENDFOR + \ELSIF {$\text{gene in PDGenes} AND \text{gene NOT in PNGenes}$} + \STATE $PGene \leftarrow loadPermutations('D',gene)$ + \STATE \dots + \FOR {$\text{X in PGene}$} + \STATE \dots + \STATE $scores \leftarrow geneChk(g1,g2)$ + \ENDFOR + \ELSIF {$\text{gene in PDGenes} AND \text{gene in PNGenes}$} + \FOR {$\text{X in loadPermutations('N',gene)}$} + \FOR {$\text{Y in loadPermutations('D',gene)}$} + \STATE \dots + \STATE $scores \leftarrow geneChk(g1,g2)$ + \ENDFOR + \ENDFOR +\STATE $score \leftarrow max(scores)$ +\IF {$score > Threshold$} + \STATE $geneList \leftarrow gene$ +\ENDIF +\ENDIF +\ENDFOR +\RETURN $geneList$ +\end{algorithmic} +\end{algorithm} + +%\begin{algorithm}[H] +%\tiny +%\caption{Extract new genome based on gene quality test}\label{Alg3:thirdM} +%\begin{algorithmic} +%\REQUIRE {$Gname \leftarrow \text{Genome Name}, Threshold \leftarrow 60, RNGenes \leftarrow \text{[ ]}, RDGenes \leftarrow \text{[ ]},PNGenes \leftarrow \text{[ ]}, PDGenes \leftarrow \text{[ ]}$} +%\ENSURE $geneList \leftarrow \text{Quality genes}$ +%\FOR{$\text{gene in NCBI genes of Gname}$} +% \IF {$\text{gene in RNGenes}$} +% \STATE $dir(NCBI\_Genes) \leftarrow savePermutation(gene)$ +% \STATE $PNGenes \leftarrow gene$ +% \ELSE +% \STATE $RNGenes \leftarrow gene$ +% \ENDIF +%\ENDFOR +%\FOR{$\text{gene in Dogma genes of Gname}$} +% \IF {$\text{gene in RDGenes}$} +% \STATE $dir(Dogma\_Genes) \leftarrow savePermutation(gene)$ +% \STATE $PDGenes \leftarrow gene$ +% \ELSE +% \STATE $RDGenes \leftarrow gene$ +% \ENDIF +%\ENDFOR +%\STATE $geneList=\text{empty list}$ +%\STATE $common=set(dir(NCBI\_Genes)) \cap set(dir(Dogma\_Genes))$ +%\FOR{$\text{gene in common}$} +% \STATE $\text{scores} \leftarrow \text{[ ]}$ +% \IF {$\text{gene NOT in PNGenes AND gene NOT in PDGenes}$} +% \STATE $g1 \leftarrow open(NCBI\_Genes(gene)).read()$ +% \STATE $g2 \leftarrow open(Dogma\_Genes(gene)).read()$ +% \STATE $score \leftarrow geneChk(g1,g2)$ +% \IF {$score > Threshold$} +% \STATE $geneList \leftarrow gene$ +% \ENDIF +% \ELSIF {$\text{gene in PNGenes AND NOT gene in PDGenes}$} +% \STATE $PGene \leftarrow loadPermutations('N',gene)$ +% \STATE $g2 \leftarrow open(Dogma\_Genes(gene)).read()$ +% \FOR {$\text{X in PGene}$} +% \STATE $g1 \leftarrow open(NCBI\_Genes(X)).read()$ +% \STATE $scores \leftarrow geneChk(g1,g2)$ +% \ENDFOR +% \STATE $score \leftarrow max(scores)$ +% \IF {$score > Threshold$} +% \STATE $geneList \leftarrow gene$ +% \ENDIF +% \ELSIF {$\text{gene in PDGenes} AND \text{gene NOT in PNGenes}$} +% \STATE $PGene \leftarrow loadPermutations('D',gene)$ +% \STATE $g1 \leftarrow open(NCBI\_Genes(gene)).read()$ +% \FOR {$\text{X in PGene}$} +% \STATE $g2 \leftarrow open(Dogma\_Genes(X)).read()$ +% \STATE $scores \leftarrow geneChk(g1,g2)$ +% \STATE $score \leftarrow max(scores)$ +% \ENDFOR +% \IF {$score > Threshold$} +% \STATE $geneList \leftarrow gene$ +% \ENDIF +% \ELSIF {$\text{gene in PDGenes} AND \text{gene in PNGenes}$} +% \FOR {$\text{X in loadPermutations('N',gene)}$} +% \FOR {$\text{Y in loadPermutations('D',gene)}$} +% \STATE $g1 \leftarrow open(NCBI\_Genes(X)).read()$ +% \STATE $g2 \leftarrow open(Dogma\_Genes(Y)).read()$ +% \STATE $scores \leftarrow geneChk(g1,g2)$ +% \ENDFOR +% \STATE $score \leftarrow max(scores)$ +% \IF {$score > Threshold$} +% \STATE $geneList \leftarrow gene$ +% \ENDIF +% \ENDFOR +% \ENDIF +%\ENDFOR +%\RETURN $geneList$ +%\end{algorithmic} +%\end{algorithm} + + + + +% +% \textbf{geneChk} is a subroutine used to find the best similarity score between +% two gene sequences after applying operations like \textit{reverse}, {\it complement}, +% and {\it reverse complement}. + + + diff --git a/Paper2/Whole_system.png b/Paper2/Whole_system.png new file mode 100644 index 0000000..61a0f65 Binary files /dev/null and b/Paper2/Whole_system.png differ diff --git a/Paper2/abstract.tex b/Paper2/abstract.tex new file mode 100644 index 0000000..7bb192c --- /dev/null +++ b/Paper2/abstract.tex @@ -0,0 +1,51 @@ +\begin{abstract} +\color{red}Investigating in the evolution of genomes become a hard task due to the amount of evolutionary techniques and the amount of genomes that raises every day. The important question to understand here is: how can we clusterize large amounts of chloroplast species?, and what are the common genes that play a role in the process of evolution among these species?. Clusterizing collection of species aims to find the common genes that share the same functionality properties. In other words, clustering helps us to find the core and pan genome among species that share a common properties, such us gene name, gene sequence, family, \dots, etc. According to other studies, finding such core and/or pan genome is not an easy task due to a large amount of computation, and requiring a rigorous methodology. \color{black} +%Due to the recent evolution of sequencing techniques, the number of +%available genomes is rising steadily, raising the problem to determine +%what to do with such large sets of DNA data. An interesting question +%is to understand what are the common functionality of a collection +%of species or, conversely, to determine what is specific to a given +%species when compared to other ones belonging in the same genus, family, etc. +%Investigating such a problem means to find both core and pan genomes +%of a collection of species, that is, genes in common to all the species +%vs. genes present at least once in the set of genomes. However, to obtain +%trustworthy core and pan genomes is not an easy task, leading to a large +%amount of computation, and requiring a rigorous methodology. Surprisingly, +%as far as we know, this methodology in finding core and pan genomes has not really been +%investigated in detail. This research work tries to fill this gap +%by focusing only on chloroplastic genomes, whose reasonable sizes allow a deep study. +%% DNA analysis techniques have received a lot of attention these last +%% years, because they play an important role in understanding genomes +%% evolution over time, and in phylogenetic and genetic analyses. +%% However systematic approaches to determine +%% %Various +%%models of genomes evolution are based on the analysis of DNA +%%sequences, SNPs, mutations, and so on. +To achieve this goal, a collection of 99 chloroplasts are +considered in this article. Two methodologies will be +investigated, respectively based on sequence similarities and +%genes names taken +from annotation tools. +The obtained results will finally be evaluated in terms of performances and +biological relevance. +% Various genes prediction methods will be +% firstly compared, some of them being specific to chloroplastic genomes. +% Then clustering methods will be proposed and evaluated, in order +% to group these coding sequences by orthologous genes. +% % We have recently investigated +% the use of core (\emph{i.e.}, common genes) and pan genomes to infer +% evolutionary information on a collection of 99~chloroplasts. In +% particular, we have regarded methods to build a genes content +% evolutionary tree using distances to core genome. However, the +% production of reliable core and pan genomes is not an easy task, due +% to error annotations. The aim of this methodology article is to +% % investigate various ways to . +% We will first compare different approaches to +% construct such a tree using fully annotated genomes provided by NCBI and +% DOGMA, followed by a gene quality control among the common genes. Then +% we will explain how, by comparing sequences from DOGMA with NCBI +% contents, we achieved to identify the genes that play a key role in +% the dynamics of genomes evolution. + +\textbf{Keywords:} Core genome, Methodology, Pan genome, Genes prediction, Coding sequences clustering, Chloroplasts, Gene quality test. +\end{abstract} diff --git a/Paper2/annotated.tex b/Paper2/annotated.tex new file mode 100644 index 0000000..d865ed6 --- /dev/null +++ b/Paper2/annotated.tex @@ -0,0 +1,186 @@ + + +%\subsubsection{Using genes names provided by annotation tools} +% +%Instead of using the sequences predicted by annotation tools, we can +%try to use the names associated to these sequences, when available. +%The basic idea is thus to annotate all the sequences using a given +%software, and to consider as core gene each sequence whose name can +%be found in all the genomes. +%Two annotation techniques will be used in the remainder of this article, +%namely DOGMA and NCBI. +% +% +%It is true that the NCBI annotations are of varying +%qualities, and sometimes such annotations are totally erroneous. As stated before, it is due to the +%large variety of annotation tools that can been used during each +%sequence submission process. However, we also considered it in this +%article, as this database contains human-curated annotations. To say this +%another way, DOGMA automatic annotations are good in average, while +%NCBI contains very good human-based annotations together with very badly +%annotated genomes. +%Let us finally remark that DOGMA also predict the locations of +%\textit{ribosomal RNA (rRNA)}, while they are not provided in +%gene features from NCBI. Thus core genomes constructed on NCBI +%data will not contain rRNA. + +We now investigate core and pan genomes design +using each of the two tools separately, which will constitute the second +approach detailed in this article. From now on we will consider annotated +genomes: either ``genes features'' downloaded from the NCBI, or the +result of DOGMA. + +%\subsubsection{Names processing} +% +%As DOGMA is a deterministic annotation tool, when a given gene +%is detected twice in two genomes, the same name will be attached +%to the two coding sequences: DOGMA spells exactly in the same manner +%the two gene names. So each genome is replaced by a list of gene +%names, and finding the core genes common to two genomes simply +%consists in intersecting the two lists of genes. The sole problem +%we have detected using DOGMA on our 97 chloroplasts is the case +%of the RPS12 gene: some genomes contain RPS12\_3end +%or RPS12\_5end in the DOGMA result. We have manually +%considered that all these representatives belong to the same gene, +%namely to RPS12. +% +%Dealing with NCBI names is more complicated, as various annotation +%tools have been used together with human annotations, and because there is +%no spelling rule for gene names. For instance, NAD6 mitochondrial gene is +%sometimes written as ND6, while we can find RPOC1, RPOC1A, and RPOC1B in +%our chloroplasts. So if we simply consider NCBI data without +%treatment, intersecting two genomes provided as list of gene names often +%lead to duplication of misspelled genes. Automatic names homogenization is thus required +%on NCBI annotations, the question being where to draw the line +%on correcting errors in the spelling of genes ? In this second approach, +%we propose to automate only obvious modifications like putting all names +%in capital letters and removing useless symbols as ``\_'', ``('', and ``)''. +%Remark that such simple renaming process cannot tackle with the situations of NAD6 or +%RPOC1 evoked above. To go further in automatic corrections requires +%to use edit distances like the Levenshtein one, however such an use will +%raise false positives (different genes with close names will be homogenized). +%To solve this problem, a compromise that reduces the number of false positives, by considering the similarity between DNA sequences of genes having similar names, will be detailed in the third approach. +% +%At this stage, we now consider that each genome is mapped to a list of gene +%names, where names have been homogenized in the NCBI case. + + + +%\subsubsection{Core genes extraction} +% +%% The goal of this stage is to extract maximum core genes from sets of +%% genes. To find core genes, the following methodology is applied. +%% +% +%%\subsubsection{Intersection Core Matrix (\textit{ICM})} +% +%To extract core genes, we iteratively collect the maximum number of +%common genes between genomes, therefore during this stage +%an \textit{Intersection Core Matrix} (ICM) is built. ICM is a two +%dimensional symmetric matrix where each row and each column correspond +%to one genome. Hence, an element of the matrix stores +%the \textit{Intersection Score} (IS): the cardinality of the core +%genes set obtained by intersecting the two genomes. +%%Maximum cardinality results in selecting the two genomes having +%%the maximum score. +%Mathematically speaking, if we have $n$ genomes in +%local database, the ICM is an $n \times n$ matrix whose elements +%satisfy: +%\begin{equation} +%score_{ij}=\vert g_i \cap g_j\vert +%\label{Eq1} +%\end{equation} +%\noindent where $1 \leq i \leq n$, $1 \leq j \leq n$, and $g_i, g_j$ are +%genomes. The generation of a new core genome depends obviously on the +%value of the intersection scores $score_{ij}$. More precisely, the +%idea is to consider a pair of genomes such that their score is the +%largest element in the ICM. These two genomes are then removed from the matrix +%and the resulting new core genome is added for the next iteration. +%The ICM is then updated to take into account the new core genome: new IS +%values are computed for it. This process is repeated until no new core +%genome can be obtained. +% +%We can observe that the ICM is relatively large due to the amount of +%species. As a consequence, the computation of the intersection scores is +%both time and memory consuming. However, since ICM is obviously a symmetric +%matrix we can reduce the computation overhead by considering only its +%triangular upper part. The time complexity for this process %after +%%enhancement +%is thus $O(\frac{n.(n-1)}{2})$. Algorithm~\ref{Alg1:ICM} +%illustrates the construction of the ICM matrix and the extraction of +%the core genomes, where \textit{GenomeList} represents the database +%storing all genomes data. At each iteration, this algorithm computes the maximum +%core genome with its two parents (genomes). +% +%% ALGORITHM HAS BEEN REWRITTEN +% +%\begin{algorithm}[H] +%\caption{Extract Maximum Intersection Score} +%\label{Alg1:ICM} +%\begin{algorithmic} +%\REQUIRE $L \leftarrow \text{genomes sets}$ +%\ENSURE $B1 \leftarrow \text{Max Core set}$ +%\FOR{$i \leftarrow 1:len(L)-1$} +% \STATE $score \leftarrow 0$ +% \STATE $core1 \leftarrow set(GenomeList[L[i]])$ +% \STATE $g1 \leftarrow L[i]$ +% \FOR{$j \leftarrow i+1:len(L)$} +% \STATE $core2 \leftarrow set(GenomeList[L[j]])$ +% \STATE $core \leftarrow core1 \cap core2$ +% \IF{$len(core) > score$} +% \STATE $score \leftarrow len(core)$ +% \STATE $g2 \leftarrow L[j]$ +% \ENDIF +% \ENDFOR +% \STATE $B1[score] \leftarrow (g1,g2)$ +%\ENDFOR +%\RETURN $max(B1)$ +%\end{algorithmic} +%\end{algorithm} +% +%For complete core trees based either on NCBI names or on DOGMA ones, (see \url{http://members.femto-st.fr/christophe-guyeux/}). +%%\color{red} The second approach is dependent on gene names spelling. When realizing simple homogenization of names provided by NCBI, we miss core genes which have slightly different name formats. So that, good annotation tool is highly required. \color{black} +% +%%\subsection{Features visualization} +%%The last stage of the proposed pipeline is naturally to take advantage +%%of the produced core and pan genomes for biological studies. As +%%this key stage is not directly related to the methodology for core +%%and pan genomes discovery, we will only outline a few tasks that +%%can be operated on the produced data. +%% +%%\begin{figure} +%%\centering +%%\includegraphics[scale=0.215]{tree} +%%\caption{Part of a core genomes evolutionary tree (NCBI gene names)} +%%\label{coreTree} +%%\end{figure} +%% +%%Obtained results may be visualized by building a core genomes evolutionary tree. +%%% All core genes generated represent an important information in the tree, +%%% because they provide ancestor information of two or more +%%% genomes. +%%Each node in this tree represents a chloroplast genome or +%%a predicted core, as depicted in Figure~\ref{coreTree}. In this +%%figure, nodes labels are of the form +%%\textit{(Genes number:Family name\_Scientific name\_Accession number)}, +%%while an edge is labeled with the number of +%%gene loss when compared to its parents (a leaf genome or an intermediate +%%core genome). Such numbers can answer questions like: +%%how many genes are different between two species? Which functionality has +%%been lost between an ancestor and its children ? For complete core trees +%%based either on NCBI names or on DOGMA ones, see supplementary data. +%% +%% +%% +%%A second application of such data is obviously to build accurate phylogenetic +%%trees, using tools like +%%PHYML\cite{guindon2005phyml} or +%%RAxML{\cite{stamatakis2008raxml,stamatakis2005raxml}. +%%Consider a set of species, the last common core genome in the core tree +%%contains all the genes shared in common by these species. These genes may be +%%multi aligned to serve as input of the phylogenetic tools mentioned above. +%%An example of such a phylogenetic tree on core 58 (NCBI cores tree, see +%%supplementary data) is provided in Appendix~\ref{philoTree}. Remark that, in +%%order to constitute a relevant outgroup, we have simply blasted each gene +%%of this core on a chosen \emph{Cyanobacteria}. +%% diff --git a/Paper2/appendix.tex b/Paper2/appendix.tex new file mode 100644 index 0000000..9674f1d --- /dev/null +++ b/Paper2/appendix.tex @@ -0,0 +1,118 @@ + + +\subsection{Phylogenetic tree based on NCBI core 58} +\label{philoTree} + +\begin{center} +\includegraphics[scale=0.4]{phylo11} +\end{center} + + +\subsection{Example} + +\begin{table}[h] +\begin{tabular}{cccccccc} +\hline + & & & \multicolumn{3}{c}{Nb. of Genes} & & \\ +Genome & Th & GAM & NCBI & DOGMA & Common & NCBI (\%) & DOGMA (\%) \\ \hline +NC\_000925.1 & 60 & osneedle & 209 & 171 & 125 & 59.81 & 73.1 \\ +NC\_000927.1 & 60 & osneedle & 147 & 130 & 89 & 60.54 & 68.46 \\ +NC\_001319.1 & 60 & osneedle & 89 & 126 & 74 & 83.15 & 58.73 \\ +NC\_001568.1 & 60 & osneedle & 21 & 51 & 20 & 95.24 & 39.22 \\ +NC\_001603.2 & 60 & osneedle & 67 & 59 & 33 & 49.25 & 55.93 \\ +NC\_001666.2 & 60 & osneedle & 105 & 118 & 74 & 70.48 & 62.71 \\ +NC\_001713.1 & 60 & osneedle & 138 & 155 & 119 & 86.23 & 76.77 \\ +NC\_001840.1 & 60 & osneedle & 197 & 106 & 72 & 36.55 & 67.92 \\ +NC\_002186.1 & 60 & osneedle & 105 & 141 & 96 & 91.43 & 68.09 \\ +NC\_003386.1 & 60 & osneedle & 95 & 124 & 77 & 81.05 & 62.1 \\ +NC\_004543.1 & 60 & osneedle & 88 & 115 & 75 & 85.23 & 65.22 \\ +NC\_005086.1 & 60 & osneedle & 79 & 119 & 74 & 93.67 & 62.18 \\ +NC\_005087.1 & 60 & osneedle & 85 & 122 & 78 & 91.76 & 63.93 \\ +NC\_005353.1 & 60 & osneedle & 66 & 73 & 46 & 69.7 & 63.01 \\ +NC\_006050.1 & 60 & osneedle & 79 & 120 & 77 & 97.47 & 64.17 \\ +NC\_006137.1 & 60 & osneedle & 203 & 227 & 188 & 92.61 & 82.82 \\ +NC\_006290.1 & 60 & osneedle & 79 & 120 & 77 & 97.47 & 64.17 \\ +NC\_006861.1 & 60 & osneedle & 86 & 124 & 84 & 97.67 & 67.74 \\ +NC\_007288.1 & 60 & osneedle & 119 & 91 & 61 & 51.26 & 67.03 \\ +NC\_007578.1 & 60 & osneedle & 78 & 118 & 76 & 97.44 & 64.41 \\ +NC\_007898.3 & 60 & osneedle & 80 & 120 & 78 & 97.5 & 65 \\ +NC\_007957.1 & 60 & osneedle & 79 & 120 & 77 & 97.47 & 64.17 \\ +NC\_007977.1 & 60 & osneedle & 79 & 118 & 77 & 97.47 & 65.25 \\ +NC\_008097.1 & 60 & osneedle & 104 & 109 & 69 & 66.35 & 63.3 \\ +NC\_008099.1 & 60 & osneedle & 83 & 89 & 57 & 68.67 & 64.04 \\ +NC\_008114.1 & 60 & osneedle & 103 & 81 & 52 & 50.49 & 64.2 \\ +NC\_008289.1 & 60 & osneedle & 60 & 68 & 42 & 70 & 61.76 \\ +NC\_008325.1 & 60 & osneedle & 79 & 119 & 77 & 97.47 & 64.71 \\ +NC\_008336.1 & 60 & osneedle & 79 & 121 & 77 & 97.47 & 63.64 \\ +NC\_008359.1 & 60 & osneedle & 78 & 119 & 76 & 97.44 & 63.87 \\ +NC\_008372.1 & 60 & osneedle & 79 & 73 & 45 & 56.96 & 61.64 \\ +NC\_008407.1 & 60 & osneedle & 78 & 116 & 73 & 93.59 & 62.93 \\ +NC\_008456.1 & 60 & osneedle & 79 & 121 & 77 & 97.47 & 63.64 \\ +NC\_008457.1 & 60 & osneedle & 79 & 120 & 77 & 97.47 & 64.17 \\ +NC\_008535.1 & 60 & osneedle & 79 & 121 & 77 & 97.47 & 63.64 \\ +NC\_008588.1 & 60 & osneedle & 131 & 160 & 129 & 98.47 & 80.62 \\ +NC\_008796.1 & 60 & osneedle & 78 & 119 & 75 & 96.15 & 63.03 \\ +NC\_008822.1 & 60 & osneedle & 113 & 131 & 91 & 80.53 & 69.47 \\ +NC\_008829.1 & 60 & osneedle & 85 & 119 & 79 & 92.94 & 66.39 \\ +NC\_009143.1 & 60 & osneedle & 91 & 117 & 73 & 80.22 & 62.39 \\ +NC\_009598.1 & 60 & osneedle & 79 & 121 & 77 & 97.47 & 63.64 \\ +NC\_009599.1 & 60 & osneedle & 79 & 119 & 77 & 97.47 & 64.71 \\ +NC\_009600.1 & 60 & osneedle & 79 & 121 & 77 & 97.47 & 63.64 \\ +NC\_009601.1 & 60 & osneedle & 78 & 119 & 76 & 97.44 & 63.87 \\ +NC\_009618.1 & 60 & osneedle & 118 & 125 & 80 & 67.8 & 64 \\ +NC\_009765.1 & 60 & osneedle & 59 & 86 & 52 & 88.14 & 60.47 \\ +NC\_009808.1 & 60 & osneedle & 78 & 118 & 75 & 96.15 & 63.56 \\ +NC\_010361.1 & 60 & osneedle & 78 & 120 & 76 & 97.44 & 63.33 \\ +NC\_010433.1 & 60 & osneedle & 78 & 118 & 76 & 97.44 & 64.41 \\ +NC\_010442.1 & 60 & osneedle & 74 & 112 & 66 & 89.19 & 58.93 \\ +NC\_010772.1 & 60 & osneedle & 143 & 112 & 80 & 55.94 & 71.43 \\ +NC\_011031.1 & 60 & osneedle & 83 & 78 & 46 & 55.42 & 58.97 \\ +NC\_011600.1 & 60 & osneedle & 139 & 108 & 74 & 53.24 & 68.52 \\ +NC\_011942.1 & 60 & osneedle & 63 & 88 & 46 & 73.02 & 52.27 \\ +NC\_012097.1 & 60 & osneedle & 68 & 71 & 44 & 64.71 & 61.97 \\ +NC\_012099.1 & 60 & osneedle & 87 & 95 & 64 & 73.56 & 67.37 \\ +NC\_012568.1 & 60 & osneedle & 27 & 29 & 0 & 0 & 0 \\ +NC\_012903.1 & 60 & osneedle & 110 & 98 & 64 & 58.18 & 65.31 \\ +NC\_013707.2 & 60 & osneedle & 78 & 121 & 75 & 96.15 & 61.98 \\ +NC\_013823.1 & 60 & osneedle & 79 & 121 & 77 & 97.47 & 63.64 \\ +NC\_013991.2 & 60 & osneedle & 80 & 121 & 78 & 97.5 & 64.46 \\ +NC\_014267.1 & 60 & osneedle & 139 & 167 & 136 & 97.84 & 81.44 \\ +NC\_014287.1 & 60 & osneedle & 127 & 156 & 125 & 98.43 & 80.13 \\ +NC\_014346.1 & 60 & osneedle & 74 & 73 & 43 & 58.11 & 58.9 \\ +NC\_014348.1 & 60 & osneedle & 84 & 117 & 77 & 91.67 & 65.81 \\ +NC\_014570.1 & 60 & osneedle & 69 & 120 & 67 & 97.1 & 55.83 \\ +NC\_014674.1 & 60 & osneedle & 77 & 120 & 75 & 97.4 & 62.5 \\ +NC\_014675.1 & 60 & osneedle & 82 & 115 & 74 & 90.24 & 64.35 \\ +NC\_014676.2 & 60 & osneedle & 76 & 119 & 75 & 98.68 & 63.03 \\ +NC\_014699.1 & 60 & osneedle & 84 & 115 & 73 & 86.9 & 63.48 \\ +NC\_014808.1 & 60 & osneedle & 142 & 152 & 124 & 87.32 & 81.58 \\ +NC\_015403.1 & 60 & osneedle & 134 & 154 & 122 & 91.04 & 79.22 \\ +NC\_015645.1 & 60 & osneedle & 77 & 74 & 45 & 58.44 & 60.81 \\ +NC\_015830.1 & 60 & osneedle & 77 & 118 & 75 & 97.4 & 63.56 \\ +NC\_015899.1 & 60 & osneedle & 77 & 120 & 76 & 98.7 & 63.33 \\ +NC\_016058.1 & 60 & osneedle & 71 & 115 & 70 & 98.59 & 60.87 \\ +NC\_016063.1 & 60 & osneedle & 82 & 120 & 76 & 92.68 & 63.33 \\ +NC\_016065.1 & 60 & osneedle & 83 & 117 & 75 & 90.36 & 64.1 \\ +NC\_016068.1 & 60 & osneedle & 103 & 120 & 77 & 74.76 & 64.17 \\ +NC\_016069.1 & 60 & osneedle & 69 & 116 & 69 & 100 & 59.48 \\ +NC\_016433.2 & 60 & osneedle & 80 & 120 & 78 & 97.5 & 65 \\ +NC\_016468.1 & 60 & osneedle & 78 & 120 & 77 & 98.72 & 64.17 \\ +NC\_016670.1 & 60 & osneedle & 77 & 118 & 75 & 97.4 & 63.56 \\ +NC\_016727.1 & 60 & osneedle & 77 & 119 & 75 & 97.4 & 63.03 \\ +NC\_016731.1 & 60 & osneedle & 128 & 149 & 116 & 90.62 & 77.85 \\ +NC\_016732.1 & 60 & osneedle & 79 & 71 & 45 & 56.96 & 63.38 \\ +NC\_016733.1 & 60 & osneedle & 79 & 82 & 52 & 65.82 & 63.41 \\ +NC\_016734.1 & 60 & osneedle & 79 & 117 & 77 & 97.47 & 65.81 \\ +NC\_016735.1 & 60 & osneedle & 139 & 96 & 65 & 46.76 & 67.71 \\ +NC\_016736.1 & 60 & osneedle & 78 & 120 & 76 & 97.44 & 63.33 \\ +NC\_016753.1 & 60 & osneedle & 79 & 121 & 77 & 97.47 & 63.64 \\ +NC\_016986.1 & 60 & osneedle & 82 & 123 & 80 & 97.56 & 65.04 \\ +NC\_017006.1 & 60 & osneedle & 87 & 119 & 73 & 83.91 & 61.34 \\ +NC\_017609.1 & 60 & osneedle & 67 & 115 & 65 & 97.01 & 56.52 \\ +NC\_018357.1 & 60 & osneedle & 78 & 121 & 76 & 97.44 & 62.81 \\ +NC\_018523.1 & 60 & osneedle & 139 & 107 & 74 & 53.24 & 69.16 \\ +NC\_019601.1 & 60 & osneedle & 78 & 120 & 76 & 97.44 & 63.33 \\ +NC\_020014.1 & 60 & osneedle & 118 & 86 & 51 & 43.22 & 59.3 \\ +NC\_020018.1 & 60 & osneedle & 62 & 67 & 42 & 67.74 & 62.69 \\ \hline +\end{tabular} +\end{table} diff --git a/Paper2/biblio.bib b/Paper2/biblio.bib new file mode 100644 index 0000000..8130739 --- /dev/null +++ b/Paper2/biblio.bib @@ -0,0 +1,353 @@ +@article{Alkindy2014, +author = {Alkindy B. \emph{et al}}, +title = {Find Core-Genes for Chloroplasts}, +volume = {}, +number = {}, +pages = {}, +year = {2014}, +doi = {}, +URL = {}, +journal = {} +} + +@article{Sayers01012011, +author = {Sayers \emph{et al}}, +title = {Database resources of the National Center for Biotechnology Information}, +volume = {39}, +number = {suppl 1}, +pages = {D38-D51}, +year = {2011}, +doi = {10.1093/nar/gkq1172}, +URL = {http://nar.oxfordjournals.org/content/39/suppl_1/D38.abstract}, +eprint = {http://nar.oxfordjournals.org/content/39/suppl_1/D38.full.pdf+html}, +journal = {Nucleic Acids Research} +} + +@misc{acgs13:onp, +inhal = {no}, +domainehal = {INFO:INFO_DC, INFO:INFO_CR, INFO:INFO_MO}, +equipe = {and}, +classement = {COM}, +author = {Alkindy, Bassam and Couchot, Jean-Fran\c{c}ois and Guyeux, Christophe and Salomon, Michel}, +title = {Finding the core-genes of Chloroplast Species}, +howpublished = {Journ\'ees SeqBio 2013, Montpellier}, +month = nov, +year = 2013, + +} + +@article{Rice2000, + added-at = {2011-12-21T01:05:11.000+0100}, + author = {Rice, P. and Longden, I. and Bleasby, A.}, + biburl = {http://www.bibsonomy.org/bibtex/28f45367da937b4116be3db30538af20f/fairybasslet}, + interhash = {5d90f52b0e7101e8ee9c1cfb5eb3237b}, + intrahash = {8f45367da937b4116be3db30538af20f}, + journal = {Trends Genet}, + keywords = {imported}, + number = 6, + pages = {276-7}, + timestamp = {2011-12-21T01:05:11.000+0100}, + title = {EMBOSS: the European Molecular Biology Open Software Suite}, + volume = 16, + year = 2000 +} + + + +@Article{RDogma, +AUTHOR = {Stacia K. Wyman, Robert K. Jansen and Jeffrey L. Boore}, +TITLE = {Automatic annotation of organellar genomes +with DOGMA}, +JOURNAL = {BIOINFORMATICS, oxford Press}, +VOLUME = {20}, +YEAR = {2004}, +NUMBER = {172004}, +PAGES = {3252-3255}, +URL={http://www.biosci.utexas.edu/ib/faculty/jansen/pubs/Wyman%20et%20al.%202004.pdf}, +} + +@article{SMMR+13, +title={Genomic analysis of smooth tubercle bacilli provides insights into ancestry and pathoadaptation of Mycobacterium tuberculosis}, +url={http://www.nature.com/ng/journal/v45/n2/full/ng.2517.html}, +DOI={10.1038/ng.2517}, +volume={45}, +number={2}, +journal={Nature Genetics}, +author={Philip Supply and Michael Marceau and Sophie Mangenot and David Roche and Carine Rouanet and Varun Khanna and Laleh Majlessi and Alexis Criscuolo and Julien Tap and Alexandre Pawlik}, +year={2013}, + pages={172–179}} + +@article{CGOT10, +title={Yeast Ancestral Genome Reconstructions: The Possibilities of Computational Methods II}, +author={Cedric Chauve and Haris Gavranovic and Aida Ouangraoua and Eric Tannier}, +journal={Journal of Computational Biology}, +month=sep, +year={2010}, +volume=17, +number=9, +pages={1097--1112}, +DOI={10.1089/cmb.2010.0092} +} + +@article{Eisen2007, + author = {Eisen, Jonathan A}, + journal = {PLoS Biol}, + publisher = {Public Library of Science}, + title = {Environmental Shotgun Sequencing: Its Potential and Challenges for Studying the Hidden World of Microbes}, + year = {2007}, + month = {03}, + volume = {5}, + url = {http://dx.doi.org/10.1371%2Fjournal.pbio.0050082}, + pages = {e82}, + abstract = { +

Environmental shotgun sequencing promises to reveal novel and fundamental insights into the hidden world of microbes, but the complexity of analysis required to realize this potential poses unique interdisciplinary challenges.

+ }, + number = {3}, + doi = {10.1371/journal.pbio.0050082} +} + +@article{de2002comparative, + title={Comparative analysis of chloroplast genomes: functional annotation, genome-based phylogeny, and deduced evolutionary patterns}, + author={De Las Rivas, Javier and Lozano, Juan Jose and Ortiz, Angel R}, + journal={Genome research}, + volume={12}, + number={4}, + pages={567--583}, + year={2002}, + publisher={Cold Spring Harbor Lab} +} + +@article{liu2012cpgavas, + title={CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences}, + author={Zhang \emph{et al}}, + %Liu, Chang and Shi, Linchun and Zhu, Yingjie and Chen, Haimei and Zhang, Jianhui %and Lin, Xiaohan and Guan, Xiaojun}, + journal={BMC genomics}, + volume={13}, + number={1}, + pages={715}, + year={2012}, + publisher={BioMed Central Ltd} +} + +@article{parra2007cegma, + title={CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes}, + author={Parra, Genis and Bradnam, Keith and Korf, Ian}, + journal={Bioinformatics}, + volume={23}, + number={9}, + pages={1061--1067}, + year={2007}, + publisher={Oxford Univ Press} +} + +@article{parra2000geneid, + title={Geneid in drosophila}, + author={Parra, Gen{\'\i}s and Blanco, Enrique and Guig{\'o}, Roderic}, + journal={Genome research}, + volume={10}, + number={4}, + pages={511--515}, + year={2000}, + publisher={Cold Spring Harbor Lab} +} + +@article{birney2004genewise, + title={GeneWise and genomewise}, + author={Birney, Ewan and Clamp, Michele and Durbin, Richard}, + journal={Genome research}, + volume={14}, + number={5}, + pages={988--995}, + year={2004}, + publisher={Cold Spring Harbor Lab} +} + +@article{apweiler1985swiss, + title={SWISS-PROT AND ITS COMPUTER-ANNOTATED SUPPLEMENT TREMBL: HOW TO PRODUCE HIGH QUALITY AUTOMATIC ANNOTATION}, + author={Apweiler, Rolf and O’Donovan, Claire and Martin, Maria Jesus and Fleischmann, Wolfgang and Hermjakob, Henning and Moeller, Steffen and Contrino, Sergio and Junker, Vivien}, + journal={EUR. J. BIOCHEM}, + volume={147}, + pages={9--15}, + year={1985}, + url={http://www.ebi.ac.uk/ena/} +} + +@article{sugawara2008ddbj, + title={DDBJ with new system and face}, + author={Sugawara, Hideaki and Ogasawara, Osamu and Okubo, Kousaku and Gojobori, Takashi and Tateno, Yoshio}, + journal={Nucleic acids research}, + volume={36}, + number={suppl 1}, + pages={D22--D24}, + year={2008}, + publisher={Oxford Univ Press} +} + +@article{chapman2000biopython, + title={Biopython: Python tools for computational biology}, + author={Chapman, Brad and Chang, Jeffrey}, + journal={ACM SIGBIO Newsletter}, + volume={20}, + number={2}, + pages={15--19}, + year={2000}, + publisher={ACM} +} + +@incollection{FI09, +year={2009}, +isbn={978-3-642-04743-5}, +booktitle={Comparative Genomics}, +volume={5817}, +series={Lecture Notes in Computer Science}, +editor={Ciccarelli, FrancescaD. and Miklós, István}, +doi={10.1007/978-3-642-04744-2_1}, +title={Yeast Ancestral Genome Reconstructions: The Possibilities of Computational Methods}, +url={http://dx.doi.org/10.1007/978-3-642-04744-2_1}, +publisher={Springer Berlin Heidelberg}, +author={Tannier, Eric}, +pages={1-12} +} + +@article{gansner2002drawing, + title={Drawing graphs with dot}, + author={Gansner, Emden and Koutsofios, Eleftherios and North, Stephen}, + journal={Retrieved June}, + volume={13}, + pages={2005}, + year={2002} +} + +@article{10.1371/journal.pone.0052841, + author ={Blouin \emph{et al}}, + %{Blouin, Yann AND Hauck, Yolande AND Soler,Charles AND Fabre, Michel AND Vong, Rithy ANDDehan, Céline AND Cazajous, Géraldine ANDMassoure, Pierre-Laurent AND Kraemer, PhilippeANDJenkins, Akinbowale AND Garnotel, EricAND Pourcel, Christine AND Vergnaud, Gilles} + journal = {PLoS ONE}, + publisher = {Public Library of Science}, + title = {Significance of the Identification in the Horn of Africa of an Exceptionally Deep Branching \textit{Mycobacterium tuberculosis} Clade}, + year = {2012}, + month = {12}, + volume = {7}, + url = {http://dx.doi.org/10.1371%2Fjournal.pone.0052841}, + pages = {e52841}, + number = {12}, + doi = {10.1371/journal.pone.0052841} +} + + + +@article{zafar2002coregenes, + title={CoreGenes: A computational tool for identifying and cataloging}, + author={Zafar, Nikhat and Mazumder, Raja and Seto, Donald}, + journal={BMC bioinformatics}, + volume={3}, + number={1}, + pages={12}, + year={2002}, + publisher={BioMed Central Ltd} +} + +@Article{17623808, +AUTHOR = {Gomez-Valero, Laura and Rocha, Eduardo P C and Latorre, Amparo and Silva, Francisco J}, +TITLE = {Reconstructing the ancestor of Mycobacterium leprae: the dynamics of gene loss and genome reduction.}, +JOURNAL = {Genome Res}, +VOLUME = {17}, +YEAR = {2007}, +NUMBER = {8}, +PAGES = {1178-85}, +URL = {http://www.biomedsearch.com/nih/Reconstructing-ancestor-Mycobacterium-leprae-dynamics/17623808.html}, +PubMedID = {17623808}, +ISSN = {1088-9051} +} + +@article{guindon2005phyml, + title={PHYML Online—a web server for fast maximum likelihood-based phylogenetic inference}, + author={Guindon, Stephane and Lethiec, Franck and Duroux, Patrice and Gascuel, Olivier}, + journal={Nucleic acids research}, + volume={33}, + number={suppl 2}, + pages={W557--W559}, + year={2005}, + publisher={Oxford Univ Press} +} + +@article{goloboff2008tnt, + title={TNT, a free program for phylogenetic analysis}, + author={Goloboff, Pablo A and Farris, James S and Nixon, Kevin C}, + journal={Cladistics}, + volume={24}, + number={5}, + pages={774--786}, + year={2008}, + publisher={Wiley Online Library} +} + +@article{stamatakis2008raxml, + title={The RAxML 7.0. 4 Manual}, + author={Stamatakis, Alexandros}, + journal={Department of Computer Science. Ludwig-Maximilians-Universit{\"a}t M{\"u}nchen}, + year={2008} +} +@article{stamatakis2005raxml, + title={RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees}, + author={Stamatakis, Alexandros and Ludwig, Thomas and Meier, Harald}, + journal={Bioinformatics}, + volume={21}, + number={4}, + pages={456--463}, + year={2005}, + publisher={Oxford Univ Press} +} + + +@article{Bakke2009, + author = {Bakke \emph{et al}}, + journal = {PLoS ONE}, + publisher = {Public Library of Science}, + title = {Evaluation of Three Automated Genome Annotations for \textit{Halorhabdus utahensis}}, + year = {2009}, + month = {07}, + volume = {4}, + url = {http://dx.doi.org/10.1371%2Fjournal.pone.0006291}, + pages = {e6291}, + number = {7}, + doi = {10.1371/journal.pone.0006291} +} + +@article{altschul1990basic, + title={Basic local alignment search tool}, + author={Altschul, Stephen F and Gish, Warren and Miller, Webb and Myers, Eugene W and Lipman, David J}, + journal={Journal of molecular biology}, + volume={215}, + number={3}, + pages={403--410}, + year={1990}, + publisher={Elsevier} +} + +@article{geneVision, + title={DNASTAR- GenVision Software for Genomic Visualizations}, + author={DNASTAR}, + url = {http://www.dnastar.com/products/genvision.php} +} + +@article{mcfadden2001primary, + title={Primary and secondary endosymbiosis and the origin of plastids}, + author={McFadden, Geoffrey Ian}, + journal={Journal of Phycology}, + volume={37}, + number={6}, + pages={951--959}, + year={2001}, + publisher={Wiley Online Library} +} + +@article{li2013complete, + title={Complete Chloroplast Genome Sequence of Holoparasite Cistanche deserticola (Orobanchaceae) Reveals Gene Loss and Horizontal Gene Transfer from Its Host Haloxylon ammodendron (Chenopodiaceae)}, + author={Li \emph{et al}}, + journal={PloS one}, + volume={8}, + number={3}, + pages={e58747}, + year={2013}, + publisher={Public Library of Science} +} diff --git a/Paper2/classEquiv.tex b/Paper2/classEquiv.tex new file mode 100644 index 0000000..e9bf829 --- /dev/null +++ b/Paper2/classEquiv.tex @@ -0,0 +1,216 @@ +\color{red} +We still need to propose good methodology to predict good core genes that reflect natural biological relationships among species. Proposing new methods and compare them with previous ones can give us an indicator of which method can produce good functional genes. In this section, we will recall the definition with a fast revision of similarity based method (see \cite{Alkindy2014} for further details). +This method, considers annotated genomes from NCBI and DOGMA and uses a distance-based similarity measure on genes' coding sequences. Such an approach requires annotated genomes, like the ones provided by the NCBI website. + +\subsubsection{Theoretical presentation} + +We start by fast revision of first method by the following preliminary definition~\cite{acgs13:onp,Alkindy2014}.\color{black} + +\begin{definition} +\label{def1} +Let $A=\{A,T,C,G\}$ be the nucleotides alphabet, and $A^\ast$ be the +set of finite words on $A$ (\emph{i.e.}, of DNA sequences). Let +$d:A^{\ast}\times A^{\ast}\rightarrow[0,1]$ be a function called +similarity measure on +$A^{\ast}$. Consider a given value $T\in[0,1]$ called a threshold. For +all $x,y\in A^{\ast}$, we will say that $x\sim_{d,T}y$ if +$d(x,y)\leqslant T$. +\end{definition} + +%\noindent $\sim_{d,T}$ is obviously an equivalence relation and when $d=1-\Delta$, where $\Delta$ is the similarity scoring function embedded into the emboss package , we will simply denote $\sim_{d,0.1}$ by $\sim$. + +Let be given a \emph{similarity} threshold $T$ and a \emph{similarity measure} $d$. +The method begins by building an undirected graph +between all the DNA~sequences $g$ of the set of genomes as follows: +there is an edge between $g_{i}$ and $g_{j}$ +if $g_i \sim_{d,T} g_j$ is established. +This graph is further denoted as the ``similarity'' graph. +We thus say that two coding sequences +$g_i$, $g_j$ are +equivalent with respect to the relation $\mathcal{R}$ if both $g_i$ and +$g_j$ belong in the same +connected component (CC) of this similarity graph, \textit{i.e.}, if there is a path between $g_i$ +and $g_j$ in the graph. + +It is not hard to see that this relation is an +equivalence relation whereas $\sim$ is not. +Any class for this relation is called a ``gene'' +in this article, where its representatives +(DNA~sequences) are the ``alleles'' of this gene, such abuse of language +being proposed to set our ideas down. Thus this first +method produces for each genome $G$, which is a set +$\left\{g_{1}^G,...,g_{m_G}^G\right\}$ of $m_{G}$ DNA coding +sequences, the projection of each sequence according to $\pi$, where +$\pi$ maps each sequence into its gene (class) according to $\mathcal{R}$. In +other words, a genome $G$ is mapped into +$\left\{\pi(g_{1}^G),...,\pi(g_{m_G}^G)\right\}$. Note that a +projected genome has no duplicated gene since it is a set. + +Consequently, the core genome (resp., the pan genome) of two genomes +$G_{1}$ and $G_{2}$ is defined as the intersection (resp., as the +union) of their projected genomes. We finally consider the intersection +of all the projected genomes, which is the set of all the genes +$\dot{x}$ such that each genome has at least one allele in +$\dot{x}$. This set will constitute the core genome of the whole species +under consideration. The pan genome is computed similarly as the union of all +the projected genomes. + +% \begin{figure} +% \begin{center} +% \includegraphics[scale=0.5]{stats.png} +% \end{center} +% \caption{Size of core and pan genomes w.r.t. the similarity threshold, first approach with NCBI annotations and Needleman-Wunsch similarity measure.}\label{Fig:sim:core:pan} +% \end{figure} +% + + +% \begin{table}[h!] +% \centering +% \begin{tabular}{ccccccc} +% \hline +% & \multicolumn{4}{c}{Method 1} & \multicolumn{2}{c}{Method 3} \\ \hline +% & \multicolumn{2}{c}{NCBI} & \multicolumn{2}{c}{DOGMA} & \multicolumn{2}{c}{NCBI and DOGMA} \\ \hline +% Threshold & core & pan & core & pan & core & pan \\ \hline +% 50 & 1 & 163 & 1 & 118 & \textbf{5} & \textbf{245} \\ +% 51 & 1 & 291 & 1 & 194 & - & - \\ +% 52 & 1 & 412 & 1 & 258 & - & - \\ +% 53 & 1 & 508 & 1 & 321 & - & - \\ +% 54 & 4 & 617 & 2 & 372 & - & - \\ +% 55 & \textbf{5} & \textbf{692} & 2 & 409 & - & - \\ +% 56 & \textbf{5} & \textbf{761} & \textbf{3} & \textbf{445} & - & - \\ +% 57 & 4 & 832 & \textbf{3} & \textbf{459} & - & - \\ +% 58 & 4 & 905 & 2 & 477 & - & - \\ +% 59 & 4 & 976 & 2 & 497 & - & - \\ +% 60 & 2 & 1032 & 2 & 519 & 4 & 242 \\ +% 61 & 2 & 1113 & 2 & 553 & - & - \\ +% 62 & 2 & 1186 & 2 & 580 & - & - \\ +% 63 & 2 & 1264 & 2 & 607 & - & - \\ +% 64 & 2 & 1352 & 2 & 644 & - & - \\ +% 65 & 1 & 1454 & 2 & 685 & - & - \\ +% 66 & 1 & 1544 & 1 & 756 & - & - \\ +% 67 & 0 & 1652 & 1 & 838 & - & - \\ +% 68 & 0 & 1775 & 1 & 912 & - & - \\ +% 69 & 0 & 1886 & 1 & 1007 & - & - \\ +% 70 & 0 & 2000 & 1 & 1116 & 3 & 242 \\ +% 80 & 0 & 3541 & 0 & 2730 & 1 & 242 \\ +% 90 & 0 & 5703 & 0 & 5181 & 0 & 241 \\ \hline +% \end{tabular} +% \caption{Size of core and pan genomes w.r.t. the similarity threshold, first and third approaches.} +% \label{Fig:sim:core:pan} +% \end{table} +% +\begin{table}[h!] +\centering +\begin{tabular}{cccccccc} +\hline + & \multicolumn{4}{c}{Method 1} & \multicolumn{2}{c}{Method 2} \\ +\hline + & \multicolumn{2}{c}{NCBI} & \multicolumn{2}{c}{DOGMA} & \multicolumn{2}{c}{NCBI and DOGMA} \\ \hline +Threshold & core & pan & core & pan & core & pan \\ +\hline +50 & 1 & 163 & 1 & 118 & \textbf{5} & \textbf{245} \\ +51 & 1 & 291 & 1 & 194 & - & - \\ +52 & 1 & 412 & 1 & 258 & - & - \\ +53 & 1 & 508 & 1 & 321 & - & - \\ +54 & 4 & 617 & 2 & 372 & - & - \\ +55 & \textbf{5} & \textbf{692} & 2 & 409 & - & - \\ +56 & \textbf{5} & \textbf{761} & \textbf{3} & \textbf{445} & - & - \\ +57 & 4 & 832 & \textbf{3} & \textbf{459} & - & - \\ +58 & 4 & 905 & 2 & 477 & - & - \\ +59 & 4 & 976 & 2 & 497 & - & - \\ +60 & 2 & 1032 & 2 & 519 & 4 & 242 \\ +61 & 2 & 1113 & 2 & 553 & - & - \\ +62 & 2 & 1186 & 2 & 580 & - & - \\ +63 & 2 & 1264 & 2 & 607 & - & - \\ +64 & 2 & 1352 & 2 & 644 & - & - \\ +65 & 1 & 1454 & 2 & 685 & - & - \\ +66 & 1 & 1544 & 1 & 756 & - & - \\ +67 & 0 & 1652 & 1 & 838 & - & - \\ +68 & 0 & 1775 & 1 & 912 & - & - \\ +69 & 0 & 1886 & 1 & 1007 & - & - \\ +70 & 0 & 2000 & 1 & 1116 & 3 & 242 \\ +80 & 0 & 3541 & 0 & 2730 & 1 & 242 \\ +90 & 0 & 5703 & 0 & 5181 & 0 &241 \\ +\hline +\end{tabular} +\caption{Size of core and pan genomes w.r.t. the similarity threshold, first and second +approache.} +\label{Fig:sim:core:pan} +\end{table} + +\subsubsection{Case study}% using NCBI annotations} + +Let us now consider the 99 chloroplastic genomes introduced earlier. +We will use in this case study either the coding sequences +downloaded from NCBI website or the sequences predicted by DOGMA. +DOGMA, which stands for \textit{Dual Organellar GenoMe Annotator}, has +already been evoked in this article. This is a + tool developed in 2004 at University of Texas for annotating plant +chloroplast and animal mitochondrial genomes. This tool +translates a genome in all six reading frames and then +queries its own amino acid sequence database using +Blast (blastx~\cite{altschul1990basic}) with various ad hoc +parameters. The choice of DOGMA is natural, as this annotation tool is +reputed and specific to chloroplasts. + +Each genome is thus constituted +by a list of coding sequences. In this illustration study, +we have evaluated the similarity between two sequences by +using a global alignment. More precisely, the measure $d$ +introduced above is the similarity score provided after +a Needleman-Wunch global alignment, as obtained by running +the \emph{needle} command from the \emph{emboss} package +released by EMBL~\cite{Rice2000}. Parameters of the \emph{needle} +command are the default ones: 10.0 for gap open penalty and 0.5 +for gap extension. + +The number of genes in the core genome and in the pan genome, +according to this first method using data and measure described above +have been computed using the supercomputer facilities of the M\'esocentre +de calcul de Franche-Comt\'e. Obtained results are +represented in Table~\ref{Fig:sim:core:pan} with respect to various +threshold values on Needleman-Wunsch similarity scores. +Remark that when the threshold is large, +we obtain more connected components, but with small sizes (a large number +of genes, with a few numbers of alleles for each of them). In other words, +%the number +%of alleles of one gene is naturally small if the threshold is large. +%Similarly, +when the threshold is large, the +pan genome is large too. %However due to the construction method of the +%core genome, this set of genes has few elements in such a situation. +No matter the chosen annotation tool, this first approach suffers from producing +too small core genomes, for any chosen similarity threshold, compared +to what is usually expected by biologists. +% regarding these +%chloroplasts and their commonly accepted evolutionary scenario. +For NCBI, it is certainly due to a wrong determination of start and stop +codons in some annotated genomes, due to a large variety of annotation +tools used during genomes submission on the NCBI server, some of them +being old or deficient: such truncated +genes will not produce a large similarity score with their orthologous genes +present in other genomes. The case of DOGMA is more +difficult to explain as, according to our experiments and to the +state of the art, this gene prediction tool produces normally good +results in average. The best explanation of such an under-performance +is that a few genomes are very specific and far from the remainder ones, in terms +of gene contents, which leads to a small number of genes in the global core +genome. However this first approach cannot help us to determine +which genomes must be removed from our set of data. To do so, we +need to introduce a second approach based on gene names: from the +problematic gene names, we will be able to trace back to the +problematic genomes. + +% +% This first illustration emphasizes the importance to deal with +% well-predicted coding sequences: gene prediction must be +% achieved with the same tool on each genome, and this tool must +% be well chosen among the state of the art ones. As we will use +% good annotation tools, a natural idea is to take advantage to +% the whole produced annotations (not only the predicted coding +% sequence, but its name and location too for instance). The implementation +% of this idea is the principle of the approaches for finding core +% and pan genomes detailed below. +% % We are then left with the following questions: how can +% we improve the confidence put in the produced core? That is, how to obtain +% such a core genome without considering coding sequences provided by the NCBI? diff --git a/Paper2/conclusion.tex b/Paper2/conclusion.tex new file mode 100644 index 0000000..856cacd --- /dev/null +++ b/Paper2/conclusion.tex @@ -0,0 +1,42 @@ +In this research work, we studied two %three +methodologies for extracting core genes from a large set of chloroplasts genomes, and we developed +Python programs to evaluate them in practice. +%Extracted core genomes +%depend on both gene names and sequences. +% Furthermore, that extract these core genes with the three methodologies. + +We firstly considered to extract core genomes by the way of comparisons +(global alignment) of DNA sequences downloaded from NCBI database. +However this method failed to produce biologically +relevant core genomes, no matter the chosen similarity threshold, probably +due to annotation errors. We then considered to use the DOGMA annotation tool +to enhance the genes prediction process. The second method consisted in extracting +gene names either from NCBI gene features or from DOGMA results. A first +``intersection core matrix (ICM)'' where built, in which each coefficient +stored the intersection cardinality of the two genomes placed at the extremities +of its row and column. New ICMs are +then constructed by selecting the maximum intersection score (IS) in this matrix, +removing the two genomes having this score, and adding the corresponding +core genome in a new ICM construction. %Finally, in the third method, a genes quality test has been added before the ICMs computation, to ensure that the genes obtained in the NCBI annotation files are the same %(\emph{i.e.}, gene name and sequence) than the ones produced by DOGMA. +% A genes quality test has then been introduced to construct new ICMs +% on genomes +% only constituted by the genes that successfully passed +% a specific similarity threshold of 65\% on their sequences. +% % , ICM +% % then will take place to extract the core genes. +% + +Core trees have finally been generated for each method, to investigate +the distribution of chloroplasts and core genomes. The tree from second +method based on DOGMA has revealed the best distribution of + chloroplasts regarding their evolutionary history. In particular, it appears to +us that each endosymbiosis event is well branched in the DOGMA core tree. + +In future work, we intend to deepen the methodology evaluation by considering +new gene prediction tools and various similarity measures on both +gene names and sequences. Additionally, we will investigate new clustering +methods on the first approach, to improve the results quality in this promising way to +obtain core genes. Finally, the results produced with DOGMA will be +further investigated, biologically speaking: the genes content of each core +will be studied while phylogenetic relations between all these species +will be questioned. diff --git a/Paper2/core.png b/Paper2/core.png new file mode 100644 index 0000000..e59bed7 Binary files /dev/null and b/Paper2/core.png differ diff --git a/Paper2/coregenome.png b/Paper2/coregenome.png new file mode 100644 index 0000000..c3b39de Binary files /dev/null and b/Paper2/coregenome.png differ diff --git a/Paper2/cover_dogma.png b/Paper2/cover_dogma.png new file mode 100644 index 0000000..99c699c Binary files /dev/null and b/Paper2/cover_dogma.png differ diff --git a/Paper2/cover_ncbi.png b/Paper2/cover_ncbi.png new file mode 100644 index 0000000..66e4d57 Binary files /dev/null and b/Paper2/cover_ncbi.png differ diff --git a/Paper2/discussion.tex b/Paper2/discussion.tex new file mode 100644 index 0000000..2ea0941 --- /dev/null +++ b/Paper2/discussion.tex @@ -0,0 +1,210 @@ + + + + +%\subsection{Implementation} +%\label{sec:implem} +%All the algorithms detailed in this article have +%been implemented using Python~2.7 on a personal computer (Ubuntu~12.04 with 6~GiB memory, quad-core Intel~i5 with an operating frequency of +%2.5~GHz). %All programs can be downloaded at \begin{color}{red} \url{http://......} \end{color}. +%%genes from large amount of chloroplast genomes. +% +%%\begin{center} +%\begin{table}[H] +%\centering +%\caption{Type of annotations and execution time}\label{Etime} +%{%\scriptsize +%%\begin{tabular}{p{2.3cm}p{0.5cm}p{0.25cm}p{0.5cm}p{0.25cm}p{0.5cm}p{0.25cm}}%p{0.5cm}p{0.25cm}p{0.5cm}p{0.2cm}} +%\begin{tabular}{ccccccc} +%\hline\hline +% Method & \multicolumn{2}{c}{Annotation} & \multicolumn{2}{c}{Features} & \multicolumn{2}{c}{Exec. time (min.)} \\%& \multicolumn{2}{c}{Core genes} & \multicolumn{2}{c}{Bad genomes} \\ +%~ & N & D & Name & Seq & N & D \\%& N & D & N & D \\ +%\hline +%First approach & $\surd$ & - & - & $\surd$ & 1.7 & -\\% & ? & - & 0 & -\\[0.5ex] +%Second approach & $\surd$ & $\surd$ & $\surd$ & - & 4.98 & 1.52\\% & 28 & 10 & 1 & 0\\[0.5ex] +%Third approach & $\surd$ & $\surd$ & $\surd$ & $\surd$ & \multicolumn{2}{c}{$\simeq$3 days + 1.29} \\%& \multicolumn{2}{c}{4} & \multicolumn{2}{c}{1}\\[1ex] +%\hline +%\end{tabular} +%} +%\end{table} +%%\end{center} +% +%%\vspace{-1cm} +% +%Table~\ref{Etime} presents the annotation type, +%execution time, and the number of core genes for each proposed method. The following +%notations have been used: \textbf{N} denotes NCBI and \textbf{D} means DOGMA, +%while \textbf{Seq} stands for sequence. The two first {\it Annotation} columns +%represent the algorithm used to annotate chloroplast genomes. The next two {\it +%Features} columns mean the kind of gene feature used to extract core +%genes: gene name, gene sequence, or both of them. +% +%It can be seen that +%almost all methods need low execution time to extract core genes +%from the large set of chloroplast genomes. Only the third method requires +%more than one day of computation (about 3-4 days) for sequence comparisons. However, +%once the quality genomes are well constructed, it only takes 1.29~minutes to +%extract the core genes. Such low execution times allow us to use these +%methods to extract all core genomes on a personal computer. +%The lowest execution time (1.52~minutes) +%is obtained with the second method using DOGMA annotations. +% +% +%The second important computational factor is the amount of memory necessary for each +%methodology. Table~\ref{mem} shows the memory usage of each +%method. In this table, the values are presented in megabyte +%unit, while \textit{gV} means geneVision~file~format. We can notice that +%the quantity of required memory is relatively low for all methods, +%and is available on any personal computer. The different values also +%show that the gene features method based on DOGMA annotations has the +%most reasonable memory usage, except when extracting core +%sequences. The third method gives the lowest values if we already have +%the ``quality genomes'', otherwise it will consume far more +%memory. Remark that the amount of memory used by the third method also +%depends on the size of each genome. +% +% +%\begin{table}[H] +%\centering +%\caption{Memory usages for each methodology (in MB)}\label{mem} +%\tabcolsep=0.11cm +%{\scriptsize +%\begin{tabular}{p{2.5cm}@{\hskip 0.1mm}p{1.5cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}} +%\hline\hline +%Method& & Load Gen. & Conv. gV & Read gV & ICM & Core tree & Core Seq. \\ +%\hline +%Gene prediction & NCBI & 108 & - & - & - & - & -\\ +%\multirow{2}{*}{Gene Features} & NCBI & 15.4 & 18.9 & 17.5 & 18 & 18 & 28.1\\ +% & DOGMA& 15.3 & 15.3 & 16.8 & 17.8 & 17.9 & 31.2\\ +%Gene Quality & ~ & 15.3 & $\le$3G & 16.1 & 17 & 17.1 & 24.4\\ +%\hline +%\end{tabular} +%} +%\end{table} +% +% +%\subsection{Results comparison} +% +%Method 2 has indicated to us that two genomes must be removed from the +%set of chloroplasts, namely \textit{Epifagus virginiana} (NC\_001568.1) +%and \textit{Cuscuta gronovii} (NC\_009765.1). The reason to +%be of this update is that (1) these chloroplastic genomes are non functional ones, +%and (2) considering them leads to a too small final core genome. +%Additionally, we have been forced to remove \textit{NC\_012568.1 Micromonas pusilla} +%from the NCBI study, as its wrong annotations lead to an empty +%final core genome. +% +%The number +%of {\it Core genes} in Table~\ref{Etime} represents the amount of genes in the last core +%genome (the core genes shared by all the chloroplasts). +%%The main goal is to find the maximum core genes that simulate +%%biological background of chloroplasts. +%With NCBI we obtained 28 genes for +%96 genomes, while DOGMA approach produces 10 genes for the whole 97 genomes. +% However we will see that the distribution of genomes +%in the NCBI core tree is less relevant, biologically speaking, than the one obtained +%by using DOGMA naming process (see Section~\ref{sec:discuss}). +% +%%\begin{sidewaystable} +%\begin{table} +%\centering +% \begin{tabular}{llllllllll} +% \hline +% Method & \multicolumn{2}{l}{Connected Components} & \multicolumn{6}{l}{ICM-Genes' names} & ICM-Quality test \\ \hline +% Annotation & NCBI & DOGMA & \multicolumn{3}{l}{NCBI} & \multicolumn{3}{l}{DOGMA} & NCBI and DOGMA \\ +% Nb. of genomes & 99 & 99 & 99* & 97 & 96 & 99 & 97 & 96 & 99 \\ +% Core genome & 5 & 3 & 9 & 0 & 28 & 2 & 10 & 28 & 5 \\ +% Pan genome & 761 & 445 & 766 & 764 & 737 & 297 & 297 & 297 & 245 \\ \hline +% \end{tabular} +%\end{table} +% +%% Please remember to add \use{multirow} to your document preamble in order to suppor multirow cells +%\begin{table}[h] +%\begin{tabular}{ccccl} +%\hline +%Nb. of & Methods & Type of & Size of & \multicolumn{1}{c}{Names of core genes} \\ +% genomes&&annotation&core genome&\\ \hline +%\multirow{3}{*}{97} & \multirow{2}{*}{Method 2} & NCBI & 0 & - \\ +%\multicolumn{1}{l}{} & \multicolumn{1}{l}{} & DOGMA & 10 & ATPI, PSAA, PSAB, PSBA, \\ +%&&&&PSBE, PSBF, PSBL, RPL2,\\ +%&&&& TRNC-GCA, TRNH-GUG \\ +%\multicolumn{1}{l}{} & Method 3 & Both & 5 & ATPI, ATPA, ATPH, PSBJ, PSBE \\ +%\multirow{3}{*}{96} & \multirow{2}{*}{Method 2} & NCBI & 28 & ATPA, ATPB, ATPE, ATPH,\\ +%&&&& ATPI, PETG, PSAA, PSAB,\\ +%&&&& PSAC, PSBA, PSBD, PSBE,\\ +%&&&& PSBF, PSBH, PSBJ, PSBK,\\ +%&&&& PSBN, RBCL, RPL14, RPL16,\\ +%&&&& RPL20, RPL36, RPS18, RPS3,\\ +%&&&& RPS4, RPS7, RPS8, RPS11 \\ +%\multicolumn{1}{l}{} & & DOGMA & 28 & ATPA, ATPB, ATPI, ATPH, PETB, PETG, PSAA, PSAB, PSAC, PSBA, PSBD, PSBE, PSBF, PSBC, PSBJ, PSBI, PSBL, PSBT, RBCL, RPL2, RRN16, TRND-GUC, TRNFM-CAU, TRNH-GUG, TRNI-GAU, TRNN-GUU, TRNQ-UUG, TRNC-GCA \\ +%\multicolumn{1}{l}{} & \multicolumn{1}{l}{Method 3} & Both & 5 & ATPI, ATPA, ATPH, PSBJ, PSBE \\ \hline +%\end{tabular} +%\end{table} +%%\end{sidewaystable} + + + +\subsection{Biological evaluation}\label{sec:discuss} +It is well known that the first plants' endosymbiosis ended in a great diversification of +lineages comprising \textit{Red Algae}, \textit{Green Algae}, and \textit{Land Plants} (terrestrial). +Several second endosymbioses occurred then: two involving a \textit{Red Algae} +and other heterotrophic eucaryotes and giving birth to both \textit{Brown Algae} +and \textit{Dinoflagellates} lineages; another involving a \textit{Green Algae} and +a heterotrophic eucaryote and giving birth to \textit{Euglens}~\cite{mcfadden2001primary}. + +The interesting point with the produced core trees (especially the one +obtained with DOGMA, see \url{http://members.femto-st.fr/christophe-guyeux/en/chloroplasts}) is +that organisms resulting from the first endosymbiosis are distributed in +each of the lineages found in the chloroplast genome structure +evolution. More precisely, all \textit{Red Algae} chloroplasts are grouped together in one lineage, while +\textit{Green Algae} and \textit{Land Plants} chloroplasts are all in a second lineage. +Furthermore organisms resulting from the secondary endosymbioses are well localized in +the tree: both the chloroplasts of \textit{Brown Algae} and \textit{Dinoflagellates} +representatives are found exclusively in the lineage also comprising the +\textit{Red Algae} chloroplasts from which they evolved, while the \textit{Euglens} +chloroplasts are related to the \textit{Green Algae} chloroplasts from which they +evolved. +This makes sense in terms of biology, history of lineages, and +theories of chloroplasts origins (and so photosynthetic ability) in +different Eucaryotic lineages~\cite{mcfadden2001primary}. + +Interestingly, the sole organisms under consideration that possess a +chloroplast (and so a chloroplastic genome) but that have lost the +photosynthetic ability (being parasitic plants) are found at the basis of +the tree, and not together with their phylogenetically related species. +This means that functional chloroplast genes are evolutionary constrained +when used in photosynthetic process, but loose rapidly their efficiency +when not used, as recently observed for a species of Angiosperms\cite{li2013complete}. +These species are \textit{Cuscuta-grovonii}, an Angiosperm (flowering plant) +at the base of the DOGMA Angiosperm-Conifers branch, and +\textit{Epipactis-virginiana}, also an Angiosperm, at the complete basis of this tree. + +Another interesting result is that \textit{Land Plants} that +represent a single sublineage originating from the large and diverse +lineage of \textit{Green Algae} in Eucaryotes history are present in two different +branches of the DOGMA tree, both associated with \textit{Green Algae}: one branch +comprising the basal grade of \textit{Land Plants} (mosses and ferns) and the second one +containing the most internal lineages of \textit{Land Plants} (Conifers and flowering plants). +But independently of their split in two distinct branches of the DOGMA +tree, the \textit{Land Plants} always show a higher number of functional genes in +their chloroplasts than the \textit{Green Algae} from which they emerged, probably meaning that the +terrestrial way of life necessitates more functional genes for an +optimal photosynthesis than the marine one. However, a more detailed +analysis of selected genes is necessary to better understand the reasons why +such a distribution has been obtained. +Remark finally that all these biologically interesting results are apparent +only in the core tree based on DOGMA, while they are not so obvious in the NCBI one. + + +%\begin{figure} +%\centering +%\includegraphics[scale=0.37]{core} +%\caption{Core} +%\end{figure} +% +% +%\begin{figure} +%\centering +%\includegraphics[scale=0.37]{pan} +%\caption{Pan} +%\end{figure} diff --git a/Paper2/general.tex b/Paper2/general.tex new file mode 100644 index 0000000..646352c --- /dev/null +++ b/Paper2/general.tex @@ -0,0 +1,62 @@ + + +\begin{figure}[h] + \centering + \includegraphics[width=0.75\textwidth]{Whole_system} +\caption{A general overview of the annotation-based approach}\label{Fig1} +\end{figure} + +%Figure~\ref{Fig1} presents a general overview of the entire proposed pipeline +%for core and pan genomes production and exploitation, which consists of three stages: \textit{Genomes annotation}, \textit{Core extraction}, and \textit{Features Visualization}. +% To understand the whole core extraction process, we +% describe briefly each stage below. More details will be given in the +% coming subsections. +\color{red}In previous work \cite{Alkindy2014}, we proposed a pipeline for the extraction of core genome. In this work, the pipline is considered with quality test method in extracting core genes, for more details (see figure~\ref{Fig1}). As a starting point, an annotation uses a DNA sequences database % chosen among the many international databases storing %nucleotide sequences, +such as NCBI's GenBank~\cite{Sayers01012011}, the European \textit{EMBL} database~\cite{apweiler1985swiss}, or the Japanese \textit{DDBJ} one~\cite{sugawara2008ddbj}. +\color{black} +Further more, It is possible to obtain annotated genomes (DNA coding sequences with gene +names and locations) by interacting with these databases, either by directly downloading +annotated genomes delivered by these websites, or by launching an +annotation tool on complete downloaded genomes. +Obviously, this annotation stage must be of quality if we want +to obtain acceptable core and pan genomes. +% These last years the cost of sequencing genomes has been greatly +% reduced, and thus more and more genomes are sequenced. Therefore +% automatic annotation tools are required to deal with this continuously +% increasing amount of genomics data. %Moreover, a reliable and accurate +% %genome annotation process is needed in order to provide strong +%indicators for the study of life\cite{Eisen2007}. +%Various cost-effective annotation tools~\cite{Bakke2009} producing genomic annotations at many levels of detail have been designed recently, some reputed ones being: % NCBI~\cite{Sayers01012011}, DOGMA~\cite{RDOGMA}, cpBase~\cite{de2002comparative}, CpGAVAS~\cite{liu2012cpgavas}, and CEGMA~\cite{parra2007cegma}. Such tools usually use one out of the three following methods for finding gene locations in large DNA sequences: \textit{alignment-based}, \textit{composition based}, or a combination of both~\cite{parra2007cegma}. The alignment-based method is used when trying to predict a protein coding sequence by aligning a genomic DNA sequence with a cDNA sequence coding an already known homologous protein~\cite{parra2007cegma}.0 This approach is used for instance in GeneWise~\cite{birney2004genewise}. The alternative method, the composition-based one (also known as \textit{ab initio}) is based on probabilistic models of genes structure~\cite{parra2000geneid}. % to find genes according to the gene value probability +%(GeneID). + +Using such annotated genomes, we will detail two general approaches for extracting the core genome, which is the third stage of the pipeline: the first one uses similarities computed on predicted coding sequences, while the second one uses all the information provided during the annotation stage. + +\color{red}instead of considering only gene sequences taken from NCBI or DOGMA, a quality test process is take place by working with gene names and sequences to produce quality genes. However, we will show that such a simple idea is not so easy to realize, and that it is not sufficient to only consider gene names provided by such tools while it gives good results in previous work \cite{Alkindy2014}. \color{black} +% +% +Annotation, which is the first stage, is an important task for extracting gene features. Indeed, to extract good gene feature, a good annotation tool is obviously required. +Indeed, such annotations can be used in various manners (based on gene names, gene sequences, protein sequences, etc.) to extract the core and pan genomes. +We will subsequently propose methods that use gene names and sequences for extracting core genes and producing chloroplast evolutionary tree. + +%\input{population_Table} +The final stage of our pipeline, only invoked in this article, is to take advantage +of the information produced during the core and pan genomes search. +This features visualization stage encompasses phylogenetic tree construction (see \cite{Alkindy2014} for more details) +using core genes, genes content evolution illustrated by core trees, functionality +investigations, and so on. +% +% allows to visualize genomes and/or gene evolution in chloroplast. Therefore we use representations like tables, phylogenetic trees, graphs, etc. to organize and show genomes relationships, and thus achieve the goal of representing gene +% evolution. In addition, comparing these representations with ones issued from another annotation tool dedicated to large population of chloroplast genomes give us biological perspectives to the nature of chloroplasts evolution. %Notice that a local database linked with each pipe stage is used to store all the information produced during the process. + +For illustration purposes, we have considered % GenBank-NCBI~\cite{Sayers01012011} as sequence +%database: +99~genomes of chloroplasts downloaded from GenBank database~\cite{Sayers01012011}. These genomes +lie in the eleven type of chloroplast families (see \cite{Alkindy2014} for more details).%as described in Table~\ref{Tab2}. +Furthermore, two kinds of annotations will be considered in this document, namely the +ones provided by NCBI on the one hand, and the ones by DOGMA on the other hand. +%The +%database in our method must be taken from any confident data source +%that stores annotated and/or unannotated chloroplast genomes. +% As stated in the previous section, we have +% considered GenBank-NCBI~\cite{Sayers01012011} as sequence +% database. diff --git a/Paper2/generalView.png b/Paper2/generalView.png new file mode 100644 index 0000000..52ca165 Binary files /dev/null and b/Paper2/generalView.png differ diff --git a/Paper2/gensim.png b/Paper2/gensim.png new file mode 100644 index 0000000..9c512b4 Binary files /dev/null and b/Paper2/gensim.png differ diff --git a/Paper2/implementation.tex b/Paper2/implementation.tex new file mode 100644 index 0000000..a8a9123 --- /dev/null +++ b/Paper2/implementation.tex @@ -0,0 +1,119 @@ +\color{red} +All different algorithms have been implemented using Python on a personal computer running Ubuntu~12.04 with 6~GiB memory and +a quad-core Intel core~i5~processor with an operating frequency of +2.5~GHz. %All the programs can be downloaded at \url{http://......} . +%genes from large amount of chloroplast genomes. + +\begin{center} +\begin{table}[H] +\caption{Type of annotation, execution time, and core genes.}\label{Etime} +{\scriptsize +\begin{tabular}{p{2cm}p{0.5cm}p{0.25cm}p{0.5cm}p{0.25cm}p{0.5cm}p{0.25cm}p{0.5cm}p{0.25cm}p{0.5cm}p{0.2cm}} +\hline\hline + Method & \multicolumn{2}{c}{Annotation} & \multicolumn{2}{c}{Features} & \multicolumn{2}{c}{Exec. time (min.)} & \multicolumn{2}{c}{Core genes} & \multicolumn{2}{c}{Bad genomes} \\ +~ & N & D & Name & Seq & N & D & N & D & N & D \\ +\hline +Gene prediction & $\surd$ & $\surd$ & - & $\surd$ & 1.7 & - & ? & - & 0 & -\\[0.5ex] +%Gene Features & $\surd$ & $\surd$ & $\surd$ & - & 4.98 & 1.52 & 28 & 10 & 1 & 0\\[0.5ex] +Gene Quality & $\surd$ & $\surd$ & $\surd$ & $\surd$ & \multicolumn{2}{c}{$\simeq$3 days + 1.29} & \multicolumn{2}{c}{4} & \multicolumn{2}{c}{1}\\[1ex] +\hline +\end{tabular} +} +\end{table} +\end{center} + +\vspace{-1cm} + +Table~\ref{Etime} presents for each method the annotation type, +execution time, and the number of core genes. We use the following +notations: \textbf{N} denotes NCBI, while \textbf{D} means DOGMA, +and \textbf{Seq} is for sequence. The first two {\it Annotation} columns +represent the algorithm used to annotate chloroplast genomes. The next two ones {\it +Features} columns mean the kind of gene feature used to extract core +genes: gene name, gene sequence, or both of them. It can be seen that +almost all methods need low {\it Execution time} expended in minutes to extract core genes +from the large set of chloroplast genomes. Only the gene quality method requires +several days of computation (about 3-4 days) for sequence comparisons. However, +once the quality genomes are well constructed, it only takes 1.29~minutes to +extract core gene. Thanks to this low execution times that gave us a privilege to use these +methods to extract core genes on a personal computer rather than main +frames or parallel computers. The lowest execution time: 1.52~minutes, +is obtained with the second method using Dogma annotations. The number +of {\it Core genes} represents the amount of genes in the last core +genome. The main goal is to find the maximum core genes that simulate +biological background of chloroplasts. With NCBI we have 28 genes for +96 genomes, instead of 10 genes for 97 genomes with +Dogma. Unfortunately, the biological distribution of genomes with NCBI +in core tree do not reflect good biological perspective, whereas with +DOGMA the distribution of genomes is biologically relevant. Some a few genomes maybe destroying core genes due to +low number of gene intersection. More precisely, \textit{NC\_012568.1 Micromonas pusilla} is the only genome who destroyes the core genome with NCBI +annotations for both gene features and gene quality methods. + +The second important factor is the amount of memory nessecary in each +methodology. Table \ref{mem} shows the memory usage of each method. +In this table, the values are presented in megabyte +unit and \textit{gV} means genevision~file~format. We can notice that +the level of memory which is used is relatively low for all methods +and is available on any personal computer. The different values also +show that the gene features method based on Dogma annotations has the +more reasonable memory usage, except when extracting core +sequences. The third method gives the lowest values if we already have +the quality genomes, otherwise it will consume far more +memory. Moreover, the amount of memory, which is used by the third method also +depends on the size of each genome. + + +\begin{table}[H] +\centering +\caption{Memory usages in (MB) for each methodology}\label{mem} +\tabcolsep=0.11cm +{\scriptsize +\begin{tabular}{p{2.5cm}@{\hskip 0.1mm}p{1.5cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}@{\hskip 0.1mm}p{1cm}} +\hline\hline +Method& & Load Gen. & Conv. gV & Read gV & ICM & Core tree & Core Seq. \\ +\hline +Gene prediction & NCBI & 108 & - & - & - & - & -\\ +%\multirow{2}{*}{Gene Features} & NCBI & 15.4 & 18.9 & 17.5 & 18 & 18 & 28.1\\ + %& DOGMA& 15.3 & 15.3 & 16.8 & 17.8 & 17.9 & 31.2\\ +Gene Quality & ~ & 15.3 & $\le$3G & 16.1 & 17 & 17.1 & 24.4\\ +\hline +\end{tabular} +} +\end{table} +\color{black} +%\begin{figure}[H] +% \centering \includegraphics[width=0.75\textwidth]{Whole_system} \caption{Overview +% of the pipeline, third approach}\label{wholesystem} +%\end{figure} + +\begin{figure}[!ht] + \subfloat[Sizes of core genome\label{subfig-1:core}]{% + \includegraphics[width=0.5\textwidth]{coregenome} + } + \hfill + \subfloat[Sizes of pan genome\label{subfig-2:pan}]{% + \includegraphics[width=0.5\textwidth]{pangenome} + } + \caption{Sizes of Core and Pan genomes for first and second method.} + \label{fig:sizes of core and pan} + \end{figure} + + +\begin{figure}[!ht] + \subfloat[genes coverage of NCBI genomes\label{Cover:NCBI}]{% + \includegraphics[width=0.5\textwidth]{cover_ncbi} + } + \hfill + \subfloat[genes coverage of DOGMA genomes\label{cover:dogma}]{% + \includegraphics[width=0.5\textwidth]{cover_dogma} + } + \caption{Gene comparisons cover from NCBI and DOGMA, second method} + \label{fig:sizes of core and pan} + \end{figure} + +\color{red} +Figure~\ref{fig:sizes of core and pan} represent the sizes of core and pan genomes produced from the two methods. In figure~\ref{subfig-1:core} core genes are predicted, note that max core genes do not mean good genes. We are looking for genes that meet it's biological principles. The core genes produced from the first method specially from DOGMA can reflect its biological meaning, we will explain later in the section of disscusion the reason why. In figure~\ref{subfig-2:pan}, we can see that the values of pan genome from second method is still steady with different thresholds the second method, while in the first method pan genes increases when the threshold increased. + +Furthermore, we calculate the correlation coeffecient formula for the second method and the results shows that the correlation for the annotation from DOGMA was $0.97$ while with NCBI was $0.69$. + +\color{black} diff --git a/Paper2/intro.aux b/Paper2/intro.aux new file mode 100644 index 0000000..f23e546 --- /dev/null +++ b/Paper2/intro.aux @@ -0,0 +1 @@ +\relax diff --git a/Paper2/intro.fdb_latexmk b/Paper2/intro.fdb_latexmk new file mode 100644 index 0000000..7a18840 --- /dev/null +++ b/Paper2/intro.fdb_latexmk @@ -0,0 +1,9 @@ +# Fdb version 3 +["pdflatex"] 1390291152 "intro.tex" "intro.pdf" "intro" 1390291152 + "/usr/share/texmf-dist/web2c/texmf.cnf" 1378714620 31669 df537245012f3e5f05bdc55644b7a3df "" + "/var/lib/texmf/web2c/pdftex/pdflatex.fmt" 1382036467 7168684 91d4905864f47193f0e2e49e82fb1d98 "" + "intro.aux" 1390290974 8 a94a2480d3289e625eea47cd1b285758 "" + "intro.tex" 1390291147 3515 75738ce782bdb48a3942fbd055488a90 "" + (generated) + "intro.log" + "intro.pdf" diff --git a/Paper2/intro.tex b/Paper2/intro.tex new file mode 100644 index 0000000..23b3439 --- /dev/null +++ b/Paper2/intro.tex @@ -0,0 +1,67 @@ +\color{red}The idea behind the importance of identifying core genes is to understand the shared functionality of agiven set of species. +%Identifying core genes may be of importance to understand shared functionality and specificity of a given set of species, or to construct their phylogeny using curated sequences. +We introduced in previous work (see \cite{Alkindy2014}) two methods for discovering core and pan genes based on sequence similarity method and alignment based approache method. However, to +determine both core and pan genomes of a large set of DNA sequences, we consider in this work compare the same clustering algorithm of sequence similarity method proposed in previous work with new method as an improvement of alignment based approach by considering sequence quality control test. More precisely, we focus on +the following questions using a collection of 99~chloroplasts as illustrative example: how +can we identify the best core genome (that is, an artificially designed set of +coding sequences as close as possible to the real biological one) and +how to deduce scenarios regarding their genes loss. + +The existance of Chloroplasts is behind the fact that \color{black}chloroplasts found in Eucaryotes have +an endosymbiotic origin, meaning +that they come from the incorporation of a photosynthetic bacteria (Cyanobacteria) within an eucaryotic cell. They are fundamental key elements in +living organisms history, as they are organelles responsible for +photosynthesis. This latter is the main way to produce organic matters +from mineral ones using solar energy. Consequently photosynthetic +organisms are at the basis of most ecosystem trophic chains. Indeed +photosynthesis in eucaryotes has allowed a great speciation in the lineage, +leading to a great biodiversity. From an ecological point of view, +photosynthetic organisms are at the origin of the presence of dioxygen +in the atmosphere (allowing extant life) and are the main source of mid +to long term carbon storage, which is fundamental regarding current +climate changes. However, the chloroplasts evolutionary history is not totally +well understood, at least large scale speaking, and their phylogeny requires +to be further investigated. + +A key idea in phylogenetic classification is that a given DNA mutation shared +by at least two taxa has a larger probability to be inherited from a common +ancestor than to have occurred independently. Thus shared changes in genomes +allow to build relationships between species. In the case of chloroplasts, +an important category of genomes changes is the loss of functional genes, +either because they become ineffective or due to a transfer to the nucleus. +Thereby +%we hypothesize that +a small number of gene losses among species indicates +that these species are close to each other and belong to a similar lineage, +while a large loss means %that we have an evolutionary relationship +%between species from +%much more +distant lineages. +%Phylogenetic relationships are mainly built by comparison of sets of coding and non-coding sequences. + Phylogenies of photosynthetic plants are important to assess the origin + of chloroplasts and the modes of gene loss among lineages. + These phylogenies are usually done using a few chloroplastic genes, +some of them being not conserved in all the taxa. + %As phylogenetic relationships inferred from data matrices complete for each species included and with the same evolution history are better assumptions, +%we argue that +This is why selecting core genes may be of interest for a new investigation +of photosynthetic plants phylogeny. +%To depict the links between species clearly, we here intend to built a phylogenetic tree showing the relationships based on the distances among gene sequences of a core genome. +However, the circumscription of the core chloroplast genomes for a given set of photosynthetic organisms needs bioinformatics investigations using sequence annotation and comparison tools, and various choices +%of tools +are available. + +\color{red}Our intention in this research work regarding the methodology in core and pan genomes determination is to investigate the impact of these choices. on the results. A general presentation of the approaches detailed in this document is provided in the next section. Then we will study in Section~\ref{sec:simil} the use of annotated genomes from NCBI website~\cite{Sayers01012011} with a coding sequences clustering method based on the Needleman-Wunsch similarity scores~\cite{Rice2000}. %We will show that such an approach based on sequences similarity cannot lead to satisfactory results, biologically speaking. +%We will thus investigate name-sequence-based approaches in Section~\ref{sec:annot}, by using successively the gene names provided by NCBI and DOGMA~\cite{RDOGMA} annotations, where DOGMA is a recent annotation tool specific to chloroplasts. +While the second method will be proposed in Section~\ref{sec:mixed}, which intends to use gene name and sequence comparisons. \color{black} +%Ways to take advantage of the produced core genomes are introduced in Section~\ref{sec:features}, + Information regarding computation time and memory usage are provided in Section~\ref{sec:implem}. +Finally, a discussion based on biological aspects regarding the evolutionary history of the considered genomes +will finalize our investigations, leading to our methodology proposal for core and pan genomes +discovery of chloroplasts %(Section~\ref{sec:discuss}). +This research work ends by a conclusion section, in which our investigations will be summarized and intended future work will be planned. + + +% Other possible scientific questions to consider for introduction improvement: +% Which bioinformatics tools are necessary for genes comparison in selected complete chloroplast genomes? Which bioinformatics tools are necessary to build a phylogeny of numerous genes and species, etc? +% diff --git a/Paper2/llncs.cls b/Paper2/llncs.cls new file mode 100644 index 0000000..6e1806d --- /dev/null +++ b/Paper2/llncs.cls @@ -0,0 +1,1208 @@ +% LLNCS DOCUMENT CLASS -- version 2.18 (27-Sep-2013) +% Springer Verlag LaTeX2e support for Lecture Notes in Computer Science +% +%% +%% \CharacterTable +%% {Upper-case \A\B\C\D\E\F\G\H\I\J\K\L\M\N\O\P\Q\R\S\T\U\V\W\X\Y\Z +%% Lower-case \a\b\c\d\e\f\g\h\i\j\k\l\m\n\o\p\q\r\s\t\u\v\w\x\y\z +%% Digits \0\1\2\3\4\5\6\7\8\9 +%% Exclamation \! Double quote \" Hash (number) \# +%% Dollar \$ Percent \% Ampersand \& +%% Acute accent \' Left paren \( Right paren \) +%% Asterisk \* Plus \+ Comma \, +%% Minus \- Point \. Solidus \/ +%% Colon \: Semicolon \; Less than \< +%% Equals \= Greater than \> Question mark \? +%% Commercial at \@ Left bracket \[ Backslash \\ +%% Right bracket \] Circumflex \^ Underscore \_ +%% Grave accent \` Left brace \{ Vertical bar \| +%% Right brace \} Tilde \~} +%% +\NeedsTeXFormat{LaTeX2e}[1995/12/01] +\ProvidesClass{llncs}[2013/09/27 v2.18 +^^J LaTeX document class for Lecture Notes in Computer Science] +% Options +\let\if@envcntreset\iffalse +\DeclareOption{envcountreset}{\let\if@envcntreset\iftrue} +\DeclareOption{citeauthoryear}{\let\citeauthoryear=Y} +\DeclareOption{oribibl}{\let\oribibl=Y} +\let\if@custvec\iftrue +\DeclareOption{orivec}{\let\if@custvec\iffalse} +\let\if@envcntsame\iffalse +\DeclareOption{envcountsame}{\let\if@envcntsame\iftrue} +\let\if@envcntsect\iffalse +\DeclareOption{envcountsect}{\let\if@envcntsect\iftrue} +\let\if@runhead\iffalse +\DeclareOption{runningheads}{\let\if@runhead\iftrue} + +\let\if@openright\iftrue +\let\if@openbib\iffalse +\DeclareOption{openbib}{\let\if@openbib\iftrue} + +% languages +\let\switcht@@therlang\relax +\def\ds@deutsch{\def\switcht@@therlang{\switcht@deutsch}} +\def\ds@francais{\def\switcht@@therlang{\switcht@francais}} + +\DeclareOption*{\PassOptionsToClass{\CurrentOption}{article}} + +\ProcessOptions + +\LoadClass[twoside]{article} +\RequirePackage{multicol} % needed for the list of participants, index +\RequirePackage{aliascnt} + +\setlength{\textwidth}{12.2cm} +\setlength{\textheight}{19.3cm} +\renewcommand\@pnumwidth{2em} +\renewcommand\@tocrmarg{3.5em} +% +\def\@dottedtocline#1#2#3#4#5{% + \ifnum #1>\c@tocdepth \else + \vskip \z@ \@plus.2\p@ + {\leftskip #2\relax \rightskip \@tocrmarg \advance\rightskip by 0pt plus 2cm + \parfillskip -\rightskip \pretolerance=10000 + \parindent #2\relax\@afterindenttrue + \interlinepenalty\@M + \leavevmode + \@tempdima #3\relax + \advance\leftskip \@tempdima \null\nobreak\hskip -\leftskip + {#4}\nobreak + \leaders\hbox{$\m@th + \mkern \@dotsep mu\hbox{.}\mkern \@dotsep + mu$}\hfill + \nobreak + \hb@xt@\@pnumwidth{\hfil\normalfont \normalcolor #5}% + \par}% + \fi} +% +\def\switcht@albion{% +\def\abstractname{Abstract.} +\def\ackname{Acknowledgement.} +\def\andname{and} +\def\lastandname{\unskip, and} +\def\appendixname{Appendix} +\def\chaptername{Chapter} +\def\claimname{Claim} +\def\conjecturename{Conjecture} +\def\contentsname{Table of Contents} +\def\corollaryname{Corollary} +\def\definitionname{Definition} +\def\examplename{Example} +\def\exercisename{Exercise} +\def\figurename{Fig.} +\def\keywordname{{\bf Keywords:}} +\def\indexname{Index} +\def\lemmaname{Lemma} +\def\contriblistname{List of Contributors} +\def\listfigurename{List of Figures} +\def\listtablename{List of Tables} +\def\mailname{{\it Correspondence to\/}:} +\def\noteaddname{Note added in proof} +\def\notename{Note} +\def\partname{Part} +\def\problemname{Problem} +\def\proofname{Proof} +\def\propertyname{Property} +\def\propositionname{Proposition} +\def\questionname{Question} +\def\remarkname{Remark} +\def\seename{see} +\def\solutionname{Solution} +\def\subclassname{{\it Subject Classifications\/}:} +\def\tablename{Table} +\def\theoremname{Theorem}} +\switcht@albion +% Names of theorem like environments are already defined +% but must be translated if another language is chosen +% +% French section +\def\switcht@francais{%\typeout{On parle francais.}% + \def\abstractname{R\'esum\'e.}% + \def\ackname{Remerciements.}% + \def\andname{et}% + \def\lastandname{ et}% + \def\appendixname{Appendice} + \def\chaptername{Chapitre}% + \def\claimname{Pr\'etention}% + \def\conjecturename{Hypoth\`ese}% + \def\contentsname{Table des mati\`eres}% + \def\corollaryname{Corollaire}% + \def\definitionname{D\'efinition}% + \def\examplename{Exemple}% + \def\exercisename{Exercice}% + \def\figurename{Fig.}% + \def\keywordname{{\bf Mots-cl\'e:}} + \def\indexname{Index} + \def\lemmaname{Lemme}% + \def\contriblistname{Liste des contributeurs} + \def\listfigurename{Liste des figures}% + \def\listtablename{Liste des tables}% + \def\mailname{{\it Correspondence to\/}:} + \def\noteaddname{Note ajout\'ee \`a l'\'epreuve}% + \def\notename{Remarque}% + \def\partname{Partie}% + \def\problemname{Probl\`eme}% + \def\proofname{Preuve}% + \def\propertyname{Caract\'eristique}% +%\def\propositionname{Proposition}% + \def\questionname{Question}% + \def\remarkname{Remarque}% + \def\seename{voir} + \def\solutionname{Solution}% + \def\subclassname{{\it Subject Classifications\/}:} + \def\tablename{Tableau}% + \def\theoremname{Th\'eor\`eme}% +} +% +% German section +\def\switcht@deutsch{%\typeout{Man spricht deutsch.}% + \def\abstractname{Zusammenfassung.}% + \def\ackname{Danksagung.}% + \def\andname{und}% + \def\lastandname{ und}% + \def\appendixname{Anhang}% + \def\chaptername{Kapitel}% + \def\claimname{Behauptung}% + \def\conjecturename{Hypothese}% + \def\contentsname{Inhaltsverzeichnis}% + \def\corollaryname{Korollar}% +%\def\definitionname{Definition}% + \def\examplename{Beispiel}% + \def\exercisename{\"Ubung}% + \def\figurename{Abb.}% + \def\keywordname{{\bf Schl\"usselw\"orter:}} + \def\indexname{Index} +%\def\lemmaname{Lemma}% + \def\contriblistname{Mitarbeiter} + \def\listfigurename{Abbildungsverzeichnis}% + \def\listtablename{Tabellenverzeichnis}% + \def\mailname{{\it Correspondence to\/}:} + \def\noteaddname{Nachtrag}% + \def\notename{Anmerkung}% + \def\partname{Teil}% +%\def\problemname{Problem}% + \def\proofname{Beweis}% + \def\propertyname{Eigenschaft}% +%\def\propositionname{Proposition}% + \def\questionname{Frage}% + \def\remarkname{Anmerkung}% + \def\seename{siehe} + \def\solutionname{L\"osung}% + \def\subclassname{{\it Subject Classifications\/}:} + \def\tablename{Tabelle}% +%\def\theoremname{Theorem}% +} + +% Ragged bottom for the actual page +\def\thisbottomragged{\def\@textbottom{\vskip\z@ plus.0001fil +\global\let\@textbottom\relax}} + +\renewcommand\small{% + \@setfontsize\small\@ixpt{11}% + \abovedisplayskip 8.5\p@ \@plus3\p@ \@minus4\p@ + \abovedisplayshortskip \z@ \@plus2\p@ + \belowdisplayshortskip 4\p@ \@plus2\p@ \@minus2\p@ + \def\@listi{\leftmargin\leftmargini + \parsep 0\p@ \@plus1\p@ \@minus\p@ + \topsep 8\p@ \@plus2\p@ \@minus4\p@ + \itemsep0\p@}% + \belowdisplayskip \abovedisplayskip +} + +\frenchspacing +\widowpenalty=10000 +\clubpenalty=10000 + +\setlength\oddsidemargin {63\p@} +\setlength\evensidemargin {63\p@} +\setlength\marginparwidth {90\p@} + +\setlength\headsep {16\p@} + +\setlength\footnotesep{7.7\p@} +\setlength\textfloatsep{8mm\@plus 2\p@ \@minus 4\p@} +\setlength\intextsep {8mm\@plus 2\p@ \@minus 2\p@} + +\setcounter{secnumdepth}{2} + +\newcounter {chapter} +\renewcommand\thechapter {\@arabic\c@chapter} + +\newif\if@mainmatter \@mainmattertrue +\newcommand\frontmatter{\cleardoublepage + \@mainmatterfalse\pagenumbering{Roman}} +\newcommand\mainmatter{\cleardoublepage + \@mainmattertrue\pagenumbering{arabic}} +\newcommand\backmatter{\if@openright\cleardoublepage\else\clearpage\fi + \@mainmatterfalse} + +\renewcommand\part{\cleardoublepage + \thispagestyle{empty}% + \if@twocolumn + \onecolumn + \@tempswatrue + \else + \@tempswafalse + \fi + \null\vfil + \secdef\@part\@spart} + +\def\@part[#1]#2{% + \ifnum \c@secnumdepth >-2\relax + \refstepcounter{part}% + \addcontentsline{toc}{part}{\thepart\hspace{1em}#1}% + \else + \addcontentsline{toc}{part}{#1}% + \fi + \markboth{}{}% + {\centering + \interlinepenalty \@M + \normalfont + \ifnum \c@secnumdepth >-2\relax + \huge\bfseries \partname~\thepart + \par + \vskip 20\p@ + \fi + \Huge \bfseries #2\par}% + \@endpart} +\def\@spart#1{% + {\centering + \interlinepenalty \@M + \normalfont + \Huge \bfseries #1\par}% + \@endpart} +\def\@endpart{\vfil\newpage + \if@twoside + \null + \thispagestyle{empty}% + \newpage + \fi + \if@tempswa + \twocolumn + \fi} + +\newcommand\chapter{\clearpage + \thispagestyle{empty}% + \global\@topnum\z@ + \@afterindentfalse + \secdef\@chapter\@schapter} +\def\@chapter[#1]#2{\ifnum \c@secnumdepth >\m@ne + \if@mainmatter + \refstepcounter{chapter}% + \typeout{\@chapapp\space\thechapter.}% + \addcontentsline{toc}{chapter}% + {\protect\numberline{\thechapter}#1}% + \else + \addcontentsline{toc}{chapter}{#1}% + \fi + \else + \addcontentsline{toc}{chapter}{#1}% + \fi + \chaptermark{#1}% + \addtocontents{lof}{\protect\addvspace{10\p@}}% + \addtocontents{lot}{\protect\addvspace{10\p@}}% + \if@twocolumn + \@topnewpage[\@makechapterhead{#2}]% + \else + \@makechapterhead{#2}% + \@afterheading + \fi} +\def\@makechapterhead#1{% +% \vspace*{50\p@}% + {\centering + \ifnum \c@secnumdepth >\m@ne + \if@mainmatter + \large\bfseries \@chapapp{} \thechapter + \par\nobreak + \vskip 20\p@ + \fi + \fi + \interlinepenalty\@M + \Large \bfseries #1\par\nobreak + \vskip 40\p@ + }} +\def\@schapter#1{\if@twocolumn + \@topnewpage[\@makeschapterhead{#1}]% + \else + \@makeschapterhead{#1}% + \@afterheading + \fi} +\def\@makeschapterhead#1{% +% \vspace*{50\p@}% + {\centering + \normalfont + \interlinepenalty\@M + \Large \bfseries #1\par\nobreak + \vskip 40\p@ + }} + +\renewcommand\section{\@startsection{section}{1}{\z@}% + {-18\p@ \@plus -4\p@ \@minus -4\p@}% + {12\p@ \@plus 4\p@ \@minus 4\p@}% + {\normalfont\large\bfseries\boldmath + \rightskip=\z@ \@plus 8em\pretolerance=10000 }} +\renewcommand\subsection{\@startsection{subsection}{2}{\z@}% + {-18\p@ \@plus -4\p@ \@minus -4\p@}% + {8\p@ \@plus 4\p@ \@minus 4\p@}% + {\normalfont\normalsize\bfseries\boldmath + \rightskip=\z@ \@plus 8em\pretolerance=10000 }} +\renewcommand\subsubsection{\@startsection{subsubsection}{3}{\z@}% + {-18\p@ \@plus -4\p@ \@minus -4\p@}% + {-0.5em \@plus -0.22em \@minus -0.1em}% + {\normalfont\normalsize\bfseries\boldmath}} +\renewcommand\paragraph{\@startsection{paragraph}{4}{\z@}% + {-12\p@ \@plus -4\p@ \@minus -4\p@}% + {-0.5em \@plus -0.22em \@minus -0.1em}% + {\normalfont\normalsize\itshape}} +\renewcommand\subparagraph[1]{\typeout{LLNCS warning: You should not use + \string\subparagraph\space with this class}\vskip0.5cm +You should not use \verb|\subparagraph| with this class.\vskip0.5cm} + +\DeclareMathSymbol{\Gamma}{\mathalpha}{letters}{"00} +\DeclareMathSymbol{\Delta}{\mathalpha}{letters}{"01} +\DeclareMathSymbol{\Theta}{\mathalpha}{letters}{"02} +\DeclareMathSymbol{\Lambda}{\mathalpha}{letters}{"03} +\DeclareMathSymbol{\Xi}{\mathalpha}{letters}{"04} +\DeclareMathSymbol{\Pi}{\mathalpha}{letters}{"05} +\DeclareMathSymbol{\Sigma}{\mathalpha}{letters}{"06} +\DeclareMathSymbol{\Upsilon}{\mathalpha}{letters}{"07} +\DeclareMathSymbol{\Phi}{\mathalpha}{letters}{"08} +\DeclareMathSymbol{\Psi}{\mathalpha}{letters}{"09} +\DeclareMathSymbol{\Omega}{\mathalpha}{letters}{"0A} + +\let\footnotesize\small + +\if@custvec +\def\vec#1{\mathchoice{\mbox{\boldmath$\displaystyle#1$}} +{\mbox{\boldmath$\textstyle#1$}} +{\mbox{\boldmath$\scriptstyle#1$}} +{\mbox{\boldmath$\scriptscriptstyle#1$}}} +\fi + +\def\squareforqed{\hbox{\rlap{$\sqcap$}$\sqcup$}} +\def\qed{\ifmmode\squareforqed\else{\unskip\nobreak\hfil +\penalty50\hskip1em\null\nobreak\hfil\squareforqed +\parfillskip=0pt\finalhyphendemerits=0\endgraf}\fi} + +\def\getsto{\mathrel{\mathchoice {\vcenter{\offinterlineskip +\halign{\hfil +$\displaystyle##$\hfil\cr\gets\cr\to\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\textstyle##$\hfil\cr\gets +\cr\to\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\scriptstyle##$\hfil\cr\gets +\cr\to\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\scriptscriptstyle##$\hfil\cr +\gets\cr\to\cr}}}}} +\def\lid{\mathrel{\mathchoice {\vcenter{\offinterlineskip\halign{\hfil +$\displaystyle##$\hfil\cr<\cr\noalign{\vskip1.2pt}=\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\textstyle##$\hfil\cr<\cr +\noalign{\vskip1.2pt}=\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\scriptstyle##$\hfil\cr<\cr +\noalign{\vskip1pt}=\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\scriptscriptstyle##$\hfil\cr +<\cr +\noalign{\vskip0.9pt}=\cr}}}}} +\def\gid{\mathrel{\mathchoice {\vcenter{\offinterlineskip\halign{\hfil +$\displaystyle##$\hfil\cr>\cr\noalign{\vskip1.2pt}=\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\textstyle##$\hfil\cr>\cr +\noalign{\vskip1.2pt}=\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\scriptstyle##$\hfil\cr>\cr +\noalign{\vskip1pt}=\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\scriptscriptstyle##$\hfil\cr +>\cr +\noalign{\vskip0.9pt}=\cr}}}}} +\def\grole{\mathrel{\mathchoice {\vcenter{\offinterlineskip +\halign{\hfil +$\displaystyle##$\hfil\cr>\cr\noalign{\vskip-1pt}<\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\textstyle##$\hfil\cr +>\cr\noalign{\vskip-1pt}<\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\scriptstyle##$\hfil\cr +>\cr\noalign{\vskip-0.8pt}<\cr}}} +{\vcenter{\offinterlineskip\halign{\hfil$\scriptscriptstyle##$\hfil\cr +>\cr\noalign{\vskip-0.3pt}<\cr}}}}} +\def\bbbr{{\rm I\!R}} %reelle Zahlen +\def\bbbm{{\rm I\!M}} +\def\bbbn{{\rm I\!N}} %natuerliche Zahlen +\def\bbbf{{\rm I\!F}} +\def\bbbh{{\rm I\!H}} +\def\bbbk{{\rm I\!K}} +\def\bbbp{{\rm I\!P}} +\def\bbbone{{\mathchoice {\rm 1\mskip-4mu l} {\rm 1\mskip-4mu l} +{\rm 1\mskip-4.5mu l} {\rm 1\mskip-5mu l}}} +\def\bbbc{{\mathchoice {\setbox0=\hbox{$\displaystyle\rm C$}\hbox{\hbox +to0pt{\kern0.4\wd0\vrule height0.9\ht0\hss}\box0}} +{\setbox0=\hbox{$\textstyle\rm C$}\hbox{\hbox +to0pt{\kern0.4\wd0\vrule height0.9\ht0\hss}\box0}} +{\setbox0=\hbox{$\scriptstyle\rm C$}\hbox{\hbox +to0pt{\kern0.4\wd0\vrule height0.9\ht0\hss}\box0}} +{\setbox0=\hbox{$\scriptscriptstyle\rm C$}\hbox{\hbox +to0pt{\kern0.4\wd0\vrule height0.9\ht0\hss}\box0}}}} +\def\bbbq{{\mathchoice {\setbox0=\hbox{$\displaystyle\rm +Q$}\hbox{\raise +0.15\ht0\hbox to0pt{\kern0.4\wd0\vrule height0.8\ht0\hss}\box0}} +{\setbox0=\hbox{$\textstyle\rm Q$}\hbox{\raise +0.15\ht0\hbox to0pt{\kern0.4\wd0\vrule height0.8\ht0\hss}\box0}} +{\setbox0=\hbox{$\scriptstyle\rm Q$}\hbox{\raise +0.15\ht0\hbox to0pt{\kern0.4\wd0\vrule height0.7\ht0\hss}\box0}} +{\setbox0=\hbox{$\scriptscriptstyle\rm Q$}\hbox{\raise +0.15\ht0\hbox to0pt{\kern0.4\wd0\vrule height0.7\ht0\hss}\box0}}}} +\def\bbbt{{\mathchoice {\setbox0=\hbox{$\displaystyle\rm +T$}\hbox{\hbox to0pt{\kern0.3\wd0\vrule height0.9\ht0\hss}\box0}} +{\setbox0=\hbox{$\textstyle\rm T$}\hbox{\hbox +to0pt{\kern0.3\wd0\vrule height0.9\ht0\hss}\box0}} +{\setbox0=\hbox{$\scriptstyle\rm T$}\hbox{\hbox +to0pt{\kern0.3\wd0\vrule height0.9\ht0\hss}\box0}} +{\setbox0=\hbox{$\scriptscriptstyle\rm T$}\hbox{\hbox +to0pt{\kern0.3\wd0\vrule height0.9\ht0\hss}\box0}}}} +\def\bbbs{{\mathchoice +{\setbox0=\hbox{$\displaystyle \rm S$}\hbox{\raise0.5\ht0\hbox +to0pt{\kern0.35\wd0\vrule height0.45\ht0\hss}\hbox +to0pt{\kern0.55\wd0\vrule height0.5\ht0\hss}\box0}} +{\setbox0=\hbox{$\textstyle \rm S$}\hbox{\raise0.5\ht0\hbox +to0pt{\kern0.35\wd0\vrule height0.45\ht0\hss}\hbox +to0pt{\kern0.55\wd0\vrule height0.5\ht0\hss}\box0}} +{\setbox0=\hbox{$\scriptstyle \rm S$}\hbox{\raise0.5\ht0\hbox +to0pt{\kern0.35\wd0\vrule height0.45\ht0\hss}\raise0.05\ht0\hbox +to0pt{\kern0.5\wd0\vrule height0.45\ht0\hss}\box0}} +{\setbox0=\hbox{$\scriptscriptstyle\rm S$}\hbox{\raise0.5\ht0\hbox +to0pt{\kern0.4\wd0\vrule height0.45\ht0\hss}\raise0.05\ht0\hbox +to0pt{\kern0.55\wd0\vrule height0.45\ht0\hss}\box0}}}} +\def\bbbz{{\mathchoice {\hbox{$\mathsf\textstyle Z\kern-0.4em Z$}} +{\hbox{$\mathsf\textstyle Z\kern-0.4em Z$}} +{\hbox{$\mathsf\scriptstyle Z\kern-0.3em Z$}} +{\hbox{$\mathsf\scriptscriptstyle Z\kern-0.2em Z$}}}} + +\let\ts\, + +\setlength\leftmargini {17\p@} +\setlength\leftmargin {\leftmargini} +\setlength\leftmarginii {\leftmargini} +\setlength\leftmarginiii {\leftmargini} +\setlength\leftmarginiv {\leftmargini} +\setlength \labelsep {.5em} +\setlength \labelwidth{\leftmargini} +\addtolength\labelwidth{-\labelsep} + +\def\@listI{\leftmargin\leftmargini + \parsep 0\p@ \@plus1\p@ \@minus\p@ + \topsep 8\p@ \@plus2\p@ \@minus4\p@ + \itemsep0\p@} +\let\@listi\@listI +\@listi +\def\@listii {\leftmargin\leftmarginii + \labelwidth\leftmarginii + \advance\labelwidth-\labelsep + \topsep 0\p@ \@plus2\p@ \@minus\p@} +\def\@listiii{\leftmargin\leftmarginiii + \labelwidth\leftmarginiii + \advance\labelwidth-\labelsep + \topsep 0\p@ \@plus\p@\@minus\p@ + \parsep \z@ + \partopsep \p@ \@plus\z@ \@minus\p@} + +\renewcommand\labelitemi{\normalfont\bfseries --} +\renewcommand\labelitemii{$\m@th\bullet$} + +\setlength\arraycolsep{1.4\p@} +\setlength\tabcolsep{1.4\p@} + +\def\tableofcontents{\chapter*{\contentsname\@mkboth{{\contentsname}}% + {{\contentsname}}} + \def\authcount##1{\setcounter{auco}{##1}\setcounter{@auth}{1}} + \def\lastand{\ifnum\value{auco}=2\relax + \unskip{} \andname\ + \else + \unskip \lastandname\ + \fi}% + \def\and{\stepcounter{@auth}\relax + \ifnum\value{@auth}=\value{auco}% + \lastand + \else + \unskip, + \fi}% + \@starttoc{toc}\if@restonecol\twocolumn\fi} + +\def\l@part#1#2{\addpenalty{\@secpenalty}% + \addvspace{2em plus\p@}% % space above part line + \begingroup + \parindent \z@ + \rightskip \z@ plus 5em + \hrule\vskip5pt + \large % same size as for a contribution heading + \bfseries\boldmath % set line in boldface + \leavevmode % TeX command to enter horizontal mode. + #1\par + \vskip5pt + \hrule + \vskip1pt + \nobreak % Never break after part entry + \endgroup} + +\def\@dotsep{2} + +\let\phantomsection=\relax + +\def\hyperhrefextend{\ifx\hyper@anchor\@undefined\else +{}\fi} + +\def\addnumcontentsmark#1#2#3{% +\addtocontents{#1}{\protect\contentsline{#2}{\protect\numberline + {\thechapter}#3}{\thepage}\hyperhrefextend}}% +\def\addcontentsmark#1#2#3{% +\addtocontents{#1}{\protect\contentsline{#2}{#3}{\thepage}\hyperhrefextend}}% +\def\addcontentsmarkwop#1#2#3{% +\addtocontents{#1}{\protect\contentsline{#2}{#3}{0}\hyperhrefextend}}% + +\def\@adcmk[#1]{\ifcase #1 \or +\def\@gtempa{\addnumcontentsmark}% + \or \def\@gtempa{\addcontentsmark}% + \or \def\@gtempa{\addcontentsmarkwop}% + \fi\@gtempa{toc}{chapter}% +} +\def\addtocmark{% +\phantomsection +\@ifnextchar[{\@adcmk}{\@adcmk[3]}% +} + +\def\l@chapter#1#2{\addpenalty{-\@highpenalty} + \vskip 1.0em plus 1pt \@tempdima 1.5em \begingroup + \parindent \z@ \rightskip \@tocrmarg + \advance\rightskip by 0pt plus 2cm + \parfillskip -\rightskip \pretolerance=10000 + \leavevmode \advance\leftskip\@tempdima \hskip -\leftskip + {\large\bfseries\boldmath#1}\ifx0#2\hfil\null + \else + \nobreak + \leaders\hbox{$\m@th \mkern \@dotsep mu.\mkern + \@dotsep mu$}\hfill + \nobreak\hbox to\@pnumwidth{\hss #2}% + \fi\par + \penalty\@highpenalty \endgroup} + +\def\l@title#1#2{\addpenalty{-\@highpenalty} + \addvspace{8pt plus 1pt} + \@tempdima \z@ + \begingroup + \parindent \z@ \rightskip \@tocrmarg + \advance\rightskip by 0pt plus 2cm + \parfillskip -\rightskip \pretolerance=10000 + \leavevmode \advance\leftskip\@tempdima \hskip -\leftskip + #1\nobreak + \leaders\hbox{$\m@th \mkern \@dotsep mu.\mkern + \@dotsep mu$}\hfill + \nobreak\hbox to\@pnumwidth{\hss #2}\par + \penalty\@highpenalty \endgroup} + +\def\l@author#1#2{\addpenalty{\@highpenalty} + \@tempdima=15\p@ %\z@ + \begingroup + \parindent \z@ \rightskip \@tocrmarg + \advance\rightskip by 0pt plus 2cm + \pretolerance=10000 + \leavevmode \advance\leftskip\@tempdima %\hskip -\leftskip + \textit{#1}\par + \penalty\@highpenalty \endgroup} + +\setcounter{tocdepth}{0} +\newdimen\tocchpnum +\newdimen\tocsecnum +\newdimen\tocsectotal +\newdimen\tocsubsecnum +\newdimen\tocsubsectotal +\newdimen\tocsubsubsecnum +\newdimen\tocsubsubsectotal +\newdimen\tocparanum +\newdimen\tocparatotal +\newdimen\tocsubparanum +\tocchpnum=\z@ % no chapter numbers +\tocsecnum=15\p@ % section 88. plus 2.222pt +\tocsubsecnum=23\p@ % subsection 88.8 plus 2.222pt +\tocsubsubsecnum=27\p@ % subsubsection 88.8.8 plus 1.444pt +\tocparanum=35\p@ % paragraph 88.8.8.8 plus 1.666pt +\tocsubparanum=43\p@ % subparagraph 88.8.8.8.8 plus 1.888pt +\def\calctocindent{% +\tocsectotal=\tocchpnum +\advance\tocsectotal by\tocsecnum +\tocsubsectotal=\tocsectotal +\advance\tocsubsectotal by\tocsubsecnum +\tocsubsubsectotal=\tocsubsectotal +\advance\tocsubsubsectotal by\tocsubsubsecnum +\tocparatotal=\tocsubsubsectotal +\advance\tocparatotal by\tocparanum} +\calctocindent + +\def\l@section{\@dottedtocline{1}{\tocchpnum}{\tocsecnum}} +\def\l@subsection{\@dottedtocline{2}{\tocsectotal}{\tocsubsecnum}} +\def\l@subsubsection{\@dottedtocline{3}{\tocsubsectotal}{\tocsubsubsecnum}} +\def\l@paragraph{\@dottedtocline{4}{\tocsubsubsectotal}{\tocparanum}} +\def\l@subparagraph{\@dottedtocline{5}{\tocparatotal}{\tocsubparanum}} + +\def\listoffigures{\@restonecolfalse\if@twocolumn\@restonecoltrue\onecolumn + \fi\section*{\listfigurename\@mkboth{{\listfigurename}}{{\listfigurename}}} + \@starttoc{lof}\if@restonecol\twocolumn\fi} +\def\l@figure{\@dottedtocline{1}{0em}{1.5em}} + +\def\listoftables{\@restonecolfalse\if@twocolumn\@restonecoltrue\onecolumn + \fi\section*{\listtablename\@mkboth{{\listtablename}}{{\listtablename}}} + \@starttoc{lot}\if@restonecol\twocolumn\fi} +\let\l@table\l@figure + +\renewcommand\listoffigures{% + \section*{\listfigurename + \@mkboth{\listfigurename}{\listfigurename}}% + \@starttoc{lof}% + } + +\renewcommand\listoftables{% + \section*{\listtablename + \@mkboth{\listtablename}{\listtablename}}% + \@starttoc{lot}% + } + +\ifx\oribibl\undefined +\ifx\citeauthoryear\undefined +\renewenvironment{thebibliography}[1] + {\section*{\refname} + \def\@biblabel##1{##1.} + \small + \list{\@biblabel{\@arabic\c@enumiv}}% + {\settowidth\labelwidth{\@biblabel{#1}}% + \leftmargin\labelwidth + \advance\leftmargin\labelsep + \if@openbib + \advance\leftmargin\bibindent + \itemindent -\bibindent + \listparindent \itemindent + \parsep \z@ + \fi + \usecounter{enumiv}% + \let\p@enumiv\@empty + \renewcommand\theenumiv{\@arabic\c@enumiv}}% + \if@openbib + \renewcommand\newblock{\par}% + \else + \renewcommand\newblock{\hskip .11em \@plus.33em \@minus.07em}% + \fi + \sloppy\clubpenalty4000\widowpenalty4000% + \sfcode`\.=\@m} + {\def\@noitemerr + {\@latex@warning{Empty `thebibliography' environment}}% + \endlist} +\def\@lbibitem[#1]#2{\item[{[#1]}\hfill]\if@filesw + {\let\protect\noexpand\immediate + \write\@auxout{\string\bibcite{#2}{#1}}}\fi\ignorespaces} +\newcount\@tempcntc +\def\@citex[#1]#2{\if@filesw\immediate\write\@auxout{\string\citation{#2}}\fi + \@tempcnta\z@\@tempcntb\m@ne\def\@citea{}\@cite{\@for\@citeb:=#2\do + {\@ifundefined + {b@\@citeb}{\@citeo\@tempcntb\m@ne\@citea\def\@citea{,}{\bfseries + ?}\@warning + {Citation `\@citeb' on page \thepage \space undefined}}% + {\setbox\z@\hbox{\global\@tempcntc0\csname b@\@citeb\endcsname\relax}% + \ifnum\@tempcntc=\z@ \@citeo\@tempcntb\m@ne + \@citea\def\@citea{,}\hbox{\csname b@\@citeb\endcsname}% + \else + \advance\@tempcntb\@ne + \ifnum\@tempcntb=\@tempcntc + \else\advance\@tempcntb\m@ne\@citeo + \@tempcnta\@tempcntc\@tempcntb\@tempcntc\fi\fi}}\@citeo}{#1}} +\def\@citeo{\ifnum\@tempcnta>\@tempcntb\else + \@citea\def\@citea{,\,\hskip\z@skip}% + \ifnum\@tempcnta=\@tempcntb\the\@tempcnta\else + {\advance\@tempcnta\@ne\ifnum\@tempcnta=\@tempcntb \else + \def\@citea{--}\fi + \advance\@tempcnta\m@ne\the\@tempcnta\@citea\the\@tempcntb}\fi\fi} +\else +\renewenvironment{thebibliography}[1] + {\section*{\refname} + \small + \list{}% + {\settowidth\labelwidth{}% + \leftmargin\parindent + \itemindent=-\parindent + \labelsep=\z@ + \if@openbib + \advance\leftmargin\bibindent + \itemindent -\bibindent + \listparindent \itemindent + \parsep \z@ + \fi + \usecounter{enumiv}% + \let\p@enumiv\@empty + \renewcommand\theenumiv{}}% + \if@openbib + \renewcommand\newblock{\par}% + \else + \renewcommand\newblock{\hskip .11em \@plus.33em \@minus.07em}% + \fi + \sloppy\clubpenalty4000\widowpenalty4000% + \sfcode`\.=\@m} + {\def\@noitemerr + {\@latex@warning{Empty `thebibliography' environment}}% + \endlist} + \def\@cite#1{#1}% + \def\@lbibitem[#1]#2{\item[]\if@filesw + {\def\protect##1{\string ##1\space}\immediate + \write\@auxout{\string\bibcite{#2}{#1}}}\fi\ignorespaces} + \fi +\else +\@cons\@openbib@code{\noexpand\small} +\fi + +\def\idxquad{\hskip 10\p@}% space that divides entry from number + +\def\@idxitem{\par\hangindent 10\p@} + +\def\subitem{\par\setbox0=\hbox{--\enspace}% second order + \noindent\hangindent\wd0\box0}% index entry + +\def\subsubitem{\par\setbox0=\hbox{--\,--\enspace}% third + \noindent\hangindent\wd0\box0}% order index entry + +\def\indexspace{\par \vskip 10\p@ plus5\p@ minus3\p@\relax} + +\renewenvironment{theindex} + {\@mkboth{\indexname}{\indexname}% + \thispagestyle{empty}\parindent\z@ + \parskip\z@ \@plus .3\p@\relax + \let\item\par + \def\,{\relax\ifmmode\mskip\thinmuskip + \else\hskip0.2em\ignorespaces\fi}% + \normalfont\small + \begin{multicols}{2}[\@makeschapterhead{\indexname}]% + } + {\end{multicols}} + +\renewcommand\footnoterule{% + \kern-3\p@ + \hrule\@width 2truecm + \kern2.6\p@} + \newdimen\fnindent + \fnindent1em +\long\def\@makefntext#1{% + \parindent \fnindent% + \leftskip \fnindent% + \noindent + \llap{\hb@xt@1em{\hss\@makefnmark\ }}\ignorespaces#1} + +\long\def\@makecaption#1#2{% + \small + \vskip\abovecaptionskip + \sbox\@tempboxa{{\bfseries #1.} #2}% + \ifdim \wd\@tempboxa >\hsize + {\bfseries #1.} #2\par + \else + \global \@minipagefalse + \hb@xt@\hsize{\hfil\box\@tempboxa\hfil}% + \fi + \vskip\belowcaptionskip} + +\def\fps@figure{htbp} +\def\fnum@figure{\figurename\thinspace\thefigure} +\def \@floatboxreset {% + \reset@font + \small + \@setnobreak + \@setminipage +} +\def\fps@table{htbp} +\def\fnum@table{\tablename~\thetable} +\renewenvironment{table} + {\setlength\abovecaptionskip{0\p@}% + \setlength\belowcaptionskip{10\p@}% + \@float{table}} + {\end@float} +\renewenvironment{table*} + {\setlength\abovecaptionskip{0\p@}% + \setlength\belowcaptionskip{10\p@}% + \@dblfloat{table}} + {\end@dblfloat} + +\long\def\@caption#1[#2]#3{\par\addcontentsline{\csname + ext@#1\endcsname}{#1}{\protect\numberline{\csname + the#1\endcsname}{\ignorespaces #2}}\begingroup + \@parboxrestore + \@makecaption{\csname fnum@#1\endcsname}{\ignorespaces #3}\par + \endgroup} + +% LaTeX does not provide a command to enter the authors institute +% addresses. The \institute command is defined here. + +\newcounter{@inst} +\newcounter{@auth} +\newcounter{auco} +\newdimen\instindent +\newbox\authrun +\newtoks\authorrunning +\newtoks\tocauthor +\newbox\titrun +\newtoks\titlerunning +\newtoks\toctitle + +\def\clearheadinfo{\gdef\@author{No Author Given}% + \gdef\@title{No Title Given}% + \gdef\@subtitle{}% + \gdef\@institute{No Institute Given}% + \gdef\@thanks{}% + \global\titlerunning={}\global\authorrunning={}% + \global\toctitle={}\global\tocauthor={}} + +\def\institute#1{\gdef\@institute{#1}} + +\def\institutename{\par + \begingroup + \parskip=\z@ + \parindent=\z@ + \setcounter{@inst}{1}% + \def\and{\par\stepcounter{@inst}% + \noindent$^{\the@inst}$\enspace\ignorespaces}% + \setbox0=\vbox{\def\thanks##1{}\@institute}% + \ifnum\c@@inst=1\relax + \gdef\fnnstart{0}% + \else + \xdef\fnnstart{\c@@inst}% + \setcounter{@inst}{1}% + \noindent$^{\the@inst}$\enspace + \fi + \ignorespaces + \@institute\par + \endgroup} + +\def\@fnsymbol#1{\ensuremath{\ifcase#1\or\star\or{\star\star}\or + {\star\star\star}\or \dagger\or \ddagger\or + \mathchar "278\or \mathchar "27B\or \|\or **\or \dagger\dagger + \or \ddagger\ddagger \else\@ctrerr\fi}} + +\def\inst#1{\unskip$^{#1}$} +\def\fnmsep{\unskip$^,$} +\def\email#1{{\tt#1}} +\AtBeginDocument{\@ifundefined{url}{\def\url#1{#1}}{}% +\@ifpackageloaded{babel}{% +\@ifundefined{extrasenglish}{}{\addto\extrasenglish{\switcht@albion}}% +\@ifundefined{extrasfrenchb}{}{\addto\extrasfrenchb{\switcht@francais}}% +\@ifundefined{extrasgerman}{}{\addto\extrasgerman{\switcht@deutsch}}% +\@ifundefined{extrasngerman}{}{\addto\extrasngerman{\switcht@deutsch}}% +}{\switcht@@therlang}% +\providecommand{\keywords}[1]{\par\addvspace\baselineskip +\noindent\keywordname\enspace\ignorespaces#1}% +} +\def\homedir{\~{ }} + +\def\subtitle#1{\gdef\@subtitle{#1}} +\clearheadinfo +% +%%% to avoid hyperref warnings +\providecommand*{\toclevel@author}{999} +%%% to make title-entry parent of section-entries +\providecommand*{\toclevel@title}{0} +% +\renewcommand\maketitle{\newpage +\phantomsection + \refstepcounter{chapter}% + \stepcounter{section}% + \setcounter{section}{0}% + \setcounter{subsection}{0}% + \setcounter{figure}{0} + \setcounter{table}{0} + \setcounter{equation}{0} + \setcounter{footnote}{0}% + \begingroup + \parindent=\z@ + \renewcommand\thefootnote{\@fnsymbol\c@footnote}% + \if@twocolumn + \ifnum \col@number=\@ne + \@maketitle + \else + \twocolumn[\@maketitle]% + \fi + \else + \newpage + \global\@topnum\z@ % Prevents figures from going at top of page. + \@maketitle + \fi + \thispagestyle{empty}\@thanks +% + \def\\{\unskip\ \ignorespaces}\def\inst##1{\unskip{}}% + \def\thanks##1{\unskip{}}\def\fnmsep{\unskip}% + \instindent=\hsize + \advance\instindent by-\headlineindent + \if!\the\toctitle!\addcontentsline{toc}{title}{\@title}\else + \addcontentsline{toc}{title}{\the\toctitle}\fi + \if@runhead + \if!\the\titlerunning!\else + \edef\@title{\the\titlerunning}% + \fi + \global\setbox\titrun=\hbox{\small\rm\unboldmath\ignorespaces\@title}% + \ifdim\wd\titrun>\instindent + \typeout{Title too long for running head. Please supply}% + \typeout{a shorter form with \string\titlerunning\space prior to + \string\maketitle}% + \global\setbox\titrun=\hbox{\small\rm + Title Suppressed Due to Excessive Length}% + \fi + \xdef\@title{\copy\titrun}% + \fi +% + \if!\the\tocauthor!\relax + {\def\and{\noexpand\protect\noexpand\and}% + \protected@xdef\toc@uthor{\@author}}% + \else + \def\\{\noexpand\protect\noexpand\newline}% + \protected@xdef\scratch{\the\tocauthor}% + \protected@xdef\toc@uthor{\scratch}% + \fi + \addtocontents{toc}{\noexpand\protect\noexpand\authcount{\the\c@auco}}% + \addcontentsline{toc}{author}{\toc@uthor}% + \if@runhead + \if!\the\authorrunning! + \value{@inst}=\value{@auth}% + \setcounter{@auth}{1}% + \else + \edef\@author{\the\authorrunning}% + \fi + \global\setbox\authrun=\hbox{\small\unboldmath\@author\unskip}% + \ifdim\wd\authrun>\instindent + \typeout{Names of authors too long for running head. Please supply}% + \typeout{a shorter form with \string\authorrunning\space prior to + \string\maketitle}% + \global\setbox\authrun=\hbox{\small\rm + Authors Suppressed Due to Excessive Length}% + \fi + \xdef\@author{\copy\authrun}% + \markboth{\@author}{\@title}% + \fi + \endgroup + \setcounter{footnote}{\fnnstart}% + \clearheadinfo} +% +\def\@maketitle{\newpage + \markboth{}{}% + \def\lastand{\ifnum\value{@inst}=2\relax + \unskip{} \andname\ + \else + \unskip \lastandname\ + \fi}% + \def\and{\stepcounter{@auth}\relax + \ifnum\value{@auth}=\value{@inst}% + \lastand + \else + \unskip, + \fi}% + \begin{center}% + \let\newline\\ + {\Large \bfseries\boldmath + \pretolerance=10000 + \@title \par}\vskip .8cm +\if!\@subtitle!\else {\large \bfseries\boldmath + \vskip -.65cm + \pretolerance=10000 + \@subtitle \par}\vskip .8cm\fi + \setbox0=\vbox{\setcounter{@auth}{1}\def\and{\stepcounter{@auth}}% + \def\thanks##1{}\@author}% + \global\value{@inst}=\value{@auth}% + \global\value{auco}=\value{@auth}% + \setcounter{@auth}{1}% +{\lineskip .5em +\noindent\ignorespaces +\@author\vskip.35cm} + {\small\institutename} + \end{center}% + } + +% definition of the "\spnewtheorem" command. +% +% Usage: +% +% \spnewtheorem{env_nam}{caption}[within]{cap_font}{body_font} +% or \spnewtheorem{env_nam}[numbered_like]{caption}{cap_font}{body_font} +% or \spnewtheorem*{env_nam}{caption}{cap_font}{body_font} +% +% New is "cap_font" and "body_font". It stands for +% fontdefinition of the caption and the text itself. +% +% "\spnewtheorem*" gives a theorem without number. +% +% A defined spnewthoerem environment is used as described +% by Lamport. +% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\def\@thmcountersep{} +\def\@thmcounterend{.} + +\def\spnewtheorem{\@ifstar{\@sthm}{\@Sthm}} + +% definition of \spnewtheorem with number + +\def\@spnthm#1#2{% + \@ifnextchar[{\@spxnthm{#1}{#2}}{\@spynthm{#1}{#2}}} +\def\@Sthm#1{\@ifnextchar[{\@spothm{#1}}{\@spnthm{#1}}} + +\def\@spxnthm#1#2[#3]#4#5{\expandafter\@ifdefinable\csname #1\endcsname + {\@definecounter{#1}\@addtoreset{#1}{#3}% + \expandafter\xdef\csname the#1\endcsname{\expandafter\noexpand + \csname the#3\endcsname \noexpand\@thmcountersep \@thmcounter{#1}}% + \expandafter\xdef\csname #1name\endcsname{#2}% + \global\@namedef{#1}{\@spthm{#1}{\csname #1name\endcsname}{#4}{#5}}% + \global\@namedef{end#1}{\@endtheorem}}} + +\def\@spynthm#1#2#3#4{\expandafter\@ifdefinable\csname #1\endcsname + {\@definecounter{#1}% + \expandafter\xdef\csname the#1\endcsname{\@thmcounter{#1}}% + \expandafter\xdef\csname #1name\endcsname{#2}% + \global\@namedef{#1}{\@spthm{#1}{\csname #1name\endcsname}{#3}{#4}}% + \global\@namedef{end#1}{\@endtheorem}}} + +\def\@spothm#1[#2]#3#4#5{% + \@ifundefined{c@#2}{\@latexerr{No theorem environment `#2' defined}\@eha}% + {\expandafter\@ifdefinable\csname #1\endcsname + {\newaliascnt{#1}{#2}% + \expandafter\xdef\csname #1name\endcsname{#3}% + \global\@namedef{#1}{\@spthm{#1}{\csname #1name\endcsname}{#4}{#5}}% + \global\@namedef{end#1}{\@endtheorem}}}} + +\def\@spthm#1#2#3#4{\topsep 7\p@ \@plus2\p@ \@minus4\p@ +\refstepcounter{#1}% +\@ifnextchar[{\@spythm{#1}{#2}{#3}{#4}}{\@spxthm{#1}{#2}{#3}{#4}}} + +\def\@spxthm#1#2#3#4{\@spbegintheorem{#2}{\csname the#1\endcsname}{#3}{#4}% + \ignorespaces} + +\def\@spythm#1#2#3#4[#5]{\@spopargbegintheorem{#2}{\csname + the#1\endcsname}{#5}{#3}{#4}\ignorespaces} + +\def\@spbegintheorem#1#2#3#4{\trivlist + \item[\hskip\labelsep{#3#1\ #2\@thmcounterend}]#4} + +\def\@spopargbegintheorem#1#2#3#4#5{\trivlist + \item[\hskip\labelsep{#4#1\ #2}]{#4(#3)\@thmcounterend\ }#5} + +% definition of \spnewtheorem* without number + +\def\@sthm#1#2{\@Ynthm{#1}{#2}} + +\def\@Ynthm#1#2#3#4{\expandafter\@ifdefinable\csname #1\endcsname + {\global\@namedef{#1}{\@Thm{\csname #1name\endcsname}{#3}{#4}}% + \expandafter\xdef\csname #1name\endcsname{#2}% + \global\@namedef{end#1}{\@endtheorem}}} + +\def\@Thm#1#2#3{\topsep 7\p@ \@plus2\p@ \@minus4\p@ +\@ifnextchar[{\@Ythm{#1}{#2}{#3}}{\@Xthm{#1}{#2}{#3}}} + +\def\@Xthm#1#2#3{\@Begintheorem{#1}{#2}{#3}\ignorespaces} + +\def\@Ythm#1#2#3[#4]{\@Opargbegintheorem{#1} + {#4}{#2}{#3}\ignorespaces} + +\def\@Begintheorem#1#2#3{#3\trivlist + \item[\hskip\labelsep{#2#1\@thmcounterend}]} + +\def\@Opargbegintheorem#1#2#3#4{#4\trivlist + \item[\hskip\labelsep{#3#1}]{#3(#2)\@thmcounterend\ }} + +\if@envcntsect + \def\@thmcountersep{.} + \spnewtheorem{theorem}{Theorem}[section]{\bfseries}{\itshape} +\else + \spnewtheorem{theorem}{Theorem}{\bfseries}{\itshape} + \if@envcntreset + \@addtoreset{theorem}{section} + \else + \@addtoreset{theorem}{chapter} + \fi +\fi + +%definition of divers theorem environments +\spnewtheorem*{claim}{Claim}{\itshape}{\rmfamily} +\spnewtheorem*{proof}{Proof}{\itshape}{\rmfamily} +\if@envcntsame % alle Umgebungen wie Theorem. + \def\spn@wtheorem#1#2#3#4{\@spothm{#1}[theorem]{#2}{#3}{#4}} +\else % alle Umgebungen mit eigenem Zaehler + \if@envcntsect % mit section numeriert + \def\spn@wtheorem#1#2#3#4{\@spxnthm{#1}{#2}[section]{#3}{#4}} + \else % nicht mit section numeriert + \if@envcntreset + \def\spn@wtheorem#1#2#3#4{\@spynthm{#1}{#2}{#3}{#4} + \@addtoreset{#1}{section}} + \else + \def\spn@wtheorem#1#2#3#4{\@spynthm{#1}{#2}{#3}{#4} + \@addtoreset{#1}{chapter}}% + \fi + \fi +\fi +\spn@wtheorem{case}{Case}{\itshape}{\rmfamily} +\spn@wtheorem{conjecture}{Conjecture}{\itshape}{\rmfamily} +\spn@wtheorem{corollary}{Corollary}{\bfseries}{\itshape} +\spn@wtheorem{definition}{Definition}{\bfseries}{\itshape} +\spn@wtheorem{example}{Example}{\itshape}{\rmfamily} +\spn@wtheorem{exercise}{Exercise}{\itshape}{\rmfamily} +\spn@wtheorem{lemma}{Lemma}{\bfseries}{\itshape} +\spn@wtheorem{note}{Note}{\itshape}{\rmfamily} +\spn@wtheorem{problem}{Problem}{\itshape}{\rmfamily} +\spn@wtheorem{property}{Property}{\itshape}{\rmfamily} +\spn@wtheorem{proposition}{Proposition}{\bfseries}{\itshape} +\spn@wtheorem{question}{Question}{\itshape}{\rmfamily} +\spn@wtheorem{solution}{Solution}{\itshape}{\rmfamily} +\spn@wtheorem{remark}{Remark}{\itshape}{\rmfamily} + +\def\@takefromreset#1#2{% + \def\@tempa{#1}% + \let\@tempd\@elt + \def\@elt##1{% + \def\@tempb{##1}% + \ifx\@tempa\@tempb\else + \@addtoreset{##1}{#2}% + \fi}% + \expandafter\expandafter\let\expandafter\@tempc\csname cl@#2\endcsname + \expandafter\def\csname cl@#2\endcsname{}% + \@tempc + \let\@elt\@tempd} + +\def\theopargself{\def\@spopargbegintheorem##1##2##3##4##5{\trivlist + \item[\hskip\labelsep{##4##1\ ##2}]{##4##3\@thmcounterend\ }##5} + \def\@Opargbegintheorem##1##2##3##4{##4\trivlist + \item[\hskip\labelsep{##3##1}]{##3##2\@thmcounterend\ }} + } + +\renewenvironment{abstract}{% + \list{}{\advance\topsep by0.35cm\relax\small + \leftmargin=1cm + \labelwidth=\z@ + \listparindent=\z@ + \itemindent\listparindent + \rightmargin\leftmargin}\item[\hskip\labelsep + \bfseries\abstractname]} + {\endlist} + +\newdimen\headlineindent % dimension for space between +\headlineindent=1.166cm % number and text of headings. + +\def\ps@headings{\let\@mkboth\@gobbletwo + \let\@oddfoot\@empty\let\@evenfoot\@empty + \def\@evenhead{\normalfont\small\rlap{\thepage}\hspace{\headlineindent}% + \leftmark\hfil} + \def\@oddhead{\normalfont\small\hfil\rightmark\hspace{\headlineindent}% + \llap{\thepage}} + \def\chaptermark##1{}% + \def\sectionmark##1{}% + \def\subsectionmark##1{}} + +\def\ps@titlepage{\let\@mkboth\@gobbletwo + \let\@oddfoot\@empty\let\@evenfoot\@empty + \def\@evenhead{\normalfont\small\rlap{\thepage}\hspace{\headlineindent}% + \hfil} + \def\@oddhead{\normalfont\small\hfil\hspace{\headlineindent}% + \llap{\thepage}} + \def\chaptermark##1{}% + \def\sectionmark##1{}% + \def\subsectionmark##1{}} + +\if@runhead\ps@headings\else +\ps@empty\fi + +\setlength\arraycolsep{1.4\p@} +\setlength\tabcolsep{1.4\p@} + +\endinput +%end of file llncs.cls diff --git a/Paper2/main.aux b/Paper2/main.aux new file mode 100644 index 0000000..5cf4e76 --- /dev/null +++ b/Paper2/main.aux @@ -0,0 +1,95 @@ +\relax +\citation{Alkindy2014} +\@writefile{toc}{\contentsline {section}{\numberline {1}Introduction}{1}} +\newlabel{sec:intro}{{1}{1}} +\citation{Sayers01012011} +\citation{Rice2000} +\citation{Alkindy2014} +\citation{Sayers01012011} +\citation{apweiler1985swiss} +\citation{sugawara2008ddbj} +\citation{Alkindy2014} +\@writefile{toc}{\contentsline {section}{\numberline {2}An Overview}{3}} +\newlabel{sec:general}{{2}{3}} +\@writefile{lof}{\contentsline {figure}{\numberline {1}{\ignorespaces A general overview of the annotation-based approach\relax }}{3}} +\providecommand*\caption@xref[2]{\@setref\relax\@undefined{#1}} +\newlabel{Fig1}{{1}{3}} +\citation{Alkindy2014} +\citation{Sayers01012011} +\citation{Alkindy2014} +\citation{Alkindy2014} +\citation{acgs13:onp,Alkindy2014} +\@writefile{toc}{\contentsline {section}{\numberline {3}Core genes extraction}{4}} +\@writefile{toc}{\contentsline {subsection}{\numberline {3.1}Similarity-based approach}{4}} +\newlabel{sec:simil}{{3.1}{4}} +\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.1.1}Theoretical presentation}{4}} +\@writefile{thm}{\contentsline {definition}{{Definition}{1}{}}{4}} +\newlabel{def1}{{1}{4}} +\citation{altschul1990basic} +\@writefile{lot}{\contentsline {table}{\numberline {1}{\ignorespaces Size of core and pan genomes w.r.t. the similarity threshold, first and second approache.\relax }}{5}} +\newlabel{Fig:sim:core:pan}{{1}{5}} +\citation{Rice2000} +\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.1.2}Case study}{6}} +\citation{Alkindy2014} +\citation{Alkindy2014} +\@writefile{toc}{\contentsline {subsection}{\numberline {3.2}Annotation based approach}{7}} +\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.2.1}Quality-test approach}{7}} +\newlabel{sec:mixed}{{3.2.1}{7}} +\@writefile{lof}{\contentsline {figure}{\numberline {2}{\ignorespaces Part of the implementation of the second method, compaire the common genes from NCBI and DOGMA.\relax }}{7}} +\newlabel{Meth2:gensim}{{2}{7}} +\citation{Alkindy2014} +\@writefile{loa}{\contentsline {algorithm}{\numberline {1}{\ignorespaces Maximum similarity score between two sequences(geneChk)\relax }}{8}} +\newlabel{Alg3:genechk}{{1}{8}} +\@writefile{loa}{\contentsline {algorithm}{\numberline {2}{\ignorespaces Extract new genome based on gene quality test\relax }}{9}} +\newlabel{Alg3:thirdM}{{2}{9}} +\@writefile{toc}{\contentsline {section}{\numberline {4}Implementation}{9}} +\newlabel{sec:implem}{{4}{9}} +\@writefile{lot}{\contentsline {table}{\numberline {2}{\ignorespaces Type of annotation, execution time, and core genes.\relax }}{9}} +\newlabel{Etime}{{2}{9}} +\@writefile{lot}{\contentsline {table}{\numberline {3}{\ignorespaces Memory usages in (MB) for each methodology\relax }}{10}} +\newlabel{mem}{{3}{10}} +\citation{mcfadden2001primary} +\citation{mcfadden2001primary} +\newlabel{subfig-1:core}{{3a}{11}} +\newlabel{sub@subfig-1:core}{{(a)}{a}} +\newlabel{subfig-2:pan}{{3b}{11}} +\newlabel{sub@subfig-2:pan}{{(b)}{b}} +\@writefile{lof}{\contentsline {figure}{\numberline {3}{\ignorespaces Sizes of Core and Pan genomes for first and second method.\relax }}{11}} +\@writefile{lof}{\contentsline {subfigure}{\numberline{(a)}{\ignorespaces {Sizes of core genome}}}{11}} +\@writefile{lof}{\contentsline {subfigure}{\numberline{(b)}{\ignorespaces {Sizes of pan genome}}}{11}} +\newlabel{fig:sizes of core and pan}{{3}{11}} +\newlabel{Cover:NCBI}{{4a}{11}} +\newlabel{sub@Cover:NCBI}{{(a)}{a}} +\newlabel{cover:dogma}{{4b}{11}} +\newlabel{sub@cover:dogma}{{(b)}{b}} +\@writefile{lof}{\contentsline {figure}{\numberline {4}{\ignorespaces Gene comparisons cover from NCBI and DOGMA, second method\relax }}{11}} +\@writefile{lof}{\contentsline {subfigure}{\numberline{(a)}{\ignorespaces {genes coverage of NCBI genomes}}}{11}} +\@writefile{lof}{\contentsline {subfigure}{\numberline{(b)}{\ignorespaces {genes coverage of DOGMA genomes}}}{11}} +\newlabel{fig:sizes of core and pan}{{4}{11}} +\@writefile{toc}{\contentsline {section}{\numberline {5}Discussion}{11}} +\@writefile{toc}{\contentsline {subsection}{\numberline {5.1}Biological evaluation}{11}} +\newlabel{sec:discuss}{{5.1}{11}} +\citation{li2013complete} +\@writefile{toc}{\contentsline {section}{\numberline {6}Conclusion}{12}} +\newlabel{sec:concl}{{6}{12}} +\bibstyle{plain} +\bibdata{biblio} +\bibcite{acgs13:onp}{1} +\bibcite{altschul1990basic}{2} +\bibcite{apweiler1985swiss}{3} +\bibcite{birney2004genewise}{4} +\bibcite{de2002comparative}{5} +\bibcite{Alkindy2014}{6} +\bibcite{Bakke2009}{7} +\bibcite{li2013complete}{8} +\bibcite{Sayers01012011}{9} +\bibcite{liu2012cpgavas}{10} +\bibcite{guindon2005phyml}{11} +\bibcite{mcfadden2001primary}{12} +\bibcite{parra2000geneid}{13} +\bibcite{parra2007cegma}{14} +\bibcite{Rice2000}{15} +\bibcite{RDOGMA}{16} +\bibcite{stamatakis2008raxml}{17} +\bibcite{stamatakis2005raxml}{18} +\bibcite{sugawara2008ddbj}{19} diff --git a/Paper2/main.bbl b/Paper2/main.bbl new file mode 100644 index 0000000..7b1f670 --- /dev/null +++ b/Paper2/main.bbl @@ -0,0 +1,113 @@ +\begin{thebibliography}{10} + +\bibitem{acgs13:onp} +Bassam Alkindy, Jean-Fran\c{c}ois Couchot, Christophe Guyeux, and Michel + Salomon. +\newblock Finding the core-genes of chloroplast species. +\newblock Journ\'ees SeqBio 2013, Montpellier, November 2013. + +\bibitem{altschul1990basic} +Stephen~F Altschul, Warren Gish, Webb Miller, Eugene~W Myers, and David~J + Lipman. +\newblock Basic local alignment search tool. +\newblock {\em Journal of molecular biology}, 215(3):403--410, 1990. + +\bibitem{apweiler1985swiss} +Rolf Apweiler, Claire O’Donovan, Maria~Jesus Martin, Wolfgang Fleischmann, + Henning Hermjakob, Steffen Moeller, Sergio Contrino, and Vivien Junker. +\newblock Swiss-prot and its computer-annotated supplement trembl: How to + produce high quality automatic annotation. +\newblock {\em EUR. J. BIOCHEM}, 147:9--15, 1985. + +\bibitem{birney2004genewise} +Ewan Birney, Michele Clamp, and Richard Durbin. +\newblock Genewise and genomewise. +\newblock {\em Genome research}, 14(5):988--995, 2004. + +\bibitem{de2002comparative} +Javier De~Las~Rivas, Juan~Jose Lozano, and Angel~R Ortiz. +\newblock Comparative analysis of chloroplast genomes: functional annotation, + genome-based phylogeny, and deduced evolutionary patterns. +\newblock {\em Genome research}, 12(4):567--583, 2002. + +\bibitem{Alkindy2014} +Alkindy~B. \emph{et al}. +\newblock Find core-genes for chloroplasts. +\newblock 2014. + +\bibitem{Bakke2009} +Bakke \emph{et al}. +\newblock Evaluation of three automated genome annotations for + \textit{Halorhabdus utahensis}. +\newblock {\em PLoS ONE}, 4(7):e6291, 07 2009. + +\bibitem{li2013complete} +Li~\emph{et al}. +\newblock Complete chloroplast genome sequence of holoparasite cistanche + deserticola (orobanchaceae) reveals gene loss and horizontal gene transfer + from its host haloxylon ammodendron (chenopodiaceae). +\newblock {\em PloS one}, 8(3):e58747, 2013. + +\bibitem{Sayers01012011} +Sayers \emph{et al}. +\newblock Database resources of the national center for biotechnology + information. +\newblock {\em Nucleic Acids Research}, 39(suppl 1):D38--D51, 2011. + +\bibitem{liu2012cpgavas} +Zhang \emph{et al}. +\newblock Cpgavas, an integrated web server for the annotation, visualization, + analysis, and genbank submission of completely sequenced chloroplast genome + sequences. + +\bibitem{guindon2005phyml} +Stephane Guindon, Franck Lethiec, Patrice Duroux, and Olivier Gascuel. +\newblock Phyml online—a web server for fast maximum likelihood-based + phylogenetic inference. +\newblock {\em Nucleic acids research}, 33(suppl 2):W557--W559, 2005. + +\bibitem{mcfadden2001primary} +Geoffrey~Ian McFadden. +\newblock Primary and secondary endosymbiosis and the origin of plastids. +\newblock {\em Journal of Phycology}, 37(6):951--959, 2001. + +\bibitem{parra2000geneid} +Gen{\'\i}s Parra, Enrique Blanco, and Roderic Guig{\'o}. +\newblock Geneid in drosophila. +\newblock {\em Genome research}, 10(4):511--515, 2000. + +\bibitem{parra2007cegma} +Genis Parra, Keith Bradnam, and Ian Korf. +\newblock Cegma: a pipeline to accurately annotate core genes in eukaryotic + genomes. +\newblock {\em Bioinformatics}, 23(9):1061--1067, 2007. + +\bibitem{Rice2000} +P.~Rice, I.~Longden, and A.~Bleasby. +\newblock Emboss: the european molecular biology open software suite. +\newblock {\em Trends Genet}, 16(6):276--7, 2000. + +\bibitem{RDOGMA} +Robert K.~Jansen Stacia K.~Wyman and Jeffrey~L. Boore. +\newblock Automatic annotation of organellar genomes with dogma. +\newblock {\em BIOINFORMATICS, oxford Press}, 20(172004):3252--3255, 2004. + +\bibitem{stamatakis2008raxml} +Alexandros Stamatakis. +\newblock The raxml 7.0. 4 manual. +\newblock {\em Department of Computer Science. + Ludwig-Maximilians-Universit{\"a}t M{\"u}nchen}, 2008. + +\bibitem{stamatakis2005raxml} +Alexandros Stamatakis, Thomas Ludwig, and Harald Meier. +\newblock Raxml-iii: a fast program for maximum likelihood-based inference of + large phylogenetic trees. +\newblock {\em Bioinformatics}, 21(4):456--463, 2005. + +\bibitem{sugawara2008ddbj} +Hideaki Sugawara, Osamu Ogasawara, Kousaku Okubo, Takashi Gojobori, and Yoshio + Tateno. +\newblock Ddbj with new system and face. +\newblock {\em Nucleic acids research}, 36(suppl 1):D22--D24, 2008. + +\end{thebibliography} diff --git a/Paper2/main.blg b/Paper2/main.blg new file mode 100644 index 0000000..315c5e6 --- /dev/null +++ b/Paper2/main.blg @@ -0,0 +1,59 @@ +This is BibTeX, Version 0.99c (TeX Live 2009/Debian) +The top-level auxiliary file: main.aux +The style file: plain.bst +Database file #1: biblio.bib +You're missing a field name---line 123 of file biblio.bib + : + : %Liu, Chang and Shi, Linchun and Zhu, Yingjie and Chen, Haimei and Zhang, Jianhui %and Lin, Xiaohan and Guan, Xiaojun}, +(Error may have been on previous line) +I'm skipping whatever remains of this entry +You're missing a field name---line 223 of file biblio.bib + : + : %{Blouin, Yann AND Hauck, Yolande AND Soler,Charles AND Fabre, Michel AND Vong, Rithy ANDDehan, Céline AND Cazajous, Géraldine ANDMassoure, Pierre-Laurent AND Kraemer, PhilippeANDJenkins, Akinbowale AND Garnotel, EricAND Pourcel, Christine AND Vergnaud, Gilles} +(Error may have been on previous line) +I'm skipping whatever remains of this entry +Warning--empty journal in Alkindy2014 +Warning--empty journal in liu2012cpgavas +Warning--empty year in liu2012cpgavas +You've used 19 entries, + 2118 wiz_defined-function locations, + 627 strings with 7317 characters, +and the built_in function-call counts, 6332 in all, are: += -- 617 +> -- 315 +< -- 0 ++ -- 126 +- -- 106 +* -- 506 +:= -- 1091 +add.period$ -- 56 +call.type$ -- 19 +change.case$ -- 110 +chr.to.int$ -- 0 +cite$ -- 22 +duplicate$ -- 202 +empty$ -- 476 +format.name$ -- 106 +if$ -- 1266 +int.to.chr$ -- 0 +int.to.str$ -- 19 +missing$ -- 18 +newline$ -- 97 +num.names$ -- 38 +pop$ -- 90 +preamble$ -- 1 +purify$ -- 91 +quote$ -- 0 +skip$ -- 132 +stack$ -- 0 +substring$ -- 445 +swap$ -- 17 +text.length$ -- 0 +text.prefix$ -- 0 +top$ -- 0 +type$ -- 76 +warning$ -- 3 +while$ -- 63 +width$ -- 21 +write$ -- 203 +(There were 2 error messages) diff --git a/Paper2/main.log b/Paper2/main.log new file mode 100644 index 0000000..566d7de --- /dev/null +++ b/Paper2/main.log @@ -0,0 +1,1026 @@ +This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009/Debian) (format=pdflatex 2012.11.30) 20 FEB 2014 10:24 +entering extended mode + %&-line parsing enabled. +**main.tex +(./main.tex +LaTeX2e <2009/09/24> +Babel and hyphenation patterns for english, usenglishmax, dumylang, noh +yphenation, farsi, arabic, croatian, bulgarian, ukrainian, russian, czech, slov +ak, danish, dutch, finnish, french, basque, ngerman, german, german-x-2009-06-1 +9, ngerman-x-2009-06-19, ibycus, monogreek, greek, ancientgreek, hungarian, san +skrit, italian, latin, latvian, lithuanian, mongolian2a, mongolian, bokmal, nyn +orsk, romanian, irish, coptic, serbian, turkish, welsh, esperanto, uppersorbian +, estonian, indonesian, interlingua, icelandic, kurmanji, slovenian, polish, po +rtuguese, spanish, galician, catalan, swedish, ukenglish, pinyin, loaded. +(/usr/share/texmf-texlive/tex/latex/base/article.cls +Document Class: article 2007/10/19 v1.4h Standard LaTeX document class +(/usr/share/texmf-texlive/tex/latex/base/size10.clo +File: size10.clo 2007/10/19 v1.4h Standard LaTeX file (size option) +) +\c@part=\count79 +\c@section=\count80 +\c@subsection=\count81 +\c@subsubsection=\count82 +\c@paragraph=\count83 +\c@subparagraph=\count84 +\c@figure=\count85 +\c@table=\count86 +\abovecaptionskip=\skip41 +\belowcaptionskip=\skip42 +\bibindent=\dimen102 +) +(/usr/share/texmf-texlive/tex/latex/subfig/subfig.sty +Package: subfig 2005/06/28 ver: 1.3 subfig package + +(/usr/share/texmf-texlive/tex/latex/graphics/keyval.sty +Package: keyval 1999/03/16 v1.13 key=value parser (DPC) +\KV@toks@=\toks14 +) +(/usr/share/texmf-texlive/tex/latex/caption/caption.sty +Package: caption 2009/10/09 v3.1k Customizing captions (AR) + +(/usr/share/texmf-texlive/tex/latex/caption/caption3.sty +Package: caption3 2009/10/09 v3.1k caption3 kernel (AR) +\captionmargin=\dimen103 +\captionmargin@=\dimen104 +\captionwidth=\dimen105 +\caption@indent=\dimen106 +\caption@parindent=\dimen107 +\caption@hangindent=\dimen108 +) +\c@ContinuedFloat=\count87 +) +\c@KVtest=\count88 +\sf@farskip=\skip43 +\sf@captopadj=\dimen109 +\sf@capskip=\skip44 +\sf@nearskip=\skip45 +\c@subfigure=\count89 +\c@subfigure@save=\count90 +\c@lofdepth=\count91 +\c@subtable=\count92 +\c@subtable@save=\count93 +\c@lotdepth=\count94 +\sf@top=\skip46 +\sf@bottom=\skip47 +) +(/usr/share/texmf-texlive/tex/latex/graphics/color.sty +Package: color 2005/11/14 v1.0j Standard LaTeX Color (DPC) + +(/etc/texmf/tex/latex/config/color.cfg +File: color.cfg 2007/01/18 v1.5 color configuration of teTeX/TeXLive +) +Package color Info: Driver file: pdftex.def on input line 130. + +(/usr/share/texmf-texlive/tex/latex/pdftex-def/pdftex.def +File: pdftex.def 2010/03/12 v0.04p Graphics/color for pdfTeX +\Gread@gobject=\count95 +)) +(/usr/share/texmf-texlive/tex/latex/graphics/graphicx.sty +Package: graphicx 1999/02/16 v1.0f Enhanced LaTeX Graphics (DPC,SPQR) + +(/usr/share/texmf-texlive/tex/latex/graphics/graphics.sty +Package: graphics 2009/02/05 v1.0o Standard LaTeX Graphics (DPC,SPQR) + +(/usr/share/texmf-texlive/tex/latex/graphics/trig.sty +Package: trig 1999/03/16 v1.09 sin cos tan (DPC) +) +(/etc/texmf/tex/latex/config/graphics.cfg +File: graphics.cfg 2009/08/28 v1.8 graphics configuration of TeX Live +) +Package graphics Info: Driver file: pdftex.def on input line 91. +) +\Gin@req@height=\dimen110 +\Gin@req@width=\dimen111 +) +(/usr/share/texmf-texlive/tex/latex/ltxmisc/url.sty +\Urlmuskip=\muskip10 +Package: url 2006/04/12 ver 3.3 Verb mode for urls, etc. +) +(/usr/share/texmf-texlive/tex/latex/cite/cite.sty +LaTeX Info: Redefining \cite on input line 285. +LaTeX Info: Redefining \nocite on input line 356. +Package: cite 2009/08/29 v 5.2 +) +(/usr/share/texmf-texlive/tex/latex/algorithms/algorithm.sty +Package: algorithm 2009/08/24 v0.1 Document Style `algorithm' - floating enviro +nment + +(/usr/share/texmf-texlive/tex/latex/float/float.sty +Package: float 2001/11/08 v1.3d Float enhancements (AL) +\c@float@type=\count96 +\float@exts=\toks15 +\float@box=\box26 +\@float@everytoks=\toks16 +\@floatcapt=\box27 +) +(/usr/share/texmf-texlive/tex/latex/base/ifthen.sty +Package: ifthen 2001/05/26 v1.1c Standard LaTeX ifthen package (DPC) +) +\@float@every@algorithm=\toks17 +\c@algorithm=\count97 +) +(/usr/share/texmf-texlive/tex/latex/algorithms/algorithmic.sty +Package: algorithmic 2009/08/24 v0.1 Document Style `algorithmic' +\c@ALC@unique=\count98 +\c@ALC@line=\count99 +\c@ALC@rem=\count100 +\c@ALC@depth=\count101 +\ALC@tlm=\skip48 +\algorithmicindent=\skip49 +) +(/usr/share/texmf-texlive/tex/latex/oberdiek/pdflscape.sty +Package: pdflscape 2008/08/11 v0.10 Landscape pages in PDF (HO) + +(/usr/share/texmf-texlive/tex/latex/graphics/lscape.sty +Package: lscape 2000/10/22 v3.01 Landscape Pages (DPC) +) +(/usr/share/texmf-texlive/tex/generic/oberdiek/ifpdf.sty +Package: ifpdf 2009/04/10 v2.0 Provides the ifpdf switch (HO) +Package ifpdf Info: pdfTeX in pdf mode detected. +) +Package pdflscape Info: Auto-detected driver: pdftex on input line 75. + +(/usr/share/texmf-texlive/tex/generic/ifxetex/ifxetex.sty +Package: ifxetex 2009/01/23 v0.5 Provides ifxetex conditional +)) +(/usr/share/texmf-texlive/tex/latex/preprint/authblk.sty +Package: authblk 2001/02/27 1.3 (PWD) +\affilsep=\skip50 +\@affilsep=\skip51 +\c@Maxaffil=\count102 +\c@authors=\count103 +\c@affil=\count104 +) +(/usr/share/texmf-texlive/tex/latex/base/fontenc.sty +Package: fontenc 2005/09/27 v1.99g Standard LaTeX package + +(/usr/share/texmf-texlive/tex/latex/base/t1enc.def +File: t1enc.def 2005/09/27 v1.99g Standard LaTeX file +LaTeX Font Info: Redeclaring font encoding T1 on input line 43. +)) +(/usr/share/texmf-texlive/tex/latex/multirow/multirow.sty +\bigstrutjot=\dimen112 +) +(/usr/share/texmf-texlive/tex/latex/tools/longtable.sty +Package: longtable 2004/02/01 v4.11 Multi-page Table package (DPC) +\LTleft=\skip52 +\LTright=\skip53 +\LTpre=\skip54 +\LTpost=\skip55 +\LTchunksize=\count105 +\LTcapwidth=\dimen113 +\LT@head=\box28 +\LT@firsthead=\box29 +\LT@foot=\box30 +\LT@lastfoot=\box31 +\LT@cols=\count106 +\LT@rows=\count107 +\c@LT@tables=\count108 +\c@LT@chunks=\count109 +\LT@p@ftn=\toks18 +) +(/usr/share/texmf-texlive/tex/latex/amsmath/amsmath.sty +Package: amsmath 2000/07/18 v2.13 AMS math features +\@mathmargin=\skip56 + +For additional information on amsmath, use the `?' option. +(/usr/share/texmf-texlive/tex/latex/amsmath/amstext.sty +Package: amstext 2000/06/29 v2.01 + +(/usr/share/texmf-texlive/tex/latex/amsmath/amsgen.sty +File: amsgen.sty 1999/11/30 v2.0 +\@emptytoks=\toks19 +\ex@=\dimen114 +)) +(/usr/share/texmf-texlive/tex/latex/amsmath/amsbsy.sty +Package: amsbsy 1999/11/29 v1.2d +\pmbraise@=\dimen115 +) +(/usr/share/texmf-texlive/tex/latex/amsmath/amsopn.sty +Package: amsopn 1999/12/14 v2.01 operator names +) +\inf@bad=\count110 +LaTeX Info: Redefining \frac on input line 211. +\uproot@=\count111 +\leftroot@=\count112 +LaTeX Info: Redefining \overline on input line 307. +\classnum@=\count113 +\DOTSCASE@=\count114 +LaTeX Info: Redefining \ldots on input line 379. +LaTeX Info: Redefining \dots on input line 382. +LaTeX Info: Redefining \cdots on input line 467. +\Mathstrutbox@=\box32 +\strutbox@=\box33 +\big@size=\dimen116 +LaTeX Font Info: Redeclaring font encoding OML on input line 567. +LaTeX Font Info: Redeclaring font encoding OMS on input line 568. +\macc@depth=\count115 +\c@MaxMatrixCols=\count116 +\dotsspace@=\muskip11 +\c@parentequation=\count117 +\dspbrk@lvl=\count118 +\tag@help=\toks20 +\row@=\count119 +\column@=\count120 +\maxfields@=\count121 +\andhelp@=\toks21 +\eqnshift@=\dimen117 +\alignsep@=\dimen118 +\tagshift@=\dimen119 +\tagwidth@=\dimen120 +\totwidth@=\dimen121 +\lineht@=\dimen122 +\@envbody=\toks22 +\multlinegap=\skip57 +\multlinetaggap=\skip58 +\mathdisplay@stack=\toks23 +LaTeX Info: Redefining \[ on input line 2666. +LaTeX Info: Redefining \] on input line 2667. +) +(/usr/share/texmf-texlive/tex/latex/mh/mathtools.sty +Package: mathtools 2008/08/01 v1.06 mathematical typesetting tools (MH) + +(/usr/share/texmf-texlive/tex/latex/tools/calc.sty +Package: calc 2007/08/22 v4.3 Infix arithmetic (KKT,FJ) +\calc@Acount=\count122 +\calc@Bcount=\count123 +\calc@Adimen=\dimen123 +\calc@Bdimen=\dimen124 +\calc@Askip=\skip59 +\calc@Bskip=\skip60 +LaTeX Info: Redefining \setlength on input line 76. +LaTeX Info: Redefining \addtolength on input line 77. +\calc@Ccount=\count124 +\calc@Cskip=\skip61 +) +(/usr/share/texmf-texlive/tex/latex/mh/mhsetup.sty +Package: mhsetup 2007/12/03 v1.2 programming setup (MH) +) +\g_MT_multlinerow_int=\count125 +\l_MT_multwidth_dim=\dimen125 +) +(/usr/share/texmf-texlive/tex/latex/amsfonts/amssymb.sty +Package: amssymb 2009/06/22 v3.00 + +(/usr/share/texmf-texlive/tex/latex/amsfonts/amsfonts.sty +Package: amsfonts 2009/06/22 v3.00 Basic AMSFonts support +\symAMSa=\mathgroup4 +\symAMSb=\mathgroup5 +LaTeX Font Info: Overwriting math alphabet `\mathfrak' in version `bold' +(Font) U/euf/m/n --> U/euf/b/n on input line 96. +)) +(/usr/share/texmf-texlive/tex/latex/ntheorem/ntheorem.sty +Style `ntheorem', Version 1.25 <2005/07/07> +Package: ntheorem 2005/07/07 1.25 +\theorem@style=\toks24 +\theorem@@style=\toks25 +\theorembodyfont=\toks26 +\theoremnumbering=\toks27 +\theorempreskipamount=\skip62 +\theorempostskipamount=\skip63 +\theoremindent=\dimen126 +\theorem@indent=\dimen127 +\theoremheaderfont=\toks28 +\theoremseparator=\toks29 +\theoremprework=\toks30 +\theorempostwork=\toks31 +\theoremsymbol=\toks32 +\qedsymbol=\toks33 +\theoremkeyword=\toks34 +\qedsymbol=\toks35 +\thm@topsepadd=\skip64 +Package ntheorem Info: Standard config file ntheorem.std used on input line 103 +8. +(/usr/share/texmf-texlive/tex/latex/ntheorem/ntheorem.std +(/usr/share/texmf-texlive/tex/latex/base/latexsym.sty +Package: latexsym 1998/08/17 v2.2e Standard LaTeX package (lasy symbols) +\symlasy=\mathgroup6 +LaTeX Font Info: Overwriting symbol font `lasy' in version `bold' +(Font) U/lasy/m/n --> U/lasy/b/n on input line 47. +) +\c@Theorem=\count126 +\c@theorem=\count127 +\c@Satz=\count128 +\c@satz=\count129 +\c@Proposition=\count130 +\c@proposition=\count131 +\c@Lemma=\count132 +\c@lemma=\count133 +\c@Korollar=\count134 +\c@korollar=\count135 +\c@Corollary=\count136 +\c@corollary=\count137 +\c@Example=\count138 +\c@example=\count139 +\c@Beispiel=\count140 +\c@beispiel=\count141 +\c@Bemerkung=\count142 +\c@bemerkung=\count143 +\c@Anmerkung=\count144 +\c@anmerkung=\count145 +\c@Remark=\count146 +\c@remark=\count147 +\c@Definition=\count148 +\c@definition=\count149 +\c@Proof=\count150 +\c@proof=\count151 +\c@Beweis=\count152 +\c@beweis=\count153 +)) +(/usr/share/texmf-texlive/tex/latex/stmaryrd/stmaryrd.sty +Package: stmaryrd 1994/03/03 St Mary's Road symbol package +\symstmry=\mathgroup7 +LaTeX Font Info: Overwriting symbol font `stmry' in version `bold' +(Font) U/stmry/m/n --> U/stmry/b/n on input line 89. +) +(/usr/share/texmf-texlive/tex/latex/base/inputenc.sty +Package: inputenc 2008/03/30 v1.1d Input encoding file +\inpenc@prehook=\toks36 +\inpenc@posthook=\toks37 + +(/usr/share/texmf-texlive/tex/latex/base/utf8.def +File: utf8.def 2008/04/05 v1.1m UTF-8 support for inputenc +Now handling font encoding OML ... +... no UTF-8 mapping file for font encoding OML +Now handling font encoding T1 ... +... processing UTF-8 mapping file for font encoding T1 + +(/usr/share/texmf-texlive/tex/latex/base/t1enc.dfu +File: t1enc.dfu 2008/04/05 v1.1m UTF-8 support for inputenc + defining Unicode char U+00A1 (decimal 161) + defining Unicode char U+00A3 (decimal 163) + defining Unicode char U+00AB (decimal 171) + defining Unicode char U+00BB (decimal 187) + defining Unicode char U+00BF (decimal 191) + defining Unicode char U+00C0 (decimal 192) + defining Unicode char U+00C1 (decimal 193) + defining Unicode char U+00C2 (decimal 194) + defining Unicode char U+00C3 (decimal 195) + defining Unicode char U+00C4 (decimal 196) + defining Unicode char U+00C5 (decimal 197) + defining Unicode char U+00C6 (decimal 198) + defining Unicode char U+00C7 (decimal 199) + defining Unicode char U+00C8 (decimal 200) + defining Unicode char U+00C9 (decimal 201) + defining Unicode char U+00CA (decimal 202) + defining Unicode char U+00CB (decimal 203) + defining Unicode char U+00CC (decimal 204) + defining Unicode char U+00CD (decimal 205) + defining Unicode char U+00CE (decimal 206) + defining Unicode char U+00CF (decimal 207) + defining Unicode char U+00D0 (decimal 208) + defining Unicode char U+00D1 (decimal 209) + defining Unicode char U+00D2 (decimal 210) + defining Unicode char U+00D3 (decimal 211) + defining Unicode char U+00D4 (decimal 212) + defining Unicode char U+00D5 (decimal 213) + defining Unicode char U+00D6 (decimal 214) + defining Unicode char U+00D8 (decimal 216) + defining Unicode char U+00D9 (decimal 217) + defining Unicode char U+00DA (decimal 218) + defining Unicode char U+00DB (decimal 219) + defining Unicode char U+00DC (decimal 220) + defining Unicode char U+00DD (decimal 221) + defining Unicode char U+00DE (decimal 222) + defining Unicode char U+00DF (decimal 223) + defining Unicode char U+00E0 (decimal 224) + defining Unicode char U+00E1 (decimal 225) + defining Unicode char U+00E2 (decimal 226) + defining Unicode char U+00E3 (decimal 227) + defining Unicode char U+00E4 (decimal 228) + defining Unicode char U+00E5 (decimal 229) + defining Unicode char U+00E6 (decimal 230) + defining Unicode char U+00E7 (decimal 231) + defining Unicode char U+00E8 (decimal 232) + defining Unicode char U+00E9 (decimal 233) + defining Unicode char U+00EA (decimal 234) + defining Unicode char U+00EB (decimal 235) + defining Unicode char U+00EC (decimal 236) + defining Unicode char U+00ED (decimal 237) + defining Unicode char U+00EE (decimal 238) + defining Unicode char U+00EF (decimal 239) + defining Unicode char U+00F0 (decimal 240) + defining Unicode char U+00F1 (decimal 241) + defining Unicode char U+00F2 (decimal 242) + defining Unicode char U+00F3 (decimal 243) + defining Unicode char U+00F4 (decimal 244) + defining Unicode char U+00F5 (decimal 245) + defining Unicode char U+00F6 (decimal 246) + defining Unicode char U+00F8 (decimal 248) + defining Unicode char U+00F9 (decimal 249) + defining Unicode char U+00FA (decimal 250) + defining Unicode char U+00FB (decimal 251) + defining Unicode char U+00FC (decimal 252) + defining Unicode char U+00FD (decimal 253) + defining Unicode char U+00FE (decimal 254) + defining Unicode char U+00FF (decimal 255) + defining Unicode char U+0102 (decimal 258) + defining Unicode char U+0103 (decimal 259) + defining Unicode char U+0104 (decimal 260) + defining Unicode char U+0105 (decimal 261) + defining Unicode char U+0106 (decimal 262) + defining Unicode char U+0107 (decimal 263) + defining Unicode char U+010C (decimal 268) + defining Unicode char U+010D (decimal 269) + defining Unicode char U+010E (decimal 270) + defining Unicode char U+010F (decimal 271) + defining Unicode char U+0110 (decimal 272) + defining Unicode char U+0111 (decimal 273) + defining Unicode char U+0118 (decimal 280) + defining Unicode char U+0119 (decimal 281) + defining Unicode char U+011A (decimal 282) + defining Unicode char U+011B (decimal 283) + defining Unicode char U+011E (decimal 286) + defining Unicode char U+011F (decimal 287) + defining Unicode char U+0130 (decimal 304) + defining Unicode char U+0131 (decimal 305) + defining Unicode char U+0132 (decimal 306) + defining Unicode char U+0133 (decimal 307) + defining Unicode char U+0139 (decimal 313) + defining Unicode char U+013A (decimal 314) + defining Unicode char U+013D (decimal 317) + defining Unicode char U+013E (decimal 318) + defining Unicode char U+0141 (decimal 321) + defining Unicode char U+0142 (decimal 322) + defining Unicode char U+0143 (decimal 323) + defining Unicode char U+0144 (decimal 324) + defining Unicode char U+0147 (decimal 327) + defining Unicode char U+0148 (decimal 328) + defining Unicode char U+014A (decimal 330) + defining Unicode char U+014B (decimal 331) + defining Unicode char U+0150 (decimal 336) + defining Unicode char U+0151 (decimal 337) + defining Unicode char U+0152 (decimal 338) + defining Unicode char U+0153 (decimal 339) + defining Unicode char U+0154 (decimal 340) + defining Unicode char U+0155 (decimal 341) + defining Unicode char U+0158 (decimal 344) + defining Unicode char U+0159 (decimal 345) + defining Unicode char U+015A (decimal 346) + defining Unicode char U+015B (decimal 347) + defining Unicode char U+015E (decimal 350) + defining Unicode char U+015F (decimal 351) + defining Unicode char U+0160 (decimal 352) + defining Unicode char U+0161 (decimal 353) + defining Unicode char U+0162 (decimal 354) + defining Unicode char U+0163 (decimal 355) + defining Unicode char U+0164 (decimal 356) + defining Unicode char U+0165 (decimal 357) + defining Unicode char U+016E (decimal 366) + defining Unicode char U+016F (decimal 367) + defining Unicode char U+0170 (decimal 368) + defining Unicode char U+0171 (decimal 369) + defining Unicode char U+0178 (decimal 376) + defining Unicode char U+0179 (decimal 377) + defining Unicode char U+017A (decimal 378) + defining Unicode char U+017B (decimal 379) + defining Unicode char U+017C (decimal 380) + defining Unicode char U+017D (decimal 381) + defining Unicode char U+017E (decimal 382) + defining Unicode char U+200C (decimal 8204) + defining Unicode char U+2013 (decimal 8211) + defining Unicode char U+2014 (decimal 8212) + defining Unicode char U+2018 (decimal 8216) + defining Unicode char U+2019 (decimal 8217) + defining Unicode char U+201A (decimal 8218) + defining Unicode char U+201C (decimal 8220) + defining Unicode char U+201D (decimal 8221) + defining Unicode char U+201E (decimal 8222) + defining Unicode char U+2030 (decimal 8240) + defining Unicode char U+2031 (decimal 8241) + defining Unicode char U+2039 (decimal 8249) + defining Unicode char U+203A (decimal 8250) + defining Unicode char U+2423 (decimal 9251) +) +Now handling font encoding OT1 ... +... processing UTF-8 mapping file for font encoding OT1 + +(/usr/share/texmf-texlive/tex/latex/base/ot1enc.dfu +File: ot1enc.dfu 2008/04/05 v1.1m UTF-8 support for inputenc + defining Unicode char U+00A1 (decimal 161) + defining Unicode char U+00A3 (decimal 163) + defining Unicode char U+00B8 (decimal 184) + defining Unicode char U+00BF (decimal 191) + defining Unicode char U+00C5 (decimal 197) + defining Unicode char U+00C6 (decimal 198) + defining Unicode char U+00D8 (decimal 216) + defining Unicode char U+00DF (decimal 223) + defining Unicode char U+00E6 (decimal 230) + defining Unicode char U+00EC (decimal 236) + defining Unicode char U+00ED (decimal 237) + defining Unicode char U+00EE (decimal 238) + defining Unicode char U+00EF (decimal 239) + defining Unicode char U+00F8 (decimal 248) + defining Unicode char U+0131 (decimal 305) + defining Unicode char U+0141 (decimal 321) + defining Unicode char U+0142 (decimal 322) + defining Unicode char U+0152 (decimal 338) + defining Unicode char U+0153 (decimal 339) + defining Unicode char U+2013 (decimal 8211) + defining Unicode char U+2014 (decimal 8212) + defining Unicode char U+2018 (decimal 8216) + defining Unicode char U+2019 (decimal 8217) + defining Unicode char U+201C (decimal 8220) + defining Unicode char U+201D (decimal 8221) +) +Now handling font encoding OMS ... +... processing UTF-8 mapping file for font encoding OMS + +(/usr/share/texmf-texlive/tex/latex/base/omsenc.dfu +File: omsenc.dfu 2008/04/05 v1.1m UTF-8 support for inputenc + defining Unicode char U+00A7 (decimal 167) + defining Unicode char U+00B6 (decimal 182) + defining Unicode char U+00B7 (decimal 183) + defining Unicode char U+2020 (decimal 8224) + defining Unicode char U+2021 (decimal 8225) + defining Unicode char U+2022 (decimal 8226) +) +Now handling font encoding OMX ... +... no UTF-8 mapping file for font encoding OMX +Now handling font encoding U ... +... no UTF-8 mapping file for font encoding U + defining Unicode char U+00A9 (decimal 169) + defining Unicode char U+00AA (decimal 170) + defining Unicode char U+00AE (decimal 174) + defining Unicode char U+00BA (decimal 186) + defining Unicode char U+02C6 (decimal 710) + defining Unicode char U+02DC (decimal 732) + defining Unicode char U+200C (decimal 8204) + defining Unicode char U+2026 (decimal 8230) + defining Unicode char U+2122 (decimal 8482) + defining Unicode char U+2423 (decimal 9251) +)) +(/usr/share/texmf/tex/latex/pgf/frontendlayer/tikz.sty +(/usr/share/texmf/tex/latex/pgf/basiclayer/pgf.sty +(/usr/share/texmf/tex/latex/pgf/utilities/pgfrcs.sty +(/usr/share/texmf/tex/generic/pgf/utilities/pgfutil-common.tex +\pgfutil@everybye=\toks38 +) +(/usr/share/texmf/tex/generic/pgf/utilities/pgfutil-latex.def +\pgfutil@abb=\box34 + +(/usr/share/texmf-texlive/tex/latex/ms/everyshi.sty +Package: everyshi 2001/05/15 v3.00 EveryShipout Package (MS) +)) +(/usr/share/texmf/tex/generic/pgf/utilities/pgfrcs.code.tex +Package: pgfrcs 2010/10/25 v2.10 (rcs-revision 1.24) +)) +Package: pgf 2008/01/15 v2.10 (rcs-revision 1.12) + +(/usr/share/texmf/tex/latex/pgf/basiclayer/pgfcore.sty +(/usr/share/texmf/tex/latex/pgf/systemlayer/pgfsys.sty +(/usr/share/texmf/tex/generic/pgf/systemlayer/pgfsys.code.tex +Package: pgfsys 2010/06/30 v2.10 (rcs-revision 1.37) + +(/usr/share/texmf/tex/generic/pgf/utilities/pgfkeys.code.tex +\pgfkeys@pathtoks=\toks39 +\pgfkeys@temptoks=\toks40 + +(/usr/share/texmf/tex/generic/pgf/utilities/pgfkeysfiltered.code.tex +\pgfkeys@tmptoks=\toks41 +)) +\pgf@x=\dimen128 +\pgf@y=\dimen129 +\pgf@xa=\dimen130 +\pgf@ya=\dimen131 +\pgf@xb=\dimen132 +\pgf@yb=\dimen133 +\pgf@xc=\dimen134 +\pgf@yc=\dimen135 +\w@pgf@writea=\write3 +\r@pgf@reada=\read1 +\c@pgf@counta=\count154 +\c@pgf@countb=\count155 +\c@pgf@countc=\count156 +\c@pgf@countd=\count157 + +(/usr/share/texmf/tex/generic/pgf/systemlayer/pgf.cfg +File: pgf.cfg 2008/05/14 (rcs-revision 1.7) +) +Package pgfsys Info: Driver file for pgf: pgfsys-pdftex.def on input line 900. + +(/usr/share/texmf/tex/generic/pgf/systemlayer/pgfsys-pdftex.def +File: pgfsys-pdftex.def 2009/05/22 (rcs-revision 1.26) + +(/usr/share/texmf/tex/generic/pgf/systemlayer/pgfsys-common-pdf.def +File: pgfsys-common-pdf.def 2008/05/19 (rcs-revision 1.10) +))) +(/usr/share/texmf/tex/generic/pgf/systemlayer/pgfsyssoftpath.code.tex +File: pgfsyssoftpath.code.tex 2008/07/18 (rcs-revision 1.7) +\pgfsyssoftpath@smallbuffer@items=\count158 +\pgfsyssoftpath@bigbuffer@items=\count159 +) +(/usr/share/texmf/tex/generic/pgf/systemlayer/pgfsysprotocol.code.tex +File: pgfsysprotocol.code.tex 2006/10/16 (rcs-revision 1.4) +)) +(/usr/share/texmf/tex/latex/xcolor/xcolor.sty +Package: xcolor 2007/01/21 v2.11 LaTeX color extensions (UK) + +(/etc/texmf/tex/latex/config/color.cfg +File: color.cfg 2007/01/18 v1.5 color configuration of teTeX/TeXLive +) +Package xcolor Info: Driver file: pdftex.def on input line 225. +LaTeX Info: Redefining \color on input line 702. +Package xcolor Info: Model `cmy' substituted by `cmy0' on input line 1337. +Package xcolor Info: Model `hsb' substituted by `rgb' on input line 1341. +Package xcolor Info: Model `RGB' extended on input line 1353. +Package xcolor Info: Model `HTML' substituted by `rgb' on input line 1355. +Package xcolor Info: Model `Hsb' substituted by `hsb' on input line 1356. +Package xcolor Info: Model `tHsb' substituted by `hsb' on input line 1357. +Package xcolor Info: Model `HSB' substituted by `hsb' on input line 1358. +Package xcolor Info: Model `Gray' substituted by `gray' on input line 1359. +Package xcolor Info: Model `wave' substituted by `hsb' on input line 1360. +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcore.code.tex +Package: pgfcore 2010/04/11 v2.10 (rcs-revision 1.7) + +(/usr/share/texmf/tex/generic/pgf/math/pgfmath.code.tex +(/usr/share/texmf/tex/generic/pgf/math/pgfmathcalc.code.tex +(/usr/share/texmf/tex/generic/pgf/math/pgfmathutil.code.tex) +(/usr/share/texmf/tex/generic/pgf/math/pgfmathparser.code.tex +\pgfmath@dimen=\dimen136 +\pgfmath@count=\count160 +\pgfmath@box=\box35 +\pgfmath@toks=\toks42 +\pgfmath@stack@operand=\toks43 +\pgfmath@stack@operation=\toks44 +) +(/usr/share/texmf/tex/generic/pgf/math/pgfmathfunctions.code.tex +(/usr/share/texmf/tex/generic/pgf/math/pgfmathfunctions.basic.code.tex) +(/usr/share/texmf/tex/generic/pgf/math/pgfmathfunctions.trigonometric.code.tex) +(/usr/share/texmf/tex/generic/pgf/math/pgfmathfunctions.random.code.tex) +(/usr/share/texmf/tex/generic/pgf/math/pgfmathfunctions.comparison.code.tex) +(/usr/share/texmf/tex/generic/pgf/math/pgfmathfunctions.base.code.tex) +(/usr/share/texmf/tex/generic/pgf/math/pgfmathfunctions.round.code.tex) +(/usr/share/texmf/tex/generic/pgf/math/pgfmathfunctions.misc.code.tex))) +(/usr/share/texmf/tex/generic/pgf/math/pgfmathfloat.code.tex +\c@pgfmathroundto@lastzeros=\count161 +)) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcorepoints.code.tex +File: pgfcorepoints.code.tex 2010/04/09 (rcs-revision 1.20) +\pgf@picminx=\dimen137 +\pgf@picmaxx=\dimen138 +\pgf@picminy=\dimen139 +\pgf@picmaxy=\dimen140 +\pgf@pathminx=\dimen141 +\pgf@pathmaxx=\dimen142 +\pgf@pathminy=\dimen143 +\pgf@pathmaxy=\dimen144 +\pgf@xx=\dimen145 +\pgf@xy=\dimen146 +\pgf@yx=\dimen147 +\pgf@yy=\dimen148 +\pgf@zx=\dimen149 +\pgf@zy=\dimen150 +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcorepathconstruct.code.tex +File: pgfcorepathconstruct.code.tex 2010/08/03 (rcs-revision 1.24) +\pgf@path@lastx=\dimen151 +\pgf@path@lasty=\dimen152 +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcorepathusage.code.tex +File: pgfcorepathusage.code.tex 2008/04/22 (rcs-revision 1.12) +\pgf@shorten@end@additional=\dimen153 +\pgf@shorten@start@additional=\dimen154 +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcorescopes.code.tex +File: pgfcorescopes.code.tex 2010/09/08 (rcs-revision 1.34) +\pgfpic=\box36 +\pgf@hbox=\box37 +\pgf@layerbox@main=\box38 +\pgf@picture@serial@count=\count162 +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcoregraphicstate.code.tex +File: pgfcoregraphicstate.code.tex 2008/04/22 (rcs-revision 1.9) +\pgflinewidth=\dimen155 +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcoretransformations.code.tex +File: pgfcoretransformations.code.tex 2009/06/10 (rcs-revision 1.11) +\pgf@pt@x=\dimen156 +\pgf@pt@y=\dimen157 +\pgf@pt@temp=\dimen158 +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcorequick.code.tex +File: pgfcorequick.code.tex 2008/10/09 (rcs-revision 1.3) +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcoreobjects.code.tex +File: pgfcoreobjects.code.tex 2006/10/11 (rcs-revision 1.2) +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcorepathprocessing.code.tex +File: pgfcorepathprocessing.code.tex 2008/10/09 (rcs-revision 1.8) +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcorearrows.code.tex +File: pgfcorearrows.code.tex 2008/04/23 (rcs-revision 1.11) +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcoreshade.code.tex +File: pgfcoreshade.code.tex 2008/11/23 (rcs-revision 1.13) +\pgf@max=\dimen159 +\pgf@sys@shading@range@num=\count163 +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcoreimage.code.tex +File: pgfcoreimage.code.tex 2010/03/25 (rcs-revision 1.16) + +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcoreexternal.code.tex +File: pgfcoreexternal.code.tex 2010/09/01 (rcs-revision 1.17) +\pgfexternal@startupbox=\box39 +)) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcorelayers.code.tex +File: pgfcorelayers.code.tex 2010/08/27 (rcs-revision 1.2) +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcoretransparency.code.tex +File: pgfcoretransparency.code.tex 2008/01/17 (rcs-revision 1.2) +) +(/usr/share/texmf/tex/generic/pgf/basiclayer/pgfcorepatterns.code.tex +File: pgfcorepatterns.code.tex 2009/07/02 (rcs-revision 1.3) +))) +(/usr/share/texmf/tex/generic/pgf/modules/pgfmoduleshapes.code.tex +File: pgfmoduleshapes.code.tex 2010/09/09 (rcs-revision 1.13) +\pgfnodeparttextbox=\box40 +) +(/usr/share/texmf/tex/generic/pgf/modules/pgfmoduleplot.code.tex +File: pgfmoduleplot.code.tex 2010/10/22 (rcs-revision 1.8) +) +(/usr/share/texmf/tex/latex/pgf/compatibility/pgfcomp-version-0-65.sty +Package: pgfcomp-version-0-65 2007/07/03 v2.10 (rcs-revision 1.7) +\pgf@nodesepstart=\dimen160 +\pgf@nodesepend=\dimen161 +) +(/usr/share/texmf/tex/latex/pgf/compatibility/pgfcomp-version-1-18.sty +Package: pgfcomp-version-1-18 2007/07/23 v2.10 (rcs-revision 1.1) +)) +(/usr/share/texmf/tex/latex/pgf/utilities/pgffor.sty +(/usr/share/texmf/tex/latex/pgf/utilities/pgfkeys.sty +(/usr/share/texmf/tex/generic/pgf/utilities/pgfkeys.code.tex)) +(/usr/share/texmf/tex/generic/pgf/utilities/pgffor.code.tex +Package: pgffor 2010/03/23 v2.10 (rcs-revision 1.18) +\pgffor@iter=\dimen162 +\pgffor@skip=\dimen163 +\pgffor@stack=\toks45 +\pgffor@toks=\toks46 +)) +(/usr/share/texmf/tex/generic/pgf/frontendlayer/tikz/tikz.code.tex +Package: tikz 2010/10/13 v2.10 (rcs-revision 1.76) + +(/usr/share/texmf/tex/generic/pgf/libraries/pgflibraryplothandlers.code.tex +File: pgflibraryplothandlers.code.tex 2010/05/31 v2.10 (rcs-revision 1.15) +\pgf@plot@mark@count=\count164 +\pgfplotmarksize=\dimen164 +) +\tikz@lastx=\dimen165 +\tikz@lasty=\dimen166 +\tikz@lastxsaved=\dimen167 +\tikz@lastysaved=\dimen168 +\tikzleveldistance=\dimen169 +\tikzsiblingdistance=\dimen170 +\tikz@figbox=\box41 +\tikz@tempbox=\box42 +\tikztreelevel=\count165 +\tikznumberofchildren=\count166 +\tikznumberofcurrentchild=\count167 +\tikz@fig@count=\count168 + +(/usr/share/texmf/tex/generic/pgf/modules/pgfmodulematrix.code.tex +File: pgfmodulematrix.code.tex 2010/08/24 (rcs-revision 1.4) +\pgfmatrixcurrentrow=\count169 +\pgfmatrixcurrentcolumn=\count170 +\pgf@matrix@numberofcolumns=\count171 +) +\tikz@expandcount=\count172 + +(/usr/share/texmf/tex/generic/pgf/frontendlayer/tikz/libraries/tikzlibrarytopat +hs.code.tex +File: tikzlibrarytopaths.code.tex 2008/06/17 v2.10 (rcs-revision 1.2) +))) (/usr/share/texmf-texlive/tex/latex/rotating/rotating.sty +Package: rotating 2009/03/28 v2.16a rotated objects in LaTeX +\c@r@tfl@t=\count173 +\rotFPtop=\skip65 +\rotFPbot=\skip66 +\rot@float@box=\box43 +\rot@mess@toks=\toks47 +) +(/usr/share/texmf-texlive/tex/latex/blindtext/blindtext.sty +Package: blindtext 2009/06/14 V1.9b blindtext-Package + +(/usr/share/texmf-texlive/tex/latex/tools/xspace.sty +Package: xspace 2006/05/08 v1.12 Space after command names (DPC,MH) +) +\c@blindtext=\count174 +\c@Blindtext=\count175 +\blind@countxx=\count176 +\blindtext@numBlindtext=\count177 +\blind@countyy=\count178 +\c@blindlist=\count179 +\c@blindlistlevel=\count180 +\c@blindlist@level=\count181 +\blind@listitem=\count182 +\c@blind@listcount=\count183 +\c@blind@levelcount=\count184 +\blind@mathformula=\count185 +\blind@Mathformula=\count186 +) (./main.aux + +LaTeX Warning: Label `fig:sizes of core and pan' multiply defined. + +) +\openout1 = `main.aux'. + +LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 39. +LaTeX Font Info: ... okay on input line 39. +LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 39. +LaTeX Font Info: ... okay on input line 39. +LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 39. +LaTeX Font Info: ... okay on input line 39. +LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 39. +LaTeX Font Info: ... okay on input line 39. +LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 39. +LaTeX Font Info: ... okay on input line 39. +LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 39. +LaTeX Font Info: ... okay on input line 39. +Package caption Info: Begin \AtBeginDocument code. +Package caption3 Info: subfig package 1.2 or 1.3 is loaded. +LaTeX Info: Redefining \subref on input line 39. +Package caption Info: float package is loaded. +Package caption Info: longtable package is loaded. + (/usr/share/texmf-texlive/tex/latex/caption/ltcaption.sty +Package: ltcaption 2008/03/28 v1.2 longtable captions (AR) +) +Package caption Info: rotating package is loaded. +Package caption Info: End \AtBeginDocument code. + +(/usr/share/texmf-texlive/tex/context/base/supp-pdf.mkii +[Loading MPS to PDF converter (version 2006.09.02).] +\scratchcounter=\count187 +\scratchdimen=\dimen171 +\scratchbox=\box44 +\nofMPsegments=\count188 +\nofMParguments=\count189 +\everyMPshowfont=\toks48 +\MPscratchCnt=\count190 +\MPscratchDim=\dimen172 +\MPnumerator=\count191 +\everyMPtoPDFconversion=\toks49 +) ABD: EveryShipout initializing macros +LaTeX Font Info: Try loading font information for U+msa on input line 41. + +(/usr/share/texmf-texlive/tex/latex/amsfonts/umsa.fd +File: umsa.fd 2009/06/22 v3.00 AMS symbols A +) +LaTeX Font Info: Try loading font information for U+msb on input line 41. + +(/usr/share/texmf-texlive/tex/latex/amsfonts/umsb.fd +File: umsb.fd 2009/06/22 v3.00 AMS symbols B +) +LaTeX Font Info: Try loading font information for U+lasy on input line 41. + +(/usr/share/texmf-texlive/tex/latex/base/ulasy.fd +File: ulasy.fd 1998/08/17 v2.2e LaTeX symbol font definitions +) +LaTeX Font Info: Try loading font information for U+stmry on input line 41. + +(/usr/share/texmf-texlive/tex/latex/stmaryrd/Ustmry.fd) (./abstract.tex) +(./intro.tex [1 + +{/var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map}] [2]) +(./general.tex +File: Whole_system.png Graphic file (type png) + + [3 <./Whole_system.png (PNG copy)>]) (./classEquiv.tex +[4] [5]) (./Mixed.tex +File: gensim.png Graphic file (type png) + + [6] +Overfull \hbox (5.64026pt too wide) in paragraph at lines 52--53 +[]\T1/cmr/m/n/10 The pro-posed dis-tance is the Lev-en-shtein one, which is clo +se to the Needleman- + [] + +[7 <./gensim.png (PNG copy)>] + +LaTeX Warning: Reference `wholesystem' on page 8 undefined on input line 70. + +[8]) (./implementation.tex +Overfull \hbox (5.78264pt too wide) in paragraph at lines 14--14 +[]\T1/cmr/m/n/7 Name| + [] + + +Overfull \hbox (5.05746pt too wide) in paragraph at lines 14--14 +[]\T1/cmr/m/n/7 Seq| + [] + + +Overfull \hbox (2.98024pt too wide) in paragraph at lines 11--22 +[][] + [] + + +Underfull \hbox (badness 10000) in paragraph at lines 11--22 + + [] + +[9] +Underfull \hbox (badness 10000) in paragraph at lines 73--73 +[]\T1/cmr/m/n/7 Load + [] + + +Underfull \hbox (badness 10000) in paragraph at lines 73--73 +[]\T1/cmr/m/n/7 Conv. + [] + + +Underfull \hbox (badness 10000) in paragraph at lines 73--73 +[]\T1/cmr/m/n/7 Read + [] + + +Underfull \hbox (badness 10000) in paragraph at lines 73--73 +[]\T1/cmr/m/n/7 Core + [] + + +Underfull \hbox (badness 10000) in paragraph at lines 73--74 +[]\T1/cmr/m/n/7 Core + [] + + +File: coregenome.png Graphic file (type png) + + +File: pangenome.png Graphic file (type png) + +Overfull \hbox (6.66504pt too wide) in paragraph at lines 92--97 +[][] [] + [] + + +File: cover_ncbi.png Graphic file (type png) + + +File: cover_dogma.png Graphic file (type png) + +Overfull \hbox (6.66504pt too wide) in paragraph at lines 105--110 +[][] [] + [] + +[10]) (./discussion.tex +LaTeX Font Info: Try loading font information for T1+cmtt on input line 156. + + (/usr/share/texmf-texlive/tex/latex/base/t1cmtt.fd +File: t1cmtt.fd 1999/05/25 v2.5h Standard LaTeX font definitions +) +Overfull \hbox (3.26242pt too wide) in paragraph at lines 155--170 +\T1/cmr/m/n/10 tained with DOGMA, see $\T1/cmtt/m/n/10 http : / / members . fem +to-[]st . fr / christophe-[]guyeux / + [] + +[11 <./coregenome.png (PNG copy)> <./pangenome.png (PNG copy)> <./cover_ncbi.pn +g (PNG copy)> <./cover_dogma.png (PNG copy)>]) (./conclusion.tex [12]) +(./main.bbl [13]) +\tf@thm=\write4 +\openout4 = `main.thm'. + + [14] (./main.aux) + +LaTeX Warning: There were undefined references. + + +LaTeX Warning: There were multiply-defined labels. + + ) +Here is how much of TeX's memory you used: + 13868 strings out of 493848 + 240951 string characters out of 1152822 + 297521 words of memory out of 3000000 + 16773 multiletter control sequences out of 15000+50000 + 22851 words of font info for 67 fonts, out of 3000000 for 9000 + 717 hyphenation exceptions out of 8191 + 56i,12n,55p,1127b,750s stack positions out of 5000i,500n,10000p,200000b,50000s +{/usr/share/texmf/fonts/enc/dvips/cm-super/cm-super-t1.enc} +Output written on main.pdf (14 pages, 909089 bytes). +PDF statistics: + 158 PDF objects out of 1000 (max. 8388607) + 0 named destinations out of 1000 (max. 500000) + 43 words of extra memory for PDF output out of 10000 (max. 10000000) + diff --git a/Paper2/main.pdf b/Paper2/main.pdf new file mode 100644 index 0000000..2f5b398 Binary files /dev/null and b/Paper2/main.pdf differ diff --git a/Paper2/main.synctex.gz b/Paper2/main.synctex.gz new file mode 100644 index 0000000..b492ff1 Binary files /dev/null and b/Paper2/main.synctex.gz differ diff --git a/Paper2/main.tex b/Paper2/main.tex new file mode 100644 index 0000000..35ccd3d --- /dev/null +++ b/Paper2/main.tex @@ -0,0 +1,92 @@ +\documentclass{article} +\usepackage{subfig} +\usepackage{color} +\usepackage{graphicx} +\usepackage{url} +\usepackage{cite} +\usepackage{algorithm} +\usepackage{algorithmic} +\usepackage{pdflscape} +\usepackage{authblk} +\usepackage[T1]{fontenc} +\usepackage{multirow,longtable} +\usepackage{amsmath,mathtools} +\usepackage{amssymb} +\usepackage[standard]{ntheorem} +\usepackage{stmaryrd} +\usepackage[utf8]{inputenc} +\usepackage{tikz} +\usepackage{rotating} +\usepackage{blindtext} + +% correct bad hyphenation here +\hyphenation{op-tical net-works semi-conduc-tor} + + +\title{Finding the Core-Genes of Chloroplasts based on Gene Similarity Approaches} +\author[1,3]{Bassam Al Kindy} %\footnote{email: bassam.al-kindy@univ-fcomte.fr} +\author[1,3]{Christophe Guyeux} +\author[1,3]{Jean-Fran\c{c}ois Couchot} +%\author[2,3]{Arnaud Mouly} +\author[1,3]{Michel Salomon} +\author[1,3]{Jacques M. Bahi} +\affil[1]{FEMTO-ST Institute, UMR 6174 CNRS, DISC Computer Science Department\\ Universit\'{e} de Franche-Comt\'{e}, France} +%\affil[2]{Chrono-Environnement Lab., UMR 6249 CNRS} +%{\small \it Authors in alphabetic order} +%\affil[3]{Universit\'{e} de Franche-Comt\'{e}, France} +\renewcommand\Authands{ and } + +\begin{document} + +\maketitle + +%IEEEtran, journal, \LaTeX, paper, template. +\input{abstract} + +\section{Introduction}\label{sec:intro} +\input{intro.tex} + + + +\section{An Overview}\label{sec:general} +\input{general} + +\section{Core genes extraction} +\subsection{Similarity-based approach}\label{sec:simil} +% Main author : jfc +\input{classEquiv} + +%\subsection{Annotation-based approach}\label{sec:annot} +% Main author : bassam +%\input{annotated} +\subsection{Annotation based approach} +\subsubsection{Quality-test approach}\label{sec:mixed} +\input{Mixed} + +%\section{Features visualization}\label{sec:features} +%\input{Features} + +\section{Implementation}\label{sec:implem} +\input{implementation} +%\section{Second Stage: to Find Closed Genomes} +%\input{closedgenomes} + +\section{Discussion} +\input{discussion.tex} +\section{Conclusion}\label{sec:concl} +\input{conclusion} + +\bigskip + +\textit{Computations have been performed on the supercomputer facilities of the M\'esocentre +de calcul de Franche-Comt\'e.} + + +\bibliographystyle{plain} + +\bibliography{biblio} + +%\section{Appendix} +%\input{appendix} + +\end{document} diff --git a/Paper2/main.thm b/Paper2/main.thm new file mode 100644 index 0000000..5d6d4e6 --- /dev/null +++ b/Paper2/main.thm @@ -0,0 +1 @@ +\contentsline {definition}{{Definition}{1}{}}{4} diff --git a/Paper2/pan.png b/Paper2/pan.png new file mode 100644 index 0000000..6975e58 Binary files /dev/null and b/Paper2/pan.png differ diff --git a/Paper2/pangenome.png b/Paper2/pangenome.png new file mode 100644 index 0000000..33652fe Binary files /dev/null and b/Paper2/pangenome.png differ diff --git a/Paper2/phylo1.png b/Paper2/phylo1.png new file mode 100644 index 0000000..adc3cd7 Binary files /dev/null and b/Paper2/phylo1.png differ diff --git a/Paper2/phylo11.png b/Paper2/phylo11.png new file mode 100644 index 0000000..f075a03 Binary files /dev/null and b/Paper2/phylo11.png differ diff --git a/Paper2/population_Table.tex b/Paper2/population_Table.tex new file mode 100644 index 0000000..6c33354 --- /dev/null +++ b/Paper2/population_Table.tex @@ -0,0 +1,163 @@ + \begin{table} + \tiny + \caption[NCBI Genomes Families]{List of chloroplast genomes of photosynthetic Eucaryotes lineages from NCBI\label{Tab2}} + \begin{minipage}{0.50\textwidth} + \setlength{\tabcolsep}{4pt} + \begin{tabular}{|p{0.1cm}|p{0.2cm}|p{1.35cm}|p{2.65cm}|} + \hline + {{F.}}&{{\#}} & {{Acc. No}} & {{Scientific Name}} \\ + \hline + %Entering First line + \parbox[t]{1mm}{\multirow{11}{*}{\rotatebox[origin=c]{90}{Brown Algae}}} &\multirow{11}{*}{11} & NC\_001713.1 & \textit{Odontella sinensis} \\ + & & NC\_008588.1 & \textit{Phaeodactylum tricornutum} \\ + & & NC\_010772.1 & \textit{Heterosigma akashiwo} \\ + & & NC\_011600.1 & \textit{Vaucheria litorea} \\ + & & NC\_012903.1 & \textit{Aureoumbra lagunensis} \\ + & & NC\_014808.1 & \textit{Thalassiosira oceanica} \\ + & & NC\_015403.1 & \textit{Fistulifera sp} \\ + & & NC\_016731.1 & \textit{Synedra acus} \\ + & & NC\_016735.1 & \textit{Fucus vesiculosus} \\ + & & NC\_018523.1 & \textit{Saccharina japonica} \\ + & & NC\_020014.1 & \textit{Nannochloropsis gadtina} \\ + \hline + % Entering second group + \parbox[t]{1mm}{\multirow{3}{*}{\rotatebox[origin=c]{90}{F1}}} & + \multirow{3}{*}{3} & NC\_000925.1 & \textit{Porphyra purpurea} \\ + & & NC\_001840.1 & \textit{Cyanidium caldarium} \\ + & & NC\_006137.1 & \textit{Gracilaria tenuistipitata} \\ + \hline + % Entering third group + \parbox[t]{1mm}{\multirow{17}{*}{\rotatebox[origin=c]{90}{Green Algae}}} & + \multirow{17}{*}{17} & NC\_000927.1 & \textit{Nephroselmis olivacea} \\ + & & NC\_002186.1 & \textit{Mesotigma viride} \\ + & & NC\_005353.1 & \textit{Chlamydomonas reinhardtii} \\ + & & NC\_008097.1 & \textit{Chara vulgaris} \\ + & & NC\_008099.1 & \textit{Oltmannsiellopsis viridis} \\ + & & NC\_008114.1 & \textit{Pseudoclonium akinetum} \\ + & & NC\_008289.1 & \textit{Ostreococcus tauri} \\ + & & NC\_008372.1 & \textit{Stigeoclonium helveticum} \\ + & & NC\_008822.1 & \textit{Chlorokybus atmophyticus} \\ + & & NC\_011031.1 & \textit{Oedogonium cardiacum} \\ + & & NC\_012097.1 & \textit{Pycnococcus provaseolii} \\ + & & NC\_012099.1 & \textit{Pyramimonas parkeae} \\ + & & NC\_012568.1 & \textit{Micromonas pusilla} \\ + & & NC\_014346.1 & \textit{Floydiella terrestris} \\ + & & NC\_015645.1 & \textit{Schizomeris leibleinii} \\ + & & NC\_016732.1 & \textit{Dunaliella salina} \\ + & & NC\_016733.1 & \textit{Pedinomonas minor} \\ % + \hline + % Entering fourth group + \parbox[t]{1mm}{\multirow{3}{*}{\rotatebox[origin=c]{90}{F2}}} & + \multirow{3}{*}{3} & NC\_001319.1 & \textit{Marchantia polymorpha} \\ + & & NC\_004543.1 & \textit{Anthoceros formosae} \\ + & & NC\_005087.1 & \textit{Physcomitrella patens} \\ % + \hline + % Entering fifth group + \parbox[t]{1mm}{\multirow{2}{*}{\rotatebox[origin=c]{90}{F3}}} & + \multirow{2}{*}{2} & NC\_014267.1 & \textit{Kryptoperidinium foliaceum} \\ + & + & NC\_014287.1 & \textit{Durinskia baltica} \\ + \hline + + % Entering sixth group + \parbox[t]{1mm}{\multirow{2}{*}{\rotatebox[origin=c]{90}{F4}}} & + \multirow{2}{*}{2} & NC\_001603.2 & \textit{Euglena gracilis} \\ + & & NC\_020018.1 & \textit{Monomorphina aenigmatica} \\ + \hline + % Entering seventh group + \parbox[t]{1mm}{\multirow{5}{*}{\rotatebox[origin=c]{90}{Ferns}}} & \multirow{5}{*}{5} + & NC\_003386.1 & \textit{Psilotum nudum} \\ + & & NC\_008829.1 & \textit{Angiopteris evecta} \\ + & & NC\_014348.1 & \textit{Pteridium aquilinum} \\ + & & NC\_014699.1 & \textit{Equisetum arvense} \\ + & & NC\_017006.1 & \textit{Mankyua chejuensis} \\ + \hline + % Entering tenth group + & & & \\ + \parbox[t]{1mm}{\multirow{1}{*}{\rotatebox[origin=c]{90}{F5}}} + & 1 & NC\_007288.1 & \textit{Emiliana huxleyi}\\ + & & & \\ + \hline + % Entering eleventh group + \parbox[t]{1mm}{\multirow{2}{*}{\rotatebox[origin=c]{90}{F6}}} + & \multirow{2}{*}{2} & NC\_014675.1 & \textit{Isoetes flaccida} \\ + & & NC\_006861.1 & \textit{Huperzia lucidula} \\ + \hline + \end{tabular} + \end{minipage} + \begin{minipage}{0.60\textwidth} + \setlength{\tabcolsep}{4pt} + \begin{tabular}{|p{0.2cm}|p{0.2cm}|p{1.45cm}|p{2.4cm}|} + \hline + {{F.}}&{{\#}} & {{Acc. No}} & {{Scientific Name}} \\ + \hline + + % Entering eighth group + \parbox[t]{1mm}{\multirow{45}{*}{\rotatebox[origin=c]{90}{Angiosperms}}} + & + \multirow{45}{*}{45} & NC\_007898.3 & \textit{Solanum lyopersicum} \\ + & & NC\_001568.1 & \textit{Epifagus virginiana} \\ + & & NC\_001666.2 & \textit{Zea Mays} \\ + & & NC\_005086.1 & \textit{Amborella trichopoda} \\ + & & NC\_006050.1 & \textit{Nymphaea alba} \\ + & & NC\_006290.1 & \textit{Panax ginseng} \\ + & & NC\_007578.1 & \textit{Lactuca sativa} \\ + & & NC\_007957.1 & \textit{Vitis vinifera} \\ + & & NC\_007977.1 & \textit{Helianthus annuus} \\ + & & NC\_008325.1 & \textit{Daucus carota} \\ + & & NC\_008336.1 & \textit{Nandina domestica} \\ + & & NC\_008359.1 & \textit{Morus indica} \\ + & & NC\_008407.1 & \textit{Jasminum nudiflorum} \\ + & & NC\_008456.1 & \textit{Drimys granadensis} \\ + & & NC\_008457.1 & \textit{Piper cenocladum} \\ + & & NC\_009601.1 & \textit{Dioscorea elephantipes} \\ + & & NC\_009765.1 & \textit{Cuscuta gronovii} \\ + & & NC\_009808.1 & \textit{Ipomea purpurea} \\ + & & NC\_010361.1 & \textit{Oenothera biennis} \\ + & & NC\_010433.1 & \textit{Manihot esculenta} \\ + & & NC\_010442.1 & \textit{Trachelium caeruleum} \\ + & & NC\_013707.2 & \textit{Olea europea} \\ + & & NC\_013823.1 & \textit{Typha latifolia} \\ + & & NC\_014570.1 & \textit{Eucalyptus} \\ + & & NC\_014674.1 & \textit{Castanea mollissima} \\ + & & NC\_014676.2 & \textit{Theobroma cacao} \\ + & & NC\_015830.1 & \textit{Bambusa emeiensis} \\ + & & NC\_015899.1 & \textit{Wolffia australiana} \\ + & & NC\_016433.2 & \textit{Sesamum indicum} \\ + & & NC\_016468.1 & \textit{Boea hygrometrica} \\ + & & NC\_016670.1 & \textit{Gossypium darwinii} \\ + & & NC\_016727.1 & \textit{Silene vulgaris} \\ + & & NC\_016734.1 & \textit{Brassica napus} \\ + & & NC\_016736.1 & \textit{Ricinus communis} \\ + & & NC\_016753.1 & \textit{Colocasia esculenta} \\ + & & NC\_017609.1 & \textit{Phalaenopsis equestris} \\ + & & NC\_018357.1 & \textit{Magnolia denudata} \\ + & & NC\_019601.1 & \textit{Fragaria chiloensis} \\ + & & NC\_008796.1 & \textit{Ranunculus macranthus} \\ + & & NC\_013991.2 & \textit{Phoenix dactylifera} \\ + & & NC\_016068.1 & \textit{Nicotiana undulata} \\ + \hline + % Entering ninth group + \parbox[t]{1mm}{\multirow{7}{*}{\rotatebox[origin=c]{90}{Gymnosperms}}} + & + \multirow{7}{*}{7}& NC\_009618.1 & \textit{Cycas taitungensis} \\ + & & NC\_011942.1 & \textit{Gnetum parvifolium} \\ + & & NC\_016058.1 & \textit{Larix decidua} \\ + & & NC\_016063.1 & \textit{Cephalotaxus wilsoniana} \\ + & & NC\_016065.1 & \textit{Taiwania cryptomerioides} \\ + & & NC\_016069.1 & \textit{Picea morrisonicola} \\ + & & NC\_016986.1 & \textit{Gingko biloba} \\ + \hline + \end{tabular} + \end{minipage} + + \scriptsize + \noindent where lineages F1, F2, F3, F4, F5, and F6 are + \textit{Red Algae}, + \textit{Bryophytes}, + \textit{Dinoflagellates}, + \textit{Euglena}, + \textit{Haptophytes}, and \textit{Lycophytes} respectively. + \normalsize + \end{table} + \ No newline at end of file diff --git a/Paper2/stats.png b/Paper2/stats.png new file mode 100644 index 0000000..f43cbce Binary files /dev/null and b/Paper2/stats.png differ diff --git a/Paper2/tree.pdf b/Paper2/tree.pdf new file mode 100644 index 0000000..7aee786 Binary files /dev/null and b/Paper2/tree.pdf differ