From: Jean-François Couchot Date: Thu, 21 Mar 2013 07:25:46 +0000 (+0100) Subject: t X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/ancetre.git/commitdiff_plain/ebcab8d2a6e4951ed9a20959ae92787dc5d3af65?ds=inline;hp=--cc t --- ebcab8d2a6e4951ed9a20959ae92787dc5d3af65 diff --git a/closedgenomes.tex b/closedgenomes.tex new file mode 100644 index 0000000..d468f10 --- /dev/null +++ b/closedgenomes.tex @@ -0,0 +1,49 @@ +The approache is further based on the ability to decide how far is each +genome from each others. To achieve this, we combine XXX metrics which are +detailed in this part. + +\subsection{Core SNP based metric} +Due to the definition of the core genome, for each element $\dot{x}$ +in this set, there is a gene $x \in \dot{x}$ in each genome. +Let us consider a class +$\dot{x}= \{y | x \sim y\}$. + +\JFC{Il faudrait être cohérent: deux génomes proches devraient partout avoir +soit une métrique élevée soit une métrique très faible} + +%1/ On SNPs of the core genome strict +All the $y$ are thus aligned +thanks to a global alignment tool. The SNPs may thus be extracted. +For each genome, one can thus compute the vector of boolean values +memorizing at index $i$ wether the SNP $i$ is present in one of its gene +(postive value) or not (null value). +A Hamming distance between two vectors allows to build the distance +between two genes. +This metric is further refered as to $m_S$. + +% plus il y a de diff, plus le nombre est élevé + + +%2/ On SNPs of the core genome strict, each gene having the same weight +The $m_S$ method does not consider genes to have the same incidence in the +metric value. A gene with many SNPs has a larger influence in +the metric computation than a gene with fewer ones. +The metric further refered as to $m_{|S|}$ gives the same weight to each gene +without considering the number of SNP it contains. + +% plus il y a de diff, plus le nombre est élevé + + +%3/ On gene content (symmetric difference) +The third metric consider the symetric difference $\Delta$ +between the two sets $G_1$ and $G_2$ of genes. +$$ +G_1\Delta G2 = +(G1\cup G_2)\setminus (G1\cap G_2) = (G1\setminus G_2)\cup(G_2\setminus G1) +$$ +\end{document} + +% 4/ Using EPFL method +% 5/ On size of the biggest syntheny bloc +% 6/ On average size of syntheny blocs +% 7/ On number of syntheny blocs.