+Also, to the extent that there are very similar variables in the sense of statistical implication, it might be appropriate to substitute a single variable for these variables that would be their leader in terms of representing an equivalence class of similar variables for the implicit purpose.
+We therefore propose, following the example of what is done to define the notion of quasi-implication, to define a notion of quasi-equivalence between variables, in order to build classes from which we will extract a leader.
+We will illustrate this with an example.
+Then, we will consider the possibility of using a genetic algorithm to optimize the choice of the representative for each quasi-equivalence class.
+
+\subsection{Definition of quasi-equivalence}
+
+Two binary variables $a$ and $b$ are logically equivalent for the SIA when the two quasi-implications $a \Rightarrow b$ and $b \Rightarrow a$ are simultaneously satisfied at a given threshold.
+We have developed criteria to assess the quality of a quasi-involvement: one is the statistical surprise based on the likelihood of~\cite{Lerman} relationship, the other is the entropic form of quasi-inclusion~\cite{Grash2} which is presented in this chapter (§7).
+
+According to the first criterion, we could say that two variables $a$ and $b$ are almost equivalent when the intensity of involvement $\varphi(a,b)$ of $a\Rightarrow b$ is little different from that of $b \Rightarrow a$. However, for large groups (several thousands), this criterion is no longer sufficiently discriminating to validate inclusion.
+
+According to the second criterion, an entropic measure of the imbalance between the numbers $n_{a \wedge b}$ (individuals who satisfy $a$ and $b$) and $n_{a \wedge \overline{b}} $ (individuals who satisfy $a$ and $\neg b$, counter-examples to involvement $a\Rightarrow b$) is used to indicate the quality of involvement $a\Rightarrow b$, on the one hand, and the numbers $n_{a \wedge b}$ and $n_{\overline{a} \wedge b}$ to assess the quality of mutual implication $b\Rightarrow a$, on the other.
+
+
+Here we will use a method comparable to that used in Chapter 3 to define the entropic implication index.
+
+By posing $n_a$ and $n_b$, respectively effective of $a$ and $b$, the imbalance of the rule $a\Rightarrow b$ is measured by a conditional entropy $K(b \mid a=1)$, and that of $b\Rightarrow a$ by $K(a \mid b=1)$ with:
+
+
+\begin{eqnarray*}
+ K(b\mid a=1) = - \left( 1- \frac{n_{a\wedge b}}{n_a}\right) log_2 \left( 1- \frac{n_{a\wedge b}}{n_a}\right) - \frac{n_{a\wedge b}}{n_a}log_2 \frac{n_{a\wedge b}}{n_a} & \quad if \quad \frac{n_{a \wedge b}}{n_a} > 0.5\\
+ K(b\mid a=1) = 1 & \quad if \quad \frac{n_{a \wedge b}}{n_a} \leq 0.5\\
+ K(a\mid b=1) = - \left( 1- \frac{n_{a\wedge b}}{n_b}\right) log_2 \left( 1- \frac{n_{a\wedge b}}{n_b}\right) - \frac{n_{a\wedge b}}{n_b}log_2 \frac{n_{a\wedge b}}{n_b} & \quad if \quad \frac{n_{a \wedge b}}{n_b} > 0.5\\
+ K(a\mid b=1) = 1 & \quad if \quad \frac{n_{a \wedge b}}{n_b} \leq 0.5
+\end{eqnarray*}
+
+These two entropies must be low enough so that it is possible to bet on $b$ (resp. $a$) with a good certainty when $a$ (resp. $b$) is achieved. Therefore their respective complements to 1 must be simultaneously strong.
+
+\begin{figure}[htbp]
+ \centering
+\includegraphics[scale=0.5]{chap2fig8.png}
+\caption{Illustration of the functions $K$ et $1-K^2$ on $[0; 1]$ .}
+
+\label{chap2fig7}
+\end{figure}
+
+
+\definition A first entropic index of equivalence is given by:
+$$e(a,b) = \left (\left[ 1 - K^2(b \mid a = 1)\right ]\left[ 1 - K^2(a \mid b = 1) \right]\right)^{\frac{1}{4}}$$
+
+When this index takes values in the neighbourhood of $1$, it reflects a good quality of a double implication.
+In addition, in order to better take into account $a \wedge b$ (the examples), we integrate this parameter through a similarity index $s(a,b)$ of the variables, for example in the sense of I.C. Lerman~\cite{Lermana}.
+The quasi-equivalence index is then constructed by combining these two concepts.
+
+\definition A second entropic equivalence index is given by the formula
+
+$$\sigma(a,b)= \left [ e(a,b).s(a,b)\right ]^{\frac{1}{2}}$$
+
+From this point of view, we then set out the quasi-equivalence criterion that we use.
+
+\definition The pair of variables $\{a,b\}$ is said to be almost equivalent for the selected quality $\beta$ if $\sigma(a,b) \geq \beta$.
+For example, a value $\beta=0.95$ could be considered as a good quasi-equivalence between $a$ and $b$.
+
+\subsection{Algorithm of construction of quasi-equivalence classes}
+
+Let us assume a set $V = \{a,b,c,...\}$ of $v$ variables with a valued relationship $R$ induced by the measurement of quasi-equivalence on all pairs of $V$.
+We will assume the pairs of variables classified in a decreasing order of quasi-equivalence.
+If we have set the quality threshold for quasi-equivalence at $\beta$, only the first of the pairs $\{a,b\}$ checking for inequality $\sigma(a,b)\ge \beta$ will be retained.
+In general, only a part $V'$, of cardinal $v'$, of the variables of $V$ will verify this inequality.
+If this set $V'$ is empty or too small, the user can reduce his requirement to a lower threshold value.
+The relationship being symmetrical, we will have at most pairs to study.
+As for $V-V'$, it contains only non-reducible variables.
+
+We propose to use the following greedy algorithm:
+\begin{enumerate}
+\item A first potential class $C_1^0= \{e,f\}$ is constituted such that $\sigma(e,f)$ represents the largest of the $\beta$-equivalence values.
+ If possible, this class is extended to a new class $C_1$ by taking from $V'$ all the elements $x$ such that any pair of variables within this class allows a quasi-equivalence greater than or equal to $\beta$;
+
+\item We continue with:
+ \begin{enumerate}
+ \item If $o$ and $k$ forming the pair $(o,k)$ immediately below $(e,f)$ according to the index $\sigma$, belong to $C_1$, then we move to the pair immediately below (o,k) and proceed as in 1.;
+ \item If $o$ and $k$ do not belong to $C_1$, proceed as in 1. from the pair they constitute by forming the basis of a new class;
+ \item If $o$ or $k$ does not belong to $C_1$, one of these two variables can either form a singleton class or belong to a future class. On this one, we will of course practice as above.
+ \end{enumerate}
+ \end{enumerate}
+
+After a finite number of iterations, a partition of $V$ is available in $r$ classes of $\sigma$-equivalence: $\{C_1, C_2,..., C_r\}$.
+The quality of the reduction may be assessed by a gross or proportional index of $\beta^{\frac{p}{k}}$.
+However, we prefer the criterion defined below, which has the advantage of integrating the choice of representative.
+
+In addition, $k$ variables representing the $k$ classes of $\sigma$-equivalence could be selected on the basis of the following elementary criterion: the quality of connection of this variable with those of its class.
+However, this criterion does not optimize the reduction since the choice of representative is relatively arbitrary and may be a sign of triviality of the variable.
+