X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/book_chic.git/blobdiff_plain/0705dd41d0c66269b8b97d2144064e58423b489c..4155cdb38e520aa0c325bf29b9e57da783d7f89e:/chapter2.tex?ds=sidebyside diff --git a/chapter2.tex b/chapter2.tex index f3a2509..d5f45e3 100644 --- a/chapter2.tex +++ b/chapter2.tex @@ -101,3 +101,647 @@ index of this selected association is multivariate~\cite{Bernard}. +Moreover, to our knowledge, on the one hand, most often the different +and interesting developments focus on proposals for a partial +implication index for binary data~\cite{Lermana} or \cite{Lallich}, on +the other hand, this notion is not extended to other types of +variables, to extraction and representation according to a rule graph +or a hierarchy of meta-rules; structures aiming at access to the +meaning of a whole not reduced to the sum of its +parts~\cite{Seve}\footnote{This is what the philosopher L. Sève + emphasizes :"... in the non-additive, non-linear passage of the + parts to the whole, there are properties that are in no way + precontained in the parts and which cannot therefore be explained by + them" }, i.e. operating as a complex non-linear system. +For example, it is well known, through usage, that the meaning of a +sentence does not completely depend on the meaning of each of the +words in it (see the previous chapter, point 4). + +Let us return to what we believe is fertile in the approach we are +developing. +It would seem that, in the literature, the notion of implication index +is also not extended to the search for subjects and categories of +subjects responsible for associations. +Nor that this responsibility is quantified and thus leads to a +reciprocal structuring of all subjects, conditioned by their +relationships to variables. +We propose these extensions here after recalling the founding +paradigm. + + +\section{Implication intensity in the binary case} + +\subsection{Fundamental and founding situation} + +A set of objects or subjects E is crossed with variables +(characters, criteria, successes,...) which are interrogated as +follows: "to what extent can we consider that instantiating variable\footnote{Throughout the book, the word "variable" refers to both an isolated variable in premise (example: "to be blonde") or a conjunction of isolated variables (example: "to be blonde and to be under 30 years old and to live in Paris")} $a$ +implies instantiating variable $b$? +In other words, do the subjects tend to be $b$ if we know that they are +$a$?". +In natural, human or life sciences situations, where theorems (if $a$ +then $b$) in the deductive sense of the term cannot be established +because of the exceptions that taint them, it is important for the +researcher and the practitioner to "mine into his data" in order to +identify sufficiently reliable rules (kinds of "partial theorems", +inductions) to be able to conjecture\footnote{"The exception confirms the rule", as the popular saying goes, in the sense that there would be no exceptions if there were no rule} a possible causal relationship, +a genesis, to describe, structure a population and make the assumption +of a certain stability for descriptive and, if possible, predictive +purposes. +But this excavation requires the development of methods to guide it +and to free it from trial and error and empiricism. + + +\subsection{Mathematization} + +To do this, following the example of the I.C. Lerman similarity +measurement method \cite{Lerman,Lermanb}, following the classic +approach in non-parametric tests (e. g. Fischer, Wilcoxon, etc.), we +define~\cite{Grasb,Grasf} the confirmatory quality measure of the +implicative relationship $a \Rightarrow b$ from the implausibility of +the occurrence in the data of the number of cases that invalidate it, +i.e. for which $a$ is verified without $b$ being verified. This +amounts to comparing the difference between the quota and the +theoretical if only chance occurred\footnote{"...[in agreement with + Jung] if the frequency of coincidences does not significantly + exceed the probability that they can be calculated by attributing + them solely by chance to the exclusion of hidden causal + relationships, we certainly have no reason to suppose the existence + of such relationships.", H. Atlan~\cite{Atlana}}. +But when analyzing data, it is this gap that we take into account and +not the statement of a rejection or null hypothesis eligibility. +This measure is relative to the number of data verifying $a$ and not +$b$ respectively, the circumstance in which the involvement is +precisely put in default. +It quantifies the expert's "astonishment" at the unlikely small number +of counter-examples in view of the supposed independence between the +variables and the numbers involved. + +Let us be clear. A finite set $V$ of $v$ variables is given: $a$, $b$, +$c$,... +In the classical paradigmatic situation and initially retained, it is +about the performance (success-failure) to items of a questionnaire. +To a finite set $E$ of $n$ subjects $x$, functions of the type : $x +\rightarrow a(x)$ where $a(x) = 1$ (or $a(x) = true$) if $x$ satisfies +or has the character $a$ and $0$ (or $a(x) = false$) otherwise are +associated by abuse of writing. +In artificial intelligence, we will say that $x$ is an example or an +instance for $a$ if $a(x) = 1$ and a counter-example if not. + + +The $a \Rightarrow b$ rule is logically true if for any $x$ in the +sample, $b(x)$ is null only if $a(x)$ is also null; in other words if +set $A$ of the $x$ for which $a(x)=1$ is contained in set $B$ of the +$x$ for which $b(x)=1$. +However, this strict inclusion is only exceptionally observed in the +pragmatically encountered experiments. +In the case of a knowledge questionnaire, we could indeed observe a +few rare students passing an item $a$ and not passing item $b$, +without contesting the tendency to pass item $b$ when we have passed +item $a$. +With regard to the cardinals of $E$ (of size $n$), but also of $A$ (or +$n_a$) and $B$ (or $n_b$), it is therefore the "weight" of the +counter-examples (or) that must be taken into account in order to +statistically accept whether or not to keep the quasi-implication or +quasi-rule $a \Rightarrow b$. Thus, it is from the dialectic of +example-counter-examples that the rule appears as the overcoming of +contradiction. + +\subsection{Formalization} + +To formalize this quasi-rule, we consider any two parts $X$ and $Y$ of +$E$, chosen randomly and independently (absence of a priori link +between these two parts) and of the same respective cardinals as $A$ +and $B$. Let $\overline{Y}$ and $\overline{B}$ be the respective complementary of $Y$ and $B$ in $E$ of the same cardinal $n_{\overline{b}}= n-n_b$. + +We will then say: + +\definition $a \Rightarrow b$ is acceptable at confidence level +$1-\alpha$ if and only if +$$Pr[Card(X\cap \overline{Y})\leq card(A\cap \overline{B})]\leq \alpha$$ + +\begin{figure}[htbp] + \centering +\includegraphics[scale=0.34]{chap2fig1.png} + \caption{The dark grey parts correspond to the counter-examples of the + implication $a \Rightarrow b$} +\label{chap2fig1} +\end{figure} + +It is established \cite{Lermanb} that, for a certain drawing process, +the random variable $Card(X\cap \overline{Y})$ follows the Poisson law +of parameter $\frac{n_a n_{\overline{b}}}{n}$. +We achieve this same result by proceeding differently in the following +way: + +Note $X$ (resp. $Y$) the random subset of binary transactions where +$a$ (resp. $b$) would appear, independently, with the frequency +$\frac{n_a}{n}$ (resp. $\frac{n_b}{n}$). +To specify how the transactions specified in variables $a$ and $b$, +respectively $A$ and $B$, are extracted, for example, the following +semantically permissible assumptions are made regarding the +observation of the event: $[a=1~ and~ b=0]$. $(A\cap +\overline{B})$\footnote{We then note $\overline{v}$ the variable + negation of $v$ (or $not~ v$) and $\overline{P}$ the complementary + part of the part P of E.} is the subset of transactions, +counter-examples of implication $a \Rightarrow b$: + +Assumptions: +\begin{itemize} +\item h1: the waiting times of an event $[a~ and~ not~ b]$ are independent + random variables; +\item h2: the law of the number of events occurring in the time + interval $[t,~ t+T[$ depends only on T; +\item h3: two such events cannot occur simultaneously +\end{itemize} + +It is then demonstrated (for example in~\cite{Saporta}) that the +number of events occurring during a period of fixed duration $n$ +follows a Poisson's law of parameter $c.n$ where $c$ is called the +rate of the apparitions process during the unit of time. + + +However, for each transaction assumed to be random, the event $[a=1]$ +has the probability of the frequency $\frac{n_a}{n}$, the event[b=0] +has as probability the frequency, therefore the joint event $[a=1~ + and~ b=0]$ has for probability estimated by the frequency +$\frac{n_a}{n}. \frac{n_{\overline{b}}}{b}$ in the hypothesis of absence of an a priori link between a and b (independence). + +We can then estimate the rate $c$ of this event by $\frac{n_a}{n}. \frac{n_{\overline{b}}}{b}$. + +Thus for a duration of time $n$, the occurrences of the event $[a~ and~ not~b]$ follow a Poisson's law of parameter : +$$\lambda = \frac{n_a.n_{\overline{b}}}{n}$$ + +As a result, $Pr[Card(X\cap \overline{Y})= s]= e^{-\lambda}\frac{\lambda^s}{s!}$ + +Consequently, the probability that the hazard will lead, under the +assumption of the absence of an a priori link between $a$ and $b$, to +more counter-examples than those observed is: + +$$Pr[Card(X\cap \overline{Y})\leq card(A\cap \overline{B})] = +\sum^{card(A\cap \overline{B})}_{s=0} e^{-\lambda}\frac{\lambda^s}{s!} $$ + + But other legitimate drawing processes lead to a binomial law, or + even a hypergeometric law (itself not semantically adapted to the + situation because of its symmetry). Under suitable convergence + conditions, these two laws are finally reduced to the Poisson Law + above (see Annex to this chapter). + +If $n_{\overline{b}}\neq 0$, we reduce and center this Poison variable +into the variable: + +$$Q(a,\overline{b})= \frac{card(X \cap \overline{Y})) - \frac{n_a.n_{\overline{b}}}{n}}{\sqrt{\frac{n_a.n_{\overline{b}}}{n}}} $$ + +In the experimental realization, the observed value of +$Q(a,\overline{b})$ is $q(a,\overline{b})$. +It estimates a gap between the contingency $(card(A\cap +\overline{B}))$ and the value it would have taken if there had been +independence between $a$ and $b$. + +\definition $$q(a,\overline{b}) = \frac{n_{a \wedge \overline{b}}- \frac{n_a.n_{\overline{b}}}{n}}{\sqrt{\frac{n_a.n_{\overline{b}}}{n}}}$$ +is called the implication index, the number used as an indicator of +the non-implication of $a$ to $b$. +In cases where the approximation is properly legitimized (for example +$\frac{n_a.n_{\overline{b}}}{n}\geq 4$), the variable +$Q(a,\overline{b})$ approximately follows the reduced centered normal +distribution. The intensity of implication, measuring the quality of +$a\Rightarrow b$, for $n_a\leq n_b$ and $nb \neq n$, is then defined +from the index $q(a,\overline{b})$ by: + +\definition +The implication intensity that measures the inductive quality of a +over b is: +$$\varphi(a,b)=1-Pr[Q(a,\overline{b})\leq q(a,\overline{b})] = +\frac{1}{\sqrt{2 \pi}} \int^{\infty}_{ q(a,\overline{b})} +e^{-\frac{t^2}{2}} dt,~ if~ n_b \neq n$$ +$$\varphi(a,b)=0,~ otherwise$$ +As a result, the definition of statistical implication becomes: +\definition +Implication $a\Rightarrow b$ is admissible at confidence level +$1-\alpha $ if and only if: +$$\varphi(a,b)\geq 1-\alpha$$ + + +It should be recalled that this modeling of quasi-implication measures +the astonishment to note the smallness of counter-examples compared to +the surprising number of instances of implication. +It is a measure of the inductive and informative quality of +implication. Therefore, if the rule is trivial, as in the case where +$B$ is very large or coincides with $E$, this astonishment becomes +small. +We also demonstrate~\cite{Grasf} that this triviality results in a +very low or even zero intensity of implication: If, $n_a$ being fixed +and $A$ being included in $B$, $n_b$ tends towards $n$ ($B$ "grows" +towards $E$), then $\varphi(a,b)$ tends towards $0$. We therefore +define, by "continuity":$\varphi(a,b) = 0$ if $n_b = n$. Similarly, if +$A\subset B$, $\varphi(a,b)$ may be less than $1$ in the case where +the inductive confidence, measured by statistical surprise, is +insufficient. + +{\bf \remark Total correlation, partial correlation} + + +We take here the notion of correlation in a more general sense than +that used in the domain that develops the linear correlation +coefficient (linear link measure) or the correlation ratio (functional +link measure). +In our perspective, there is a total (or partial) correlation between +two variables $a$ and $b$ when the respective events they determine +occur (or almost occur) at the same time, as well as their opposites. +However, we know from numerical counter-examples that correlation and +implication do not come down to each other, that there can be +correlation without implication and vice versa~\cite{Grasf} and below. +If we compare the implication coefficient and the linear correlation +coefficient algebraically, it is clear that the two concepts do not +coincide and therefore do not provide the same +information\footnote{"More serious is the logical error inferred from + a correlation found to the existence of a causality" writes Albert + Jacquard in~\cite{Jacquard}, p.159. }. + +The quasi-implication of non-symmetric index $q(a,\overline{b})$ does +not coincide with the correlation coefficient $\rho(a, b)$ which is +symmetric and which reflects the relationship between variables a and +b. Indeed, we show~\cite{Grasf} that if $q(a,\overline{b}) \neq 0$ +then +$$\frac{\rho(a,b)}{q(a,\overline{b})} = \sqrt{\frac{n}{n_b + n_{\overline{a}}}} q(a,\overline{b})$$ +With the correlation considered from the point of view of linear +correlation, even if correlation and implication are rather in the +same direction, the orientation of the relationship between two +variables is not transparent because it is symmetrical, which is not +the bias taken in the SIA. +From a statistical relationship given by the correlation, two opposing +empirical propositions can be deduced. + +The following dual numerical situation clearly illustrates this: + + +\begin{table}[htp] +\center +\begin{tabular}{|l|c|c|c|}\hline +\diagbox[width=4em]{$a_1$}{$b_1$}& + 1 & 0 & marge\\ \hline + 1 & 96 & 4& 100 \\ \hline + 0 & 50 & 50& 100 \\ \hline + marge & 146 & 54& 200 \\ \hline +\end{tabular} ~ ~ ~ ~ ~ ~ ~ \begin{tabular}{|l|c|c|c|}\hline +\diagbox[width=4em]{$a_2$}{$b_2$}& + 1 & 0 & marge\\ \hline + 1 & 94 & 6& 100 \\ \hline + 0 & 52 & 48& 100 \\ \hline + marge & 146 & 54& 200 \\ \hline +\end{tabular} + +\caption{Numeric example of difference between implication and + correlation} +\label{chap2tab1} +\end{table} + +In Table~\ref{chap2tab1}, the following correlation and implications +can be computed:\\ +Correlation $\rho(a_1,b_1)=0.468$, Implication +$q(a_1,\overline{b_1})=-4.082$\\ +Correlation $\rho(a_2,b_2)=0.473$, Implication $q(a_2,\overline{b_2})=-4.041$ + + +Thus, we observe that, on the one hand, $a_1$ and $b_1$ are less +correlated than $a_2$ and $b_2$ while, on the other hand, the +implication intensity of $a_1$ over $b_1$ is higher than that of $a_2$ +over $b_2$ since $q1 <q2$. + +On this subject, Alain Ehrenberg in~\cite{Ehrenberg} writes: "The +finding of a correlation does not remove the ambiguity between" when I do $X$, my brain is in state $Y$" and "if I do $X$, it is because my brain is in state $Y$", that is, between something that happens in my brain when I do an action. + +\remark Remember that we consider not only conjunctions of variables +of the type "$a$ and $b$" but also disjunctions such as "($a$ and $b$) +or $c$..." in order to model phenomena that are concepts as it is done +in learning or in artificial intelligence. +The associated calculations remain compatible with the logic of the +proposals linked by connectors. + +\remark Unlike the Loevinger Index~\cite{Loevinger} and conditional +probability $(Pr[B/A])=1$ and all its derivatives, the implication +intensity varies, non-linearly, with the expansion of sets $E$, $A$ +and $B$ and weakens with triviality (see Definition 2.3). +Moreover, it +is resistant to noise, especially around $0$ for, which can only make +the relationship we want to model and establish statistically +credible. +Finally, as we have seen, the inclusion of $A$ in $B$ does not ensure +maximum intensity, the inductive quality may not be strong, whereas +$Pr[B/A]$ is equal to $1$~\cite{Grasm,Guillet}. +In paragraph 5, we study more closely the problem of the sensitivity +and stability of the implication index as a function of small +variations in the parameters involved in the study of its +differential. + +\section{Case of modal and frequency variables} +\subsection{Founding situation} + +Marc Bailleul's (1991-1994) research focuses in particular on the +representation that mathematics teachers have of their own teaching. +In order to highlight it, meaningful words are proposed to them that +they must prioritize. +Their choices are no longer binary, the words chosen by any teacher +are ordered at least at the most representative. +Mr. Bailleul's question then focuses on questions of the type: "if I +choose this word with this importance, then I choose this other word +with at least equal importance". +It was therefore necessary to extend the notion of statistical +implication to variables other than binary. +This is the case for modal variables that are associated with +phenomena where the values $a(x)$ are numbers in the interval $[0, 1]$ +and describe degrees of belonging or satisfaction as are fuzzy logic, +for example, linguistic modifiers "maybe", "a little", "sometimes", +etc. +This problem is also found in situations where the frequency of a +variable reflects a preorder on the values assigned by the subjects to +the variables presented to them. +These are frequency variables that are associated with phenomena where +the values of $a(x)$ are any positive real values. +This is the case when one considers a student's percentage of success +in a battery of tests in different areas. + +\subsection{Formalization} + +J.B. Lagrange~\cite{Lagrange} has demonstrated that, in the modal +case, +\begin{itemize} + \item if $a(x)$ and $\overline{b}(x)$ are the values taken at $x$ by + the modal variables $a$ and $\overline{b}$, with $(x)=1-b(x)$ + \item if $s^2_a$ and $s_{\overline{b}}^2$ are the empirical variances of variables $a$ and $\overline{b}$ +then the implication index, which he calls propensity index, becomes: + +\definition +$$q(a,\overline{b}) = \frac{\sum_{x\in E} a(x)\overline{b}(x) - + \frac{n_a n_{\overline{b}}}{n}} +{\sqrt{\frac{(n^2s_a^2+n_a^2)(n^2+s_{\overline{b}}^2 + n_{\overline{b}}^2)}{n^3}}}$$ +is the index of propensity of modal variables. +\end{itemize} + +J.B. Lagrange also proves that this index coincides with the index +defined previously in the binary case if the number of modalities of a +and b is precisely 2, because in this case :\\ +$n^2s_a^2+n_a^2=n n_a$,~ ~ $ n^2+s_{\overline{b}}^2 + n_{\overline{b}}=n + n_{\overline{b}}$~ ~ and ~ ~ $\sum_{x\in E} a(x)\overline{b}(x)=n_{a \wedge + \overline{b}}$. + + This solution provided in the modal case is also applicable to the + case of frequency variables, or even positive numerical variables, + provided that the values observed on the variables, such as a and b, + have been normalized, the normalization in $[0, 1]$ being made from the maximum of the value taken respectively by $a$ and $b$ on set $E$. + +\remark +In~\cite{Regniera}, we consider rank variables that reflect a +total order between choices presented to a population of judges. +Each of them must order their preferential choice among a set of +objects or proposals made to them. +An index measures the quality of the statement of the type: "if object +$a$ is ranked by judges then, generally, object $b$ is ranked higher +by the same judges". +Proximity to the previous issue leads to an index that is relatively +close to the Lagrange index, but better adapted to the rank variable +situation. + + +\section{Cases of on-interval and on-interval variables} +\subsection{Variables-on-intervals} +\subsubsection{Founding situation} + +For example, the following rule is sought to be extracted from a +biometric data set, estimating its quality: "if an individual weighs +between $65$ and $70kg$ then in general he is between $1.70$ and +$1.76m$ tall". +A similar situation arises in the search for relationships between +intervals of student performance in two different subjects. +The more general situation is then expressed as follows: two real +variables $a$ and $b$ take a certain number of values over 2 finite +intervals $[a1,~ a2]$ and $[b1,~ b2]$. Let $A$ (resp. $B$) be all the +values of $a$ (resp. $b$) observed over $[a1,~ a2]$ (resp. $[b1,~ + b2]$). +For example, here, a represents the weights of a set of n subjects and b the sizes of these same subjects. + +Two problems arise: +\begin{enumerate} +\item Can adjacent sub-intervals of $[a1,~ a2]$ (resp. $[b1,~ b2]$) + be defined so that the finest partition obtained best respects the + distribution of the values observed in $[a1,~ a2]$ (resp. $[b1,~ b2]$)? +\item Can we find the respective partitions of $[a1,~ a2]$ and $[b1,~ + b2]$ made up of meetings of the previous adjacent sub-intervals, + partitions that maximize the average intensity of involvement of the + sub-intervals of one on sub-intervals on the other belonging to + these partitions? +\end{enumerate} + +We answer these two questions as part of our problem by choosing the +criteria to optimize in order to satisfy the optimality expected in +each case. +To the first question, many solutions have been provided in other +settings (for example, by~\cite{Lahaniera}). + +\subsubsection{First problem} + +We will look at the interval $[a1,~ a2]$ assuming it has a trivial +initial partition of sub-intervals of the same length, but not +necessarily of the same frequency distribution observed on these +sub-intervals. +Note $P_0 = \{A_{01},~ A_{02},~ ...,~ A_{0p}\}$, this partition in $p$ +sub-intervals. +We try to obtain a partition of $[a1,~ a2]$ into $p$ sub-intervals +$\{A_{q1},~ A_{q2},~ ...,~ A_{qp}\}$ in such a way that within each +sub-interval there is good statistical homogeneity (low intra-class +inertia) and that these sub-intervals have good mutual heterogeneity +(high inter-class inertia). +We know that if one of the criteria is verified, the other is +necessarily verified (Koenig-Huyghens theorem). +This will be done by adopting a method directly inspired by the +dynamic cloud method developed by Edwin Diday~\cite{Diday} (see also +\cite{Lebart} and adapted to the current situation. This results in +the optimal partition targeted. + +\subsubsection{Second problem} + +It is now assumed that the intervals $[a1,~ a2]$ and $[b1,~ b2]$ are +provided with optimal partitions $P$ and $Q$, respectively, in the +sense of the dynamic clouds. +Let $p$ and $q$ be the respective numbers of sub-intervals composing +$P$ and $Q$. +From these two partitions, it is possible to generate $2^{p-1}$ and +$2^{q-1}$ partitions obtained by iterated meetings of adjacent +sub-intervals of $P$ and $Q$ \footnote{It is enough to consider the tree structure of which $A_1$ is the root, then to join it or not to $A_2$ which itself will or will not be joined to $A_3$, etc. There are therefore $2^{p-1}$ branches in this tree structure.} respectively. +We calculate the respective intensities of implication of each +sub-interval, whether or not combined with another of the first +partition, on each sub-interval, whether or not combined with another +of the second, and then the values of the intensities of the +reciprocal implications. +There are therefore a total of $2.2^{p-1}.2^{q-1}$ families of +implication intensities, each of which requires the calculation of all +the elements of a partition of $[a1,~ a2]$ on all the elements of one +of the partitions of $[b1,~ b2]$ and vice versa. +The optimality criterion is chosen as the geometric mean of the +intensities of implication, the mean associated with each pair of +partitions of elements, combined or not, defined inductively. +We note the two maxima obtained (direct implication and its +reciprocal) and we retain the two associated partitions by declaring +that the implication of the variable-on-interval $a$ on the +variable-on-interval $b$ is optimal when the interval $[a1,~ a2]$ +admits the partition corresponding to the first maximum and that the +optimal reciprocal involvement is satisfied for the partition of +$[b1,~ b2]$ corresponding to the second maximum. + +\section{Interval-variables} +\subsection{Founding situation} +Data are available from a population of $n$ individuals (who may be +each or some of the sets of individuals, e.g. a class of students) +according to variables (e.g. grades over a year in French, math, +physics,..., but also: weight, height, chest size,...). +The values taken by these variables for each individual are intervals +of positive real values. +For example, individual $x$ gives the value $[12,~ 15.50]$ to the math +score variable. +E. Diday would speak on this subject of symbolic variables $p$ at +intervals defined on the population. + + +We try to define an implication of intervals, relative to a variable +$a$, which are themselves observed intervals, towards other similarly +defined intervals and relative to another variable $b$. +This will make it possible to measure the implicit, and therefore +non-symmetric, association of certain interval(s) of the variable a +with certain interval(s) of the variable $b$, as well as the +reciprocal association from which the best one will be chosen for each +pair of sub-intervals involved, as just described in §4.1. + +For example, it will be said that the sub-interval $[2, 5.5]$ of +mathematical scores generally implies the sub-interval $[4.25, 7.5]$ +of physical scores, both of which belong to an optimal partition in +terms of the explained variance of the respective value ranges $[1, + 18]$ and $[3, 20]$ taken in the population. +Similarly, we will say that $[14.25, 17.80]$ in physics most often +implies $[16.40, 18]$ in mathematics. + + +\subsection{Algorithm} + +By following the problem of E. Diday and his collaborators, if the +values taken according to the subjects by the variables $a$ and $b$ +are of a symbolic nature, in this case intervals of $\mathbb{R}^+$, it +is possible to extend the above algorithms\cite{Grasi}. +For example, variable $a$ has weight intervals associated with it and +variable $b$ has size intervals associated with variable $b$, due to +inaccurate measurements. +By combining the intervals $I_x$ and $J_x$ described by the subjects +$x$ of $E$ according to each of the variables $a$ and $b$ +respectively, we obtain two intervals $I$ and $J$ covering all +possible values of $a$ and $b$. +On each of them a partition can be defined in a certain number of +intervals respecting as above a certain optimality criterion. +For this purpose, the intersections of intervals such as $I_x$ and +$J_x$ with these partitions will be provided with a distribution +taking into account the areas of the common parts. +This distribution may be uniform or of another discrete or continuous +type. +But thus, we are back in search of rules between two sets of +variables-on-intervals that take, as previously in §4.1, their values +on $[0,~ 1]$ from which we can search for optimal implications. + + +\remark Whatever the type of variable considered, there is often a +problem of overabundance of variables and therefore difficulty of +representation. +For this reason, we have defined an equivalence relationship on all +variables that allows us to substitute a so-called leader variable for +an equivalence class~\cite{Grask}. + +\section{Variations in the implication index q according to the 4 occurrences} + +In this paragraph, we examine the sensitivity of the implication index +to disturbances in its parameters. + +\subsection{Stability of the implication index} +To study the stability of the implication index $q$ is to examine its +small variations in the vicinity of the $4$ observed integer values +($n$, $n_a$, $n_b$, $n_{a \wedge \overline{b}}$). +To do this, it is possible to perform different simulations by +crossing these 4 integer variables on which $q$ depends~\cite{Grasx}. +But let us consider these variables as variables with real values and +$q$ as a function that can be continuously differentiated from these +variables, which are themselves forced to respect inequalities: $0\leq +n_a \leq n_b$ and $n_{a \wedge \overline{b}} \leq inf\{n_a,~ n_b\}$ and +$sup\{n_a,~ n_b\} \leq n$. +The function $q$ then defines a scalar and vector field on +$\mathbb{R}^4$ as an affine and vector space on itself. +In the likely hypothesis of an evolution of a nonchaotic process of +data collection, it is then sufficient to examine the differential of +$q$ with respect to these variables and to keep its restriction to the +integer values of the parameters of the relationship $a \Rightarrow b$. +The differential of $q$, in the sense of Fréchet's +topology\footnote{Fréchet's topology allows $\mathbb{N}$ sections, + i.e. subsets of naturals of the form $\{n,~ n+1,~ n+2,~ ....\}$, to be + used as a filter base, while the usual topology on $\mathbb{R}$ + allows real intervals for filters. + Thus continuity and derivability are perfectly defined and + operational concepts according to Fréchet's topology in the same way + as they are with the usual topology.}, is expressed as follows by +the scalar product: + +$$dq = \frac{\partial q}{\partial n}dn + \frac{\partial q}{\partial + n_a}dn_a + \frac{\partial q}{\partial n_b}dn_b + \frac{\partial + q}{\partial n_{a \wedge \overline{b}}}dn_{a \wedge \overline{b}} = grad~q.dM\footnote{By a mechanistic metaphor, we will say that $dq$ is the elementary work of $q$ for a movement $dM$ (see chapter 14 of this book).}$$ + +where $M$ is the coordinate point $(n,~ n_a,~ n_b,~ n_{a \wedge + \overline{b}})$ of the vector scalar field $C$, $dM$ is the +component vector the differential increases of these occurrence +variables, and $grad~ q$ the component vector the partial derivatives +of these occurrence variables. + +The differential of the function $q$ therefore appears as the scalar product of its gradient and the increase of $q$ on the surface representing the variations of the function $q(n,~ n_a,~ n_b,~ n_{a \wedge + \overline{b}})$. Thus, the gradient of $q$ represents its own +variations according to those of its components, the 4 cardinals of +the assemblies $E$, $A$, $B$ and $card(A\cap \overline{B})$. It +indicates the direction and direction of growth or decrease of $q$ in +the space of dimension 4. Remember that it is carried by the normal to +the surface of level $q~ =~ cte$. + +If we want to study how $q$ varies according to $ n_{\overline{b}}$, +we just have to replace $n_b$ by $n-n_b$ and therefore change the sign +of the derivative of $n_b$ in the partial derivative. In fact, the +interest of this differential lies in estimating the increase +(positive or negative) of $q$ that we note $\Delta q$ in relation to +the respective variations $\Delta n$, $\Delta n_a$, $\Delta n_b$ and +$\Delta n_{a \wedge + \overline{b}}$. So we have: + + +$$\Delta q= \frac{\partial q}{\partial n} \Delta n + \frac{\partial + q}{\partial n_a} \Delta n_a + \frac{\partial + q}{\partial n_b} \Delta n_b + \frac{\partial + q}{\partial n_{a \wedge + \overline{b}}} \Delta n_{a \wedge + \overline{b}} +o(\Delta q)$$ + +where $o(\Delta q)$ is an infinitely small first order. +Let us examine the partial derivatives of $n_b$ and $n_{a \wedge + \overline{b}}$ the number of counter-examples. We get: + +$$ \frac{\partial + q}{\partial n_b} = \frac{1}{2} n_{a \wedge + \overline{b}} (\frac{n_a}{n})^{-\frac{1}{2}} (n-n_b)^{-\frac{3}{2}} ++ \frac{1}{2} (\frac{n_a}{n})^{\frac{1}{2}} (n-n_b)^{-\frac{1}{2}} > 0 $$ + + +$$ \frac{\partial + q}{\partial n_{a \wedge + \overline{b}}} = \frac{1}{\sqrt{\frac{n_a n_{\overline{b}}}{n}}} += \frac{1}{\sqrt{\frac{n_a (n-n_b)}{n}}} > 0 $$ + +Thus, if the increases $\Delta nb$ and $\Delta n_{a \wedge + \overline{b}}$ are positive, the increase of $q(a,\overline{b})$ is +also positive. This is interpreted as follows: if the number of +examples of $b$ and the number of counter-examples of implication +increase then the intensity of implication decreases for $n$ and $n_a$ +constant. In other words, this intensity of implication is maximum at +observed values $n_b$ and $ n_{a \wedge + \overline{b}}$ and minimum at values $n_b+\Delta n_b$ and $n_{a \wedge + \overline{b}}+ n_{a \wedge + \overline{b}}$.