chapter2.tex

   1 %%%%%%%%%%%%%%%%%%%%% chapter.tex %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   2 %
   3 % sample chapter
   4 %
   5 % Use this file as a template for your own input.
   6 %
   7 %%%%%%%%%%%%%%%%%%%%%%%% Springer-Verlag %%%%%%%%%%%%%%%%%%%%%%%%%%
   8 %\motto{Use the template \emph{chapter.tex} to style the various elements of your chapter content.}
   9 \chapter{From the founding situations of the SIA to its formalization}
  10 \label{intro} % Always give a unique label
  11 % use \chaptermark{}
  12 % to alter or adjust the chapter heading in the running head
  13
  14
  15
  16 \abstract{
  17 Starting from mathematical didactic situations, the implicitative
  18 statistical analysis method develops as problems are encountered and
  19 questions are asked.
  20 Its main objective is to structure data crossing subjects and
  21 variables, to extract inductive rules between variables and, based on
  22 the contingency of these rules, to explain and therefore forecast in
  23 various fields: psychology, sociology, biology, etc.
  24 It is for this purpose that the concepts of intensity of implication,
  25 class cohesion, implication-inclusion, significance of hierarchical
  26 levels, contribution of additional variables, etc., are based.
  27 Similarly, the processing of binary variables (e.g., descriptors) is
  28 gradually being supplemented by the processing of modal, frequency
  29 and, recently, interval and fuzzy variables.
  30 }
  31
  32 \section{Preamble}
  33
  34 Human operative knowledge is mainly composed of two components: that
  35 of facts and that of rules between facts or between rules themselves.
  36 It is his learning that, through his culture and his personal
  37 experiences, allows him to gradually develop these forms of knowledge,
  38 despite the regressions, the questioning, the ruptures that arise at
  39 the turn of decisive information.
  40 However, we know that these dialectically contribute to ensuring a
  41 balanced operation.
  42 However, the rules are inductively formed in a relatively stable way
  43 as soon as the number of successes, in terms of their explanatory or
  44 anticipatory quality, reaches a certain level (of confidence) from
  45 which they are likely to be implemented.
  46 On the other hand, if this (subjective) level is not reached, the
  47 individual's economy will make him resist, in the first instance, his
  48 abandonment or criticism.
  49 Indeed, it is costly to replace the initial rule with another rule
  50 when a small number of infirmations appear, since it would have been
  51 reinforced by a large number of confirmations.
  52 An increase in this number of negative instances, depending on the
  53 robustness of the level of confidence in the rule, may lead to its
  54 readjustment or even abandonment.
  55 Laurent Fleury~\cite{Fleury}, in his thesis, correctly cites the
  56 example - which Régis repeats - of the highly admissible rule: "all
  57 Ferraris are red".
  58 This very robust rule will not be abandoned when observing a single or
  59 two counter-examples.
  60 Especially since it would not fail to be quickly
  61 re-comforted.
  62
  63 Thus, contrary to what is legitimate in mathematics, where not all
  64 rules (theorem) suffer from exception, where determinism is total,
  65 rules in the human sciences, more generally in the so-called "soft"
  66 sciences, are acceptable and therefore operative as long as the number
  67 of counter-examples remains "bearable" in view of the frequency of
  68 situations where they will be positive and effective.
  69 The problem in data analysis is then to establish a relatively
  70 consensual numerical criterion to define the notion of a level of
  71 confidence that can be adjusted to the level of requirement of the
  72 rule user.
  73 The fact that it is based on statistics is not surprising.
  74 That it has a property of non-linear resistance to noise (weakness of
  75 the first counter-example(s)) may also seem natural, in line with the
  76 "economic" meaning mentioned above.
  77 That it collapses if counter-examples are repeated also seems to have
  78 to guide our choice in the modeling of the desired criterion.
  79 This text presents the epistemological choice we have made.
  80 As such it is therefore refutable, but the number of situations and
  81 applications where it has proved relevant and fruitful leads us to
  82 reproduce its genesis here.
  83
  84 \section{Introduction}
  85
  86 Different theoretical approaches have been adopted to model the
  87 extraction and representation of imprecise (or partial) inference
  88 rules between binary variables (or attributes or characters)
  89 describing a population of individuals (or subjects or objects).
  90 But the initial situations and the nature of the data do not change
  91 the initial problem.
  92 It is a question of discovering non-symmetrical inductive rules to
  93 model relationships of the type "if a then almost b".
  94 This is, for example, the option of Bayesian networks~\cite{Amarger}
  95 or Galois lattices~\cite{Simon}.
  96 But more often than not, however, since the correlation and the
  97 ${\chi}^2$ test are unsuitable because of their symmetric nature,
  98 conditional probability~\cite{Loevinger, Agrawal,Grasn}  remains the
  99 driving force behind the definition of the association, even when the
 100 index of this selected association is multivariate~\cite{Bernard}.
 101
 102
 103
 104 Moreover, to our knowledge, on the one hand, most often the different
 105 and interesting developments focus on proposals for a partial
 106 implication index for binary data~\cite{Lermana} or \cite{Lallich}, on
 107 the other hand, this notion is not extended to other types of
 108 variables, to extraction and representation according to a rule graph
 109 or a hierarchy of meta-rules; structures aiming at access to the
 110 meaning of a whole not reduced to the sum of its
 111 parts~\cite{Seve}\footnote{This is what the philosopher L. Sève
 112   emphasizes :"... in the non-additive, non-linear passage of the
 113   parts to the whole, there are properties that are in no way
 114   precontained in the parts and which cannot therefore be explained by
 115   them" }, i.e. operating as a complex non-linear system.
 116 For example, it is well known, through usage, that the meaning of a
 117 sentence does not completely depend on the meaning of each of the
 118 words in it (see the previous chapter, point 4).
 119
 120 Let us return to what we believe is fertile in the approach we are
 121 developing.
 122 It would seem that, in the literature, the notion of implication index
 123 is also not extended to the search for subjects and categories of
 124 subjects responsible for associations.
 125 Nor that this responsibility is quantified and thus leads to a
 126 reciprocal structuring of all subjects, conditioned by their
 127 relationships to variables.
 128 We propose these extensions here after recalling the founding
 129 paradigm.
 130
 131
 132 \section{Implication intensity in the binary case}
 133
 134 \subsection{Fundamental and founding situation}
 135
 136 A set of objects or subjects E is crossed with variables
 137 (characters, criteria, successes,...) which are interrogated as
 138 follows: "to what extent can we consider that instantiating variable\footnote{Throughout the book, the word "variable" refers to both an isolated variable in premise (example: "to be blonde") or a conjunction of isolated variables (example: "to be blonde and to be under 30 years old and to live in Paris")} $a$
 139 implies instantiating variable $b$?
 140 In other words, do the subjects tend to be $b$ if we know that they are
 141 $a$?".
 142 In natural, human or life sciences situations, where theorems (if $a$
 143 then $b$) in the deductive sense of the term cannot be established
 144 because of the exceptions that taint them, it is important for the
 145 researcher and the practitioner to "mine into his data" in order to
 146 identify sufficiently reliable rules (kinds of "partial theorems",
 147 inductions) to be able to conjecture\footnote{"The exception confirms the rule", as the popular saying goes, in the sense that there would be no exceptions if there were no rule} a possible causal relationship,
 148 a genesis, to describe, structure a population and make the assumption
 149 of a certain stability for descriptive and, if possible, predictive
 150 purposes.
 151 But this excavation requires the development of methods to guide it
 152 and to free it from trial and error and empiricism.
 153
 154
 155 \subsection{Mathematization}
 156
 157 To do this, following the example of the I.C. Lerman similarity
 158 measurement method \cite{Lerman,Lermanb}, following the classic
 159 approach in non-parametric tests (e. g. Fischer, Wilcoxon, etc.), we
 160 define~\cite{Grasb,Grasf} the confirmatory quality measure of the
 161 implicative relationship $a \Rightarrow b$ from the implausibility of
 162 the occurrence in the data of the number of cases that invalidate it,
 163 i.e. for which $a$ is verified without $b$ being verified. This
 164 amounts to comparing the difference between the quota and the
 165 theoretical if only chance occurred\footnote{"...[in agreement with
 166     Jung] if the frequency of coincidences does not significantly
 167   exceed the probability that they can be calculated by attributing
 168   them solely by chance to the exclusion of hidden causal
 169   relationships, we certainly have no reason to suppose the existence
 170   of such relationships.", H. Atlan~\cite{Atlana}}.
 171 But when analyzing data, it is this gap that we take into account and
 172 not the statement of a rejection or null hypothesis eligibility.
 173 This measure is relative to the number of data verifying $a$ and not
 174 $b$ respectively, the circumstance in which the involvement is
 175 precisely put in default.
 176 It quantifies the expert's "astonishment" at the unlikely small number
 177 of counter-examples in view of the supposed independence between the
 178 variables and the numbers involved.
 179
 180 Let us be clear. A finite set $V$ of $v$ variables is given: $a$, $b$,
 181 $c$,...
 182 In the classical paradigmatic situation and initially retained, it is
 183 about the performance (success-failure) to items of a questionnaire.
 184 To a finite set $E$ of $n$ subjects $x$, functions of the type : $x
 185 \rightarrow a(x)$ where $a(x) = 1$ (or $a(x) = true$) if $x$ satisfies
 186 or has the character $a$ and $0$ (or $a(x) = false$) otherwise are
 187 associated by abuse of writing.
 188 In artificial intelligence, we will say that $x$ is an example or an
 189 instance for $a$ if $a(x) = 1$ and a counter-example if not.
 190
 191
 192 The $a \Rightarrow b$ rule is logically true if for any $x$ in the
 193 sample, $b(x)$ is null only if $a(x)$ is also null; in other words if
 194 set $A$ of the $x$ for which $a(x)=1$ is contained in set $B$ of the
 195 $x$ for which $b(x)=1$.
 196 However, this strict inclusion is only exceptionally observed in the
 197 pragmatically encountered experiments.
 198 In the case of a knowledge questionnaire, we could indeed observe a
 199 few rare students passing an item $a$ and not passing item $b$,
 200 without contesting the tendency to pass item $b$ when we have passed
 201 item $a$.
 202 With regard to the cardinals of $E$ (of size $n$), but also of $A$ (or
 203 $n_a$) and $B$ (or $n_b$), it is therefore the "weight" of the
 204 counter-examples (or) that must be taken into account in order to
 205 statistically accept whether or not to keep the quasi-implication or
 206 quasi-rule  $a \Rightarrow b$.  Thus, it is from the dialectic of
 207 example-counter-examples that the rule appears as the overcoming of
 208 contradiction.
 209
 210 \subsection{Formalization}
 211
 212 To formalize this quasi-rule, we consider any two parts $X$ and $Y$ of
 213 $E$, chosen randomly and independently (absence of a priori link
 214 between these two parts) and of the same respective cardinals as $A$
 215 and $B$. Let $\overline{Y}$ and $\overline{B}$ be the respective complementary of $Y$ and $B$ in $E$ of the same cardinal $n_{\overline{b}}= n-n_b$.
 216
 217 We will then say:
 218 Definition 1: $a \Rightarrow b$ is acceptable at confidence level
 219 $1-\alpha$ if and only if
 220 $$Pr[Card(X\cap \overline{Y})\leq card(A\cap \overline{B})]\leq \alpha$$
 221
 222 \begin{figure}[htbp]
 223   \centering
 224 \includegraphics[scale=0.34]{chap2fig1.png}
 225  \caption{The dark grey parts correspond to the counter-examples of the
 226    implication $a \Rightarrow b$}
 227 \label{chap2fig1}
 228 \end{figure}
 229
 230 It is established \cite{Lermanb} that, for a certain drawing process,
 231 the random variable $Card(X\cap \overline{Y})$ follows the Poisson law
 232 of parameter $\frac{n_a n_{\overline{b}}}{n}$.
 233 We achieve this same result by proceeding differently in the following
 234 way:
 235
 236 Note $X$ (resp. $Y$) the random subset of binary transactions where
 237 $a$ (resp. $b$) would appear, independently, with the frequency
 238 $\frac{n_a}{n}$ (resp. $\frac{n_b}{n}$).
 239 To specify how the transactions specified in variables $a$ and $b$,
 240 respectively $A$ and $B$, are extracted, for example, the following
 241 semantically permissible assumptions are made regarding the
 242 observation of the event: $[a=1~ and~ b=0]$. $(A\cap
 243 \overline{B})$\footnote{We then note $\overline{v}$ the variable
 244   negation of $v$  (or $not~ v$) and $\overline{P}$ the complementary
 245   part of the part P of E.} is the subset of transactions,
 246 counter-examples of implication $a \Rightarrow b$:
 247
 248 Assumptions:
 249 \begin{itemize}
 250 \item h1: the waiting times of an event $[a~ and~ not~ b]$ are independent
 251   random variables;
 252 \item h2: the law of the number of events occurring in the time
 253   interval $[t,~ t+T[$ depends only on T;
 254 \item h3: two such events cannot occur simultaneously
 255 \end{itemize}
 256
 257 It is then demonstrated (for example in~\cite{Saporta}) that the
 258 number of events occurring during a period of fixed duration $n$
 259 follows a Poisson's law of parameter $c.n$ where $c$ is called the
 260 rate of the apparitions process during the unit of time.
 261
 262
 263 However, for each transaction assumed to be random, the event $[a=1]$
 264 has the probability of the frequency $\frac{n_a}{n}$, the event[b=0]
 265 has as probability the frequency, therefore the joint event $[a=1~
 266   and~ b=0]$ has for probability estimated by the frequency
 267 $\frac{n_a}{n}. \frac{n_{\overline{b}}}{b}$ in the hypothesis of absence of an a priori link between a and b (independence).
 268
 269 We can then estimate the rate $c$ of this event by $\frac{n_a}{n}. \frac{n_{\overline{b}}}{b}$.
 270
 271 Thus for a duration of time $n$, the occurrences of the event $[a~ and~ not~b]$ follow a Poisson's law of parameter :
 272 $$\lambda = \frac{n_a.n_{\overline{b}}}{n}$$
 273
 274 As a result, $Pr[Card(X\cap \overline{Y})= s]= e^{-\lambda}\frac{\lambda^s}{s!}$
 275
 276 Consequently, the probability that the hazard will lead, under the
 277 assumption of the absence of an a priori link between $a$ and $b$, to
 278 more counter-examples than those observed is:
 279
 280 $$Pr[Card(X\cap \overline{Y})\leq card(A\cap \overline{B})] =
 281 \sum^{card(A\cap \overline{B})}_{s=0}  e^{-\lambda}\frac{\lambda^s}{s!} $$
 282
 283  But other legitimate drawing processes lead to a binomial law, or
 284  even a hypergeometric law (itself not semantically adapted to the
 285  situation because of its symmetry). Under suitable convergence
 286  conditions, these two laws are finally reduced to the Poisson Law
 287  above (see Annex to this chapter).
 288
 289 If $n_{\overline{b}}\neq 0$, we reduce and center this Poison variable
 290 into the variable:
 291
 292 $$Q(a,\overline{b})= \frac{card(X \cap \overline{Y})) -  \frac{n_a.n_{\overline{b}}}{n}}{\sqrt{\frac{n_a.n_{\overline{b}}}{n}}}  $$
 293
 294 In the experimental realization, the observed value of
 295 $Q(a,\overline{b})$ is $q(a,\overline{b})$.
 296 It estimates a gap between the contingency $(card(A\cap
 297 \overline{B}))$ and the value it would have taken if there had been
 298 independence between $a$ and $b$.
 299