From: Raphaël Couturier Date: Tue, 27 Aug 2019 13:44:14 +0000 (+0200) Subject: update X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/book_chic.git/commitdiff_plain/5cbdb1c4043f8d808549175ef543cf67ab8ca12a update --- diff --git a/chapter2.tex b/chapter2.tex index b50c9dd..3ffdbe5 100644 --- a/chapter2.tex +++ b/chapter2.tex @@ -313,8 +313,8 @@ $a\Rightarrow b$, for $n_a\leq n_b$ and $nb \neq n$, is then defined from the index $q(a,\overline{b})$ by: \definition -The implication intensity that measures the inductive quality of a -over b is: +The implication intensity that measures the inductive quality of $a$ +over $b$ is: $$\varphi(a,b)=1-Pr[Q(a,\overline{b})\leq q(a,\overline{b})] = \frac{1}{\sqrt{2 \pi}} \int^{\infty}_{ q(a,\overline{b})} e^{-\frac{t^2}{2}} dt,~ if~ n_b \neq n$$ @@ -1181,3 +1181,46 @@ $b$ & & & & & \\ \hline \label{chap2fig5} \end{figure} + +One of the difficulties related to the graphical representation is that the graph is not planar. +The algorithm that allows its construction must take it into account and, in particular, must "straighten" the paths of the graph in order to allow an acceptable readability for the expert who will analyze it. + +The number of arcs in the graph can be reduced (or increased) if we raise (or lower) the acceptance threshold of the rules, the level of confidence in the selected rules. +Correlatively, arcs can appear or disappear depending on the variations of the threshold. +Let us recall that this graph is necessarily without cycle, that it is not a lattice since, for example, the variable $a$ does not imply the variable ($a$ or $\neg a$) whose support is $E$. +A fortiori, it cannot be a Galois lattice. +Options of the CHIC software for automatic data processing with SIA, allow to delete variables at will, to move their image in the graph in order to decrease the arcs or to focus on certain variables called vertices of a kind of "cone" whose two "plots" are made up respectively of the variables "parents" and the variables "children" of this vertex variable. +We refer to the ends of the arcs as "nodes". A node in a given graph has a single variable or a conjunction of variables. +The transition from a node $S_1$ to a node $S_2$ is also called "transition" which is represented by an arc in the graph. +The upper slick of the vertex cone the variable $a$, called the nodal variable, is made up of the "fathers" of $a$, either in the "causal" sense the causes of $a$ ; the lower slick, on the other hand, is made up of the "children" of $a$ and therefore, always in the causal sense, the consequences or effects of $a$. +The expert in the field analysed here must be particularly interested in these configurations, which are rich in information. +See, for example~\cite{Lahanierc} and the two implicit cones below (i.e. Figures~\ref{chap2fig6} and \ref{chap2fig7}). + +\begin{figure}[htbp] + \centering +\includegraphics[scale=0.75]{chap2fig6.png} +\caption{Implicative cone.} + +\label{chap2fig6} +\end{figure} + +\begin{figure}[htbp] + \centering +\includegraphics[scale=0.75]{chap2fig7.png} +\caption{Implicative cone centered on a variable.} + +\label{chap2fig7} +\end{figure} + + +\section{Reduction in the number of variables} +\subsection{Motivation} + + +As soon as the number of variables becomes excessive, most of the available techniques become impractical. +In particular, when an implicitive analysis is carried out by calculating association rules~\cite{Agrawal}, the number of rules discovered undergoes a combinatorial explosion with the number of variables, and quickly becomes inextricable for a decision-maker, provided that variable conjunctions are requested. +In this context, it is necessary to make a preliminary reduction in the number of variables. + +Thus, ~\cite{Ritschard} proposed an efficient heuristic to reduce both the number of rows and columns in a table, using an association measure as a quasi-optimal criterion for controlling the heuristic. +However, to our knowledge, in the various other research studies, the type of situation at the origin of the need to group rows or columns is not taken into account in the reduction criteria, whether the analyst's problem and aim are the search for similarity, dissimilarity, implication, etc., between variables. + diff --git a/figures/chap2fig6.png b/figures/chap2fig6.png new file mode 100644 index 0000000..2454899 Binary files /dev/null and b/figures/chap2fig6.png differ diff --git a/figures/chap2fig7.png b/figures/chap2fig7.png new file mode 100644 index 0000000..b674467 Binary files /dev/null and b/figures/chap2fig7.png differ diff --git a/references.tex b/references.tex index 650bc39..c98a3d5 100644 --- a/references.tex +++ b/references.tex @@ -277,6 +277,9 @@ Cépaduès Ed. Toulouse, p. 195-208, ISBN: 978.2.36493.577.8. didactique des phénomènes d’ostension et de contradiction, Thèse de doctorat de l’Université de Rennes 1. +\bibitem{Ritschard} Ritschard G., Marcellin S., Zighed D.A. (2009), Arbre de décision pour données déséquilibrées : sur la complémentarité de l’intensité d’implication et de l’entropie décentrée, Une méthode d'analyse de données pour la recherche de causalités, sous la direction de Régis Gras, réd, invités R. Gras, J.C. Régnier, F. Guillet, Cépaduès Ed. Toulouse p.207-219 + + \bibitem{Ndonga} Ndong L. (2008) La place des concepts de la didactique des sciences dans la formation des professeurs de lycée et collège de Sciences de la Vie et de la Terre en France et au Gabon. Thèse