Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.

50 %
50 %
Information about Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.
Technology

Published on August 3, 2009

Author: abifet

Source: slideshare.net

Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos. Albert Bifet LARCA: Laboratori d’Algorismica Relacional, Complexitat i Aprenentatge Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya Junio 2009, Santander

Minería de Datos y Aprendizaje para Flujos de Datos con Cambio de Concepto Extraer información de secuencia potencialmente infinita de data datos que varian con el tiempo usando pocos recursos usando ADWIN La Desintegración de la ADaptive Sliding WINdow: Persistencia de la Memoria Ventana deslizante adaptativa 1952-54 sin parámetros Salvador Dalí 2 / 29

Minería de Datos y Aprendizaje para Flujos de Datos con Cambio de Concepto Extraer información de secuencia potencialmente infinita de data datos que varian con el tiempo usando pocos recursos usando ADWIN La Desintegración de la ADaptive Sliding WINdow: Persistencia de la Memoria Ventana deslizante adaptativa 1952-54 sin parámetros Salvador Dalí 2 / 29

Minería de Datos Masivos Explosión de Datos en los últimos años : 100 millones búsquedas por día : 20 millones transacciones por día 1,000 millones de transacciones de tarjetas de credito por mes 3,000 millones de llamadas telefónicas diarias en EUA 30,000 millones de e-mails diarios, 1,000 millones de SMS Tráfico de redes IP: 1,000 millones de paquetes por hora por router 3 / 29

Minería de Datos Masivos Datos Masivos 2007 Universo Digital: 281 exabytes (mil millones de gigabytes) La cantidad de información creada excedió el almacenaje disponible por primera vez Green Computing Estudio y práctica de como usar recursos informáticos eficientemente. Algorithmic Efficiency Una de las principales maneras de hacer Green Computing 4 / 29

Minería de Datos Masivos Datos Masivos 2007 Universo Digital: 281 exabytes (mil millones de gigabytes) La cantidad de información creada excedió el almacenaje disponible por primera vez Green Computing Estudio y práctica de como usar recursos informáticos eficientemente. Algorithmic Efficiency Una de las principales maneras de hacer Green Computing 4 / 29

Minería de Datos Masivos Datos Masivos 2007 Universo Digital: 281 exabytes (mil millones de gigabytes) La cantidad de información creada excedió el almacenaje disponible por primera vez Green Computing Estudio y práctica de como usar recursos informáticos eficientemente. Algorithmic Efficiency Una de las principales maneras de hacer Green Computing 4 / 29

Minería de Datos Masivos Koichi Kawana Simplicidad significa conseguir el máximo efecto con los mínimos medios. Donald Knuth “... we should make use of the idea of limited resources in our own education. We can all benefit by doing occasional "toy" programs, when artificial restrictions are set up, so that we are forced to push our abilities to the limit. “ 5 / 29

Minería de Datos Masivos Koichi Kawana Simplicidad significa conseguir el máximo efecto con los mínimos medios. Donald Knuth “... we should make use of the idea of limited resources in our own education. We can all benefit by doing occasional "toy" programs, when artificial restrictions are set up, so that we are forced to push our abilities to the limit. “ 5 / 29

Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Sea π−1 la permutación π con un elemento que falta. π−1 [i] llega en orden creciente Tarea: Determinar el número que falta 6 / 29

Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Sea π−1 la permutación π con un elemento que falta. π−1 [i] llega en orden creciente Tarea: Determinar el número que falta 6 / 29

Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Usar un vector n-bit para Sea π−1 la permutación π con un memorizar todos elemento que falta. los numeros π−1 [i] llega en orden creciente (espacio O(n) ) Tarea: Determinar el número que falta 6 / 29

Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Data Streams: Sea π−1 la permutación π con un espacio elemento que falta. O(log(n)). π−1 [i] llega en orden creciente Tarea: Determinar el número que falta 6 / 29

Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Almacenar Sea π−1 la permutación π con un n(n + 1) − ∑ π−1 [j]. elemento que falta. 2 j≤i π−1 [i] llega en orden creciente Tarea: Determinar el número que falta 6 / 29

Introducción: Data Streams Problema 12, 35, 21, 42, 5, 43, 57, 2, 45, 67 Dados n números no ordenados, encontrar un número que esté en la mitad superior de la lista ordenada. 2, 5, 12, 21, 35 42, 43, 45, 57, 67 Algoritmo Elegir k números aleatoriamente. Devolver el número mayor. Análisis La probabilidad de que la solución sea incorrecta es la probabilidad de que todos los k números estén en la mitad inferior : (1/2)k Para tener probabilidad δ usaremos k = log 1/δ muestras 7 / 29

Introducción: Data Streams Problema 12, 35, 21, 42, 5, 43, 57, 2, 45, 67 Dados n números no ordenados, encontrar un número que esté en la mitad superior de la lista ordenada. 2, 5, 12, 21, 35 42, 43, 45, 57, 67 Algoritmo Elegir k números aleatoriamente. Devolver el número mayor. Análisis La probabilidad de que la solución sea incorrecta es la probabilidad de que todos los k números estén en la mitad inferior : (1/2)k Para tener probabilidad δ usaremos k = log 1/δ muestras 7 / 29

Introducción: Data Streams Problema 12, 35, 21, 42, 5, 43, 57, 2, 45, 67 Dados n números no ordenados, encontrar un número que esté en la mitad superior de la lista ordenada. 2, 5, 12, 21, 35 42, 43, 45, 57, 67 Algoritmo Elegir k números aleatoriamente. Devolver el número mayor. Análisis La probabilidad de que la solución sea incorrecta es la probabilidad de que todos los k números estén en la mitad inferior : (1/2)k Para tener probabilidad δ usaremos k = log 1/δ muestras 7 / 29

Outline 1 Introduction 2 ADWIN : Concept Drift Mining 3 Hoeffding Adaptive Tree 4 Conclusions 8 / 29

Data Streams Data Streams At any time t in the data stream, we would like the per-item processing time and storage to be simultaneously O(log k (N, t)). Approximation algorithms Small error rate with high probability ˜ An algorithm (ε, δ )−approximates F if it outputs F for which ˜ − F | > εF ] < δ . Pr[|F 9 / 29

Data Streams Approximation Algorithms Frequency moments Frequency moments of a stream A = {a1 , . . . , aN }: v Fk = ∑ fik i=1 where fi is the frequency of i in the sequence, and k ≥ 0 F0 : number of distinct elements on the sequence F1 : length of the sequence F2 : self-join size, the repeat rate, or as Gini’s index of homogeneity Sketches can approximate F0 , F1 , F2 in O(log v + log N) space. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximation the frequency moments. 1996 10 / 29

Data Streams Approximation Algorithms 1011000111 1010101 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29

Data Streams Approximation Algorithms 10110001111 0101011 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29

Data Streams Approximation Algorithms 101100011110 1010111 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29

Data Streams Approximation Algorithms 1011000111101 0101110 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29

Data Streams Approximation Algorithms 10110001111010 1011101 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29

Data Streams Approximation Algorithms 101100011110101 0111010 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29

Outline 1 Introduction 2 ADWIN : Concept Drift Mining 3 Hoeffding Adaptive Tree 4 Conclusions 12 / 29

Data Mining Algorithms with Concept Drift No Concept Drift Concept Drift DM Algorithm DM Algorithm input output input output - Counter5 - - - Counter4 Static Model Counter3 6 Counter2 Counter1 - Change Detect.  13 / 29

Data Mining Algorithms with Concept Drift No Concept Drift Concept Drift DM Algorithm DM Algorithm input output input output - Counter5 - - Estimator5 - Counter4 Estimator4 Counter3 Estimator3 Counter2 Estimator2 Counter1 Estimator1 13 / 29

Time Change Detectors and Predictors: A General Framework Estimation - xt - Estimator 14 / 29

Time Change Detectors and Predictors: A General Framework Estimation - xt - Estimator Alarm - - Change Detect. 14 / 29

Time Change Detectors and Predictors: A General Framework Estimation - xt - Estimator Alarm - - Change Detect. 6 6 ? - Memory 14 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1 01010110111111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10 1010110111111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 101 010110111111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1010 10110111111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10101 0110111111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 101010 110111111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1010101 10111111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10101011 0111111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 101010110 111111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1010101101 11111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10101011011 1111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 101010110111 111 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1010101101111 11 [Dasu+ 06] 15 / 29

Window Management Models W = 101010110111111 Equal & fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10101011011111 1 [Dasu+ 06] 11 15 / 29

Algorithm ADWIN Example W = 101010110111111 W0 = 1 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 W0 = 1 W1 = 01010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 W0 = 10 W1 = 1010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 W0 = 101 W1 = 010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 W0 = 1010 W1 = 10110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 W0 = 10101 W1 = 0110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 W0 = 101010 W1 = 110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 W0 = 1010101 W1 = 10111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 W0 = 10101011 W1 = 0111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 |µW0 − µW1 | ≥ εc : CHANGE DET.! ˆ ˆ W0 = 101010110 W1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 101010110111111 Drop elements from the tail of W W0 = 101010110 W1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN Example W = 01010110111111 Drop elements from the tail of W W0 = 101010110 W1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29

Algorithm ADWIN [BG07] ADWIN has rigorous guarantees (theorems) On ratio of false positives On ratio of false negatives On the relation of the size of the current window and change rates Other methods in the literature: [Gama+ 04], [Widmer+ 96], [Last 02] don’t provide rigorous guarantees. 17 / 29

Algorithm ADWIN [BG07] Theorem At every time step we have: 1 (Few false positives guarantee) If µt remains constant within W , the probability that ADWIN shrinks the window at this step is at most δ . 2 (Few false negatives guarantee) If for any partition W in two parts W0 W1 (where W1 contains the most recent items) we have |µW0 − µW1 | > ε, and if 3 max{µW0 , µW1 } 4n ε ≥ 4· ln min{n0 , n1 } δ then with probability 1 − δ ADWIN shrinks W to W1 , or shorter. 18 / 29

Outline 1 Introduction 2 ADWIN : Concept Drift Mining 3 Hoeffding Adaptive Tree 4 Conclusions 19 / 29

Classification Example Contains Domain Has Time Data set that “Money” type attach. received spam describes e-mail yes com yes night yes features for yes edu no night yes deciding if it is no com yes night yes spam. no edu no day no no com no day no yes cat no day yes Assume we have to classify the following new instance: Contains Domain Has Time “Money” type attach. received spam yes edu yes day ? 20 / 29

Classification Assume we have to classify the following new instance: Contains Domain Has Time “Money” type attach. received spam yes edu yes day ? 20 / 29

Decision Trees Basic induction strategy: A ← the “best” decision attribute for next node Assign A as decision attribute for node For each value of A, create new descendant of node Sort training examples to leaf nodes If training examples perfectly classified, Then STOP, Else iterate over new leaf nodes 21 / 29

Hoeffding Tree / CVFDT Hoeffding Tree : VFDT Pedro Domingos and Geoff Hulten. Mining high-speed data streams. 2000 With high probability, constructs an identical model that a traditional (greedy) method would learn With theoretical guarantees on the error rate 22 / 29

VFDT / CVFDT Concept-adapting Very Fast Decision Trees: CVFDT G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. 2001 It keeps its model consistent with a sliding window of examples Construct “alternative branches” as preparation for changes If the alternative branch becomes more accurate, switch of tree branches occurs 23 / 29

Decision Trees: CVFDT No theoretical guarantees on the error rate of CVFDT CVFDT parameters : 1 W : is the example window size. 2 T0 : number of examples used to check at each node if the splitting attribute is still the best. 3 T1 : number of examples used to build the alternate tree. 4 T2 : number of examples used to test the accuracy of the alternate tree. 24 / 29

Decision Trees: Hoeffding Adaptive Tree Hoeffding Adaptive Tree: replace frequency statistics counters by estimators don’t need a window to store examples, due to the fact that we maintain the statistics data needed with estimators change the way of checking the substitution of alternate subtrees, using a change detector with theoretical guarantees Summary: 1 Theoretical guarantees 2 No Parameters 25 / 29

What is MOA? {M}assive {O}nline {A}nalysis is a framework for online learning from data streams. It is closely related to WEKA It includes a collection of offline and online as well as tools for evaluation: boosting and bagging Hoeffding Trees with and without Naïve Bayes classifiers at the leaves. 26 / 29

Ensemble Methods http://www.cs.waikato.ac.nz/∼abifet/MOA/ New ensemble methods: ADWIN bagging: When a change is detected, the worst classifier is removed and a new classifier is added. Adaptive-Size Hoeffding Tree bagging 27 / 29

Outline 1 Introduction 2 ADWIN : Concept Drift Mining 3 Hoeffding Adaptive Tree 4 Conclusions 28 / 29

Conclusions Adaptive and parameter-free methods based in replace frequency statistics counters by ADWIN don’t need a window to store examples, due to the fact that we maintain the statistics data needed with ADWINs using ADWIN as change detector with theoretical guarantees, Summary: 1 Theoretical guarantees 2 No parameters needed 3 Higher accuracy 4 Less space needed 29 / 29

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

Albert Bifet - HubSlide

PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions Presenters: Indrė Žliobaitė, João Gama, Albert Bifet, and M...
Read more

Multi-dimensional classification using Bayesian networks ...

... automático y de minería de datos. ... nuevos métodos de aprendizaje de clasificadores ... para clasificar flujos de datos ...
Read more

Métodos Adaptativos de Minería de Datos y Aprendizaje para ...

Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos. Albert Bifet LARCA: Laboratori d’Algorismica Relacional, Complexitat i ...
Read more

Métodos Adaptativos de Minería de Datos y Aprendizaje para ...

Technology. Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.
Read more

educación – Página 26 – juandon. Innovación y conocimiento

... ya que constituyen un obstáculo para el aprendizaje organizacional y ... minería de datos cuantitativos y ... aprendizaje adaptativos ...
Read more

www.siani.es

... DE DATOS PARA LA MINERÍA DE USO DE LA WEB J. Santos, M. Hernández, J. Lorenzo III Taller de Minería de Datos y Aprendizaje ... EN LOS MÉTODOS CGS Y ...
Read more

learning – Página 20 – juandon. Innovación y conocimiento

... y debates o conferencias, organizar los datos ... y peculiares métodos de aprendizaje, ... para el aprendizaje organizacional y ...
Read more

Seminario de investigación en Matemáticas, Estadística y ...

Seminario de investigación en Matemáticas, Estadística y ... Describiremos un método basado en muestreo geométrico y estereología que unido a las ...
Read more