Information about Lecture11 - neural networks

Published on March 2, 2009

Author: aorriols

Source: slideshare.net

Recap of Lecture 5-10 Data classification Decision trees (C4.5) Instance-based learners (kNN and CBR) Slide 2 Artificial Intelligence Machine Learning

Recap of Lecture 5-10 Data classification Probabilistic-based learners P (D | h )P (h ) P (h | D ) = P (D ) Linear/polynomial classifier Slide 3 Artificial Intelligence Machine Learning

Today’s Agenda Why Neural Networks? Looking into a Brain Neural Networks Starting from the Beginning: Perceptrons Multi-layer perceptrons Slide 4 Artificial Intelligence Machine Learning

Why Neural Networks? Brain vs. machines Machines are tremendously faster than brains in well-defined problems: Invert matrices solve differential equations etc matrices, equations, etc. Brains are tremendously faster and more accurate than machines in ill-defined methods or problems that require a lot p q of processing Recognize the character of objects in TV Let’s simulate our brains with artificial neural networks! Massive parallelism Neurons interchanging signals Slide 5 Artificial Intelligence Machine Learning

Looking into a Brain 1011 neurons of more than 20 different types 0.001 seconds of neuron switching time 104-5 connections per neuron 0.1 seconds of scene recognition time Slide 6 Artificial Intelligence Machine Learning

Artificial Neural Networks Borrow some ideas from nervous systems of animals ai =g (ini ) =g (∑ j W j ,i a j ) THE PERCEPTRON (McCulloch & Pitts) Slide 7 Artificial Intelligence Machine Learning

Adaline Adaptive Linear Element Adaptive linear combiner cascaded with a hard-limiting quantizer Linear output transformed to binary by means of a threshold device Training = adjusting the weights Activation functions Slide 8 Artificial Intelligence Machine Learning

Adaline Note that Adaline implements a function rr n f ( x , w) =w0 + ∑ xi wi i =1 This defines a threshold when the output is zero rr n f ( x , w) =w0 + ∑ xi wi =0 i =1 Slide 9 Artificial Intelligence Machine Learning

Adaline Let’s assume that we have two variables rr f ( x , w) =w0 + x1w1 + x2 w2 = 0 Therefore w0 w1 x2 =− x1 − w2 w2 So, Adaline is drawing a linear , g discriminant that divides the space into two regions Linear classifier Slide 10 Artificial Intelligence Machine Learning

Adaline So, we got a cool way to create linear classifiers But are linear classifiers enough to tackle our problems? Can you draw a line that separates examples of class white and black for the last example? Slide 11 Artificial Intelligence Machine Learning

Moving to more Flexible NN So, we want to classify problems such as x-or. Any idea? Polynomial discriminant functions In this system: rr f ( x , w) =w0 + x1w1 + x12 w11 + x1 x2 w12 + x2 w22 + x2 w2 = 0 2 Slide 12 Artificial Intelligence Machine Learning

Moving to more Flexible NN With appropriate values of w, I can fit data that is not linearly separable Slide 13 Artificial Intelligence Machine Learning

Even more Flexible: Multi-layer NN So, we want to classify problems such as x-or. Any other idea? Madaline: Multiple Adalines connected This also enables the network to solve non-separable problems Slide 14 Artificial Intelligence Machine Learning

But Step Down… How Do I Learn w? We have seen that different structures enable me to define different functions But the key is to get a proper estimation of w There are many algorithms Perceptron rule α-LMS α-perceptron May’s algorithm Backpropagation p pg We are going to see two examples: α-LMS and backprop. Slide 15 Artificial Intelligence Machine Learning

Weight Learning in Adaline Recall that we want to adjust w Slide 16 Artificial Intelligence Machine Learning

Weight Learning in Adaline Weight learning with α-LMS algorithm εk Xk Wk +1 =Wk + α Incrementally update weights as 2 Xk The error is the difference between ε k +1 = d k − WkT X k the actual and the expected output Δε k = Δ(d k − WkT X k ) =− X k ΔWk A change in the T weights effects the error εk Xk ΔWk = Wk +1 − Wk = α And the weight change is 2 Xk ε k X kT X k Δε k = −α = −αε k Therefore 2 Xk Slide 17 Artificial Intelligence Machine Learning

Weight Learning in Adaline εk Δε k = − X k ΔWk ΔWk = α T Xk 2 Xk Slide 18 Artificial Intelligence Machine Learning

Backpropagation α-LMS works for networks with a single layer. But what happens in networks with multiple layers? Backpropagation (Rumelhat, 1986) The most influential development of NN in the 1980s Here, we present the method conceptually (the math details are in the papers) Let’s assume a network with Three neurons in the input layer Two neurons in the output layer Slide 19 Artificial Intelligence Machine Learning

Backpropagation Strategy Compute the gradient of the error ∂ε ˆ k = ∂ε k 2 ∇ ∂Wk Adjust the weights in the direction opposite to the instantaneous error gradient Now, Wk is a vector that contains all the components of the net Slide 20 Artificial Intelligence Machine Learning

Backpropagation Algorithm Insert a new example Xk into the network and sweep it forward 1. till getting the output y Compute the square error of thi attribute C t th f this tt ib t 2. Ny Ny ε k 2 = ∑ ε ik 2 = ∑ (d ik − yik )2 i =1 i =1 For example, for two outputs (disregarding k) ε = (d 1 − y1 ) + (d 2 − y2 ) 2 2 2 Propagate the error to the previous layer (b k P t th t th i l (back-propagation). ti ) 3. How? Steepest descent p Compute the derivative of the square error δ for each Adaline Slide 21 Artificial Intelligence Machine Learning

Backpropagation Example Example borrowed from: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html Slide 22 Artificial Intelligence Machine Learning

Backpropagation Example 1. Sweep the weights forward Slide 23 Artificial Intelligence Machine Learning

Backpropagation Example 2. Backpropagate the error Slide 24 Artificial Intelligence Machine Learning

Backpropagation Example 3. Modify the weights of each neuron Slide 25 Artificial Intelligence Machine Learning

Backpropagation Example 3.bis. Do the same of each neuron Slide 26 Artificial Intelligence Machine Learning

Backpropagation Example 3.bis2. Until reaching the output Slide 27 Artificial Intelligence Machine Learning

Backpropagation for a Two-Layer Net. That is, the algorithm is Find the instantaneous square error derivative 1. 1 ∂ε 2 δj =− (l ) 2 ∂s j ( l ) This tells us how sensitive is the square output error of the network net ork is to changes in the linear output s of the associated o tp t Madaline Expanding the error term we g p g get 2. [ ] 1 ∂ ( d 1 − y1 ) 2 + ( d 2 − y 2 ) 2 1 ∂[ d 1 − sgm( s1 (2) )]2 δ1 =− =− (2) ∂s1 ∂s1 (2) (2) 2 2 And recognizing that d1 is independent of s1 3. δ 1( 2 ) = [ d 1 − sgm( s1( 2 ) )]sgm' ( s1( 2 ) ) = ε 1( 2 ) sgm' ( s1( 2 ) ) Slide 28 Artificial Intelligence Machine Learning

Backpropagation for a Two-Layer Net. That is, the algorithm is Similarly for the hidden layers we have 4. 1 ⎛ ∂ε 2 ∂s1 ∂ε 2 ∂s2 ⎞ 1 ∂ε 2 (2) (2) = − ⎜ (2) ⎟ δ 1( 1 ) =− + ⎜ ∂s (1) ⎟ 2 ∂s1 2 ⎝ 1 ∂s1 ∂s2 ∂s1 ⎠ (1) (1) (2) ∂s1 ( 2 ) ∂s 2 (2) (2) δ = δ1 + δ2 That is (1) (2) 5. ∂s1 ∂s1 1 (1) (1) Which yields 4. ⎡ ⎤ ⎡ ⎤ 3 3 ∂ ⎢ w10 ( 2 ) + ∑ w1 i ( 2 ) sgm ( si ( 1 ) ∂ ⎢ w20 ( 2 ) + ∑ w1 i ( 2 ) sgm ( s 2 ( 1 ) )⎥ )⎥ δ 1( 1 ) = δ +δ (2) ⎣ ⎦ (2) ⎣ ⎦ i =1 i =1 ∂s1( 1 ) ∂s1( 1 ) 1 2 = δ1 ) + δ2 (2) (2) (1) (2) (2) (1) w11 sgm' ( s1 w21 sgm' ( s1 ) [ ]sgm' ( s = δ1 + δ2 (2) (2) (2) (2) (1) w11 w21 ) 1 Slide 29 Artificial Intelligence Machine Learning

Backpropagation for a Two-Layer Net. Δ ε1 =δ1 + δ2 (1) (2) (2) (2) (2) w11 w21 Defining δ 1( 1 ) = ε 1( 1 ) sgm' ( s1( 1 ) ) We obtain Implementation details of each Adaline Slide 30

Next Class Support Vector Machines Slide 31 Artificial Intelligence Machine Learning

Introduction to Machine Learning Lecture 11 Neural Networks N lN t k Albert Orriols i Puig aorriols@salle.url.edu Artificial Intelligence – Machine Learning g g Enginyeria i Arquitectura La Salle Universitat Ramon Llull

Associative Memory Networks l Remembering something : Associating an idea or thought with a sensory cue. l Human memory connects items (ideas, sensations ...

Read more

Neural networks are a fascinating interdisciplinary field where physicists, biologists, and computer scientists work together in order to better ...

Read more

1 Lecture 11 Neural Networks The Hopfield Model W 1992 J. J. Hopfield Neural Networks and Physical Systems with Emergent Collective Computational Abilities

Read more

Outline of the lecture This lecture introduces you sequence models. The goal is for you to learn about: Recurrent neural networks The vanishing and ...

Read more

CSC2535 Spring 2013 - Lectures ... Recurrent neural networks ... March 27 Lecture11: Non-linear Dimensionality Reduction notes as .ppt ...

Read more

Artifical Neural Networks; Artificial Neural Networks. Machine Learning. ... Lecture 11 docs-slides-Lecture11.pptx Microsoft Power Point Presentation ...

Read more

• The multilayer perceptron is an artificial neural network that ... • The multilayer network ... www.cs.stir.ac.uk/~ahu/31YB/lecture11.pdf ...

Read more

Fundamentals Of Artificial Neural Networks downloads at Ebookmarket.org - Download free pdf files,ebooks and documents - Fundamentals of Artificial Neural ...

Read more

## Add a comment