Information about Lecture9 - Bayesian-Decision-Theory

Recap of Lecture 5-8 LET’S START WITH DATA CLASSIFICATION Slide 2 Artificial Intelligence Machine Learning

Recap of Lectures 5-8 We want to build decision trees How can I automatically generate these types of trees? Decide which attribute we should put in each node Decide a split point Rely on information theory We also saw many other improvements Slide 3 Artificial Intelligence Machine Learning

Recap of Lecture 5-8 From kNN to CBR 15-NN 1-NN Key aspects Value of k Distance functions Slide 4 Artificial Intelligence Machine Learning

Today’s Agenda Could we use probability to classify? p y y Where all began Some anecdotes on the correct use of probabilities b biliti Slide 5 Artificial Intelligence Introduction to C++

Why Bother about Prob.? The world is a very uncertain place Almost 40 years of AI and ML dealing with uncertain domains Some researchers decided to employ ideas from probability to model concepts Before saying more let’s go to the beginning more… let s Slide 6 Artificial Intelligence Machine Learning

Meeting the Reverend Thomas Bayes Two main works: Divine Benevolence or an Attempt to Benevolence, Prove That the Principal End of the Divine Providence and Government is the Happiness of Hi C t H i f His Creatures (1731) An Introduction to the Doctrine of Fluxions, and a Defence of the Mathematicians Against the Objections of the Author of the Analyst (published anonymously in 1736) But we are especially interested in: Essay Towards Solving a Problem in the Doctrine of Chances (1764) which was actually published p yp posthumously by Richard Price yy Slide 7 Artificial Intelligence Machine Learning

Where These Ideas Came From? Bayes build his theory upon several ideas y yp Immanuel Kant (1724-1804) Copernican revolution: our understanding of the external world had its foundations not merely in experience, but in both experience and a priori concepts, th offering a d ii t thus ff i non-empiricist critique of rationalist philosophy Isaac Newton (1643-1727) Universal gravitation three laws of motion which dominated the scientific view of the physical universe for the next three centuries Slide 8 Artificial Intelligence Machine Learning

What Was Bayes’ Point Bayesian p y probability y Notion of probability interpreted as partial belief rather than as frequency Bayesian estimation Calculate the validity of a proposition On the basis of a prior estimate of its probability and new relevant evidence E.g.: Before Bayes, forward probability Bf B f d b bilit given a specified number of white and black balls in an urn, what is the probability of drawing a black ball? p y g Bayes turned its attention to the converse problem given that one or more balls have been drawn, what can be said about the number of white and black balls in the urn? Slide 9 Artificial Intelligence Machine Learning

Bayes’ Theorem Outputs the most probable hypothesis h∈H, given the data D + knowledge about prior probabilities of hypotheses in H Terminology: P(h|D): probability that h holds given data D. Posterior probability of h; confidence that h holds given D. P(h): prior probability of h (background knowledge we have about that h is a correct hypothesis) P(D): prior probability that training data D will be observed P(D|h): probability of observing D given h holds P (D | h )P (h ) P (h | D ) = P (D ) Slide 10 Artificial Intelligence Machine Learning

Bayes’ Theorem Given H the space of possible hypothesis The Th most probable h b bl hypothesis i the one that maximizes P(h|D) h i is h h ii P(h|D): P (D | h )P (h ) hMAP ≡ arg max P (h | D ) = arg max = arg max P (D | h )P (h ) P (D ) h∈H Slide 11 Artificial Intelligence Machine Learning

Is the Pope the Pope? The chances that a randomly chosen human being is the Pope y g p are about 1 in 6 billion Benedict XVI is the Pope p What are the chances that Benedict XVI is human? (Beck-Bornholdt (Beck Bornholdt and Dubben, 1996) Dubben Analogy to syllogistic reasoning: 1 in 6 billion Slide 12 Artificial Intelligence Machine Learning

So, Is the Pope an Alien? Where is the trick? Probability of the data given a hypothesis H: P(D|H) ypo es s (|) Probability of the hypothesis ge given the da a P(H|D) e data: ( | ) P(D|H) is different from P(H|D) So, i th P S is the Pope An alien? A li ? Probability of being an alien P(A) Probability of being human P(H) Probability that the pope is an alien P( Pope | Alien) P( Alien) P( Alien | Pope) = p Human) + P( P P( P Pope | H Human) P( H Pope | Ali ) P( Ali ) Alien Alien Slide 13 Artificial Intelligence Machine Learning

So, Is the Pope an Alien? What’s missing? g P(Pope|Alien) P(Human) P(H ) P(Alien) Considering Low values of P(Alien) and P(Pope|Alien) And large values of P(Human) f( ) We could “probably” say that the pope is not an alien! Slide 14 Artificial Intelligence Machine Learning

More examples: Monty Hall Stick or switch Slide 15 Artificial Intelligence Machine Learning

Stick or Switch I chose door number 3 Door 2 is uncovered a d contains sheep and co a s a s eep They give me the chance to change the door Should I? Use probability, not faith, to give an answer! Slide 16 Artificial Intelligence Machine Learning

Stick or Switch I should switch! Slide 17 Artificial Intelligence Machine Learning

Yet Another Example: The Defendant’s Fallacy The history of a murder A suspect was caught h DNA test was positive DNA test fails only 1 over 1 million times So, my suspect must be guilty, right? More specifically, it will be guilty with p = 0.999999. Agree? Slide 18 Artificial Intelligence Machine Learning

The Defendant’s Fallacy Where is the trick now? P(coincides | innocent) as opposed to P(innocent|coincides) P(coincides | innocent) commonly misused as the probability of being innocent P(innocent | coincides) is the probability of being guilty ( ) p y gg y having that the test was positive! Does this really matter? Let’s L t’ assume a city of 10 million i h bit t it f illi inhabitants We apply the test to all the 10 million inhabitants How many of them will be positive? 10 Slide 19 Artificial Intelligence Machine Learning

The Defendant’s Fallacy Two arguments g The prosecutor: There is 0.000001 that the suspect is innocent The d f d t In thi it f Th defendant: I this city of 10M people, the probability of th l th b bilit f the suspect being innocent is approximately 90% Who is right? The d f d t Th defendant Prove for that? You do the math Slide 20 Artificial Intelligence Machine Learning

Next Class How we can use these concepts in machine learning Slide 21 Artificial Intelligence Introduction to C++

Introduction to Machine Learning Lecture 9 Bayesian decision theory – An introduction Albert Orriols i Puig aorriols@salle.url.edu i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull

CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 4 Likelihood ratio test: an example • Problem –Given the likelihoods below, derive a ...

Read more

Overview and Plan Covering Chapter 2 of DHS. Bayesian Decision Theory is a fundamental statistical approach to the problem of pattern classi cation.

Read more

7 G.Seni – Q1/04 25 Bayesian Decision Theory Discriminant Function – Normal Density (5) • Case 2: hyperplane separating class regions is generally

Read more

Chapter 4 Bayesian Decision Theory . 4.1 Introduction. Bayesian decision theory is a fundamental statistical approach to the problem of pattern classification.

Read more

67577 – Intro. to Machine Learning Fall semester, 2008/9 Lecture 2: Bayesian Decision Theory Lecturer: Amnon Shashua Scribe: Amnon Shashua 1 During the ...

Read more

Title: Lecture 23 Bayesian Decision Theory 1 Lecture 2/3 Bayesian Decision Theory 2 Outline. Bayes Decision Theory ; Fish example ; Generalized Bayes ...

Read more

Bayesian decision theory provides a unified and intuitively appealing approach to drawing inferences from observations and making rational, informed decisions.

Read more

236607 Visual Recognition Tutorial 2 Bayesian decision making with discrete probabilities – an example Looking at continuous densities Bayesian decision ...

Read more

236607 Visual Recognition Tutorial 2 Bayesian decision making with discrete probabilities – an example Looking at continuous densities Bayesian decision ...

Read more

## Add a comment