advertisement

# Homeland Security Congressional

67 %
33 %
advertisement
Information about Homeland Security Congressional
Entertainment

Published on January 5, 2008

Author: Clown

Source: authorstream.com

advertisement

Homeland Security What Can Mathematics Do? :  Homeland Security What Can Mathematics Do? Fred Roberts Professor of Mathematics, Rutgers University Chair, RU Homeland Security Research Initiative Director, DIMACS Center Slide2:  Mathematical methods have become important tools in preparing plans for defense against terrorist attacks, especially when combined with powerful, modern computer methods for analysis and simulation. Are you Serious?? What Can Mathematics Do For Us?:  Are you Serious?? What Can Mathematics Do For Us? Slide5:  . After Pearl Harbor: Mathematics and mathematicians played a vitally important role in the US World War II effort. Slide6:  Critical War-Effort Contributions Included: Code breaking. Creation of the mathematics-based field of Operations Research: logistics optimal scheduling inventory strategic planning Enigma machine But: Terrorism is Different. Can Mathematics Really Help?:  But: Terrorism is Different. Can Mathematics Really Help? 5 + 2 = ? 1, 2, 3, … I’ll Illustrate with Mathematics Projects I’m Involved in. There are Many Others :  I’ll Illustrate with Mathematics Projects I’m Involved in. There are Many Others Bioterrorism Sensor Location Monitoring Message Streams Identification of Authors Detecting a Bioterrorist Attack through “Syndromic Surveillance” OUTLINE:  OUTLINE Bioterrorism Sensor Location Monitoring Message Streams Identification of Authors Detecting a Bioterrorist Attack through “Syndromic Surveillance” The Bioterrorism Sensor Location Problem:  The Bioterrorism Sensor Location Problem Slide11:  Early warning is critical in defense against terrorism This is a crucial factor underlying the government’s plans to place networks of sensors/detectors to warn of a bioterrorist attack The BASIS System – Salt Lake City Locating Sensors is not Easy:  Locating Sensors is not Easy Sensors are expensive How do we select them and where do we place them to maximize “coverage,” expedite an alarm, and keep the cost down? Approaches that improve upon existing, ad hoc location methods could save countless lives in the case of an attack and also money in capital and operational costs. Two Fundamental Problems :  Two Fundamental Problems Sensor Location Problem Choose an appropriate mix of sensors decide where to locate them for best protection and early warning Two Fundamental Problems:  Two Fundamental Problems Pattern Interpretation Problem: When sensors set off an alarm, help public health decision makers decide Has an attack taken place? What additional monitoring is needed? What was its extent and location? What is an appropriate response? The Sensor Location Problem:  The Sensor Location Problem Approach is to develop new algorithmic methods. Developing new algorithms involves fundamental mathematical analysis. Analyzing how efficient algorithms are involves fundamental mathematical methods. Implementing the algorithms on a computer is often a separate problem – which needs to go hand in hand with the basic mathematics of algorithm development. Algorithmic Approaches I : Greedy Algorithms:  Algorithmic Approaches I : Greedy Algorithms Greedy Algorithms:  Greedy Algorithms Find the most important location first and locate a sensor there. Find second-most important location. Etc. Builds on earlier mathematical work at Institute for Defense Analyses (Grotte, Platt) “Steepest ascent approach.’’ No guarantee of “optimal” or best solution. In practice, gets pretty close to optimal solution. Algorithmic Approaches II : Variants of Classic Facility Location Theory Methods:  Algorithmic Approaches II : Variants of Classic Facility Location Theory Methods Location Theory:  Location Theory Old problem in Operations research: Where to locate facilities (fire houses, garbage dumps, etc.) to best serve “users” Often deal with a network with nodes, edges, and distances along edges Users u1, u2, …, un are located at nodes One approach: locate the facility at node x chosen so that sum of distances to users is minimized. Minimize: Location Theory: A Network:  Location Theory: A Network 1’s represent distances along edges Nodes are places for users or facilities Slide21:  u1 u3 u2 x=a: d(x,ui)=1+1+2=4 x=b: d(x,ui)=2+0+1=3 x=c: d(x,ui)=3+1+0=4 x=d: d(x,ui)=2+2+1=5 x=e: d(x,ui)=1+3+2=6 x=f: d(x,ui)=0+2+3=5 x=b is optimal Variants of Classic Facility Location Theory Methods: Complications:  Variants of Classic Facility Location Theory Methods: Complications We don’t have a network with nodes and edges; we have points in a city Sensors can only be at certain locations (size, weight, power source, hiding place) We need to place more than one sensor Instead of “users,” we have places where potential attacks take place. Potential attacks take place with certain probabilities. Wind, buildings, mountains, etc. add complications. Variants of Classic Facility Location Theory Methods: Complications:  Variants of Classic Facility Location Theory Methods: Complications These more complex problems are hard! The best-known algorithms for solving these “higher-dimensional” variants of the classic location problem are due to Rafail Ostrovsky -- a partner on our project. The mathematics-based approximation methods due to Ostrovsky and his colleagues are promising. Algorithmic Approaches IIII : Variants of Air Pollution Monitoring Models:  Algorithmic Approaches IIII : Variants of Air Pollution Monitoring Models Variants of Air Pollution Monitoring Models:  Variants of Air Pollution Monitoring Models Long history of using mathematical models to locate air pollution monitors. Use fluid dynamics Use plume models. Large computer simulations needed. Long used in nuclear weapons defense. Variants of Air Pollution Monitoring Models:  Variants of Air Pollution Monitoring Models Mathematical challenge: Modify air pollution monitor placement modeling tools for complex biological agents. E.g.: Complications arise when applying the models to cities: Buildings make it hard! The Pattern Interpretation Problem:  The Pattern Interpretation Problem The Pattern Interpretation Problem (PIP):  The Pattern Interpretation Problem (PIP) It will be up to the Decision Maker to decide how to respond to an alarm from the sensor network. Approaching the PIP: Minimizing False Alarms:  Approaching the PIP: Minimizing False Alarms Approaching the PIP: Minimizing False Alarms:  Approaching the PIP: Minimizing False Alarms One approach: Redundancy. Could require two or more sensors to make a detection before an alarm is considered confirmed Could require same sensor to register two alarms: Portal Shield requires two positives for the same agent during a specific time period. Approaching the PIP: Minimizing False Alarms:  Approaching the PIP: Minimizing False Alarms Could place two or more sensors at or near the same location. Require two proximate sensors to give off an alarm before we consider it confirmed. Redundancy has drawbacks: cost, delay in confirming an alarm. We need mathematical methods to analyze the tradeoff between lowered false alarm rate and extra cost/delay Approaching the PIP: Using Decision Rules:  Approaching the PIP: Using Decision Rules Existing sensors come with a sensitivity level specified and sound an alarm when the number of particles collected is sufficiently high – above threshold. Approaching the PIP: Using Decision Rules:  Approaching the PIP: Using Decision Rules Let f(x) = number of particles collected at sensor x in the past 24 hours. Sound an alarm if f(x) > T. Alternative decision rule: alarm if two sensors reach 90% of threshold, three reach 75% of threshold, etc. Alarm if: f(x) > T for some x, or if f(x1) > .9T and f(x2) > .9T for some x1,x2, or if f(x1) > .75T and f(x2) > .75T and f(x3) > .75T for some x1,x2,x3. Approaching the PIP: Using Decision Rules:  Approaching the PIP: Using Decision Rules Prior work along these lines in missile detection (Cherikh and Kantor) Bioterrorism Sensor Location: Partner Agencies/Institutions:  Bioterrorism Sensor Location: Partner Agencies/Institutions Defense Threat Reduction Agency MITRE Corporation Los Alamos National Laboratory Institute for Defense Analysis New York City Dept. of Health OUTLINE:  OUTLINE Bioterrorism Sensor Location Monitoring Message Streams Identification of Authors Detecting a Bioterrorist Attack through “Syndromic Surveillance” Slide37:  Monitoring Message Streams: Algorithmic Methods for Automatic Processing of Messages Slide38:  Motivation: monitoring email traffic, news, communiques, faxes Objective: Monitor huge communication streams, in particular, streams of textualized communication, to automatically detect pattern changes and "significant" events Slide39:  Given stream of text in any language. Decide whether "events" are present in the flow of messages. Event: new topic or topic with unusual level of activity. Suppose events have been classified into classes or groups: group 1, group 2, … A new message comes in. Does it fit into group 1? Into group 2? Or does it (and related messages) define a new group of interest? Technical Approaches: One Approach: “Bag of Words”:  One Approach: “Bag of Words” List all the words of interest that may arise in the messages being studied: w1, w2,…,wn Bag of words vector b has k as the ith entry if word wi appears k times in the message. Sometimes, use “bag of bits”: Vector of 0’s and 1’s; count 1 if word wi appears in the message, 0 otherwise. “Bag of Words” Example:  “Bag of Words” Example Words: w1 = bomb, w2 = attack, w3 = strike w4 = train, w5 = plane, w6 = subway w7 = New York, w8 = Los Angeles, w9 = Madrid, w10 = Tokyo, w11 = London w12 = January, w13 = March “Bag of Words”:  “Bag of Words” Message 1: Strike Madrid trains on March 1. Strike Tokyo subway on March 2. Strike New York trains on March 11. Bag of words b1 = (0,0,3,2,0,1,1,0,1,1,0,0,3) w1 = bomb, w2 = attack, w3 = strike w4 = train, w5 = plane, w6 = subway w7 = New York, w8 = Los Angeles, w9 = Madrid, w10 = Tokyo, w11 = London w12 = January, w13 = March The Approach: “Bag of Words”:  The Approach: “Bag of Words” Key idea: how close are two such vectors? Suppose known messages have been classified into different groups: group 1, group 2, … A message comes in. Which group should we put it in? Or is it “new”? You look at the bag of words vector associated with the incoming message and see if “fits” closely to typical vectors associated with a given group. The Approach: “Bag of Words”:  The Approach: “Bag of Words” Your performance can improve over time. You “learn” how to classify better. Typically you do this “automatically” and try to develop mathematical methods that will allow a machine to “learn” from past data. “Bag of Words”:  “Bag of Words” Message 2: Bomb Madrid trains on March 1. Attack Tokyo subway on March 2. Strike New York trains on March 11. Bag of words b2 = (1,1,1,2,0,1,1,0,1,1,0,0,3) w1 = bomb, w2 = attack, w3 = strike w4 = train, w5 = plane, w6 = subway w7 = New York, w8 = Los Angeles, w9 = Madrid, w10 = Tokyo, w11 = London w12 = January, w13 = March “Bag of Words”:  “Bag of Words” Note that b1 and b2 are “close” b1 = (0,0,3,2,0,1,1,0,1,1,0,0,3) b2 = (1,1,1,2,0,1,1,0,1,1,0,0,3) Close could be measured using distance d(b1,b2) = number of places where b1,b2 differ (“Hamming distance” between vectors). Here: d(b1,b2) = 3 The messages are “similar” – could belong to the same group or class of messages. “Bag of Words”:  “Bag of Words” Message 3: Go on strike against Madrid trains on March 1. Go on strike against Tokyo subway on March 2. Go on strike against New York trains on March 11. Bag of words b3 = same as b1. BUT: message 3 is quite different from message 1. Shows complexity of problem. Maybe missing some key words like “go” or maybe we should use pairs of words like “on strike” (“bigrams”) One Approach: k-Nearest Neighbor (kNN) Classifiers :  One Approach: k-Nearest Neighbor (kNN) Classifiers How kNN Classifiers Work: Find k most similar “training” messages (neighbors) Assign a message to those groups that are most common among neighbors (using weighting by distance) kNN classifiers had been considered inefficient since finding neighbors is slow Speeding up kNN:  Speeding up kNN Can finding neighbors be made fast enough to make kNN practical? Mathematics can help. Store text and classes “sparsely” Use “inverted file” heuristics that group input by word, not by “document” and compute similarities using only the few words occurring in the document Result: New methods are 10 to 100 times faster with only a 2-10% loss in “effectiveness” (according to some standard measures) Software delivered to sponsors. Slide50:  Streaming Data We often have just one shot at the data as it comes “streaming by” because there is so much of it. This calls for powerful new algorithms. Research Challenge: “Historic” Data Analysis :  Research Challenge: “Historic” Data Analysis The accumulation of text messages is massive over time We can only save summaries of the data. It is a great challenge to use only summarized historic data and see if a currently emerging phenomenon had precursors occurring in the past – since you don’t have the original data. We have had some success with a novel architecture for historic and posterior analyses via small summaries - “sketches” OUTLINE:  OUTLINE Bioterrorism Sensor Location Monitoring Message Streams Identification of Authors Detecting a Bioterrorist Attack through “Syndromic Surveillance” Slide53:  Questions Addressed: Which of a set of authors wrote a particular document/message? Were two documents written by the same author? Related Project: Author Identification Develop and evaluate techniques for identifying authors in large collections of textual artifacts (e-mails, communiques, transcribed speech, etc.). Slide54:  We are using methods developed in the Monitoring Message Streams Project Building on classical work in Statistics: Who wrote the Federalist papers, Hamilton or Madison? More complicated than conventional text classification: Large number of possible authors Not much “training data” Authors write on multiple topics Authors write in different styles in different “genres” Author Identification One Approach: In “Bag of Words”: Use “Function Words”:  One Approach: In “Bag of Words”: Use “Function Words” a about above according accordingly actual actually after afterward afterwards again against ago ah ain't all almost along already also although always am among an and another any anybody anyone anything anywhere are aren't around art as aside at away Partner Agencies: Monitoring Message Streams and Author Identification Projects :  Partner Agencies: Monitoring Message Streams and Author Identification Projects Research sponsored by ITIC: Intelligence Technology Innovation Center Administratively under the CIA Through interagency Knowledge, Discovery, and Dissemination (KDD) program. OUTLINE:  OUTLINE Bioterrorism Sensor Location Monitoring Message Streams Identification of Authors Detecting a Bioterrorist Attack through “Syndromic Surveillance” Slide58:  Great concern about the deliberate introduction of diseases such as smallpox by bioterrorists has led to new challenges for mathematical scientists. smallpox Bioterrorist Event Detection smallpox Bioterrorist Event Detection :  Bioterrorist Event Detection Mathematical models of infectious diseases go back to Daniel Bernoulli’s mathematical analysis of smallpox in 1760. However, modern data-gathering methods bring with them new challenges for mathematicians. Methods used in Monitoring Message Streams and Author ID projects enter into using large data sets to detect “bioterrorist events” or “emerging diseases” (SARS) through “syndromic surveillance” New Data Types for Public Health Surveillance:  New Data Types for Public Health Surveillance Managed care patient encounter data Pre-diagnostic/chief complaint (ED data) Over-the-counter sales transactions Drug store Grocery store 911-emergency calls Ambulance dispatch data Absenteeism data ED discharge summaries Prescription/pharmaceuticals Adverse event reports Slide61:  Syndromic Surveillance: NYC Dept. of Health Data Approach::  Approach: As with Monitoring Message Streams and Author Identification, represent data by using a vector. For example, use “bag of bits” (0 or 1 only in each entry). If use symptoms, then 1 or 0 represents presence or absence of symptoms such as coughing, fever over 102 degrees, achy legs, disoriented, etc. Many New Mathematical Methods and Approaches under Development:  Many New Mathematical Methods and Approaches under Development Spatial-temporal “scan statistics” Statistical process control (SPC) Bayesian applications “Market-basket” association analysis Text mining Rule-based surveillance Change-point techniques Project a Collaboration between a Math/CS Research Center and a Government Agency:  Project a Collaboration between a Math/CS Research Center and a Government Agency DIMACS: Center for Discrete Mathematics and Theoretical Computer Science CDC: Centers for Disease Control and Prevention Slide65:  Would Mathematics help Protect our Bridges and Tunnels? George Washington Bridge Lincoln Tunnel Slide66:  Would Mathematics Help Protect our Borders? Slide67:  Would it help with a Deliberate Outbreak of Anthrax? Slide68:  Similar approaches, using mathematical models, have proven useful in many other fields, to: make policy plan operations analyze risk compare interventions identify the cause of observed events Slide69:  Why not in homeland security?

## Add a comment

 User name: Comment:

October 22, 2017

October 22, 2017

October 22, 2017

October 22, 2017

October 11, 2017

October 11, 2017

## Related pages

### Homeland Security

The Department of Homeland Security has a vital mission: to secure the nation from the many threats we face. This requires the dedication of more than ...
Read more

### House Committee on Homeland Security

The Homeland Security Committee continues its efforts to shield the homeland and protect Americans. Below is a list of bipartisan ...
Read more

### Homeland Security | Congressional Budget Office

The Proposed Homeland Security Budget for 2013 September 27, 2012. The Administration has proposed a budget of \$69 billion for activities related to ...
Read more

### Congressional Research Service Reports on Homeland Security

Congressional Research Service Reports on Homeland Security. Introduction to FEMA's National Flood Insurance Program (NFIP), August 16, 2016; State ...
Read more

### Defining Homeland Security: Analysis and Congressional ...

Defining Homeland Security: Analysis and Congressional Considerations Congressional Research Service 3 Effects on Congressional Responsibilities
Read more

### United States House Committee on Homeland Security ...

Congressional districts; Speaker of the United States ... The U.S. House Committee on Homeland Security is a standing committee of the United States House ...
Read more

### House Homeland Security Committee | Congress.gov | Library ...

All the legislation activity and reports of the House Homeland Security Committee
Read more

### Contact Us | Homeland Security

Contact us through many channels. Look up email and mailing addresses, telephone numbers, ... The Department of Homeland Security welcomes your feedback.
Read more