ws911

50 %
50 %
Information about ws911
Travel-Nature

Published on March 12, 2008

Author: Sophia

Source: authorstream.com

InfoXtract Location Normalization: A Hybrid Approach to Geographic References in Information Extraction May 31 2003 Edmonton, Alberta :  InfoXtract Location Normalization: A Hybrid Approach to Geographic References in Information Extraction May 31 2003 Edmonton, Alberta NAACL-HLT Workshop on the Analysis of Geographic References Huifeng Li, Rohini K. Srihari, Cheng Niu, and Wei Li Cymfony Inc. Contents:  Contents Overview of Information Extraction System: InfoXtract Introduction of Location Normalization (LocNZ) Task of LocNZ Problems and Proposed Method Algorithm for LocNZ Experimental Evaluation Future Work Overview of InfoXtract:  Overview of InfoXtract InfoXtract produces the following information objects from a text Named Entities (NEs) - “Bill Gates, chairman of Microsoft ….” Correlated Entities (CEs) - “Bill Gates, chairman of Microsoft...” Subject-Verb-Object (SVO) triples - Both syntactic & semantic forms of the structures Entity Profiles - Profiles for entity types like people&organizations General Events (GEs) - Domain-independent events Event Argument structures centering around verb with the associated information “who did what to whom when (or how often) and where” Predefined Events (PEs) - Domain-specific events System component: integrated NLP and machine learning into IE POS tagging Shallow and deep parsing Named Entity tagging Combining supervised & unsupervised machine learning techniques Concept-based analysis Word sense disambiguation Location / Time normalization Co-reference analysis Entity Profile fusion Event extraction, fusion and linking InfoXtract Architecture:  InfoXtract Architecture Document Processor HTTP POST Knowledge Resources Lexicon Resources Grammars Process Manager Tokenlist Legend HTTP CORBA Output Manager Source Document Linguistic Processor(s) Tokenizer Tokenlist Lexicon Lookup Pragmatic Filtering POS Tagging Named Entity Detection Shallow Parsing Deep Parsing Relationship Detection Zoned Text Document XML Formatted Extracted Document NE PE CE Document & Error log Web Server HTTP response SVO CO Profile CGE Time Normalization Alias/Coreference Linking Location Normalization Profile/Event Linking Profile/Event Merge Legend Grammar Module Procedure or Statistical Model Hybrid Module Language models Introduction of Location Normalization:  Introduction of Location Normalization Task of location normalization (LocNZ) Identify the correct sense of ambiguous location named entity (1) Decide if a location name is a city, a province or a country Support NE Tagger to decide sub-tag New York (NeLoc) =>New York (NeLoc, NeCty) (2) Decide which city, state or country do a city, island or state belongs to 18 states have city of Boston Boston => Alabama, Arkansas, Massachusetts, Missouri,… Result of LocNZ can be used to (1) Support event extraction, merging and event visualization Indicate where the event occurred (2) Support profile generation Provide location information of a person or an organization (3) Support question answering Provide location area for document categorization Event and Profile Generation:  Event and Profile Generation <PersonProfile 001> :: Name: Julian Werner Hill Position: Research chemist Age: 91 Birth-place: <LocationProfile100> Affiliation: Du Pont Co. Education: MIT <LocationProfile 100> :: Name: St. Louis State: Missouri Country: United States of America Zipcode: 63101 Lattitude: 90.191313 Longitude: 38.634616 Related_profiles: <PersonProfile 001> <General Event id=200> : key verb: replace who: John Doe whom-what: Alvin Karloff complement: CEO of ABC when: last month Where: <LocationProfile101> Input: Alvin Karloff was replaced by John Doe as CEO of ABC at New York last month. Event Template Argument structures centering around verb with the associated information Profile Template presenting the subject's most noteworthy characteristics and achievements Event Visualization:  Event Visualization Event type: <Die: Event 200> Who: <Julian Werver Hill: PersonProfile 001> When: 1996-01-07 Where: <LocationProfile103> Preceding_event: <hospitalize: Event 260> Subsequent_event: <bury: Event 250> Event Visualization ; ; ; ; Result of LocNZ Indicates the place of an event occurred Predicate: Die Who: Julian Werner Hill When: 1996-01-07 Where: <LocationProfile 103> Problems in Location Normalization:  Problems in Location Normalization Difference between LocNZ and general WSD Selection restriction is not sufficient WSD: verb sense tagging relies mainly on co-occurrence constraints of semantic structures,Verb-Subject and Verb-Object in particular LocNZ: depends primarily on the co-occurrence of related location entities in the same discourse (text) Less clues in a text than verb and noun sense disambiguation ‘located in’ can indicate ‘San Francisco’ is a location only Example) The Golden Gate Bridge is located in San Francisco Lack of sources for default senses of location names Tipster Gazetteer provides only small part of default senses Little previous research on solving LocNZ Major Types of Ambiguities:  Major Types of Ambiguities City versus country and state name ambiguity Canada (CITY) Kansas (PROVINCE 1) United States (COUNTRY) Canada (CITY) Kentucky (PROVINCE 1) United States (COUNTRY) Canada (COUNTRY) New York state versus New York city Same city name among different provinces ambiguity - 33 Washington entries in the Gazetteer Washington (CITY) Arkansas (PROVINCE 1) United States (COUNTRY) Washington (CITY) California (PROVINCE 1) United States (COUNTRY) Washington (CITY) Connecticut (PROVINCE 1) United States (COUNTRY) Washington (CITY) District of Columbia (PROVINCE 1) United States (COUNTRY) Washington (CITY) Georgia (PROVINCE 1) United States (COUNTRY) Washington (CITY) Illinois (PROVINCE 1) United States (COUNTRY) Washington (CITY) Indiana (PROVINCE 1) United States (COUNTRY) Washington (CITY) Iowa (PROVINCE 1) United States (COUNTRY) Washington (CITY) Kansas (PROVINCE 1) United States (COUNTRY) Washington (CITY) Kentucky (PROVINCE 1) United States (COUNTRY) … … … … Example of Text with Location Names CNN news: http://www.cnn.com/2003/WEATHER/02/19/winter.storm.delays.ap/index.html:  Example of Text with Location Names CNN news: http://www.cnn.com/2003/WEATHER/02/19/winter.storm.delays.ap/index.html A traveler gets the bad news as he looks at the departures list that shows all canceled flights at the Philadelphia International Airport. MIAMI (AP) -- Travelers heading to and from the Northeast faced continued uncertainty Tuesday, even as airports in the mid-Atlantic region began slowly digging themselves out from one of the worst winter storms on record. …… No flights left Florida for Baltimore-Washington International Airport until Tuesday afternoon. That airport was one of the hardest-hit by the storm, with a snowfall total of 28 inches. Rosanna Blum, 38, of Hunt Valley, Maryland, had a confirmed seat on a Miami to Baltimore flight Tuesday afternoon, but still wasn't optimistic that she'd actually have the chance to use it. …… Theresa York, from Maryland, works the phones at Miami Airport as she tries to find a flight back home. …… "It's surreal," said Dawn Shuford, 35, as she reclined against her suitcase in a darkened hallway at BWI. She'd been trying since Sunday morning to get home to Seattle. The Washington area's two other airports, Reagan National and Dulles, also had limited service. Marty Legrow, from Connecticut, rests on her suitcase at Ronald Reagan National Airport in Washington. Philadelphia International Airport resumed operations Tuesday but still expected to cancel about one-third of its flights. Flights slowly resumed at New York's LaGuardia, Kennedy and Newark airports, and Boston's Logan, where more than 2 feet of snow fell, had one runway open. …… Margie D'Onofrio, 48, of King Of Prussia, Pennsylvania, and a travel companion left the Bahamas on Sunday, hoping to fly back to Philadelphia. They made it to Miami, and D'Onofrio said she did not expect to be home anytime Tuesday. …… Passengers camped out overnight at many airports. Many fliers called ahead Tuesday and weren't clogging airports unnecessarily, Orlando International Airport spokeswoman Carolyn Fennell said. Our Previous Method [Li et al. 2002]:  Our Previous Method [Li et al. 2002] (1) Lexical grammar processing with local context Identify City or State City of Buffalo; New York State Disambiguate meaning of a word e.g. Williamsville, New York, USA e.g. Brussels, Belgium Propagate the analysis result within a text where it appears One sense per discourse (Gale, Yarowsky et al, 1992) (2) Construct graph and calculate maximum weight spanning tree considering global information with Kruskal Algorithm Node: Location name senses Edge: Similarity weight between two location name senses Calculate similarities between locations in the graph referring to predefined similarity table Choose maximum weight spanning tree that reflects most probable location senses in the document (3) Default sense application If similarity value is lower than a threshold, apply default senses Problems of Previous Method:  Problems of Previous Method For MST calculation, sort all the weighted edges In case there are many locations, and each location has over 20 senses, the number of edges will increase a lot, and edges sorting will take much time, and value weighting is not distinctive enough Solution: Adopted Prim’s Algorithm for MST combined with heuristics If a location has sense of country, then select that sense as the default sense of that location (heuristics1) If a location has province or capital senses, then select that sense as default sense after local context application (heuristics2) The number of location mentions and the distance between them are taken into account Previous method could not reflect these factor Assign weight to the sense nodes in constructed graph Choose the node with maximum weight Weight Calculation:  Weight Calculation Table 1: Impact weight of Sense2 on Sense1 Weight Assigned to Sense Nodes:  Weight Assigned to Sense Nodes Canada {Kansas, Kentucky, Country} Vancouver {British Columbia Washington port in USA Port in Canada} New York {Prov in USA, New York City, …} Toronto (Ontorio, New South Wales, Illinois, …} Charlottetown {Prov in USA, New York City, …} Prince Edward Island {Island in Canada, Island in South Africa, Province in Canada} Quebec (city in Quebec, Quebec Prov, Connecticut, …} Modified Algorithm:  Modified Algorithm Look up the location gazetteer to associate candidate senses for each location NE; If a location has sense of country, then select that sense as the default sense of that location (heuristics); Call the pattern matching sub-module for local patterns like “Williamsville, New York, USA”; Apply the ‘one sense per discourse’ principle for each disambiguated location name to propagate the selected sense to its other mentions within a document; Apply default sense heuristics for a location with province or capital senses; Call Prim’s algorithm in the discourse sub-module to resolve the remaining ambiguities; If the difference between the sense with the maximum weight and the sense with next largest weight is equal to or lower than a threshold, choose the default sense of that name from lexicon. Otherwise, choose the sense with the maximum weight as output. Experimental Evaluation:  Experimental Evaluation Discussion:  Discussion Note: Column 5~9 used heuristics of default senses Local patterns (Col-4) alone contribute 12% to the overall performance Proper use of defaults senses and the heuristics(Col-5) can achieve close to 90% Prim’s algorithm (Col-7) is clearly better than the previous method using Kruskal’s algorithm (Col-6), with 13% But both methods cannot outperform default senses When using all three types of evidence, the new hybrid method performance of 96% shown in Col-9 Future Work:  Future Work Extend the scope of location normalization Extend processing scope Physical structure famous building, bridge, airport, lake, street name,… Extend gazetteer Introduce more context information for disambiguation Upgrade default meaning assignment

Add a comment

Related presentations

Related pages

WS9-11

Datenschutzerklärung: Datenschutz. Die Betreiber dieser Seiten nehmen den Schutz Ihrer persönlichen Daten sehr ernst. Wir behandeln Ihre ...
Read more

Oregon Scientific All-Purpose Massager WS911 - YouTube

Oregon Scientific All-Purpose Massager WS911 Shiatsu Massage Vibration Massage Heat Therapy Mode
Read more

All-Purpose Massager Model: WS911 - Official Online US ...

Thank you very much for purchasing “WS911 All-Purpose Massager”. This instruction manual contains useful information on appropriate use
Read more

Oregon Scientific WS911 i.comfort All-Purpose Massager ...

Oregon Scientific WS911 i.comfort All-Purpose Massager w/Heat - Relaxes Your Neck, Shoulders, Back & Lower Body
Read more

Amazon.com: Oregon Scientific WS911 i.Comfort All Purpose ...

Buy Oregon Scientific WS911 i.Comfort All Purpose Massager on Amazon.com FREE SHIPPING on qualified orders
Read more

WS911 - Rice Lake

www.ricelake.eu Phone: 31 0 88 2349171 115 CHECWEIGHERSDNAMIC WEIGHING www.ricelake.eu Phone: +31 (0) 26 472 1319 115 An ISO 9001 Registered Company.
Read more

Oregon Scientific WS911 i.Comfort All Purpose Massager

Oregon Scientific WS911 i.Comfort All Purpose Massager - ORG-WS911 (OREGON SCIENTIFIC). Oregon Scientific RM901 Rainbow Clock Soothes all body parts.
Read more

WS9-11

Impressum Angaben gemäß § 5 TMG: Frank Hertlein Wollentorstr. 11 90489 Nürnberg Kontakt:
Read more

Suchergebnis auf Amazon.de für: Bomann - Waschmaschinen ...

Online-Einkauf von Elektro-Großgeräte aus großartigem Angebot von Waschmaschinen, Trockner, Waschtrockner, Wasch- & Trockensäulen, Wäscheschleudern ...
Read more