Semantic Analysis

50 %
50 %
Information about Semantic Analysis

Published on September 29, 2007

Author: Connor


Semantic Analysis for Video Contents Extraction -Spotting by Association in News Video :  Semantic Analysis for Video Contents Extraction -Spotting by Association in News Video Paper by – Yuichi NAKAMURA Takeo KANADE Presented By- Hemant Joshi Introduction:  Introduction Enormous amount of multimedia data Linking two news matters together Semantic linking Using closed-captioning along with video Video Content Spotting by Association:  Video Content Spotting by Association Necessity for multiple Modalities video content extraction from only language or image data is not reliable ``They say'' – difficult to determine without semantics. Situation Spotting by Association:  Situation Spotting by Association Association between language and image clues is important key. Two advantages Reliable detection utilizing both images and language The data explained by both modalities is clearly understandable to users. Situation Spotting by Association (Con.):  Situation Spotting by Association (Con.) Situation Spotting by Association (Con.):  Situation Spotting by Association (Con.) Language Clue Detection:  Language Clue Detection Simple Keyword Spotting Direct Vs. Indirect narration Keyword usage for speech Language Clue Detection (Cont.):  Language Clue Detection (Cont.) Keyword usage for meeting and visiting Screening Keywords:  Screening Keywords To avoid false detection of keywords not related to subject matter of interest, parse the sentence in transcripts, check the role of each keyword and check the semantics of the subject, the verb and the objects. Also consider following: Part-of-speech of each word can be used as keyword. Example- “talk” as verb If keyword is verb, subject or object is checked semantically. For semantic checking, use Hypernym relation in WordNet Negative sentences or those in future tense can be ignored. Location name which follows several kinds of prepositions such as “in”, ”to” is considered as a language clue. Process - Conditions for key-sentence detection :  Process - Conditions for key-sentence detection In key-sentence detection, keywords are detected from transcripts. Keywords are syntactically and semantically checked and evaluated by using the parsing results. we focus only on subjects and verbs, results are more acceptable. (80% correct –CNN news headlines) A sentence including one or more words which satisfy these conditions is considered a key-sentence. Process - Key-sentence detection result :  Process - Key-sentence detection result The figure (X/Y/Z) in each table shows the numbers of detected key-sentence X is the number of sentences which include keywords Y is the sentences removed by the above keyword screening Z is the number of sentences incorrectly removed Image Clue Detection – Key Image:  Image Clue Detection – Key Image Image Clues ? Face close-ups People Images Outdoor Scenes Usage of Face close-up Key Image – Usage of People Images:  Key Image – Usage of People Images usage of people images is the description about crowds, such as people in a demonstration Key Image – Outdoor Scenes:  Key Image – Outdoor Scenes In the case of outdoor scenes, images describe the place, the degree of a disasters, etc. Key Image Detection:  Key Image Detection Face Close-up Detection In this research, human faces are detected by the neural-network based face detection program. Most face close-ups are easily detected because they are large and frontal. Therefore, most frontal faces, less than half of the small faces and profiles are detected. People Image and Outdoor Scene Detection As for images with many people, the problem becomes difficult because small faces and human figures are more difficult to detect. The same can be said of outdoor scene detection. Automatic face and outdoor scene detection is still under development. For the experiments in this paper, we manually pick them. Since the representative image of each cut is automatically detected, it takes only a few minutes for us to pick those images from a 30-minute news video. Association by Dynamic Programming:  Association by Dynamic Programming Basic Idea The detected data is the sequence of key images and that of key-sentences to which starting and ending time is given. If a key image duration and a key-sentence duration have enough overlap (or close to each other) and the suggested situations are compatible, they should be associated. Basic Assumption Order of a key image sequence and that of a key-sentence sequence are the same. The basic idea is to minimize the following penalty value P. P = Sumj \in Sn Skips(j) + Sumk \in In Skipi(k) + Sumj \in S, k \in I Match(j, k) where S and I are the key-sentences and key images which have corresponding clues in the other modality, Sn and In are those without corresponding clues. Skips is the penalty value for a key-sentence without inter-modal correspondence, Skipi is for a key image without inter-modal correspondence, and Match(j,k) is the penalty for the correspondence between the j-th key-sentence and the k-th key image. Association by DP - Cost Evaluation:  Association by DP - Cost Evaluation Skipping Cost(Skip) The penalty values are determined by the importance of the data, that is the possibility of each data having the inter-modal correspondence. In this research, importance evaluation of each clues is calculated by the following formula. The skip penalty Skip is considered as -E. E = EtypeEdata where the Etype is the type of evaluation, for example, the evaluation of a type “face close-up”. Edata is that of each clue, for example, the face size evaluation for a face close-up. Example of cost definition key-sentence: speech 1.0, meeting 0.6, crowd 0.6, travel/visit 0.6, location 0.6 key image: face 1.0, people 0.6, scene 0.6 Association by DP - Cost Evaluation:  Association by DP - Cost Evaluation Matching Cost(Match) The evaluation of correspondences is calculated by the following formula. Match(i,j) = Mtime(i, j) Mtype(i, j) where Mtime is the duration compatibility between an image and a sentence. The more their durations overlap, the less the penalty becomes. A key image's duration (di) is the duration of the cut from which the key image is taken; the starting and ending time of a sentence in the speech is used for key-sentence duration (ds). In the case where the exact speech time is difficult to obtain, it is substituted by the time when closed-caption appears. The actual values for Mtype are shown in Table. They are roughly determined by the number of correspondences in our sample videos. Experiments & Results:  Experiments & Results Results (Continued.):  Results (Continued.) Usage of Results:  Usage of Results Summarization and Presentation tool Around 70 segments are spotted for each 30-minute news video. This means an average of 3 segments in a minute. If a topic is not too long, we can place all of the segments in one topic into one window. This view could be a good presentation of a topic as well as a good summarization tool. Each pair of a picture and a sentence is an associated pair. The picture is a key image, and the sentence is a key-sentence. The position of the pair is determined by the situations defined This view enables us to overlook how the topic is organized. Visit and place information is given first, meeting information is given second, then a few public speeches and opinions are given. Usage of Results (Continued.):  Usage of Results (Continued.) Data tagging to video segments News Video topic explainer (Category + Time Order):  News Video topic explainer (Category + Time Order) Details in Topic Explainer:  Details in Topic Explainer Conclusion:  Conclusion The idea of the Spotting by Association in news video. video segments with typical semantics are detected by associating language clue and image clue. Most of the detected segments fit the typical situations Proposed new applications by using detected news segments. future work Improvement of key image and key-sentence detection Check the effectiveness of this method with other kinds of videos. Questions?:  Questions?

Add a comment

Related presentations

Related pages

Latent Semantic Analysis – Wikipedia

Latent Semantic Indexing (kurz LSI) ist ein (nicht mehr patentgeschütztes) Verfahren des Information Retrieval, das 1990 zuerst von Deerwester et al ...
Read more

Semantic analysis - Wikipedia, the free encyclopedia

This disambiguation page lists articles associated with the title Semantic analysis. If an internal link led you here, you may wish to change the link to ...
Read more

Semantische Analyse - Für bessere Webseiten!

Bei der semantischen Analyse handelt es sich um eine Analyseform, die der Sprachwissenschaft entstammt. Hierbei wird der Bedeutungsgehalt eines Wortes ...
Read more

Semantik – Wikipedia

Denkt man die semantische Fragestellung nach der Bedeutung ... Die Interpretation eines Satzes muss dabei auf einer Analyse seiner syntaktischen ...
Read more

Latent semantic analysis - Wikipedia, the free encyclopedia

Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a ...
Read more

Semantic Analysis - Compilers: Principles, Techniques, and ...

CS143 Handout 14 Autumn 2007 October 24, 2007 Semantic Analysis What Is Semantic Analysis Parsing only verifies that the program consists of tokens ...
Read more

An Introduction to Latent Semantic Analysis - LSA @ CU Boulder

Introduction to Latent Semantic Analysis 3 An Introduction to Latent Semantic Analysis Research reported in the three articles that follow—Foltz, Kintsch ...
Read more

Semantic Analysis | LinkedIn

View 5055 Semantic Analysis posts, presentations, experts, and more. Get the professional knowledge you need on LinkedIn.
Read more

SGN-9206 Signal processing graduate seminar II, Fall 2007 ...

Semantic analyypsis of text and speech SGN-9206 Signal processing graduate seminar II, Fall 2007 Anssi Klapuri Institute of Signal Processing, Tampere ...
Read more