Constructing Fuzzy Thesaurus for WWW

67 %
33 %
Information about Constructing Fuzzy Thesaurus for WWW
Entertainment

Published on December 6, 2007

Author: lusi

Source: authorstream.com

Constructing Fuzzy Thesaurus for WWW: Application to BeMySearch :  Constructing Fuzzy Thesaurus for WWW: Application to BeMySearch M. De Cock, S. Guadarrama and M. Nikravesh 2003 BISC FLINT-CIBI BISC, UC Berkeley Content:  Content Introduction Basic Concepts Term Weighting The WTW-approach Association Rules Fuzzy Terms Examples Conclusion Introduction:  Introduction During 70s-80s: Small text collections. Structured Databases. Information Retrieval methods. Now: Huge multimedia collections. Unstructured Web. Fuzzy Retrieval methods. Fuzzy Thesaurus:  Fuzzy Thesaurus Is a couple (T, R) consisting of a set T of terms and a set R of binary fuzzy relations. Examples of binary fuzzy relations are: Similarity Broader Narrower Part Of Instance Of … Basic Concepts:  Basic Concepts Document-Term relation: Crisp W: D x T -> {0,1} Fuzzy W: D x T -> [0,1] Term-Term Relation R: Man-Made: Dictionaries, Synonyms, Ontologies,… Computer-Made: WTW, Association Rules, Similarity and Inclusion Measures,… Term Weighting:  Term Weighting Local terms weights (lij): Binary (fij) Logarithmic log(1+fij) Normalized ((fij)+(fij/maxkfkj))/2 Term frequency. fij Global terms weights (gi): None 1 Entropy 1+( j(pijlog(pij))/log(n)) Term Weighting:  Term Weighting IDF log(n/j(fij)) GfIdf (j fij)/(j(fij)) Normal 1/(jf2ij)0.5 Probabilistic Inverse log((n-j(fij))/j(fij)) Document normalization None 1 Cosine (j(gilij)2)-0.5 Document-Term Matrix W:  Document-Term Matrix W Binary [ 1 0 0 0 0 1 0 0 0 1 ] [ 0 1 0 1 0 0 0 1 1 1 ] [ 1 1 0 0 1 0 0 0 1 0 ] [ 0 0 1 0 0 0 1 0 0 1 ] [ 0 0 1 0 1 1 0 0 1 0 ] [ 0 0 0 0 0 0 1 0 1 0 ] [ 1 0 0 1 0 0 0 1 0 1 ] [ 0 1 0 0 1 1 1 0 0 0 ] [ 0 0 0 1 0 0 0 1 0 0 ] [ 1 0 0 1 1 0 0 1 1 0 ] TF-IDF [ 0.6 0 0 0 0 0.6 0 0 0 0.6 ] [ 0 0.5 0 0.5 0 0 0 0.5 0.5 0.5 ] [ 0.5 0.5 0 0 0.5 0 0 0 0.5 0 ] [ 0 0 0.6 0 0 0 0.6 0 0 0.6 ] [ 0 0 0.5 0 0.5 0.5 0 0 0.5 0 ] [ 0 0 0 0 0 0 0.7 0 0.7 0 ] [ 0.5 0 0 0.5 0 0 0 0.5 0 0.5 ] [ 0 0.5 0 0 0.5 0.5 0.5 0 0 0 ] [ 0 0 0 0.7 0 0 0 0.7 0 0 ] [ 0.5 0 0 0.5 0.5 0 0 0.5 0.5 0 ] Crisp Document-Term Matrix:  Crisp Document-Term Matrix Fuzzy Document-Term Matrix:  Fuzzy Document-Term Matrix The WT.W approach:  The WT.W approach Term-Term Matrix WTW:  Term-Term Matrix WTW [ 0.1033 0.0250 0 0.0450 0.0450 0.0333 0 0.0450 0.0450 0.0583 ] [ 0.0250 0.0700 0 0.0200 0.0500 0.0250 0.0250 0.0200 0.0450 0.0200 ] [ 0 0 0.0583 0 0.0250 0.0250 0.0333 0 0.0250 0.0333 ] [ 0.0450 0.0200 0 0.1150 0.0200 0 0 0.1150 0.0400 0.0450 ] [ 0.0450 0.0500 0.0250 0.0200 0.0950 0.0500 0.0250 0.0200 0.0700 0 ] [ 0.0333 0.0250 0.0250 0 0.0500 0.0833 0.0250 0 0.0250 0.0333 ] [ 0 0.0250 0.0333 0 0.0250 0.0250 0.1083 0 0.0500 0.0333 ] [ 0.0450 0.0200 0 0.1150 0.0200 0 0 0.1150 0.0400 0.0450 ] [ 0.0450 0.0450 0.0250 0.0400 0.0700 0.0250 0.0500 0.0400 0.1400 0.0200 ] [ 0.0583 0.0200 0.0333 0.0450 0 0.0333 0.0333 0.0450 0.0200 0.1117 ] WTW Term-Term Matrix:  WTW Term-Term Matrix Association Rules:  Association Rules The Rows correspond to documents. The Columns correspond to terms. We want to find association rules between terms. Rules A=>B, are defined by: Confidence or Relative Cardinality:  Confidence or Relative Cardinality Compositional Approach:  Compositional Approach Sup-Prod Composition:  Sup-Prod Composition Fuzzy Terms:  Fuzzy Terms Meaning of term is a fuzzy set of documents. µ(t)= 0.8/d1+ 0.2/d2+ 0.0/d3+… Meaning of a document is a fuzzy set of terms. (d)= 0.1/t1+ 0.0/t2+ 0.8/t3+… Another interpretation of the document-term matrix: W = [µ(t1) µ(t2) µ(t3) …] WT = [(d1) (d2) (d3) …] Fuzzy Sets:  Fuzzy Sets Inclusion measures: Similarity measures: Term-Document Matrix WT:  Term-Document Matrix WT Fuzzy [0.6 0.0 0.5 0.0 0.0 0.0 0.5 0.0 0.0 0.4] [0.0 0.4 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0] [0.0 0.0 0.0 0.6 0.5 0.0 0.0 0.0 0.0 0.0] [0.0 0.4 0.0 0.0 0.0 0.0 0.5 0.0 0.7 0.4] [0.0 0.0 0.5 0.0 0.5 0.0 0.0 0.5 0.0 0.4] [0.6 0.0 0.0 0.0 0.5 0.0 0.0 0.5 0.0 0.0] [0.0 0.0 0.0 0.6 0.0 0.7 0.0 0.5 0.0 0.0] [0.0 0.4 0.0 0.0 0.0 0.0 0.5 0.0 0.7 0.4] [0.0 0.4 0.5 0.0 0.5 0.7 0.0 0.0 0.0 0.4] [0.6 0.4 0.0 0.6 0.0 0.0 0.5 0.0 0.0 0.0] Crisp [ 1 0 0 0 0 1 0 0 0 1 ] [ 0 1 0 1 0 0 0 1 1 1 ] [ 1 1 0 0 1 0 0 0 1 0 ] [ 0 0 1 0 0 0 1 0 0 1 ] [ 0 0 1 0 1 1 0 0 1 0 ] [ 0 0 0 0 0 0 1 0 1 0 ] [ 1 0 0 1 0 0 0 1 0 1 ] [ 0 1 0 0 1 1 1 0 0 0 ] [ 0 0 0 1 0 0 0 1 0 0 ] [ 1 0 0 1 1 0 0 1 1 0 ] 2-D Projection of W:  2-D Projection of W Fuzzy terms 1, 9:  Fuzzy terms 1, 9 Similarity using Min:  Similarity using Min Similarity using Prod:  Similarity using Prod Fuzzy Terms 2:  Fuzzy Terms 2 Meaning of term is a fuzzy set of terms. µ(t)= 0.5/t1+ 1.0/t2+ 0.2/t3+… Meaning of a document is a fuzzy set of documents. (d)= 0.1/d1+ 0.5/d2+ 0.1/d3+… Another interpretation of the term-term and document-document matrix: T = [µ(t1) µ(t2) µ(t3) …] D = [(d1) (d2) (d3) …] Application to BeMySearch:  Application to BeMySearch Query Expansion. Query Refinement. Re-Ranking. Navigation. User Profile. … Conclusions:  Conclusions General Framework: WTW Association Rules Fuzzy relation composition Fuzzy terms Relies on a fuzzy document-term relation. Traditionally probabilistic approach. Necessity of really fuzzy approach. Future Work:  Future Work More Relations: Sentence - Term Paragraph - Sentence Document - Paragraph Document – Document Clustering techniques Cluster of documents or paragraphs or sentences. Cluster of terms. Questions & Comments:  Questions & Comments

Add a comment

Related presentations

Related pages

Fuzzy Thesauri for and from the WWW (PDF Download Available)

Definition 2 (Fuzzy thesaurus). ... a novel approach for automatically constructing a multilingual thesaurus based on fuzzy set theory is proposed.
Read more

Construction of a Fuzzy Multilingual Thesaurus and Its ...

Construction of a Fuzzy ... Construction of a Fuzzy Multilingual ... a novel approach for automatically constructing a multilingual thesaurus based on ...
Read more

Constructing the Medical Thesaurus as a Tool for Indexing ...

Constructing the Medical Thesaurus as a Tool for Indexing Yousef Abuzir ... International Conference on Fuzzy Systems and Knowledge ... Thesaurus ...
Read more

Construction of a Fuzzy Multilingual Thesaurus and Its ...

Construction of a Fuzzy Multilingual Thesaurus and Its ... a novel approach for automatically constructing a multilingual thesaurus based on fuzzy ...
Read more

Transitive closures of fuzzy thesauri for information ...

... are there relationships between pairs if terms in T x T that were not supplied by the constructing ... Concept of fuzzy thesaurus. Information ...
Read more

Thesaurus Contruction Using Class Inheritance

Thesaurus Contruction Using... Thesaurus Contruction Using Class Inheritance. Gui-Jung Kim, Jung-Soo Han. Buchkapitel aus:
Read more

Fuzzy Semantic Association Of Multimedia Document Descriptions

Page 1. FUZZY SEMANTIC ASSOCIATION OF MULTIMEDIA DOCUMENT DESCRIPTIONS G. Akrivas, G. Stamou National Technical University of Athens, Department of ...
Read more

Self-Constructing Neural Fuzzy Inference Network - How is ...

It is Self-Constructing Neural Fuzzy Inference Network. ... Thesaurus. Medical Dictionary. Legal Dictionary. Financial Dictionary. Acronyms. Idioms ...
Read more

A fuzzy self-constructing algorithm for feature reduction ...

A fuzzy self-constructing algorithm for feature reduction. ... A fuzzy self-constructing algorithm was used to obtain the reduced number of features.
Read more

FUZZY SEMANTIC ASSOCIATION OF MULTIMEDIA DOCUMENT DESCRIPTIONS

FUZZY SEMANTIC ASSOCIATION OF MULTIMEDIA DOCUMENT DESCRIPTIONS G. Akrivas, G. Stamou ... and constructing a fuzzy thesaurus was demonstrated. The
Read more