Lecture15 2003 10 14 FINAL

Information about Lecture15 2003 10 14 FINAL

Published on February 27, 2008

Author: Talya

Source: authorstream.com

Lecture 14: MPEG-7:  Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f03/ SIMS 202: Information Organization and Retrieval Lecture 14: MPEG-7 Lecture Overview:  Lecture Overview Review XML and Markup MPEG-7 MPEG-7 Standard MPEG-7 Tools Discussion Questions Action Items for Next Time Lecture Overview:  Lecture Overview Review XML and Markup MPEG-7 MPEG-7 Standard MPEG-7 Tools Discussion Questions Action Items for Next Time XML as a Common Syntax:  XML as a Common Syntax XML (and SGML) provide a way of expressing the structure of documents that can be verified and validated by document processing systems “Documents” can be metadata structures Such as the description of a particular photograph in our Phone project XML thus provides a way of representing metadata descriptions as well as the content that they describe XML as a Common Syntax:  XML as a Common Syntax All XML documents follow some simple rules that make them interchangeable and usable across different systems All data and markup is in UNICODE All elements are marked by begin and end tags All markup is case-sensitive XML DTD’s and/or Schemas define the valid structure (and sometimes content) of the documents Document Type Definitions:  Document Type Definitions The DTD describes the structural elements and "shorthand" markup for a particular document type and defines: Names of "legal" elements How many times elements can appear The order of elements in a document Whether markup can be omitted (SGML only) Contents of elements (i.e., nested structures) Attributes associated with elements Names of "entities" Short-hand conventions for element tags (SGML only) What are XML Schemas?:  What are XML Schemas? An XML vocabulary for expressing your data's structure AND content types, and even the business rules involved in processing the data Written in XML themselves Support namespaces for combining multiple schemas in the same documents The slides in this section are based on an XML tutorial by Roger L. Costello Why Schemas?:  Motivation for XML Schemas Why Schemas? People are dissatisfied with DTDs It's a different syntax You write your XML (instance) document using one syntax and the DTD using another syntax --> bad, inconsistent Limited datatype capability DTDs support a very limited capability for specifying datatypes. You can't, for example, express "I want the <elevation> element to hold an integer with a range of 0 to 12,000" Desire a set of datatypes compatible with those found in databases DTD supports 10 datatypes; XML Schemas supports 44+ datatypes Highlights of XML Schemas:  Highlights of XML Schemas XML Schemas are a tremendous advancement over DTDs: Enhanced datatypes 44+ versus 10 Can create your own datatypes Example: "This is a new type based on the string type and elements of this type must follow this pattern: ddd-dddd, where 'd' represents a digit". Written in the same syntax as instance documents less syntax to remember Object-oriented'ish Can extend or restrict a type (derive new type definitions on the basis of old ones) Can express sets, i.e., can define the child elements to occur in any order Lecture Overview:  Lecture Overview Review XML and Markup MPEG-7 MPEG-7 Standard MPEG-7 Tools Discussion Questions Action Items for Next Time What is the Problem?:  What is the Problem? Today people cannot easily create, find, edit, share, and reuse media Computers don’t understand media content Media is opaque and data rich We lack structured representations Without content representation (metadata), manipulating digital media will remain like word-processing with bitmaps The Search for Solutions:  The Search for Solutions Current approaches to creating metadata don’t work Signal-based analysis Keywords Natural language Need standardized metadata framework Designed for video and rich media data Human and machine readable and writable Standardized and scaleable Integrated into media capture, archiving, editing, distribution, and reuse Standards Overview:  Standards Overview Why do we need multimedia standards? Reliability Scalability Interoperability Layered architecture De facto standards Not legislated, but widely adopted De jure standards Legislated, but not necessarily widely adopted Multimedia Standards Process:  Multimedia Standards Process Market dominance Microsoft Examples: Internet Explorer, Windows Media Player Sony Examples: VHS, MiniDV Adobe Examples: PDF International standards organizations ISO MPEG SMPTE MPEG Standards:  MPEG Standards MPEG-1 Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/sec MPEG-2 Generic coding of moving pictures and associated audio information MPEG Audio Layer-3 (MP3) Audio compression MPEG-4 Standardized technological elements enabling the integration of production, distribution, and content access paradigms MPEG-4:  MPEG-4 Represents units of aural, visual or audiovisual content, called “media objects” These media objects can be of natural or synthetic origin (this means they could be recorded with a camera or microphone, or generated with a computer) Describes the composition of these objects to create compound media objects that form audiovisual scenes Synchronizes the data associated with media objects, so that they can be transported over network channels providing a QoS appropriate for the nature of the specific media objects Interacts with the audiovisual scene generated at the receiver’s end MPEG Standards:  MPEG Standards MPEG-7 Describing the multimedia content data that supports some degree of interpretation of the information’s meaning, which can be passed onto, or accessed by, a device or a computer code MPEG-21 A normative open framework for multimedia delivery and consumption for use by all the players in the delivery and consumption chain MPEG-7 Motivation:  MPEG-7 Motivation Create standardized multimedia description framework Enable content-based access to and processing of multimedia information on the basis of descriptions of multimedia content and structure (metadata) Support range of abstraction levels for metadata from low-level signal characteristics to high-level semantic information MPEG-7 Query Examples:  MPEG-7 Query Examples Play a few notes on a keyboard and retrieve a list of musical pieces similar to the required tune, or images matching the notes in a certain way, e.g., in terms of emotions Draw a few lines on a screen and find a set of images containing similar graphics, logos, ideograms,... Define objects, including color patches or textures and retrieve examples among which you select the interesting objects to compose your design On a given set of multimedia objects, describe movements and relations between objects and so search for animations fulfilling the described temporal and spatial relations Describe actions and get a list of scenarios containing such actions Using an excerpt of Pavarotti’s voice, obtaining a list of Pavarotti’s records, video clips where Pavarotti is singing and photographic material portraying Pavarotti MPEG-7 Sample Application Areas:  MPEG-7 Sample Application Areas Architecture, real estate, and interior design (e.g., searching for ideas) Broadcast media selection (e.g., radio channel, TV channel) Cultural services (history museums, art galleries, etc.) Digital libraries (e.g., image catalogue, musical dictionary, bio-medical imaging catalogues, film, video and radio archives) E-Commerce (e.g., personalized advertising, on-line catalogues, directories of e-shops) Education (e.g., repositories of multimedia courses, multimedia search for support material) Home Entertainment (e.g., systems for the management of personal multimedia collections, including manipulation of content, e.g. home video editing, searching a game, karaoke) Investigation services (e.g., human characteristics recognition, forensics) Journalism (e.g. searching speeches of a certain politician using his name, his voice or his face) Multimedia directory services (e.g. yellow pages, Tourist information, Geographical information systems) Multimedia editing (e.g., personalized electronic news service, media authoring) Remote sensing (e.g., cartography, ecology, natural resources management) Shopping (e.g., searching for clothes that you like) Social (e.g. dating services) Surveillance (e.g., traffic control, surface transportation, non-destructive testing in hostile environments) MPEG-7 Scope:  MPEG-7 Scope MPEG-7 Metadata Framework:  MPEG-7 Metadata Framework Data “multimedia information that will be described using MPEG-7, regardless of storage, coding, display, transmission, medium, or technology.” Feature “a distinctive characteristic of the data [that] signifies something to somebody.” MPEG-7 Metadata Framework:  MPEG-7 Metadata Framework Descriptor “A representation of a Feature. A Descriptor defines the syntax and the semantics of the Feature representation.” Description Scheme “The structure and semantics of the relationships between its components, which may be both Descriptors and Description Schemes.” Description Definition Language (XML Schema) “A language that allows the creation of new Description Schemes, and, possibly, new Descriptors. It also allows the extension and modification of existing Description Schemes.” MPEG-7 Framework:  MPEG-7 Framework MPEG-7 Standard Parts:  MPEG-7 Standard Parts MPEG-7 Systems The binary format for encoding MPEG-7 descriptions and the terminal architecture MPEG-7 Description Definition Language The language for defining the syntax of the MPEG-7 Description Tools and for defining new Description Schemes MPEG-7 Visual The Description Tools dealing with (only) Visual descriptions MPEG-7 Audio The Description Tools dealing with (only) Audio descriptions MPEG-7 Standard Parts:  MPEG-7 Standard Parts MPEG-7 Multimedia Description Schemes The Description Tools dealing with generic features and multimedia descriptions MPEG-7 Reference Software A software implementation of relevant parts of the MPEG-7 Standard with normative status MPEG-7 Conformance Testing Guidelines and procedures for testing conformance of MPEG-7 implementations MPEG-7 Extraction and Use of Descriptions Informative material (in the form of a Technical Report) about the extraction and use of some of the Description Tools (under development) Lecture Overview:  Lecture Overview Review XML and Markup MPEG-7 MPEG-7 Standard MPEG-7 Tools Discussion Questions Action Items for Next Time MPEG-7 Description Tools:  MPEG-7 Description Tools MPEG-7 Top Level Hierarchy:  MPEG-7 Top Level Hierarchy MPEG-7 Still Image Description:  MPEG-7 Still Image Description Referencing Temporal Media:  Referencing Temporal Media Spatio-Temporal Region:  Spatio-Temporal Region MPEG-7 Video Segments Example:  MPEG-7 Video Segments Example MPEG-7 Segment Relationship Graph:  MPEG-7 Segment Relationship Graph MPEG-7 Conceptual Description:  MPEG-7 Conceptual Description MPEG-7 Summaries:  MPEG-7 Summaries MPEG-7 Collections:  MPEG-7 Collections MPEG-7 Application Framework:  MPEG-7 Application Framework MPEG-7 Applications Today:  MPEG-7 Applications Today IBM MPEG-7 Annotation Tool Assists in annotating video sequences with MPEG-7 metadata Ricoh MPEG-7 MovieTool A tool for creating video content descriptions conforming to MPEG-7 syntax interactively Canon MPEG-7 Speech Recognition engine Web site allows you to create an MPEG-7 Audio “SpokenContent” description file from an audio file in “wav” format IBM MPEG-7 Annotation Tool :  IBM MPEG-7 Annotation Tool IBM MPEG-7 Annotation Tool:  IBM MPEG-7 Annotation Tool The IBM MPEG-7 Annotation Tool assists in annotating video sequences with MPEG-7 metadata Each shot in the video sequence can be annotated with static scene descriptions, key object descriptions, event descriptions, and other lexicon sets The annotated descriptions are associated with each video shot and are stored as MPEG-7 descriptions in an XML file Can also open MPEG-7 files in order to display the annotations for the corresponding video sequence Customized lexicons can be created, saved, downloaded, and updated Ricoh MovieTool:  Ricoh MovieTool Creates an MPEG-7 description by loading video data Provides visual clues to aid the user in creating the structure of the video Automatically reflects the structure in the MPEG-7 descriptions Visually shows the relationship between the structure and MPEG-7 descriptions Presents candidate tags to help choose appropriate MPEG-7 tags Checks the validation of the MPEG-7 descriptions in accordance with MPEG-7 schema Can describe all metadata defined in MPEG-7 Is able to reflect any future changes and extensions made to MPEG-7 schema Canon MPEG-7 ASR Tool:  Canon MPEG-7 ASR Tool MPEG-7 Resources:  MPEG-7 Resources http://mpeg.telecomitalialab.com/ http://www.mpeg-industry.com/ http://www.josseybass.com/WileyCDA/WileyTitle/productCd-0471486787.html MPEG-7 Future:  MPEG-7 Future New application specific profiles Integration into media production and reuse cycle Automated metadata creation in devices Use of MPEG-7 metadata in multimedia applications MPEG-21 Lecture Overview:  Lecture Overview Review XML and Markup MPEG-7 MPEG-7 Standard MPEG-7 Tools Discussion Questions Action Items for Next Time Discussion Questions (MPEG-7):  Discussion Questions (MPEG-7) Lisa de Larios-Heiman on MPEG-7 MPEG-7 is generic “so not all descriptive tools are necessary for all applications” Do you believe MPEG-7 as described in the two papers has avoided being too generic? Could each application be too specific, affecting their interoperability? Discussion Questions (MPEG-7):  Discussion Questions (MPEG-7) Lisa de Larios-Heiman on MPEG-7 The developers of MPEG-7 state that they are leaving the best methods for feature extraction to be decided in the marketplace. Similarly, “competition and innovation will produce the best results” for the consumption-end of the chain. Their papers do not express any concern that those companies that already dominate the software marketplace might dominate this niche as well, either providing inferior products or developing proprietary flavors of MPEG-7. Either of these two scenarios would affect the success of MPEG-7 and interoperability of applications, but are they very likely? Discussion Questions (MPEG-7):  Discussion Questions (MPEG-7) Megan Finn on MPEG-7 What obstacles will there for the adoption of MPEG-7? For example, people will be required to learn a visual description tool. What can be done to enable technology adoption? Who will be the first groups to adopt MPEG-7? Discussion Questions (MPEG-7):  Discussion Questions (MPEG-7) Megan Finn on MPEG-7 Media Streams and MPEG-7 have many of the same goals. How could Media Streams use some of the description tools of MPEG-7 (and vice versa)? For example, it seems that the Audio-Visual Description Scheme could use Media Streams' annotation system. Discussion Questions (MPEG-7):  Discussion Questions (MPEG-7) Jesse Mendelsohn on MPEG-7 The authors say that the actual methods of extracting features are not part of the MPEG-7 standard because it is not within the standard's scope and because competition among providers will cause innovation in extraction methods. Does this mean the Description format of the MPEG-7 standard runs the risk of being deemed difficult, too detailed, too abstract, or even impossible to comply with once the creation of applications for feature extraction is attempted? Otherwise stated, is the creation of a standard this detailed comparable to inventing the car before the wheel? Discussion Questions (MPEG-7):  Discussion Questions (MPEG-7) Jesse Mendelsohn on MPEG-7 Who is MPEG-7 specifically for? What groups of people are going to be capable of adopting and using it? Images, moving images, sound, and semantics are not only part of motion pictures but many other things as well (i.e., medical imaging). Is the MPEG-7 standard described here powerful enough so that specialized communities can adapt it to their own needs? Discussion Questions (MPEG-7):  Discussion Questions (MPEG-7) Jeannie Yang on MPEG-7 An important factor of search success or search quality nowadays is ranking or how relevant the search results are to the search terms. Is this concept applicable to searching through MPEG-7 media content at the highest abstraction level, semantic information? If it is, how does one determine ranking for semantic content? Who is to say that one video of the “sun setting” is more relevant or correct than another video of the “sun setting?” Who determines if one video is more “passionate” in its depiction of the sun setting than another? Discussion Questions (MPEG-7):  Discussion Questions (MPEG-7) Jeannie Yang on MPEG-7 Also, if a ranking system of semantic content does prevail, is there a danger of dwindling supply of creative interpretations because media re-use would always use the same image? For example, if a stock image of a “setting sun” is always re-used, it becomes the de facto image and no more images of the “setting sun” will be taken or found. On the other hand, if the ranking or relevance concept is not applicable to searching through MPEG-7 media at the semantic level, would searching still be a useful application? How does MPEG-7 make searching for semantic content easier in this case, when there may be a thousand videos on the sun setting? Discussion Questions (MPEG-7):  Discussion Questions (MPEG-7) Joseph Hall on MPEG-7 Will businesses have to pay royalties to be able to use the MPEG-7 standard? How does this affect how this "standard" penetrates the global community? (A corollary: what's the impetus behind charging royalties for the use of standards?) Lecture Overview:  Lecture Overview Review XML and Markup MPEG-7 MPEG-7 Standard MPEG-7 Tools Discussion Questions Action Items for Next Time Phone Project Presentations:  Phone Project Presentations Flamenco Markup and Browsing of Photos COMING SOON! Readings for Next Time:  Readings for Next Time Introduction to IR and the Search Process (RRL) MIR Ch. 1 Social Navigation of Information in Space (Alan Nmunro, Kristina Hook, and David Benyon) Where did you Put It? Issues in the Design and Use of a Group Memory (Lucy Berlin et. al.)

