U William

25 %
75 %
Information about U William

Published on December 29, 2007

Author: Gabir

Source: authorstream.com

Myanmar Unicode Implementation Standards:  Myanmar Unicode Implementation Standards William W.L.K (Contribution Member, Myanmar Unicode and NLP Lab) Thursday, May 04, 2006 The 5th Myanmar ICT Week 2006 Overview:  Overview Myanmar Unicode Encoding Standard Simplified, Standardized, and Finalized Myanmar Unicode Implementations Rendering Technologies (SIL Graphite, m17n, Uniscribe, Pango, ICU) Applied, Tested, Enabled Font Technologies OTF, TTF, Pseudo Unicode Developed, Tested, Applied, Debugged Are we there yet? Simply, NO! Myanmar Unicode Encoding Standard:  Myanmar Unicode Encoding Standard CURRENT Standard (Unicode 4.1) Accepted New Standard ISO/IEC JTC1/SC2/WG2 N3043 Myanmar Unicode Implementations:  Myanmar Unicode Implementations Localization/Rendering Technologies and Standards SIL Graphite m17n ICU Pango OTF Support Slide5:  Pango Pango Modules M17N Pango Mod M17N Libs SCIM Input Method Pango Supported Applications Pango Pango Modules Graphite Application Hack Graphite Enabled Applications SCIM Input Method Pango Supported Applications Graphite Pango Module M17N Graphite Pango Pango Modules OTF OTF Pango Mod Rendering Technologies and Standards:  Rendering Technologies and Standards PANGO M17N (Multiligualization) Font Layout Table (.flt) Supports many Asian Scripts Pango Module Our Japanese Friends (Dr. Handas and Dr. Takahashi) Graphite Graphite Description Language (.gdl) Supports many non-roman scripts Works well on M$ Platforms  Pango Hacks on the way Our English Friends (Mr. Martin Hoskin and Mr. Keith Stribley) Myanmar OTF Pango Module By our Myanmar Friend U Tin Myo Htet ICU (International Components for Unicode) Rendering for Myanmar already done! Java and C++ ready! Largely used by OpenOffice We use it for “Collation”. Our Spanish friend (Dr. Javier Sola living in Cambodia) Myanmar1 OTF Font by the Lab:  Myanmar1 OTF Font by the Lab What are the things you do most in Data Processing?:  What are the things you do most in Data Processing? Sorting (Collation) Searching (Tokenization) Collation:  Collation The process of ordering units of textual information. Collation is usually specific to a particular language. Also known as alphabetizing or alphabetic sorting. The general term for the process and function of determining the sorting order of strings of characters. The culturally expected ordering of linguistic characters in a particular language. Collation is not uniform!:  Collation is not uniform! So what?:  So what? How can you build a Myanmar Collation Algorithm? Canonical Encoding Order for Myanmar (Unicode 4: Table 10-3) http://www.unicode.org/versions/Unicode4.0.0/ch10.pdf The "generic" Unicode Collation Algorithm http://www.unicode.org/unicode/reports/tr10/ Too generic for too complex script like Myanmar We need a "tailored" Unicode Collation Algorithm for Myanmar. (We need custom rules!) Myanmar Collation Algorithm:  Myanmar Collation Algorithm Myanmar collation can be split into a 5 stage process: A generic Myanmar syllable can be encoded as: < consonant> < medial> < vowel> < final> <tone> This is sorted in the order: 1.<consonant> 2.<medial> 3.<final> 4.<vowel> 5.<tone> Consonant:  Consonant Medial:  Medial Finals:  Finals Consonants, followed by U+ 1039 If the virama is visible a U+200C follows! If omitted the following consonant is stacked underneath the final. Vowels:  Vowels Tones:  Tones Other Issues:  Other Issues Independent Vowels Other Issues:  Other Issues Contractions Other Issues:  Other Issues Short Forms Other Issues:  Other Issues Myanmar Symbols (Various Signs) Myanmar Punctuations Myanmar Digits Examples:  Examples Examples:  Examples Myanmar Collation Algorithm Implementation:  Myanmar Collation Algorithm Implementation ICU (International Components for Unicode) ICU Compatible Locale by Keith Used in OO (Open Office), can test-sort! glibc Myanmar Locale (by Myanmar NLP) and Collation (by Keith) my_MM used by GTK Applications on Linux Sorting in action (in OO):  Sorting in action (in OO) ICU challenges "Collation".:  ICU challenges "Collation". Searching:  Searching Tokenizing Myanmar Tokenizing refers to the process of parsing a string and splitting it into different segments or tokens Useful for searching: allows keyword indexes to be built for searching May also be applicable for identifying syllables for line breaking purposes Traditional space based tokenizing does now work well with Myanmar: Searching:  Searching Syllable based Tokenizing using a pair comparison Step 1: Assign classes for each Myanmar code point Step 2: Analyze a potential break point by comparing the class of the code point before and after In many cases this is enough to determine the break status Step 3: In a few cases more context sensitive analysis is required Searching:  Searching Details of Tokenizing You don't want to know, Trust me  Searching:  Searching Applications for Tokenizing algorithm Syllable Based Line Breaking Algorithm Indexing text using a search engine library e.g. Apache Lucene (Java & C+ + ) Processing text into syllables for lexicon analysis e.g. Machine Translation (MT Engine) Checking for encoding errors – the pair Algorithm can be used to detect invalid sequences and duplicate codes References:  References http://www.thanlwinsoft.org/ by Keith Stribley http://www.thanlwinsoft.org/ThanLwinSoft/MyanmarUnicode/Sorting/MyanmarCollation.pdf http://www.unicode.org/versions/Unicode4.0.0/ch10.pdf http://www.unicode.org/notes/tn11/ http://www.unicode.org/unicode/reports/tr10/ http://www.unicode.org/faq/collation.html http://icu.sourceforge.net/userguide/Collate_Intro.html http://en.wikipedia.org/wiki/Locale http://www.mcf.org.mm/unicode/ http://download.microsoft.com/download/2/d/a/2daed6fd-9876-4894-92c2-4ffc51ce5c1a/collationintro-current.ppt http://www.microsoft.com/typography/developers/uniscribe/ http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3043.pdf http://www.unicode.org/charts/PDF/U1000.pdf Thanks! william.wlk@gmail.com:  Thanks! william.wlk@gmail.com Any Questions!

Add a comment

Related presentations

Related pages

Prinz William - Kate & William stolze Eltern - News von ...

Prinz William im Themenspecial. "Die Welt" bietet Ihnen aktuelle News, Bilder und Videos zum britischen Prinzen William, Ehemann von Prinzessin Kate.
Read more

Aktuelles & News über Promis, Kurioses und Kriminalität ...

Prinz William hat zwei Wochen Urlaub von der Royal Air Force. Nun starten er und seine Frau Kate in die Flitterwochen – ein Hamburger will wissen, wohin.
Read more

William, Duke of Cambridge – Wikipedia

HRH Prince William Arthur Philip Louis, Duke of Cambridge, KG, KT (* 21. ... Ludwig Schubert u. Rolf Seelmann-Eggebert: Europas Königskinder. vgs 1999.
Read more

Hochzeit von William Mountbatten-Windsor und Catherine ...

Brautpaar. Der Bräutigam ist HRH William, Duke of Cambridge, Prinz von Großbritannien und Nordirland und die Braut Catherine Elizabeth „Kate“ Middleton.
Read more

Herzogin Kate von Cambridge | News zu Kate Middleton bei ...

April 2011 mit Prinz William von England verheiratet und trägt den Adelstitel „Herzogin von Cambridge“. Außerdem gilt sie als modisches Vorbild: ...
Read more

Prince William And Kate Middleton - Facebook

Prince William And Kate Middleton. 1,354,151 likes · 36,200 talking about this. Facebook Page fans of William and Kate
Read more

Williams-Immobilien.de - Möblierte Zimmer und Apartments ...

günstige möblierte Zimmer und möblierte Wohnungen im Zentrum von Bremen. Vermietung von Ute Williams in Bremen
Read more

William Hill™ - Bet £10 Get £20

Online betting and gambling at William Hill, the world's biggest bookmaker. Visit us now for sports betting, poker games, online casino, bingo and Vegas games.
Read more

William Simon U'Ren - Wikipedia, the free encyclopedia

William Simon U'Ren; Member of the Oregon House of Representatives; In office 1897–1898: Constituency: Clackamas County: Personal details; Born (1859-01 ...
Read more

William Hill Online Sportwetten | Fußballwetten | Beste ...

Fußballwetten mit den besten Quoten. Tennis, Basketball & vieles mehr bei einem der ältesten & erfahrensten Buchmacher: Online Sportwetten mit William Hill!
Read more