SWP - A Generic Language Parser

57 %
43 %
Information about SWP - A Generic Language Parser

Published on October 7, 2007

Author: kamaelian

Source: slideshare.net


This talk was part tongue in cheek, part serious, but entirely fun and given twice as a lightning talk - once at Europython & once at the ACCU python uk 05. It presents a generic python like language parser which does actually work. Think of it as an alternative to brackets in Lisp!

“SWP” A Generic Language Parser (Gloop?) (SWP == Semantic Whitespace Parser for want of a better name) Michael Sparks

Parse Anything Got bored of seeing “use Prothon”... “no” ● Hacking python to add a keyword whilst trivial ● wasn't trivial enough Got bored of seeing “use prothon's replacement” ● Thought it might be a fun thing to try ● Got very bored of seeing “use the replacement for ● prothon's replacement” etc ●

Parse Anything Parse this: def displayResult(result,quiet): if not quiet: print quot;The result of parsing your program:quot; print result print if not result: print quot;Rule match/evaluation orderquot; for rule in r: print quot; quot;, rule end end else: if result is None: print quot;Parse failedquot; else: print quot;Successquot; end end end

Parse Anything # Parse this: # Sample logo like language using the parser # shape square: pen down repeat 4: forward 10 rotate 90 end pen up end repeat (360/5): square() rotate 5 end

Parse Anything Parse this: # # Example based on defining grammars for L-Systems. # OBJECT tree L_SYSTEM: ROOT G RULES: G -> T { G } { A G } { B G } { CG} (0.00 .. 0.15) G -> T { A B G } { B A G } { C AG} (0.15 .. 0.30) G -> T { A C G } { B B G } { C BG} (0.30 .. 0.45) G -> T { A A G } { B C G } { C CG} (0.45 .. 0.60) G -> T { A G } { C G } (0.70 .. 0.80) G -> T { A G } { B G } (0.80 .. 0.95) G -> T { A G } (0.95 .. 1.00) T -> T (0.00 .. 0.75) ENDRULES ENDOBJECT

Parse Anything Parse this: # # An SML-like language using this parser. # structure Stk = struct : exception EmptyStack_exception datatype 'x stack = EmptyStack | push of ('x * 'x stack) fun pop(push(x,y)) = y fun pop EmptyStack = raise EmptyStack_exception fun top(push(x,y)) = x fun top EmptyStack = raise EmptyStack_exception end

Parse Anything, etc EXPORT OBJECT person: PRIVATE: flat name, telephone address::PTR TO LONG telephone ENDATTRS ENDOBJECT PROC compare_address(address1::PTR TO LONG, address2::PTR TO LONG): # Returns *TRUE* if the address2 exists _inside address1 DEF result=TRUE, f FOR f:=0 TO 5: IF address2[f]: IF Not(((StrLen address2[f])==0) AND ((StrLen address1[f])==0)): # The following line incorrectly(?) says that a # NULL string does not exist inside a NULL string. # The IF above corrects this result:=result AND ( ((InStr address1[f],address2[f])<>-1) OR ((StrLe ENDIF ENDIF ENDFOR ENDPROC result

Parse This?! OBJECT tree L_SYSTEM: ROOT G structure Stk = struct : exception EmptyStack_exception if (__name__ == quot;__main__quot;): datatype 'x stack = EmptyStack | push of ('x * 'x import sys stack) assign lexonly False shape square: assign trace False repeat 4: for fields in using query: forward 10 SELECT fname, lname, t.phone, tsite.name : rotate 90 FROM tcontact, tsite end WHERE table_contact.objid = quot;CONTIDquot; end AND table_site.objid = quot;SITEIDquot; end ENDSELECT RULES: endfor G -> T { A G } { C G } (0.70 .. 0.80) if sys.argv[1]: G -> T { A G } { B G } (0.80 .. 0.95) assign source open(sys.argv[1]).read() G -> T { A G } (0.95 .. 1.00) else: ENDRULES assign source quot;junkquot; ENDOBJECT end end

Parsed! ['program', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'OBJECT'], ['factorlist', ['factorlist', ['factorlist', ['ID', 'tree']], ['trailedfactor', ['ID', ● 'L_SYSTEM'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'ROOT'], ['factorlist', ['ID', 'G']]]]], ['statement_list', ['assignment', '=', ['explist', ['functioncall', ['ID', 'structure'], ['factorlist', ['ID', 'Stk']]]], ['explist', ['functioncall', ['trailedfactor', ['ID', 'struct'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'exception'], ['factorlist', ['ID', 'EmptyStack_exception']]]]], ['statement_list', ['assignment', '=', ['explist', ['functioncall', ['ID', 'datatype'], ['factorlist', ['factorlist', ['ID', quot;'xquot;]], ['ID', 'stack']]]], ['explist', ['infixepr', '|', ['ID', 'EmptyStack'], ['explist', ['functioncall', ['ID', 'push'], ['factorlist', ['factorlist', ['ID', 'of']], ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '*', ['ID', quot;'xquot;], ['explist', ['functioncall', ['ID', quot;'xquot;], ['factorlist', ['ID', 'stack']]]]]]]]]]]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'shape'], ['factorlist', ['factorlist', ['trailedfactor', ['ID', 'square'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'repeat'], ['factorlist', ['factorlist', ['trailedfactor', ['number', 4], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'forward'], ['factorlist', ['number', 10]]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'rotate'], ['factorlist', ['number', 90]]]]]]]]]]], ['ID', 'end']]]]]]]]]], ['ID', 'end']]]]]]]]]]], ['factorlist', ['ID', 'end']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['trailedfactor', ['ID', 'RULES'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['infixepr', '->', ['ID', 'G'], ['explist', ['functioncall', ['ID', 'T'], ['factorlist', ['factorlist', ['factorlist', ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'A'], ['factorlist', ['ID', 'G']]]]]]], ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'C'], ['factorlist', ['ID', 'G']]]]]]], ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '..', ['dottedfactor', ['number', 0], ['attribute', ['number', 70]]], ['explist', ['expression', ['dottedfactor', ['number', 0], ['attribute', ['number', 80]]]]]]]]]]]]]]], ['statement_list', ['exprstatement', ['explist', ['infixepr', '->', ['ID', 'G'], ['explist', ['functioncall', ['ID', 'T'], ['factorlist', ['factorlist', ['factorlist', ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'A'], ['factorlist', ['ID', 'G']]]]]]], ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'B'], ['factorlist', ['ID', 'G']]]]]]], ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '..', ['dottedfactor', ['number', 0], ['attribute', ['number', 80]]], ['explist', ['expression', ['dottedfactor', ['number', 0], ['attribute', ['number', 95]]]]]]]]]]]]]]], ['statement_list', ['exprstatement', ['explist', ['infixepr', '->', ['ID', 'G'], ['explist', ['functioncall', ['ID', 'T'], ['factorlist', ['factorlist', ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'A'], ['factorlist', ['ID', 'G']]]]]]], ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '..', ['dottedfactor', ['number', 0], ['attribute', ['number', 95]]], ['explist', ['expression', ['dottedfactor', ['number', 1], ['attribute', ['number', 0]]]]]]]]]]]]]]]]]]]]], ['factorlist', ['ID', 'ENDRULES']]]]]]]]]]]], ['ID', 'ENDOBJECT']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'if'], ['factorlist', ['factorlist', ['trailedfactor', ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '==', ['ID', '__name__'], ['explist', ['expression', ['string', '__main__']]]]]]], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'import'], ['factorlist', ['ID', 'sys']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'assign'], ['factorlist', ['factorlist', ['ID', 'lexonly']], ['ID', 'False']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'assign'], ['factorlist', ['factorlist', ['ID', 'trace']], ['ID', 'False']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'for'], ['factorlist', ['factorlist', ['factorlist', ['factorlist', ['factorlist', ['ID', 'fields']], ['ID', 'in']], ['ID', 'using']], ['trailedfactor', ['ID', 'query'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'SELECT'], ['factorlist', ['ID', 'first_name']]], ['explist', ['expression', ['ID', 'last_name']], ['explist', ['expression', ['dottedfactor', ['ID', 'table_contact'], ['attribute', ['ID', 'phone']]]], ['explist', ['expression', ['ID', 'e_mail']], ['explist', ['functioncall', ['dottedfactor', ['ID', 'table_site'], ['attribute', ['trailedfactor', ['ID', 'name'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'FROM'], ['factorlist', ['ID', 'table_contact']]], ['explist', ['expression', ['ID', 'table_site']]]]], ['statement_list', ['assignment', '=', ['explist', ['functioncall', ['ID', 'WHERE'], ['factorlist', ['dottedfactor', ['ID', 'table_contact'], ['attribute', ['ID', 'objid']]]]]], ['explist', ['expression', ['string', '<CASECONTACTID>']]]], ['statement_list', ['assignment', '=', ['explist', ['functioncall', ['ID', 'AND'], ['factorlist', ['dottedfactor', ['ID', 'table_site'], ['attribute', ['ID', 'objid']]]]]], ['explist', ['expression', ['string', '<CASESITEID>']]]]]]]]]]]], ['factorlist', ['ID', 'ENDSELECT']]]]]]]]]]]]]], ['ID', 'endfor']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'if'], ['factorlist', ['factorlist', ['factorlist', ['dottedfactor', ['ID', 'sys'], ['attribute', ['trailedfactor', ['trailedfactor', ['ID', 'argv'], ['bracketedtrailer', ['explist', ['expression', ['number', 1]]]]], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'assign'], ['factorlist', ['factorlist', ['factorlist', ['ID', 'source']], ['ID', 'open']], ['dottedfactor', ['bracketedexpression', ['bracketedexpression', ['explist', ['expression', ['dottedfactor', ['ID', 'sys'], ['attribute', ['trailedfactor', ['ID', 'argv'], ['bracketedtrailer', ['explist', ['expression', ['number', 1]]]]]]]]]]], ['methodcall', 'read', ['bracketedexpression', None]]]]]]]]]]]]]], ['trailedfactor', ['ID', 'else'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'assign'], ['factorlist', ['factorlist', ['ID', 'source']], ['string', 'junk']]]]]]]]]], ['ID', 'end']]]]]]]]]]]]]], ['ID', 'end']]]]]]]]

Grammar (SLR) program -> block block -> BLOCKSTART statement_list BLOCKEND statement_list -> statement* statement -> (expression | expression ASSIGNMENT expression | ) EOL expression -> oldexpression (COMMA expression)* oldexpression -> (factor [factorlist] | factor INFIXOPERATOR expression ) factorlist -> factor* factor factor -> ( bracketedexpression | constructorexpression | NUMBER | STRING | ID | factor DOT dotexpression | factor trailer | factor trailertoo ) dotexpression -> (ID bracketedexpression | factor ) bracketedexpression -> BRA [ expression ] KET constructorexpression -> BRA3 [ expression ] KET3 trailer -> BRA2 expression KET2 trailertoo -> COLON EOL block

Notes Just uses a slightly modified PLY (1.5) ● All of the examples are parseable by the same ● parser – no changes to the lexer or parser. Just spits out a syntax tree ● Treats everything as a function ●

Everything's a function This is a function: ● if bar(bibble=>baz): bla bla bla bingle bongle else: babble babble this = bing Parsed as: Call function “if” with the arguments: ● bar(bibble=>baz), codeblock, “else”, codeblock, “endif”

Where...? http:///www.cerenity.org/SWP-0.0.0.tar.gz ● http://www.cerenity.org/SWP/ ● I'd be curious to see someone put a lisp back end ● on it :-) Actually no, don't do that, someone might use this – then...

Add a comment

Related pages

java - How to write a Generic Log Parser - Stack Overflow

How to write a Generic Log Parser. ... AWStats is a great log parser, ... English Language & Usage; Skeptics; Mi Yodeya ...
Read more

pyxie 0.0.13 : Python Package Index

... I made a generic language parser which I called SWP (semantic whitespace parser), or Gloop. * https://github.com/sparkslabs/minisnips/tree/master/SWP ...
Read more

How to write a generic XML Parser in C#

How to write a generic XML Parser in C#. ... Question 19 4/2/2009 5:54:02 PM 6/21/2012 4:33:45 PM Issues regarding the C# language and ...
Read more

language design - Should I use a parser generator or ...

Should I use a parser generator or should I roll my own custom lexer and parser ... Perhaps your language is very weird and parsers reject your grammar or ...
Read more

kamaelian - HubSlide

SWP - A Generic Language Parser This talk was part tongue in cheek, part serious, but entir...
Read more

The Stanford Natural Language Processing Group

The Stanford Parser: A ... The package includes a tool for scoring of generic dependency ... Ruby wrapper to the Stanford Natural Language Parser.
Read more

Language (Oracle Fusion Middleware Java API Reference for ...

oracle.javatools.parser.generic Class Language java.lang.Object oracle.javatools.parser.generic.Language All Implemented Interfaces: GenericTokens.
Read more

Swp | LinkedIn

View 9109 Swp posts, presentations, experts, and more. Get the professional knowledge you need on LinkedIn. LinkedIn Home What is LinkedIn? Join Today
Read more