Total Pageviews

Saturday, 17 March 2012

An introduction to lex and yacc

hi, i m posting here about lex and yacc ,as per my knowledge i m posting here whatever i have understood if there are any mistakes please excuse me,

let us begin our understanding with the definition of lex

what is lex?
lex- is a lexical analyzer generator 
 Lex is a program generator designed for lexical processing of character input streams .It accepts a high-level, problem oriented specification for character string matching, and produces a program in a general purpose language which recognizes regular expressions. The regular expressions are specified by the user in the source specifications given to Lex. The Lex written code recognizes these expressions in an input stream and partitions the input stream into strings matching the expressions

to put it in more simpler way we can define it as,

lex is a program generator that generates lexical analyzers.
the main job of lexical analysers is to break up an input stream into more usable elements(tokens)
example:  a=b+c*d;

             compilation sequence of above expression in lex is as shown below 
 


now by looking at above example we can say 
Lexical analysers tokenise input streams
tokens are the terminals of a language,we use regular expressions to define these tokens/terminals

we can notice that as we go on understanding lex we are redifining it's definition ,
now we will give more simpler definition which will suit our context of use in the ss lab

lex is a program (generator) that generates lexical analyzers, (widely used on Unix).
It is mostly used with Yacc parser generator.
Written by Eric Schmidt and Mike Lesk.
It reads the input stream (specifying the lexical analyzer ) and outputs source code implementing the lexical analyzer in the C programming language.
Lex will read patterns (regular expressions); then produces  C code for a lexical analyzer that scans for identifiers

i think we hv already spoken more about lex now move to how to write programs for lexical analyser

before we start writing source programs for lexical analyser we should have some knowledge about regular expressions so first i will give some examples of regular expression and then begin begin with lex

example of regular expression  

the symbols used while writing regular expressions and there meaning are described below
EXPRESSION                                         MEANING
  1. abc*                           "ab" followed by zero or any number of   "c"  ex: ab abc abcc abccc... etc.
  2. abc+                            "ab" followed by atleast one or more no. of "c"  ex:  abc abcc ....  etc
  3. a(bc)+                          "a" followed by atleast one or more "bcex: abc abcbc abcbcbc... etc
  4. a(bc) ?                         "a" followed by zero or one "bc"   ex: a abc (it can accept only these two)
  5. [abc]                             either "a" or "b" or "c" i.e one out of a,b,c 
  6. [a-z]                              any letter between "a-z" ex: a,b,c,d...............,x,y,z
  7. [a\-z]                            either "a" or "-" or "z" i.e. (one of a,-,z)  here we are using backslash                                                because   if we give it as [a-z] it means any letter between "a to z" 
  8. [-az]                              this is another way of defining above expression it means one   outof-,a,z 
  9. [^ab]                           anything except "ab"
  10.   a|b                             either "a" or "b" 
 i hope above examples convey the meanings of regular expressions easily 
i will post how to write programs for lex and how they work in next post
asdsadsaddasdsdsadsadasdsadsadsdsdsadsadsad<strike>cvzxcdsfdsfdf<font color="#444444">sadsad</font></strike>lex and yacc

No comments:

Post a Comment