atom feed3 messages in org.antlr.antlr-interestRe: [antlr-interest] Using C++ types ...
FromSent OnAttachments
Jim IdleFeb 24, 2010 7:24 am 
Christopher L ConwayFeb 24, 2010 7:50 am 
Christopher L ConwayFeb 24, 2010 11:36 am 
Subject:Re: [antlr-interest] Using C++ types in an ANTLR-generated C parser
From:Jim Idle (
Date:Feb 24, 2010 7:24:03 am

Return types from rules are structs in C, unless your output is not an AST and
there is only a single return value. For various reasons, the current version of
the C target tries to auto-initialize the return values and when it does not
understand the type (well, it never understands the type), it defaults to =

You should note that I have made corrections/simplifications to the initializing
stuff in the current development version of the C target, and this problem has
gone away. For now though, you will need your work around unless you are willing
to use the bleeding edge of development by getting the latest C.stg and using
that instead of the production version.

Make sure you should always reference them via $paramname and not paramname.

Also, if you would use you could have found your answer.


-----Original Message----- From: [mailto:antlr-interest-] On Behalf Of Christopher L Conway Sent: Wednesday, February 24, 2010 7:51 AM To: Subject: [antlr-interest] Using C++ types in an ANTLR-generated C parser

I'm trying to use an ANTLR v3.2-generated parser in a C++ project using C as the output language, compiling the output as C++. I'm having trouble dealing with C++ types inside parser actions. Here's a C++ header file defining a few types I'd like to use in the parser:

/* expr.h */ enum Kind { PLUS, MINUS };

class Expr { // stub };

class ExprFactory { public: Expr mkExpr(Kind kind, Expr op1, Expr op2); Expr mkInt(std::string n); };

And here's a simple parser definition:

/* Expr.g */ grammar Expr;

options { language = 'C'; }

@parser::includes { #include "expr.h" }

@members { ExprFactory *exprFactory; }

start returns [Expr expr] : e = expression EOF { $expr = e; } ;

expression returns [Expr e] : TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN { e = exprFactory->mkExpr(k,op1,op2); } | INTEGER { e = exprFactory->mkInt((char*)$INTEGER.text->chars); } ;

builtinOp returns [Kind kind] : TOK_PLUS { kind = PLUS; } | TOK_MINUS { kind = MINUS; } ;

TOK_PLUS : '+'; TOK_MINUS : '-'; TOK_LPAREN : '('; TOK_RPAREN : ')'; INTEGER : ('0'..'9')+;

The grammar runs through ANTLR just fine. When I try to compile ExprParser.c, I get errors like

1. `conversion from 'long int' to non-scalar type 'Expr' requested` 2. `no match for 'operator=' in 'e = 0l'` 3. `invalid conversion from 'long int' to 'Kind'`

In each case, the statement is an initialization of an `Expr` or `Kind` value to `NULL`.

I can make the problem go away for the `Expr`'s by changing everything to `Expr*`. This is workable, though hardly ideal. But passing around pointers for a simple enum like `Kind` seems ridiculous. One ugly workaround I've found is to create a second return value, which pushes the `Kind` value into a struct and suppresses the initialization to `NULL`. I.e, `builtinOp` becomes

builtinOp returns [Kind kind, bool dummy] : TOK_PLUS { $kind = PLUS; } | TOK_MINUS { $kind = MINUS; } ;

and the first `expression` alternative becomes

TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN { e = exprFactory->mkExpr(k.kind,*op1,*op2); }

There has to be a better way to do things? Am I missing a configuration option to the C language backend? Is there another way to arrange my grammar to avoid this awkwardness? Is there a pure C++ backend I can use?

List: Unsubscribe: email-address

List: Unsubscribe: