21 messages in com.perforce.jamming[jamming] Re: Whitespace As Delimiter...
FromSent OnAttachments
Vladimir Prus24 Jul 2001 08:40 
Arnt Gulbrandsen31 Jul 2001 02:23 
Roesler, Randy01 Aug 2001 12:48 
Glen Darling01 Aug 2001 22:23 
Arnt Gulbrandsen02 Aug 2001 03:17 
Roger Lipscombe02 Aug 2001 04:01 
Arnt Gulbrandsen02 Aug 2001 04:09 
Arnt Gulbrandsen02 Aug 2001 04:23 
David Abrahams02 Aug 2001 07:37 
Roesler, Randy02 Aug 2001 21:14 
Roesler, Randy02 Aug 2001 21:31 
Glen Darling02 Aug 2001 22:52 
Paul Haffenden03 Aug 2001 01:57 
David Abrahams03 Aug 2001 06:22 
Arnt Gulbrandsen03 Aug 2001 06:49 
David Abrahams03 Aug 2001 07:03 
Roesler, Randy03 Aug 2001 11:34 
Roesler, Randy03 Aug 2001 11:42 
David Abrahams" <david.abrahams@rcn.com (David Abrahams)03 Aug 2001 16:24 
Glen Darling06 Aug 2001 19:01 
Arnt Gulbrandsen07 Aug 2001 03:12 
Subject:[jamming] Re: Whitespace As Delimiter -- Yuk!
From:Roesler, Randy (rroe@mdsi.bc.ca)
Date:08/03/2001 11:34:48 AM
List:com.perforce.jamming

Comments on your comments ...

-----Original Message----- From: Glen Darling [mailto:gdar@cisco.com] Sent: Thursday, August 02, 2001 10:53 PM To: Roesler, Randy Cc: 'David Abrahams'; jamm@perforce.com; Roesler, Randy Subject: RE: [jamming] Re: Whitespace As Delimiter -- Yuk!

Hi Randy,

At 09:14 PM 08/02/2001 -0700, Roesler, Randy wrote:

I don't agree that whitespace as a delimiter is a bad thing. Careful design can build a lexer and parser that can handle the Jam langauge and give reasonable feedback about errors in parsing. Look at Doc++, which bsically parses C++ in lex. If it can do that, we can surely handle little things like $(whatever) stuff.

I think that parsing C++ with lex is easier than parsing jam with lex. Whitespace is irrelevant to token separation in C++.

You should try to read the Doc++ lex grammer ... its a full state machine in itself. It does not always ignore white space. It counts qoutes, "{", "(", switches state as it steps into and out of structs, classes, functions, etc.

Also, consider the following additional example which makes normal scanning of jam awkward: rule foo { ... } actions foo { ... } To a scanner, these character sequences look essentially the same, but they must be tokenized very differently. The {} after "rule" are just two tokens delimiting a block. Everything inside the block has jam syntax and must be scanned/parsed. The {} after "actions" mean something very different to the scanner. They essentially delimit a character string, which cannot be parsed by jam (in fact the grammar it uses in there is not even known at this time, it could be Bourne Shell syntax or C-Shell syntax or anything). This requires the scanner to be modal -- doable, but yuk. There is nothing like this nastiness in C++.

Making ";" into a special character, at least in certain contexts would be good. But missing ";" are more of an issue than forgetting to place a whitespace before the ";".

They are essentially the same issue for us. They cause spurious parameters to get attached onto the end of the rule whose termination has been compromised. We manage this kind of stuff by a lot of parameter checking code. E.g., if $(4) { EXIT "too many parameters to rule foo." ; } But when the last parameter is a list of arbitrary length as it often is, this doesn't help.

Sorry ... A Rule a b c : d e f : g h i ;

Has 3 parameters ... any parameter can have any number of elements.

Do you have rules like

rule X { local i ;

for i in ( 1 2 3 4 5 6 7 8 9 ) { if $($(i)) { } } }

To make parsing (or execution) more fool-proof, it would be nice if rules/actions could specify the number of arguments they expect (the number of ":").

You might do something your self with ...

if $3 { EXIT ; }

as a type of assertion that the rule was called with 1 or 2 arguments.

Yup we do this now.

Thinks that would have helped me are ...

a) globing b) functions (the current [] notation} c) substitution (like ksh's $(var#) $(var~) or perls =~ )

All of the above would be nice. B and c are in ftjam already though, aren't they?

d) ability to read variables attached to objects

This one is ABSOLUTELY ESSENTIAL!!!! It is so painful not having this.

e) hiding rules (so they could not be called outside of a lexical context)

Excellent. C-like file scope would be sufficient for my purposes.

f) "simple" whats out of date compared to X (makes -d, I believe) (ie, why an I building X)

Okay, would be nice.

g) rule dependencies (like Sun's make, it remembers what command(s) it ran last time for a target, and if the list changes, it assumes that target is out of date)

I don't understand the above.

Sun's make did two things I really like.

1) they are "integrated" with the compilers using an normally undocument (or it use to be undocument) command line arguement.

Each time a file was compiled, a file .make.dependencies, I believe, is updated by the compiler (or make looking at special compiler output). The file lists for each target (.o file) which files were read during the compilations.

2) make had a file called .make.state (I believe) which was also maintained.

Each time a target was built, the make state would be updated. The next you called make, make compared the exact commands lines for the previous run, and if they differ, would assume the target needs to be rebuilt.

For example, make a.o might call the C++ compiler and the command lines would be stored in the make state.

a.o: CPP -o a.o a.cpp

Now you want to turn debugging on. export CXXFLAGS=-g; make a.o

Make says "last time I did "CPP -o a.o a.cpp", but this time the macro expands to "CPP -g -o a.o a.cpp", thats different, therefore, a.o needs to be rebuilt.

Notice, the user does not need to touch or remove anything to get this to happen.

h) knowledge that objects that are in a archive are in the archive (right now, you can have a different TARGET on a library and a library member, and if so, the behaviour is weird)

I don't think we have encountered this problem. Maybe because our use of static libraries is minimal.

i) Collapse multiple, identical "actions" on the same target. (same targets, same sources) (but I can think of several problems with this, not the least of which, jam might end up spining during rule invocation a lot of rule writers were not careful)

I have been following the current thread on this, but it is not an issue for us. We force developers to either: - spin out a library for the duplicate code, or - use a CopyFile rule to copy the source, to separate the builds. We also try to discourage the use of either technique and when this comes up we look for alternatives.

This is also not an issue for us. By you must admit that something like

Main A : a.cpp c.cpp ; Main B : b.cpp c.cpp ;

would be nice if c.cpp was compiled to c.o only once. The above style is more deductive than forcing the user to build a library simply to avoid a second compile. In this case, .o files would work just as well.

If fact, there are cases when you need to force an object file to be linked into an application. Now I have to do things like ..

Main A : a.cpp ; Object c.cpp ;

ExtraObjects A : c.o ;

but with grsiting, its a bit more tricky.

j) Simple way to grist a value only if it did not already have grist. (This would allow gristing to be used everywhere, for example in the C++ rule and the Objects rule. Which ever rule added grsitig first would win (unless the grist was reset with $(x:G=y))

I don't understand this one. Can't you just use this: if ! $(x:G) { x = $(x:G=foo) ; } to set grist if not set.

I guess you could say we use grist "everywhere". That is all targets, and all headers use grist to allow us to unambiguously match them up when necessary.

Your correct, a rule could be writen to grist stuff that not already gristed. But usage of the $(X:G=g) notation is so nice and easy, that you often don't build a temporary variable to get gristing sorted out.

A form like $(X:X=g) (where :G adds grist only if no grist is present) would be convent.

I'm not sure if everything should be/needs to be gristed. But gristing should be consistent.

For example, in the rules distributed with Jam, the C++ rule does not grist its inputs. The Object rules does not grist its inputs, the Objects rule does.

So, mixing C++ rules and Objects rules in the same Jamfile is risky.

C++ a.o : a.cpp ;

MainFromObjects a : a.o ;

(very possibility) refer to differ a.o files

I would not objects to jam saving dependency information someplace (g) about would basically force this. But jam would need to timestamp the dependency information so that it could invalidate it if a file was modified.

Yup. This would be necessary.

I use to wish for access to the shell during rule invocation, and that might still be useful. The fact that somebody could invoke side effects during the rule phase is something that could be lived with.

I used to really want this. Now I look at this as another "would be nice" feature. I was thinking of something like a shell or Perl backtick thing. E.g., x = `ls *.c` ; which essentially gives you your globbing too.

Glen. ==========

-----Original Message----- From: David Abrahams [mailto:davi@rcn.com] Sent: Thursday, August 02, 2001 7:38 AM To: Glen Darling Cc: jamm@perforce.com; Roesler, Randy Subject: Re: [jamming] Re: Whitespace As Delimiter -- Yuk!

My view, in brief, is that whitespace delimiting is great for top-level Jamfiles but lousy for people writing Jam rules. The underlying Jam language is rather weak for the sort of large build-system construction jobs that I'm trying to throw at it. [It probably would also help everyone to have ";" be a delimiter].

----- Original Message ----- From: "Arnt Gulbrandsen" <ar@gulbrandsen.priv.no> To: "Glen Darling" <gdar@cisco.com> Cc: <jamm@perforce.com>; "Roesler, Randy" <rroe@mdsi.bc.ca> Sent: Thursday, August 02, 2001 7:09 AM Subject: [jamming] Re: Whitespace As Delimiter -- Yuk!

Glen Darling <gdar@cisco.com>

I advocate completely gutting out the home-grown scanner from JAM, and replacing it with a nice normal extensible lex-built scanner in which whitespace is irrelevant to token separation, as it is in most programming languages. This change implies some significant language usage changes though. For example, code like this: SubDirCcFlags -DDEBUG=1 ; would have to be changed to something like this: SubDirCcFlags "-DDEBUG=1"; to avoid separating this single SubDirCcFlags parameter into a bunch of tokens (which would later be parsed by jam into an expression that jam itself would try to evaluate -- not really what is wanted

here). There are

many other things that would require change too, since

expressions like this:

$(foo)bar would be indistinguishable from expressions like this: $foo) bar when delivered as a token stream by a typical scanner. Of course these mean very different things to jam. So to code the former

you would need to

use: "$(foo)bar"

So changing the scanner the way I am advocating would

require people to use

quotes in a lot of places where they are not needed

today. Would that be a

difficult change for you and your jam users to accept?

Jam is at risk of forking. This is the sort of change that makes the fork certain. Unless David, David and the new perforce hire all agree... IIRC perforce hired someone who has jam as a large part of his job description as of August 1.

--Arnt