cpp replaced by m4?

Ric Werme ewerme at comcast.net
Tue Jul 3 16:07:38 EDT 2007


Thomas Charron may have opened the floodgates:

>  Can we have an example of why you want to do this?

I had refrained from suggesting (somewhat tongue in cheek) that you
should sit down with the MACRO-10 manual (the assembler for the PDP-10)
and implement its macro processor.  However, since I have an opening....

One of the things sadly lacking in C is the ability to create initialized
data structures with circularly linked lists, trees, etc.  I once wrote a
macro to take a list of command names and create a tree to walk character
by character.  TOPS-10 had a module where it would take the disk drive
and file system description and create at compile time all the data structures
the system needed for them.

We also had a most wonderful I/O system (it did some things that printf can't)
that included a tiny lexical analyzer.  The heart of was a macro that
packed characters (or flags), labels, callbacks, jumps, calls, etc into
36 bit control words.  You could parse a TOPS-10 path that looked like
DEV:FILE.EXT[#,#] with this table:

	TBLBEG(FILSPC)

	PROD(	<SG>		,FILI, ,      )	;INIT FILE PARSER FLAGS
NXTATM:	PROD(	<SG>		,CALL, ,SIXSCN)	;GET A NAME
	PROD(	"_"		,NODE,*,NXTATM)	;UNDERSCORE MEANS A NODE
	PROD(	":"		,DEV ,*,NXTATM)	;COLON MEANS A DEVICE
	PROD(	"."		,NAME,*,NXTATM)	;AND PERIOD MEANS NAME
	PROD(	"["		,NAMX,*,PPNSCN)	;THEN BRACKET MEANS NAME OR EXT
	PROD(	<SG>		,NAMX, ,      )	;ANYTHING ELSE IS SAME
PPNDON:	PROD(	<SG>		,SRET, ,      )	;QUIT WHILE AHEAD

PPNSCN:	PROD(	<SG>		,GPRJ, ,      )	;GET PROJECT NUMBER
	PROD(	<SG>		,PROJ,*,      )	;SAVE PROJECT. PROJ WILL FAKE CALL
	PROD(	"]"		,    ,*,      )	;OPTIONAL CLOSE BRACKET
	PROD(	<SG>		,PROG, ,PPNDON)	;MERGE WITH PROJECT

SIXSCN:	PROD(	<BLANK>		,    ,*,.     )	;SKIP BLANKS
	PROD(	<SG>		,SIXI, ,      )	;SETUP SIXBIT PACKER
	PROD(	<LETTER!DIGIT>	,SIXS,*,.     )	;SAVE ANY ALPHANUMERICS
SKPBLA:	PROD(	<BLANK>		,    ,*,.     )	;IGNORE BLANKS
	PROD(	<SG>		,RET , ,      )	;AND RETURN

	TBLEND

The first column matched on characters or types (<SG> matched everthing)
The column with 4 character names went to local callbacks to process
individual characters or other events.  (CALL and RET note subroutine calls
and returns.) The '*' meant to discard the character after the callback, the
last field is the address for the next instruction, blank says execute the
next.

See http://pdp-10.trailing-edge.com/decuslib10-04/01/43,50347/tulip.doc.html
pages 20-22 for better documentation,
see
http://pdp-10.trailing-edge.com/tops10_tools_bb-fp64b-sb/01/10,7/nettst/netlib.mac.html
at the bottom for the code my sample came from.

One very important part of MACRO-10 programming was defining a macro like:

DEFINE	FOO <
	X(item1)
	X(item2)
>

Then define X and call FOO.  Then redefine X and call FOO.

One time might count the number of entries, another might create a circularly
linked list of data structures, etc.  People generally don't redefine C
macros, though it is possible (use #undef first).  IRP and IRPC are very
handy directives, they process indefinite repeats and step through items in
a list of characters in a name.

All in all, cpp was a big disappointment, and m4 offered so little stuff that
was familiar I never bothered to try it.

    -Ric Werme



More information about the gnhlug-discuss mailing list