Advanced shell scripting question :-)

Derek D. Martin ddm+gnhlug at pizzashack.org
Thu Sep 5 13:29:18 EDT 2002


At some point hitherto, Bob Bell hath spake thusly:
> > The problem is this: I expect that when data is written to a pipe, that 
> > the the order is preserved. Using this construct, the order is *not* 
> > preserved. So on a p that produces many kilobytes of spew, I get big 
> > chunks in the BOTH file which are from each seperate channel.
> 
>     I think what you are see is that p is writing to the pipe to "tee
> BOTH", and "tee ERR" is writing to the pipe to "tee BOTH".  Due to
> scheduling between the two processes, buffering, etc., ordering is not
> guaranteed.

Seems similar to (but different from) the problem that I ran into...

As a programming exercise, I decided to implement the algorithm I
outlined (code attached).  It mostly works, except for 2 points:

  - the output to the screen is in the wrong order
  - the output of both streams to a physical file is totally jumbled

I used the following test program to test it:

-=-=-=-

#include <stdio.h>

int main(void)
{
    int i;
    
        for (i = 0; i < 10; i++)
            printf("This is standard output! %d\n", i);
        for (i = 10; i < 15; i++){
            printf("This is standard output! %d\n", i);
            fprintf(stderr, "this is standard error! %d\n", i);
        }
    return 0;
}

-=-=-=-

The screen output looked like this:

-=-=-=-

$ ./redir of=junk1 ef=junk2 bf=junk3 ./tprg
This is standard output! 0
this is standard error! 10
This is standard output! 1
this is standard error! 11
This is standard output! 2
This is standard output! 3
this is standard error! 12
this is standard error! 13
This is standard output! 4
this is standard error! 14
This is standard output! 5
This is standard output! 6
This is standard output! 7
This is standard output! 8
This is standard output! 9
This is standard output! 10
This is standard output! 11
This is standard output! 12
This is standard output! 13
This is standard output! 14

-=-=-=-

If it's not obvious, 'of=' sets the file to send only stdout to; 'ef='
sets the file to send only stderr to; and 'bf=' sets the file to send
both to.  The program prints input from the subprocess's stdout
descriptor to stdout, and likewise with stderr, so both always appear
to the screen.

The bf file is a total mess; using select() does not allow you to say
"give me the input from each file descriptor, in the order you
received it" as far as I can tell...  And because of this, there's no
way (AFAICT) to order your output to your output file in the same
order it came in.  However if the output to the screen is
satisfactory (despite being out of order), one could duplicate it in a
file using the normal "> file 2> file" and then "tail -f file" to get
the output to the screen.

What I'm wondering is: there's got to be a way to do this, such that
the output is syncronized to the order of the input, doesn't there?
The operating system does (more or less) this all the time, whenever
it sends stdout and stderr to your terminal.

Any I/O gurus out there know by what mechanism this could be achieved?
I'm now very curious...  It seems to me this probably not the only
sort of application where you might want to process the input from
several descriptors in the order it was received.

-- 
Derek Martin               ddm at pizzashack.org    
---------------------------------------------
I prefer mail encrypted with PGP/GPG!
GnuPG Key ID: 0x81CFE75D
Retrieve my public key at http://pgp.mit.edu
Learn more about it at http://www.gnupg.org
-------------- next part --------------
/*
 * redir.c - generic redirection of stdout and stderr
 * copyright 2002 Derek D. Martin <ddm at pizzashack.org>
 *
 * This program is licensed under a BSD-style license, as follows: 
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
 * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
 * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
 * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/time.h>
#include <sys/stat.h>
#include <fcntl.h>

#define O_OUTFILE  1
#define O_ERRFILE  (1<<1)
#define O_BOTHFILE (1<<2)

#define BUF_SZ 40

/* RDF = ReDir Files */
typedef struct redir_files{
    char   *outfile;    /* stdout goes here */
    char   *errfile;    /* stderr goes here */
    char   *bothfile;   /* both goes here */
    FILE   *of;         /* outfile file */
    FILE   *ef;         /* err */
    FILE   *bf;         /* both - stdout only */
} RDF;


void usage(void);
int process_args( int argc, char **argv, RDF *rdf );
int process_input( int outfd, int errfd, fd_set *inputs, RDF rdf );
int readp( int desc, void *buff, size_t count );
int writep( int desc, void *buff, size_t count );


int  options;              /* bit field representing options used */

int main( int argc, char **argv )
{
    fd_set inputs;              /* fd_set for reading from child */
    RDF    rdf;                 /* output file names and descriptors */
    int    pid;                 /* for fork() */
    int    prog_index;          /* argv[prog_index] is the program to run */
    int    pipe1[2];            /* for stdout from program */
    int    pipe2[2];            /* for stderr from program */
    int    status;              /* for return values from I/O functions */

    rdf.outfile = NULL;
    rdf.errfile = NULL;
    rdf.bothfile = NULL;

    /* find out what the user wants */
    prog_index = process_args(argc, argv, &rdf);

    /* 
     * Open the various files.  We could use a method to specify that we want
     * to append to the various files, rather than to overwrite them...
     */
    if ( options & O_OUTFILE ){
        rdf.of = fopen(rdf.outfile, "w");
        if ( rdf.of == NULL ){
            fprintf(stderr, "redir: couldn't open of\n");
            exit(2);
        }
        setlinebuf(rdf.of);
    }
    if ( options & O_ERRFILE ){
        rdf.ef = fopen(rdf.errfile, "w");
        if ( rdf.ef == NULL ){
            fprintf(stderr, "redir: couldn't open ef\n");
            exit(2);
        }
        setlinebuf(rdf.ef);
    }
    if ( options & O_BOTHFILE ){
        rdf.bf = fopen(rdf.bothfile, "w");
        if ( rdf.bf == NULL ){
            fprintf(stderr, "redir: couldn't open bout\n");
            exit(2);
        }
        setlinebuf(rdf.bf);
    }

    /* open our happy pipes */
    if ( (pipe(pipe1) == -1) ){
        fprintf(stderr, "redir: couldn't open stdout pipe\n");
        exit(3);
    }
    if ( pipe(pipe2) == -1 ){
        fprintf(stderr, "redir: couldn't open stdout pipe\n");
        exit(3);
    }

    pid = fork();

    if ( !pid ){

        /* in the child */
        int chld_stderr;        /* save stderr */

        chld_stderr = dup(2);

        /* redirect stdout and stderr */
        dup2(pipe1[1],1);
        dup2(pipe2[1],2);

        /* close the half of the pipes we don't need */
        close(pipe1[0]);
        close(pipe2[0]);

        /* call our happy program */
        execvp( argv[prog_index], &argv[prog_index]);

        /* if we get here, the exec failed */
        dup2(chld_stderr, 2);

        fprintf(stderr, "redir: execvp() of %s failed!\n", argv[prog_index]);
        exit(4);

    }
    else{

        /* in the parent -- where the real fun happens */

        /* set line buffered mode on stdout and stderr */
        setlinebuf(stdout);
        setlinebuf(stderr);

        /* we don't need the write end of the pipe in the parent */
        close(pipe1[1]);
        close(pipe2[1]);

        /* set up fd_set */
        FD_ZERO(&inputs);
        FD_SET(pipe1[0], &inputs);
        FD_SET(pipe2[0], &inputs);

        /* get input from the child */
        while (1){
            status = process_input(pipe1[0], pipe2[0], &inputs, rdf);
            if ( status ) exit(0);
        }
    }

    /* this is really not needed */
    return 0;
}


/* process_args()
 *
 * Process the command line options, setting file vars to the value of their
 * parameters.   Returns the index of argv where the parameters don't match a
 * file arg
 */
int process_args( int argc, char **argv, RDF *rdf )
{

    int flag;
    int index;

    if ( argc < 2 ){
        usage();
        exit(1);
    }

    /* skip argv[0] */
    argc--;
    argv++;
    index = 1;

    /* check for keywords in cmd-line args */
    while ( argc ){

        flag = 0;

        if ( (strncmp("of=", *argv, 3) == 0) ){
            if ( strlen(*argv) < 4 )
                usage();
            rdf->outfile = strdup((*argv)+3);
            options |= O_OUTFILE;
            flag = 1;
        }
        if ( (strncmp("ef=", *argv, 3) == 0) ){
            if ( strlen(*argv) < 4 )
                usage();
            rdf->errfile = strdup((*argv)+3);
            options |= O_ERRFILE;
            flag = 1;
        }
        if ( (strncmp("bf=", *argv, 3) == 0) ){
            if ( strlen(*argv) < 4 )
                usage();
            rdf->bothfile = strdup((*argv)+3);
            options |= O_BOTHFILE;
            flag = 1;
        }

        if ( !flag ) return index;

        argc--;
        argv++;
        index++;
    }

}

void usage(void)
{
    /* doesn't get much more terse than this! */
    printf("Bad command args\n");
    exit(1);
}


/* this probably needs to catch SIGPIPE */
int process_input( int outfd, int errfd, fd_set *inputs, RDF rdf )
{
    char   inbuf[BUF_SZ];       /* input buffers for reading from prog */
    char   errbuf[BUF_SZ];
    int    status, rc;

    /* call select to wait indefinitely for input from the program */
    if ( (status = select(errfd + 1, inputs, NULL, NULL, NULL)) == -1 ){
        fprintf(stderr, "redir: error occured in select\n");
        exit(5);
    }

    /* check to see if we have fd's to read */
    if ( status ){

        /* now see if stdout was one of them */
        if ( FD_ISSET(outfd, inputs) ){
            memset(inbuf, 0, BUF_SZ);
            if ( (rc = readp(outfd, inbuf, BUF_SZ - 1)) == -1 )
                fprintf(stderr, "redir: error reading from stdout pipe\n");

            /* figure out where to print stdout to, and do it */
            if ( rc ){
                if ( options & O_OUTFILE )
                    if ( ( fprintf(rdf.of, "%s", inbuf)) < 1 )
                        fprintf(stderr, "redir: write error on outfile\n" );

                if ( options & O_BOTHFILE )
                if ( ( fprintf(rdf.bf, "%s", inbuf) ) < 1 )
                    fprintf(stderr, "redir: write error on bothfile\n" );

                /* always print to stdout */
                printf("%s", inbuf);
            }
        }

        /* now see if stderr was one of them */
        if ( FD_ISSET(errfd, inputs) ){
            memset(errbuf, 0, BUF_SZ);
            if ( (rc = readp(errfd, errbuf, BUF_SZ - 1)) == -1 )
                fprintf(stderr, "redir: error reading from stdout pipe\n");

            /* figure out where to print stdout to, and do it */
            if ( rc ){
                if ( options & O_ERRFILE )
                    if ( ( fprintf(rdf.ef, "%s", errbuf)) < 1 )
                        fprintf(stderr, "redir: write error on errfile\n" );

                if ( options & O_BOTHFILE )
                if ( ( fprintf(rdf.bf, "%s", errbuf) ) < 1 )
                    fprintf(stderr, "redir: write error on bothfile\n" );

                /* always print to stderr */
                fprintf(stderr, "%s", errbuf);
            }
        }
   }

    FD_SET(outfd, inputs);
    FD_SET(errfd, inputs);

    return 0;
}



int readp( int desc, void *buff, size_t count )
{
    size_t  total_read;      /* number of bytes read so far */
    size_t  chars_read;      /* bytes read in one read */
    size_t  chars_needed;    /* number of bytes left to read */
    char    *buff_tail;      /* pos in buff to put read chars */

    total_read = 0;
    buff_tail = buff;
 
    /* loop until we get what we want or get to EOF */
    while ( total_read < count ){

        /* determine how many chars we need to read */
        chars_needed = count - total_read;

        /* try to read all of them */
        if ( ( chars_read = read(desc, buff_tail, chars_needed) ) == -1 ){
        	buff_tail = '\0';
        	return -1;
        }

        /* are we at the end? */
        if ( chars_read == 0 ){
        	buff_tail = '\0';
        	return total_read;
      }

        /* if we didn't get 'em all, see how many we still need */
        if ( chars_read < chars_needed ){
        	buff_tail += chars_read;
        	chars_needed -= chars_read;
        }

        total_read += chars_read;
    }
    return total_read;

}
      

int writep( int desc, void *buff, size_t count )
{

    size_t  total_written;     /* number of bytes written so far */
    size_t  chars_written;     /* bytes written in one write */
    size_t  chars_togo;        /* number of bytes left to write */
    char    *buff_tail;        /* pos in buff to put read chars */
    total_written = 0;
    buff_tail = buff;
 
    /* loop until we get what we want or get to EOF */
    while ( total_written < count ){

        /* determine how many chars we need to write */
        chars_togo = count - total_written;

        /* try to write all of them */
        if ( ( chars_written = write(desc, buff_tail, chars_togo) )  == -1 ){
        	buff_tail = '\0';
        	return -1;
        }

        /* if we didn't get 'em all, see how many we still need */
        if ( chars_written < chars_togo ){
        	buff_tail += chars_written;
        	chars_togo -= chars_written;
        }

        total_written += chars_written;

    }
    return total_written;

}


More information about the gnhlug-discuss mailing list