[RndTbl] Ksh scripting, named pipes

Gilles Detillieux grdetil at scrc.umanitoba.ca
Tue Mar 4 11:37:45 CST 2014


Hey, Kevin.  I guess I misled you last week when I suggested two 
processes reading from the same pipe might work to implement a queue.  I 
thought the "read" calls in the script would do the right thing, but 
that appears not to be the case.

The key to avoiding race conditions between co-operating shell processes 
is to find some sort of operation that's "atomic" - that does its work 
all in one shot so it won't be interrupted mid-stream.  It appears that 
"read" is atomic at the single character level, not at the single line 
level as would be needed for this queue to work properly.  I recall now 
that on Unix/Linux systems, the link system call, or the ln command is 
good for that. If two processes try to make a hard link, of the same 
name, to the same file without forcing it (-f), only one of them will 
succeed. So, instead of implementing a queue of file names to process, 
you could make both processes use the same file list but for each file 
check if a) the file has not already been processed, and b) you can link 
to the file, and if both conditions are met the loop can go ahead and 
process that file.  If I recall, the processing was to compress each tar 
file, and you wanted two simultaneous processes to speed up the task on 
a multi-core system.  Each process could be implemented something like 
the following:

ls /zonebackup/*tar | while read F; do
     if [ ! -e "$F.gz" -a ! -e "$F.lock" ] && ln "$F" "$F.lock" 
2>/dev/null; then
         gzip "$F"
         rm -f "$F.lock"
     fi
done

Theoretically, you should be able to run two (or more) of these and they 
should compress all the files without getting in each other's way.  The 
loop won't even attempt the ln command if it sees the .gz file or .lock 
link already there, but in the event of a race where both processes 
decide almost simultaneously that the lock isn't there, and both run the 
ln command, only one of the two ln's will succeed, and only for that 
successful one will it go on to compress the file and remove the lock 
link.  The redirect to /dev/null is to keep the losing ln command 
quiet.  I haven't actually tested this whole script, though.  But I have 
successfully used ln in the past for locking files in this manner.

The next step would be to find the optimum number of processes to run 
before you lose any speedup of the overall job.  This would likely 
depend on the number of files, their sizes, and the number and speed of 
your CPU cores.  The faster a CPU core is, the more the compression task 
goes from being CPU-bound to being I/O-bound, and the less gain you get 
from more parallel processes.  Compressing lots of small files, rather 
than a few larger ones, would likely skew things towards being more 
I/O-bound too, because of the extra overhead of each gzip startup.

On 01/03/2014 4:48 PM, Kevin McGregor wrote:
> Aw. I was afraid of that. In my 'production' script there would be a 
> long time between most reads, so it's unlikely there would be a 
> problem, but I still don't want rare random failures. I'll find a 
> work-around.
>
>
> On Sat, Mar 1, 2014 at 2:12 PM, Adam Thompson <athompso at athompso.net 
> <mailto:athompso at athompso.net>> wrote:
>
>     Oh, it's obvious when I think about it - the behavior of a pipe
>     with multiple readers is well-defined as being OS and clib-dependent.
>     Each byte is only available to one reader, ever - if the reading
>     is interleaved, each reader will get garbage.
>     You can use tee(1) to write to multiple FIFOs at once, or just
>     adapt the writing script to do so manually.
>     -Adam
>
>     On Mar 1, 2014 1:35 PM, Kevin McGregor <kevin.a.mcgregor at gmail.com
>     <mailto:kevin.a.mcgregor at gmail.com>> wrote:
>     >
>     > The writer is:
>     > #/bin/ksh
>     > PIPE=/tmp/backup.pipe
>     > [[ ! -a $PIPE ]] && mkfifo $PIPE
>     > # Start gzip processes
>     > /opt/cronjobs/zip1 &
>     > /opt/cronjobs/zip2 &
>     >
>     > # Process files needing compression
>     > let 'fc=0'
>     > ls /zonebackup/*tar | while read F; do
>     >         echo $F >$PIPE
>     >         let 'fc=fc+1'
>     > done
>     >
>     > echo "end of list" >$PIPE
>     > echo "end of list" >$PIPE
>     > exit 0
>     >
>     > The readers are:
>     > #/bin/ksh
>     > PIPE=/tmp/backup.pipe
>     > NAME=zip1
>     > if [[ ! -a $PIPE ]]; then
>     >         logger -p local0.warning "$NAME can't find $PIPE -- exiting"
>     >         exit 1
>     > fi
>     >
>     > while (( 1 )); do
>     >         read F <$PIPE
>     >         if [[ "$F" = "end of list" ]]; then
>     >                 break
>     >         else
>     >                 echo "$NAME: $F"
>     >         fi
>     > done
>     >
>     >
>     >
>     > On Sat, Mar 1, 2014 at 1:29 PM, Kevin McGregor
>     <kevin.a.mcgregor at gmail.com <mailto:kevin.a.mcgregor at gmail.com>>
>     wrote:
>     >>
>     >> I tried fiddling with IFS to no avail. I just changed it like this:
>     >> IFS='
>     >> '
>     >> And now the readers show all kinds of gibberish! All lines have
>     no whitespace, save for the newline at the end. I'm assuming it's
>     at the end.
>     >>
>     >>
>     >> On Sat, Mar 1, 2014 at 12:47 PM, Robert Keizer
>     <robert at keizer.ca <mailto:robert at keizer.ca>> wrote:
>     >>>
>     >>> Have you tried setting IFS ( field sep )? Also you could
>     enable raw mode with -r.
>     >>>
>     >>> Can you share the script?
>     >>>
>     >>> Are the same lines failing repeatedly?
>     >>>
>     >>> Rob
>     >>>
>     >>> On 2014-03-01 11:55 AM, "Kevin McGregor"
>     <kevin.a.mcgregor at gmail.com <mailto:kevin.a.mcgregor at gmail.com>>
>     wrote:
>     >>>>
>     >>>> I have a main script which writes to a named pipe. Before it
>     starts writing, it starts two other scripts which read from this
>     pipe. The reading and writing is a list of file names, one per
>     line. How do I ensure that each script reads one complete line
>     from the pipe at a time (no more, no less)?
>     >>>>
>     >>>> I have a test set up, and it usually works, but sometimes a
>     reader will get a blank line or just a "/" (but not any other part
>     of a line)!
>     >>>>
>     >>>> Kevin
>

-- 
Gilles R. Detillieux              E-mail: <grdetil at scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 0J9  (Canada)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.muug.mb.ca/pipermail/roundtable/attachments/20140304/11e6f091/attachment.html>


More information about the Roundtable mailing list