[RndTbl] restarting a process via crontab?

athompso at athompso.net athompso at athompso.net
Sun Jun 13 14:36:58 CDT 2010

My server panics occasionally (bad RAM that I haven't replaced yet - it's
ECC but one bank occasionally hits an uncorrectable multi-bit error), which
of course leaves various state files laying around on disk without the
processes to match them.

In particular, fetchmail(1) is causing me problems; there's a
~/.fetchmail.pid file that's supposed to contain the PID of the running
fetchmail daemon.  It has to run (well, I *want* it to run) as me, not as
root, so starting it from init isn't a good solution.  I'd also like it to
be recover gracefully if I have to kill it (or it crashes due to a bug),
also contraindicating the obvious rc.local solution.

So far, the best solution I've come up with is a crontab entry (for me,
not root):

0 * * * * if [ -f /home/athompso/.fetchmail.pid ];then read
FP</home/athompso/.fetchmail.pid;if [ -d /proc/$FP -a -r /proc/$FP/exe
];then BIN=$(stat -c %N /proc/$FP/exe);BIN="${BIN#*->
}";BIN="${BIN#\`}";BIN="${BIN%\'}";BIN=$(basename "${BIN}");if [ "$BIN" ==
"fetchmail" ];then exit 0;fi;fi;fi;fetchmail

(or, more readably:)
        if [ -f /home/athompso/.fetchmail.pid ]
                read FP < /home/athompso/.fetchmail.pid
                if [ -d /proc/$FP -a -r /proc/$FP/exe ]
                        BIN=$(stat -c %N /proc/$FP/exe)
                        BIN="${BIN#*-> }"
                        BIN=$(basename "${BIN}")
                        if [ "$BIN" == "fetchmail" ]
                                exit 0

I particularly hate the method of extracting the imagename for a given
PID, but haven't come up with anything better yet.  Suggestions gratefully
appreciated.  Or even a better way of doing this altogether; this feels
like a terrible hack - it works, but doesn't seem elegant.

FYI: I decided against using sed(1) to parse the output because doing it
this way saves two fork(2)s and one exec(2).  I know I don't need to worry
about efficiency on a dual-Xeon server, but I'm trying to stay in the habit
because I'm also working on 200MHz ARM5 platforms where it does matter.

(And who ever thought putting the output of stat(1) in typographical-style
quotes would be a good idea???)

Thanks for any suggestions,

More information about the Roundtable mailing list