[RndTbl] disk usage

Trevor Cordes trevor at tecnopolis.ca
Thu Jul 13 04:17:02 CDT 2017


On 2017-07-11 Kevin McGregor wrote:
> Here's another attempt to send this out:
> 
> I copied a bunch of stuff with rsync from a Linux system to a BSD
> system. I'm fairly sure it worked fine, but as a check, I'd like to
> compare the total number of bytes in files on both systems.
> du -sk <dir> produces slightly different results on both systems
> (smaller on dest)

Ya, du will work on blocks, not bytes.  Linux du has a --bytes (-b)
option which should give you what you're looking for.  On BSD if you
can get a GNU du Bob's your uncle.

> I also tried out
> find <dir> -type f -print0 | xargs -0 stat --format=%s | awk '{s+=$1}
> END {print s}'
> on Linux and
> find rsync -type f -print0 | xargs -0 stat -f %Dz | awk '{s+=$1} END
> {print s}'

GNU find can do a format %s also then you can skip the extra xargs pipe
bit: but only if BSD has the GNU find  <grin>

Otherwise your code above looks sane to me.

Instead of summing, I would dump the output to a file on each host,
sort it, and run diff on it.  (You could alter the stat/find to output
(relative) filename too to spot the offender.)

Even better, have find exec md5sum (or another hash) on every file, and
output it all to a file with the filename, sort, and diff.  That will
catch bitflip type errors.  However, on a big set of files you'll need
to go grab a coffee.

I have a perl script I wrote ages ago to compare two (possibly) remote
dir trees based on multiple criteria and bring up results in meld (GUI
diff), I can email you off-list if you like.  However, not sure if it
will be portable to BSD as I just call find/etc under the hood.  Never
did add in hash support but I think I might now!


More information about the Roundtable mailing list