[RndTbl] sshd: Corrupted MAC on input.
trevor at tecnopolis.ca
Wed Jul 29 20:31:53 CDT 2020
On 2020-07-29 Gilbert E. Detillieux wrote:
> What's the likely cause of this? A bad NIC? Bad RAM? (I'm guessing
> something is corrupting the packets once in a while, but I'm not sure
> what. If so, it seems to get past TCP's error correcting.)
I would try the same type of transfer using a different client to the
same server. Then try a different server for the same client. If you
can get the same behavior with a different server, that would be
You could also try using nc from /dev/zero from the server to the
client into a file, then use a script (or something) to check if the
file is all zeros. It would be neat to see the actual corruption that
occurs. Make sure nc is using TCP (though UDP would be an interesting
test as well, but not critical or required).
You're right that TCP shouldn't really allow such (line) errors to get
through to the ssh layer.
If your NIC has TCP checksum offloading, try turning it off (ethtool is
what I used to use for that, not sure if it's still "the way"). That
will eliminate the NIC and bus from the equation, leaving you with
RAM/CPU and/or mobo between the two (but not out to the cards/bridge).
If you turn off offloading and the problem goes away, your transfer
performance should tank because it'll be doing TCP retries each time.
My guess, as always, is... wait for it... bad caps on the board, likely
near the NIC slot, or, if onboard, near the NIC onboard chip. I've had
weird NIC behavior before and it's always turned out to be the caps
near the card slot, usually 1000uf little jobbers.
I just decommissioned my main workstation I used since 2008(!) that was
starting to get occasional VGA lockups, and lo and behold, the caps
near the slots were just starting to get puffy (on a very high end
Intel board). I'll be repairing them soon to repurpose the system.
P.S. If a repair or replacement isn't possible for a while, sometimes
moving the NIC as far away from the puffiest caps can help for a while
until more caps go bad. Each 1 or 2 slots usually gets its own cap(s).
Also, putting in a junkier NIC might help if it draws less power.
These cap problems are always exacerbated by higher (transient/peak)
Keep us posted!
More information about the Roundtable