[RndTbl] sshd: Corrupted MAC on input.
Gilbert E. Detillieux
gedetil at cs.umanitoba.ca
Tue Aug 4 09:56:59 CDT 2020
On 2020-07-31 11:42 p.m., Trevor Cordes wrote:
> On 2020-07-31 Hartmut W Sager wrote:
>> Oops, *sorry Gilbert*, I looked at this thread again, and it was
>> *your* position on checksums that I'm supporting. Maybe I have some
>> bad/failing Chinese capacitors in my head. :)
>
> Your creator should have sprung for the 5c better caps! ;-)
>
> Looks like you guys are right. TCP only has a 16-bit checksum, and
> it's a simple sum then 1's complement over (most) of the whole packet.
That's what I thought, but it's been a while since I looked at TCP
headers in detail.
> Some post says microsoft says (paraphrased): "Basically transmit 100MB+
> over a typical Internet connection and you are very likely to see a
> silent failure."
>
> I don't know about that! But, yes, even if you get 1 error through
> every 1GB TCP, that's pretty awful to contemplate.
As I said earlier, the Ethernet frame layer catches most bit errors for
you in the typical network setup. I think there's a 32-bit CRC there...
https://en.wikipedia.org/wiki/Frame_check_sequence
We were often seeing undetected TCP bit errors in PPP over serial
(modem) connections (ages ago), where there was no Ethernet frame layer
to do the heavy lifting.
But if the NIC isn't doing its work correctly (either at the Ethernet
frame level or TCP checksum offloads), this could result in bad data
getting up the food chain.
> I guess rsync detects/corrects for this automatically, but
> unfortunately Gilbert is seeing errors in the ssh wrapper layer, in a
> place where ssh is sensitive to errors and wants to barf instead of
> retry. It almost would be better if ssh would just pass up junk to
> rsync at let it deal with it.
I haven't found a way to disable the MAC support in ssh, only a way to
select protocols at both the client and server end.
> Gilbert, to confirm, your bug hits after you have transferred lots of
> data, right? It's not giving this error right at the beginning upon
> connection, right? Do you have any stats on approx how much data goes
> across each time before the error hits? Is it consistent or all over
> the map?
Yes, these are on large file transfers. Yes, they are occasional, and
random. But I had to restart a large (41-ish GB?) file transfer 3 times
last week due to repeated errors. A typical nightly backup results in
160 GB or so to transfer, and lately, the rsync fails (somewhere along
the way) more often than not.
Gilbert
--
Gilbert E. Detillieux E-mail: <gedetil at cs.umanitoba.ca>
Dept. of Computer Science Web: http://www.cs.umanitoba.ca/~gedetil/
University of Manitoba Phone: (204)474-8161
Winnipeg MB CANADA R3T 2N2 Fax: (204)474-7609
More information about the Roundtable
mailing list