diff options
author | est31 <MTest31@outlook.com> | 2016-05-18 06:18:08 +0200 |
---|---|---|
committer | est31 <MTest31@outlook.com> | 2016-05-22 15:56:54 +0200 |
commit | 423d8c1b0d3841f6ce9756ab0d6b2e10408fc89b (patch) | |
tree | 88a75511309d8bd1d27adb27f5d5f5d3845475cf /src | |
parent | f64a6259b2b5e79103cfd4bc151304e6da39bca9 (diff) | |
download | minetest-423d8c1b0d3841f6ce9756ab0d6b2e10408fc89b.tar.gz minetest-423d8c1b0d3841f6ce9756ab0d6b2e10408fc89b.tar.bz2 minetest-423d8c1b0d3841f6ce9756ab0d6b2e10408fc89b.zip |
Tolerate packet reordering in the early init process
Fixes a bug where packet reordering made the server give the
client two peer ids instead of one. This in turn confused
reliable packet sending and made connecting to the server fail.
The client usually sends three packets at init: one "dummy"
packet consisting of two 0 bytes, and the init packet as well as
its legacy counterpart. The last one can be turned off since commit
af30183124d40a969040d7de4b3a487feec466e4, but this is of lower
relevance for the bug. The relevant part here is that network
packet reorder (which is a normal occurence) can make the packets
reach the server in different order.
If reorder puts the dummy packet further behind, the following
would happen before the patch:
1. The server will get one of the init packets on channel 1 and
assign the client a peer id, as the packet will have zero as
peer id.
2. The server sends a CONTROLTYPE_SET_PEER_ID packet to inform
the client of the peer id.
3. The next packet from the client will contain the peer id set by
the server.
4. The server sets the m_has_sent_with_id member for the client's
peer structure to true.
5. Now the dummy packet arrives. It has a peer id of zero, therefore
the server searches whether it already has a peer id for the
address the packet was sent from. The search fails because
m_has_sent_with_id was set to true and the server only searched
for peers with m_has_sent_with_id set to false.
6. In a working setup, the server would assign the dummy packet to
the correct peer id. However the server instead now assigns a
second peer id and peer structure to the peer, and assign the
packet to that new peer.
7. In order to inform the peer of its peer id, the server sends a
CONTROLTYPE_SET_PEER_ID command packet, reliably, to the peer.
This packet uses the new peer id.
8. The client sends an ack to that packet, not with the new peer id
but with the peer id sent in 2.
9. This packet reaches the server, but it drops the ACK as the peer
id does not map to any un-ACK-ed packets with that seqnum. The
same time, the server still waits for an ACK with the new peer
id, which of course won't come. This causes the server to
periodically re-try sending that packet, and the client ACKing it
each time.
Steps 7-9 cause annoyances and erroneous output, but don't cause
the connection failure itself.
The actual mistake that causes the connection failure happens in 6:
The server does not assign the dummy packet to the correct peer, but
to a newly created one.
Therefore, all further packets sent by the client on channel 0 are
now buffered by the server as it waits for the dummy packet to reach
the peer, which of course doesn't happen as the server assigned
that packet to the second peer it created for the client.
This makes the connection code indefinitely buffer the
TOSERVER_CLIENT_READY packet, not passing it to higher level code,
which stalls the continuation of the further init process
indefinitely and causes the actual bug.
Maybe this can be caused by reordered init packets as well, the only
studied case was where network has reliably reordered the dummy
packet to get sent after the init packets.
The patch fixes the bug by not ignoring peers where
m_has_sent_with_id has been set anymore. The other changes of the
patch are just cleanups of unused methods and fields and additional
explanatory comments.
One could think of alternate ways to fix the bug:
* The client could simply take the new peer id and continue
communicating with that. This is however worse than the fix as
it requires the peer id set command to be sent reliably (which
currently happens, but it cant be changed anymore). Also, such a
change would require both server and client to be patched in order
for the bug to be fixed, as right now the client ignores peer id
set commands after the peer id is different from
PEER_ID_INEXISTENT and the server requires modification too to
change the peer id internally.
And, most importantly, right now we guarantee higher level server
code that the peer id for a certain peer does not change. This
guarantee would have to be broken, and it would require much
larger changes to the server than this patch means.
* One could stop sending the dummy packet. One may be unsure whether
this is a good idea, as the meaning of the dummy packet is not
known (it might be there for something important), and as it is
possible that the init packets may cause this problem as well
(although it may be possible too that they can't cause this).
Thanks to @auouymous who had originally reported this bug and who
has helped patiently in finding its cause.
Diffstat (limited to 'src')
-rw-r--r-- | src/network/connection.cpp | 13 | ||||
-rw-r--r-- | src/network/connection.h | 19 |
2 files changed, 8 insertions, 24 deletions
diff --git a/src/network/connection.cpp b/src/network/connection.cpp index f7452d8e4..02c0aa165 100644 --- a/src/network/connection.cpp +++ b/src/network/connection.cpp @@ -2167,12 +2167,12 @@ void ConnectionReceiveThread::receive() throw InvalidIncomingDataException("Channel doesn't exist"); } - /* preserve original peer_id for later usage */ - u16 packet_peer_id = peer_id; - /* Try to identify peer by sender address (may happen on join) */ if (peer_id == PEER_ID_INEXISTENT) { peer_id = m_connection->lookupPeer(sender); + // We do not have to remind the peer of its + // peer id as the CONTROLTYPE_SET_PEER_ID + // command was sent reliably. } /* The peer was not found in our lists. Add it. */ @@ -2214,11 +2214,6 @@ void ConnectionReceiveThread::receive() } } - - /* mark peer as seen with id */ - if (!(packet_peer_id == PEER_ID_INEXISTENT)) - peer->setSentWithID(); - peer->ResetTimeout(); Channel *channel = 0; @@ -2756,7 +2751,7 @@ u16 Connection::lookupPeer(Address& sender) for(; j != m_peers.end(); ++j) { Peer *peer = j->second; - if (peer->isActive()) + if (peer->isPendingDeletion()) continue; Address tocheck; diff --git a/src/network/connection.h b/src/network/connection.h index fe2c9819d..5ee53b9d4 100644 --- a/src/network/connection.h +++ b/src/network/connection.h @@ -663,8 +663,7 @@ class Peer { m_last_rtt(-1.0), m_usage(0), m_timeout_counter(0.0), - m_last_timeout_check(porting::getTimeMs()), - m_has_sent_with_id(false) + m_last_timeout_check(porting::getTimeMs()) { m_rtt.avg_rtt = -1.0; m_rtt.jitter_avg = -1.0; @@ -687,21 +686,16 @@ class Peer { virtual void PutReliableSendCommand(ConnectionCommand &c, unsigned int max_packet_size) {}; - virtual bool isActive() { return false; }; - virtual bool getAddress(MTProtocols type, Address& toset) = 0; + bool isPendingDeletion() + { MutexAutoLock lock(m_exclusive_access_mutex); return m_pending_deletion; }; + void ResetTimeout() {MutexAutoLock lock(m_exclusive_access_mutex); m_timeout_counter=0.0; }; bool isTimedOut(float timeout); - void setSentWithID() - { MutexAutoLock lock(m_exclusive_access_mutex); m_has_sent_with_id = true; }; - - bool hasSentWithID() - { MutexAutoLock lock(m_exclusive_access_mutex); return m_has_sent_with_id; }; - unsigned int m_increment_packets_remaining; unsigned int m_increment_bytes_remaining; @@ -776,8 +770,6 @@ class Peer { float m_timeout_counter; u32 m_last_timeout_check; - - bool m_has_sent_with_id; }; class UDPPeer : public Peer @@ -795,9 +787,6 @@ public: void PutReliableSendCommand(ConnectionCommand &c, unsigned int max_packet_size); - bool isActive() - { return ((hasSentWithID()) && (!m_pending_deletion)); }; - bool getAddress(MTProtocols type, Address& toset); void setNonLegacyPeer(); |