The claim is made that sending packets one at a time avoids incurring "additional latencies for TCP", but unless I'm missing something,
The main sources of latency in TCP are cases where packets are lost en-route and need to be retransmitted before anything else is done. I don't see how sending a single packet at a time avoids this problem, and unlike UDP, there's no way to "move on" and include the previous data with your next update.
Sending a single "packet" at a time and awaiting an acknowledgement, on top of a protocol that has its own acknowledgements, seems like it would severely pessimize the amount of traffic you could put through a stable connection.
And how would multiple TCP connections speed things up? Head-of-line blocking and retransmits are "symptoms", where TCP is trying to provide reliability over some underlying lossiness (e.g., in the physical connection or routing of the packets). Trying to send on a different socket doesn't somehow remove that lossiness, does it?
Please tell me you have read the article to the end.
First and foremost. This idea was for people sitting behind a firewall who want to use UDP but cannot.
The main sources of latency in TCP are cases where packets are lost en-route and need to be retransmitted before anything else is done.
The TCP connection on the receiving end will wait for the packet and will not allow other packets until the first packet is processed. This makes this connection basically dead.
The workaround is having multiple TCP connections. for each connection you have a boolean telling you if it is blocked or not. If you get an ACK you "unblock" a connection. You only send through unblocked connections. So you basically cycle through your connections and by the time you want to send through your first connection again, it should have received its ACK thus be unblocked.
For this receiving end this is basically like UDP. Multiple connections give you Packets/Data. They are not ordered. There is no Head-of-line blocking.
Now what happens if a packet is lost. The connection that send the packet is simply blocked until the packet is retransmitted. So if the sender wants to send a packet and the connection is still blocked, he will just cycle through the connections until he finds one that is currently unblocked.
and unlike UDP, there's no way to "move on" and include the previous data with your next update.
It is impossible to move on as your first packet will go through anyways. But it is possible to loose packets if all connections are blocked and the sender decides to "skip" a packet. In this case the sender knows that the packet is lost tho.
Sending a single "packet" at a time and awaiting an acknowledgement, on top of a protocol that has its own acknowledgements, seems like it would severely pessimize the amount of traffic you could put through a stable connection.
There will be an overhead. Let's be real tho. we are talking about modern internet connections, Even if just smartphone internet, sending an additional 40 bytes per message is just neglectable. If we have a lot of lost packets how ever, the overhead will grow significantly as each lost packet will be transmitted again, this hurts more if an ACK is missing as the original packet will be retransmitted. As there are twice as many packets send, the possibility of an lost ACK will also grow.
So this can handle small amounts of package loss but bigger amounts can quickly become a problem.
If there is a surge of high package loss, This sort of connection will need longer to revive as at least one connection has to resend successfully for the next packet to be sent.
The main sources of latency in TCP are cases where packets are lost en-route and need to be retransmitted before anything else is done.
Well, yes. It is known as "head of line blocking". However - HOL blocking is per-connection(!) - and this is what is being exploited here.
I don't see how sending a single packet at a time avoids this problem
Easy - if there are no outstanding packets within current_TCP_connection, it means that there is nothing to block our current packet (and HOL blocking cannot possibly happen), so it will go out immediately. Sure, this packet can be dropped - but so is UDP packet, so there is no real difference here.
seems like it would severely pessimize the amount of traffic you could put through a stable connection.
Not really; my educated guess is that most of the time TCP ACK will be piggy-backed on top of our app-level ACK - which will mean that we'll be sending like 41 byte (40 byte headers + 1 byte payload) instead of 40 bytes for usual TCP ACK (just 40 headers, with no payload) - which is an overhead, but not really a noticeable one.
If you start encountering dropped packets and latencies, isn't the problem usually at some deeper level (e.g., the physical connection)?
Mechanics of packet loss were discussed in previous instalments. Very briefly - most of the time, there are two patterns observed for packet loss: (a) probabilistic loss (which BTW grows year-over-year due to increased overall utilization), and (b) "black hole" kind of loss (usually 1-2 minutes long - due to BGP convergence or modem retrain). For (a) and 5% packet loss over a 20 packets/second game traffic - TCP will experience delays of about 7*initial_RTO ~= 1.5 seconds IIRC about once per 5 minutes; neither UDP nor proposed method exhibit this problem (the latter one - supposedly ;-)). For (b) type of loss, due to 30-year-old-exponential-backoff algorithm in TCP, effective blockage (the one observed at TCP level) can easily last twice longer than the packet-level loss (and reconnect solves it very efficiently - this is an extremely practical thing observed). With an aggressive "opportunistic reconnect" (which is BTW already used by serious TCP-based games) - this problem is reduced very significantly too.
How would sending on a different socket help?
See above; in short - it removes head of line blocking for outgoing packets. Different socket also usually helps against exponential backoff - but this is a slightly different story.
So the gist is that TCP mechanisms like exponential backoff are often "overcorrections" to short-lived connectivity loss, and that amortizing those costs across multiple connections reduces latency? Perhaps, but:
Sure, this [TCP] packet can be dropped - but so is UDP packet, so there is no real difference here.
There's a crucial difference: the connection in question can't be used again until the dropped packet is resubmitted, received, and ACKed by the other end. In the meantime, you can send on your other connections, but that one remains unavailable unless you reset it or the packet you no longer care about makes its way over.
I'm skeptical that the improved latency you get from multiple connections wouldn't be offset by these sorts of overheads, but I'd be really interested if you came back with hard data.
I think the point is "you can do that if you can't use UDP for some reason", not "it is better than UDP for this job"
Exactly. It is not better-than-UDP (well, if UDP works at all). It is a kinda-simulation for UDP (an admittedly worse one, but still hopefully workable) - for those clients who can't run UDP (because clients don't support it - or because there is an UDP-blocking firewall in between).
Not sure why you're being downvoted here. It's a fairly novel approach if for whatever reason you can't get a UDP connection to be punched through. Having to use multiple sockets per client is unfortunate, but necessary. Depending on your transmission rate it might not be too horrible. I would be interested to see performance stats on this compared to reliable UDP implementations, though.
In the meantime, you can send on your other connections, but that one remains unavailable unless you reset it or the packet you no longer care about makes its way over.
Yes, but so? If it is one packet which is lost - we're speaking about delays of about 0.2 seconds (+ usual RTT), which isn't that much.
but I'd be really interested if you came back with hard data.
Stand by for a long wait. "Hard data" is not possible for real-world Internet (and those companies who have something resembling hard data, will never share it). I have some experience with a game with hundreds of thousands of simultaneous players (with all kinds of stats etc. etc.) - but saying that I have any kind of "hard data" would be a big fat exagerration.
Interesting idea.
One remark regarding the websocket question: Browsers have a maximum number of parallel HTTP (and thereby TCP) connections configured, which is about 10, which means they typically won't do more than 10 parallel HTTP requests to a server. I would guess that they also count websocket connections (which start their life as HTTP requests) into that limit. But that should be pretty easy to check.
For the server the requirements are quite a bit higher than with pure UDP (e.g. (nrClients * nrConnectionsPerClient) socket buffers and state instead of single one). But maybe that doesn't matter for games with a low client count. In order to offload the game/application server the TCP/UDP gateway could also run on a seperate process or machine. For pure TCP->UDP gateway maybe TURN could already be an out-of-the box solution? I forgot whether it can accept frames through TCP and forward these as UDP frames.
they typically won't do more than 10 parallel HTTP requests to a server
But if this is indeed per_server and not global (which IIRC is the case, though I wouldn't bet my life on it) - it should be still enough (at least to try it ;-)). Moreover, even if we'll be off by a factor of 2 or so, my rather wild guess is that it will be much easier to get browser devs to increase the number of TCP connections than to make them to add UDP (which they're really afraid of because of potential for DDoS attacks).
But maybe that doesn't matter for games with a low client count.
Actually, it doesn't matter even for games with high client count :-). For modern games, "industry average" is currently around 100 players/core or 1000 players/2U Server, and with 10 TCP connections per player - we're speaking about 10000 TCP connections per that 2S/2U server box; with 8K buffers per TCP connection (which is less than default(!) - so needs to be adjusted using setsockopt()) - it is mere 80M of RAM, which is pretty much nothing by today standards.
The first point is a non-issue. WebSockets don't count towards the connection limit, at least not according to my tests in Chrome. I doubt it would in other browsers too but I haven't tested yet.
There does appear to be a global limit, but if you need to open 255 WebSockets you've got a different problem.
Depending on your game, reliable+ordered UDP does not need to mean head-of-line blocking.
In a generic RUDP protocol that needs to be true, but for a specific game it does not need to be. My game, inpired by the old Q3 protocol, has no head-of-line blocking despite being reliable+ordered.
Uh, isn't HoL blocking entirely controlled by the routers in the network, and unrelated to the transport-layer protocol? Unless he's referring to a different HoL blocking
Being pedantic here, but the title here doesn't make sense I think. Both UDP and TCP are protocols used to transfer information over IP. But as far as I know neither uses the other.
Edit: pedantic note
The title is a bit hard to understand. What he means is Tunneling UDP over TCP if UDP is not available in your current network.
Ah, that makes more sense. Thanks.
Sounds like a bad idea bunny man, you don't need reliable delivery for games, old data is not useful
But you need delivery. It's not about beeing reliable. its about people sitting behind firewalls who can't use UDP
EDIT: I fucked up
You do need reliable delivery for some things. If the player sends a single frame input or command to the server, I expect it not to be dropped. Regardless, this is not about reliability, it's about tunneling.
The fact that the packet can show up later is a side effect rather than an intended one.
"...I have only kinda-demonstrated its correctness, not tried it."
Well, there ya go, sport.