13

If my C program uses sockets, binds to localhost:9025, exchanges some data, gets manually killed and restarted, it sometimes crashes with the error being:

Address already in use.

All SE-recommended software that I've tried to look for “pid that uses port” with have failed to return any process id, so I assume there is no process at that time that uses port 9025, which should be the case.

Nonetheless, from what I've gathered from comments on topically similar questions, it seemed to me that "Address" is "already in use" if and only if a process uses that particular address. Why is this false then?

Now I assume the OS keeps track of what addresses are in use and what are not, but is that the case? If it is though, I would love if you could tell me how do I correct it, because my best solution to this problem is “wait for an undetermined amount of time”.

EDIT: I use Linux 5.2.2-arch1-1-ARCH x86_64

MarianD
  • 2,666
  • 1
  • 17
  • 26
Captain Trojan
  • 243
  • 2
  • 7

1 Answers1

32

You are probably re-starting your program too fast, or the program is not closing the socket.

Even after the socket is closed, Linux keeps the connection in limbo for some time, but will prevent any other connection from being accepted for the same quadruplet of "source address, source port, destination address, destination port".

The solution is to set the socket option in the program with setsockopt like this:

setsockopt(socket,SOL_SOCKET,SO_REUSEADDR ... )
harrymc
  • 455,459
  • 31
  • 526
  • 924
  • 1
    `SO_REUSEADDR` promises that your program will not get confused by data arriving for a previous instance of the program. For TCP, that is pretty much given. – Simon Richter Aug 26 '19 at 06:38
  • 1
    While `SO_REUSEADDR` is the commonly used approach (and indeed what I'd recommend, too), be aware that it is not 100% safe. There exists a small chance that delayed packets are wrongfully delivered to the new socket (if there happen to be any at all, _and_ if they happen to have the right remote port, etc.). Alas, most people deem this an acceptable risk, seeing how it's rather unlikely to be encountered and the benefits (server up again instantly rather than after 2 minutes) usually outweight the risk. – Damon Aug 26 '19 at 08:18
  • @Damon note that randomized TCP sequence numbers mean the chance of wrongful delivery of a packet to the new process is less than 1 in 2^32. – Jeremy Friesner Aug 26 '19 at 14:17
  • @Damon Isn't the delay even 256 seconds instead of 2 minutes (because of TTL magic), or am I mixing tat up with something? – Hagen von Eitzen Aug 26 '19 at 14:35
  • See https://unix.stackexchange.com/questions/17218/how-long-is-a-tcp-local-socket-address-that-has-been-bound-unavailable-after-clo for a short program to measure this delay. Note that `SO_REUSEADDR` makes it impossible to distinguish new connections from different clients behind the same NAT. – Eric Towers Aug 26 '19 at 15:15
  • @JeremyFriesner: But if a server servers a million clients per second, then in each 72-minute interval you should expect an average of 1 wrongful delivery of packet due to collision of TCP sequence number. – user21820 Aug 27 '19 at 04:51
  • @user21820 if TCP packet-sequence numbers were the only field being checked in the packets, that might be the case, but the sequence number is only one of a number of checks made on each TCP packet -- source IP, destination IP, and source and destination port numbers must also match. So the real-world error rate is undoubtedly much lower than that. – Jeremy Friesner Aug 27 '19 at 05:22
  • @JeremyFriesner: Yes I know that, hence "due to collision of TCP sequence number". I was just objecting to only relying on TCP sequence numbers since 2^32 is not that big. =) – user21820 Aug 27 '19 at 05:29
  • @user21820: Well it is "big enough" for practical purposes, mostly. You need a lot of coincidence to happen in addition. So... possible and probably quite doable if you're maliciously tampering, but not very likely to occur in normal practice. Normally, client sends 2-3 packets, waits for answer, ACKs answer, and at some point server closes after getting ACKs. Maybe never receives FIN-ACK, who cares. When the same client makes another request a second or two later (only then `SO_REUSEADDR` comes to play), who cares. Only "real" issue is a server restart with _many_ active, NATted connections. – Damon Aug 27 '19 at 10:04
  • That's not really a problem either, though. Becaues when a server restarts mid-transfer, your client pops up an error message anyway. – Damon Aug 27 '19 at 10:06
  • @Damon: Agreed agreed. I just intended to say that we need to look at other factors as well. =) – user21820 Aug 27 '19 at 10:38
  • It's as good at protecting against reuse as the checksum is against errors. – Barmar Aug 30 '19 at 19:29