cross-posted from: https://infosec.pub/post/47574072

Hello lemmings, I made a program to ping every IPv4 address and collect data on respondents. I am almost done, but I want feedback on things I should change or how I can improve my current record of 5,000 pings / second.

I am currently aware that I need to properly parse the data I receive on the receive socket, since it’s possible the TTL router messages might be confused as replies. ICMP is used on networks to inform other machines of network problems.

One issue I’m aware of is the tuning that has to go into maximizing throughput while also avoiding a total system freeze. My code seems to spend too much time opening sockets that it leaves no room for the actual OS. My only fix currently is limiting the CPU time to the ping timeout I used.

The overall program works like this: It keeps a linked list / pool of task objects. All objects are initially in the “free list” and then when they become associated with a socket, they move to the “active list”. The program first checks for updated sockets with epoll which results in like 5% of sockets giving a response. It then closes any tasks that have timed out via linked list in O(1) each. The slow part is when it creates new sockets, since it doesn’t really know when to stop, besides when the non-blocking socket informs it that it would block. I implemented a time limit on sending that is currently the maximum of the ping timeout. To increase throughput it seems like I need to streamline how I send ICMP packets.

https://github.com/bneils/PingStorm

  • CameronDev@programming.dev
    link
    fedilink
    arrow-up
    3
    ·
    5 days ago

    The streamlined way to do this is raw sockets. You only need one, and you can just craft all of the ICMP requests and send them continuously. Then in a separate thread you can read all the replies.

    • sacred_font@infosec.pubOP
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      5 days ago

      Is there any reason why it has to be a raw socket, rather than a dgram icmp socket? (which allows you to run as non-root) Silly me for not realizing I only needed one socket. Ofc my speed would only be at most 1-2Mbps. I currently track outgoing pings in memory so I would probably need to keep a single large circular array instead for timeout purposes – not too different from what I’m doing currently.

        • sacred_font@infosec.pubOP
          link
          fedilink
          arrow-up
          2
          ·
          edit-2
          5 days ago

          I gave it some further thought and I might be completely misguided to try and max throughput. Datagrams are completely connectionless and therefore can’t know if your router’s send buffer is full or not, unless I’m missing something internal between the kernel and the router that makes sendto block (which AFAIK only happens when the socket’s send buffer is full). Therefore most “extra” datagrams I send would just be dropped anyways. I know ICMP was a dated method of congestion control but have no idea if it would still be in use for simple pings.

          Edit: apparently source quench is a thing but still, no clue if the kernel intercepts this or if it is even sent in 2026 due to deprecation

  • Dave.@aussie.zone
    link
    fedilink
    arrow-up
    5
    ·
    edit-2
    7 days ago

    Been a while since I’ve done this kind of thing, but you should be able to create a pool of sockets primed with the first batch of IP addresses, then send the packets.

    On reply or timeout, you don’t destroy the socket and create a new one. Instead you alter the connection information in the socket descriptor in the existing socket to point to the next address to check, and then go again.