Table of Contents

1. Network Addressing

Addressing a socket is not a trivial topic. A port number, transport layer protocol, and possibly more than one IPv4 or IPv6 or DNS address must be given. The DNS has to be resolved, addresses have to be chosen, port numbers must be resolved, and so on.

Fortunately, nearly everything needed to represent an Internet address can be encapsulated into a regular ordinary string, which can be entered by the user. All thats needed is this string to create an EndpointAddrlist object, although other information may be supplied (in particular, the transport layer protocol -- UDP or TCP -- must be passed outside of the string, for security reasons). EndpointAddrlist attempts to take care of everything regarding addresses, including things which many applications would otherwise ignore (such as multiple addresses per name).

Endpoint accepts EndpointAddrlist as remote and local addresses, in which case all the addresses shall be tried. EndpointAddress encapsulates just one.

2. EndpointAddrlist Class

EndpointAddrlist encapsulates an array of addrinfos obtained through the protocol-independent getaddrinfo. Although the class name is singular, it is possible for multiple addresses to be listed within. Your application can choose to ignore all but one, but its recommended to try to connect to alternative addresses if the first one doesn't work, instead of giving up.

2.1. Constructors

The following arguments are accepted, in order:

string hostname

A hostname. This is passed to getnameinfo which either resolves the DNS name or uses inet_pton to convert from presentation form to numeric form. Note that inet_pton only supports dotted-decimal IPv4 or hex-string IPv6 addresss, and not the following alternative IPv4 address formats (which inet_aton and therefore ping support):

Regular dotted-decimal strings such as "127.0.0.1" (IPv4) and hex string "::1" (IPv6) are allowed. DNS names may resolve to either IPv4 or IPv6 addresses, depending on if A or AAAA resource records exist. If your host doesn't support both IPv4 and IPv6, either one can be disabled by setting the default address family later on.

The hostname can also contain a service port, in URL-style. That is, a hostname followed by a colon and then a port, rather than a period. Netstat represents port 139 of 10.0.0.2 as "10.0.0.2.139", but this is not acceptable. "10.0.0.2:139" is supported by this class, and has the advantage of being aesthetically pleasing and easy to parse. With one exception.

IPv6 addresses contain embedded colons. However, they always contain more than one colon. "::1" is interpreted as an IPv6 address with no port, while "127.0.0.1:1" is interpreted as port 1 on 127.0.0.1. So how can port names be specified with IPv6 addresses? Several proposals have been made, but the best by far (and the one which this class understands) is to enclose the IPv6 address in brackets, like so: [::1]:80 = port 80 of ::1. IPv6-enabled web browsers understand this. The parser is loose about what is accepted: [::1]80, ::1]:80 and even ::1]80 will be interpreted identically, but you shouldn't rely on this. For orthogonality (the [] notation has roots in email systems), an IPv4 address is acceptable in brackets: [127.0.0.1]:80 works as expected. Brackets also turn on AI_NUMERICHOST, which prevents DNS resolving -- [example.com]:80 won't work, but example.com:80 will. Brackets enclose literals, they are optional with IPv4 but required with IPv6.

If a port is given, it replaces the service variable, described below. This is to allow users to override default port numbers in case of an emergency. It should not be viewed as a security concern.

string service = ""

The service name resolves to a port number through services. This means you can specify human-readable service names such as "http" for port 80 TCP. Numeric strings are also allowed, such as "80"; this is passed to getaddrinfo and handled there.

getaddrinfo handles this input, but before it gets to it, this class interprets port numbers with commas in them. The port number is 16-bits, and both octets can be specified as two 0-255 integers, separated by a common; for example, "1,2" is 0x0102. This is most useful for raw ICMP sockets; they understand the first octet as the type code and the second as the code code (this is specific to the Endpoint library). Other bases besides decimal are not supported.

This argument can often be omitted if you know for sure hostname will contain a port number, but specifying it is most useful for allowing a default port if none is specified in hostname.

int type = 0

A bitmask of the following:

And:
  • The default is TCP | CLIENT, which seems to be the most common.

    To use IPv4 raw sockets on Unix, you need root access. You also need to give the transport-layer protocol; these are derived from IPPROTO_, see protocols.h. For example, to send raw ICMP packets, use RAW_ICMP, to send raw UDP, RAW_UDP. Endpoint will make the IP header for you unless you choose RAW_RAW, in which case you can do it yourself. The idea is that you RAW_xxx makes the raw IP as well as xxx header for you, although this is only implemented with RAW_UDP presently.

    int family = EndpointAddrlist::g_default_family

    Specifies the address as well as protocol family. Allowed values are:

    EndpointAddrlist::g_default_family can be set on a global (or at least program-wide) scale, to reflect which protocol families the host supports. An IPv4-only host would set this to AF_INET, an IPv6-only host AF_INET6, a dual-stack IPv4 and IPv6 host would use AF_UNSPEC.

    EndpointAddress GetAddress()

    Returns the current address. Unless GetAddressNext() is called, this is the first address. See GetAddressNext(), which is preferred for most applications. However, GetAddress() has its uses; sometimes there is no reason to try multiple names, for example if there is no DNS name.

    EndpointAddress GetAddressNext()

    Return the current address, and move to the next one (post-increment). This should be done instead of just trying to use the first address. The proper protocol is to try each address, until one succeeds.

    3. EndpointAddress

    Encapsulates a single address. This is a subclass of the addrinfo structure, whose declaration is reproduced below:

    struct addrinfo {
            int     ai_flags;       /* AI_PASSIVE, AI_CANONNAME, AI_NUMERICHOST */
            int     ai_family;      /* PF_xxx */
            int     ai_socktype;    /* SOCK_xxx */
            int     ai_protocol;    /* 0 or IPPROTO_xxx for IPv4 and IPv6 */
            size_t  ai_addrlen;     /* length of ai_addr */
            char    *ai_canonname;  /* canonical name for hostname */
            struct  sockaddr *ai_addr;      /* binary address */
            struct  addrinfo *ai_next;      /* next structure in linked list */
    };
    

    There is also an m_bool boolean value, which is true if the fields above are valid.

    3.1. Constructor

    EndpointAddrlist(addrinfo*) is the only constructor. If the pointer is non-null, each field is copied shallowly and m_bool is true. If null, m_bool is false.

    operator string()

    Stringifies the address by returning IP() + ":" + Port(). The reason why IP() is returned rather than its DNS name is threefold: not all hosts have DNS names, DNS names may have more than one host, and forward/reverse DNS may not match. If IP() contains a colon, it is put in square brackets as is customary with IPv6.

    string IP()

    Returns a human-readable form of the address. operator string() calls this. The IP address will be returned, as either a dotted-decimal or hex string.

    This function internally calls inet_ntop (numeric to presentation) with the proper flags and passing the proper structures, depending on if the address is IPv4 or IPv6. Manually, this is tedious. WinSock does not provide inet_ntop so a implementation copied from BIND is provided.

    string DNS()

    Returns the canonical DNS name in ai_cannonname, or an empty string if there is none.

    string Name()

    Returns DNS() if not empty, else IP().

    string Port()

    Returns the port number.

    operator bool()

    Returns m_bool, which is false if there is no address.

    4. File Locations

    services and hosts are used during service and name resolution (name resolution additionally uses DNS). Both these files can be by default found in the following locations:

    5. Errors From getaddrinfo()

    A non-zero return value from getaddrinfo is one of the following, organized in order of likelihood, from greatest to least. The Posix 1g and Win32 constants as well as BSD and Win32 gai_strerror results are included as well.

    5.1. Resolution Errors

    If an error occurs, m_bool is set to false, m_error_code is set to EP_ERROR_GETADDRINFO, and m_error_str is set to a description of the problem.

    Unless literals are given, the hostname and service name need to be resolved. Errors can happen in this process.

    EAI_AGAIN = WSATRY_AGAIN

    There was a temporary DNS failure and you should try again.

    EAI_FAIL = WSANO_RECOVERY

    A real bad error.

    EAI_NODATA = WSANO_DATA / EAI_NONAME

    This was removed from RFC, its equivalence or lack thereof to EAI_NONAME is unknown. The latest version of Windows defines EAI_NODATA to EAI_NONAME.

    EAI_NONAME = WSAHOST_NOT_FOUND

    Not provided or no address associated with hostname.

    EAI_SERVICE = WSATYPE_NOT_FOUND

    The symbolic service name could not be resolved, the entry in services is missing. The reference to "ai_socktype" means that if you try to connect to the "domain" service with TCP, and the services file only has an entry for TCP, you'll also get this error (although most systems now define both TCP and UDP for all services). This error never happens with numeric port numbers, so if you're concerned about portability, use them. Win32's services is quite small and doesn't cover all the esoteric service names, although it has its uses with custom servers (if you use a symbolic service name in your server and the user wants to change the port, all he has to do is change the definition in services -- however, your install program must add a line to the file.).

    5.2. Usage Errors

    These errors are mostly the fault of the programmer; either you (the class user) or me (the class programmer). Errors that "should not happen" should not happen under class design, even if invalid parameters are passed to the class.

    EAI_FAMILY = WSAEAFNOSUPPORT

    hints.ai_family is invalid. The address family is specified as the last argument, and can be either AF_UNSPEC (IPv4 and/or) IPv6, AF_INET (IPv4), or AF_INET6 (IPv4). If the parameter is not given, EndpointAddress::g_default_family is used. Either the default family (which defaults to AF_UNSPEC) or the passed family is invalid.

    EAI_BADFLAGS = WSAEINVAL

    Valid values are AI_PASSIVE, AI_CANONNAME, and AI_NUMERICHOST. EndpointAddress only sets AI_PASSIVE (for servers) and AI_NUMERICHOST (for bracketed literals) so this should never happen.

    EAI_MEMORY = WSA_NOT_ENOUGH_MEMORY

    EAI_ADDRFAMILY

    Presumably this means an address family other than AF_UNSPEC was given, and no address of that type is associated with the hostname. Does not exist on Win32.

    EAI_SOCKTYPE = WSAESOCKNOSUPPORT

    hints.ai_socktype is invalid. EndpointAddrlist sets this field to either SOCK_DGRAM for UDP or SOCK_STREAM for TCP so this should never happen.

    EAI_SYSTEM

    This error code code is rarely used, but it means the errno variable will contain additional error information. Does not exist on Win32 (and most Unixes).