NAME tun - TUN and TAP devices SYNOPSIS #include // must be included first -- bug? #include // ??? #include #include open("/dev/net/tun", O_RDWR); DESCRIPTION ... mumble mumble analogy is to a _single_ Ethernet device, and a NIC. Userspace becomes the NIC. In most use caes userspace will be forwarding to some other NIC, often another userspace app using a TAP device, but as far as the interface is concerned, a single TUN/TAP device is a single virtual Ethernet card. A TUN (tunnel) device handles layer-3 frames, for instance IP. A TAP (network tap) device handles layer-2 Ethernet frames. struct ifreq contains the following fields: (actually see netdevice(7)) struct ifreq { char ifr_name[IFNAMSIZ]; -- can contain a single "%d" to find a free name. No other escape sequences are permitted. short ifr_flags; }; flags: TUN_TUN_DEV TUN_TAP_DEV TUN_TYPE_MASK (don't document?) TUN_FASYNC -- afaict, not user-facing... also, arguably in the wrong namespace (this is a tun_file flag, not a tun flag) TUN_NOCHECKSUM -- Obsolete / Disabled in 882553752196605bf27057e7adb298ecae8058c4. You should probably be using the GSO stuff instead? TUN_NO_PI documented in Documentation/, see below TUN_ONE_QUEUE: Obsolete. Before 5d097109 it determined whether all packets queue at the device (enabled), or a fixed number queue at the device and the rest at the "qdisc" whatever that is. Now it is always enabled. TUN_PERSIST -- allows the device to continue to exist even when the last fd is closed. useful for the user/group stuff. XXX how do you get rid of it? I guess you can open the device, ioctl TUNSETPERSIST off, and close it. I think `ip link delete` also works; maybe it needs to be ifdown. TUN_VNET_HDR -- enables control of Generic Segmentation Offload. requires you to pass a struct virtio_net_hdr. If TUN_NO_PI is unset, this comes after the struct tun_pi. Flags include VIRTIO_NET_HDR_F_NEEDS_CSUM. .gso_type can include VIRTIO_NET_HDR_GSO_ECN, VIRTIO_NET_HDR_GSO_{TCPV[46],UDP}. You should probably call TUNSETVNETHDRSZ. TUN_TAP_MQ -- permits you to open a tun device multiple times (specify the same name) and get multiple fds; this is useful for having multiple threads send or receive packets. You must (of course) use the same TUN_{TUN/TAP}_DEV. Other settings can be changed provided the device is uninitialized (e.g., someone opened the device, set an owner, and then closed the fd). New in Linux 3.8. ioctls: int TUNSETNOCSUM Ignored, see TUN_NOCHECKSUM. Maybe don't document. TUNSETDEBUG Enables or disables debugging. Only valid if the kernel was compiled with the local #define TUN_DEBUG. Don't document. TUNSETIFF -- takes a struct ifreq *. Modifies that structure, which basically means the name is filled in. Invalid flags are silently ignored; see TUNGETFEATURES if you want to see what the running kernel supports. TUNSETPERSIST -- set TUN_PERSIST TUNSETOWNER, TUNSETGROUP -- set a uid or gid to own the device. TUNSETLINK -- set the link type of the device. The interface must be down, else EBUSY. Shoud be an ARPHRD_* constant. Defaults to ARPHRD_ETHER for TAP and ARPHRD_NONE for TUN. Useful for an emulated wireless interface, or something. uint TUNGETFEATURES -- returns the valid flags for TUNSETIFF, since 07240fd0902c872f044f523893364a1a24c9f278 TUNSETOFFLOAD Can be used to set TUN_F_CSUM, TUN_F_TSO[46], TUN_F_TSO_ECN, and TUN_F_UFO, which claims that the userspace, as the network "card", handles the respective offload feature. Set features are automatically enabled as if with ethtool(8); unset cannot be enabled. Unlike ethtool, this only requires an open fd to the tun device, not CAP_NET_ADMIN. Returns EINVAL if something doesn't exist in the running kernel, can be used for feature detection. See also 882553752196605bf27057e7adb298ecae8058c4 TUNSETTXFILTER -- Only valid on TAP devices. Takes a struct tun_filter *, which specifies which Ethernet addresses to accept packets for. This can be used to drop packets not intended for us at the driver level. TUNGETIFF -- Returns a struct ifreq containing the device's current name and flags. int TUNGETSNDBUF -- retrieve current send buffer. The default is currently very large (INT_MAX). TUNSETSNDBUG -- set currrent send buffer From QEMU comments: /* sndbuf implements a kind of flow control for tap. * Unfortunately when it's enabled, and packets are sent * to other guests on the same host, the receiver * can lock up the transmitter indefinitely. * * To avoid packet loss, sndbuf should be set to a value lower than the tx * queue capacity of any destination network interface. * Ethernet NICs generally have txqueuelen=1000, so 1Mb is * a good value, given a 1500 byte MTU. */ struct sock_fprog TUNATTACHFILTER TUNDETACHFILTER Berkeley Packet Filter / Linux Socket Filter support; cf. Documentation/networking/filter.txt. See 99405162598176e830d17ae6d4f3d9e070ad900c. Only valid on TAP (layer 2) devices. Only one filter at a time can be attached. See socket(7) int TUNGETVNETHDRSZ, TUNSETVNETHDRSZ -- Gets and sets the size of the header used when TUN_VNET_HDR is set. The structure increased by two bytes (see struct virtio_net_hdr_mrg_rxbuf); this provides a way for apps to say how many bytes they're expecting prepended. TUNSETQUEUE -- takes a struct ifreq *. The only thing used is whether IFF_ATTACH_QUEUE or IFF_DETACH_QUEUE is set in ifr->ifr_flags. Enables or disables this queue. You cannot attach an already-attached queue, or detach an already-detached queue. TUN_TAP_MQ must be set on the device. I really have no idea why this wasn't done like all the other enable/disables, given that nothing else in struct ifreq is used. also supports SIOCGIFHWADDR and SIOCSIFHWADDR, as documented in netdevice(7). If called on a tun device, then just ifr_hwaddr is used and the call is assumed to refer to the current tun device (unlike the same ioctl on a socket, which works off ifr_name). This change can be done even when the device is running. Flags for TUNSETIFF (this section probably wants to come first): Exactly one of the following must be set: IFF_TUN IFF_TAP You can also set: IFF_NO_PI -- see struct tun_pi below IFF_ONE_QUEUE: deprecated IFF_VNET_HDR -- prepends a struct virtio_net_hdr to indicate "GSO and checksum information" since f43798c27684ab925adde7d8acc34c78c6e50df8, see above IFF_TUN_EXCL -- ensures that we are creating a new device, instead of opening an existing persistent device. IFF_MULTI_QUEUE -- see TUN_TAP_MQ above The following are valid for TUNSETQUEUE only, in which case they are the only bits paid attention to in the struct. In other uses of this struct they are invalid IFF_ATTACH_QUEUE IFF_DETACH_QUEUE Flags for TUNSETOFFLOAD, these match to corresponding NETIF_F_* options: TUN_F_CSUM "You can hand me unchecksummed packets." TUN_F_TSO4 "I can handle TSO for IPv4 packets" TUN_F_TSO6 "I can handle TSO for IPv6 packets" TUN_F_TSO_ECN "I can handle TSO with ECN bits" TUN_F_UFO "I can handle UFO packets" TUN_PKT_STRIP struct tun_pi { __u16 flags; -- currently no flags for send (user to kernel) are defined. For receive (kernel to user), TUN_PKT_STRIP is set if the supplied buffer is not long enough for the packet and bytes were dropped. __be16 proto; -- protocol number, equivalent to the Ethernet frame protocol number; see . network byte order. } If you're using a TUN device, and TUN_NO_PI is set, you can _only_ send IPv4 or IPv6 packets. If the first byte of your data (the "version" field in IP) is not 4 or 6, the packet is dropped and the send returns EINVAL (XXX how far up does this bubble). // line 1141 If you're using a TAP device, on send, proto is always ignored and set to the type corresponding to TUNSETLINK, defaulting to Ethernet. (The net result is that the tun_pi struct is completely ignored for TAP devices.) The below interacts with SETTXFILTER somehow.... although that takes an int " * This stuff is applicable only to the TAP (Ethernet) devices. * If the count is zero the filter is disabled and the driver accepts * all packets (promisc mode). * If the filter is enabled in order to accept broadcast packets * broadcast addr must be explicitly included in the addr list." TUN_FLT_ALLMULTI struct tun_filter { __u16 flags; (can only be TUN_FLT_ALLMULTI) __u16 count; /* Number of addresses */ __u8 addr[0][ETH_ALEN]; }; Multiqueue: https://github.com/aliguori/qemu-patches/blob/9462c212685364c6091447321a8d9a7f8ad02802/Multiqueue-virtio-net/v2.1359160523/0000-Multiqueue-virtio-net.txt NOTES This is unrelated to the ip tunnel support (see ip-tunnel(7)), which is kernelspace tunneling of IP over IP. Certainly that can be implemented in userspace with this driver, though. There exists a macvtap driver with a similar character-device interface to macvlan devices (virtual interfaces on an existing interface distinguished by MAC address). SEE ALSO netdevice(7), socket(7)