Introduction to Generic Netlink, or How to Talk with the Linux Kernel

Published 2023-02-10 on Yaroslav's weblog

This text is also available in other languages: Русский

Tux has got some mail!

Are you writing some code in kernel-space and want to communicate with it from the comfort of user-space? You are not writing kernel code, but want to talk with the Linux kernel? Serve yourself some good java (the good hot kind, not the Oracle one), make yourself comfortable and read ahead!

I recently got myself into programming in the Linux Kernel, more specifically kernel space modules, and one of the APIs that I had to study was Generic Netlink. As is usual with most Linux Kernel APIs the documentation, outside of sometimes fairly well commented code, is a bit lacking and/or old.

Hence why I decided to make a little example for myself on how to use Generic Netlink for kernel-user space communications, and an introductory guide using the aforementioned example for my colleagues and any other person interested in using Generic Netlink but not knowing where to start.

This guide covers the following:

  • Registering Generic Netlink families in kernel.
  • Registering Generic Netlink operations and handling them in kernel.
  • Registering Generic Netlink multicast groups in kernel.
  • Sending "events" through Generic Netlink from kernel.
  • Connecting to Generic Netlink from a user program.
  • Resolving Generic Netlink families and multicast groups from a user program.
  • Sending a message to a Generic Netlink family from a user program.
  • Subscribing to a Generic Netlink multicast group from a user program.
  • Listening for Generic Netlink messages from a user program.

Netlink is a socket domain created with the task of providing IPC for the Linux Kernel, especially kernel<->user IPC. Netlink was created initially with the intention of replacing the aging ioctl() interface, by providing a more flexible way of communicating between kernel and user programs.

Netlink communication happens over standard sockets using the AF_NETLINK domain. Nonetheless, on the user land side of things, libraries exist that provide a more convenient way of using the Netlink interface, such as libnl1.

That said, new Netlink families aren't being created anymore and new code doesn't use Netlink directly. The classic use of Netlink is relegated to already existing families such as NETLINK_ROUTE, NETLINK_SELINUX, etc. The main problem with Netlink is that it uses static allocation of IDs which are limited to 32 unique families, which greatly limits its users and may cause conflicts with out-of-tree modules.

Generic Netlink was created to fix the deficiencies of Netlink as well as bringing some quality of life improvements. It is not a separate socket domain though, it's more of an extension of Netlink. In fact, it is a Netlink family — NETLINK_GENERIC.

Generic Netlink has been around since 2005 so it is a well established interface for kernel<->userspace IPC. Some notable users of Generic Netlink include subsystems such as 802.11, ACPI, Wireguard, among others.

The main features that Generic Netlink brings to the table are dynamic family registration, introspection and a simplified kernel API. This tutorial is focused specifically on Generic Netlink, since it's the standard way of communicating with the kernel in ways that are more sophisticated than a simple sysfs file.

Generic Netlink bus diagram

Generic Netlink bus diagram

Some theory

As I've already mentioned, Netlink works over the usual BSD sockets. A Netlink message always starts with a Netlink header, followed by a protocol header, that is in the case of Generic Netlink, the Generic Netlink header.

The headers look like this:

Netlink header

Netlink Header

Generic Netlink header

Generic Netlink Header

Or as described by the following C structures:

struct nlmsghdr {
	__u32		nlmsg_len;
	__u16		nlmsg_type;
	__u16		nlmsg_flags;
	__u32		nlmsg_seq;
	__u32		nlmsg_pid;
};

struct genlmsghdr {
	__u8	cmd;
	__u8	version;
	__u16	reserved;
};

Netlink header fields meaning:

  • Length — the length of the whole message, including headers.
  • Type — the Netlink family ID, in our case Generic Netlink.
  • Flags — a do or dump; more on that later.
  • Sequence — sequence number; also more on that later.
  • Port ID — set to 0, since we are sending from kernel.

and for Generic Netlink:

  • Command: operation identifier as defined by the Generic Netlink family.
  • Version: the version of the Generic Netlink family protocol.
  • Reserved: as its name implies ¯\(ツ)/¯.

Most of the fields are pretty straight forward, and the header is not usually filled manually by the Netlink user. Some of the information contained in the headers is provided by the user through the API when calling the different functions. Some of that information are things like the flags and sequence numbers.

There are three types of message operations that are usually performed over a Netlink socket:

  • A do operation
  • A dump operation
  • And multicast messages, or asynchronous notifications.

There are many different ways of sending messages over Netlink, but these are the most used in Generic Netlink.

A do operation is a single action kind of operation in which the user program sends the message and receives a reply that could be an acknowledgment or error message, or maybe even a message with some information.

A dump operation is one for (duh) dumping information, usually more than fits in one message. The user program also sends a message but receives multiple reply message until received a NLMSG_DONE message that signals the end of the dump.

Whether an operation is a do or a dump is set using the flags field:

  • NLM_F_REQUEST | NLM_F_ACK for do.
  • NLM_F_REQUEST | NLM_F_ACK | NLM_F_DUMP for dump.

Now the third type, the multicast messages, are used for sending notifications to the users that are subscribed to them via the generic netlink multicast group.

As we saw, there's also a sequence number field in the Netlink header. However, unlike in other protocols, Netlink doesn't manipulate or enforce the sequence number itself. It's provided as a way to help keep track of messages and replies. In practice, the user program would increase the sequence number with each message sent, and the kernel module would send the reply(ies) with the same sequence as the command message. Multicast message are usually sent with a sequence number of 0.

Message payload

Netlink provides a system of attributes to encode data with information such as type and length. The use of attributes allows for validation of data and for a supposedly easy way to extend protocols without breaking backward compatibility.

You can also encode your own types, such as a struct in a single attribute. However, the use of Netlink attributes for each field is encouraged.

The attributes are encoded in LTV format and are padded such that each attribute begins at an offset that is a multiple of 4 bytes. The length fields, both in the message header and attribute, always include the header, but not the padding.

Netlink attribute diagram

Netlink attribute diagram

The attributes in a Netlink, and hence Generic Netlink, message are not necessarily added always in the same order, which is why they should be walked and parsed.

Netlink provides for a way to validate that a message is correctly formatted using so called "attribute validation policies", represented by struct nla_policy. I do find it a bit strange that the structure is not exposed to user space and hence validation seems to be performed by default only on the kernel side. It can also be done on user space and libnl also provides its own struct nla_policy but it differs from the kernel one which means you basically have to do duplicate work to validate the attributes on both sides.

The types of messages and operations are defined by the so called Netlink commands. A command correlates to one message type, and also might correlate to one op(eration).

Each command or message type can have or use one or more attributes.

Families

I have already mentioned families in this text and in different contexts. Unfortunately, as is common in the world of computer programming, things sometimes aren't named in the very best way hence we end up with situations like this one.

Sockets have families, of which we use the AF_NETLINK. Netlink also has families, of which there are only 32, and no more are planned or should be introduced to the Linux kernel; the family we use is NETLINK_GENERIC. Last but not least, Generic Netlink also has families, although these are dynamically registered and a whopping total of 1024 can be registered at a single time.

Generic Netlink families are identified by a string, such as "nl80211", for example. Since the families are registered dynamically, that means that their ID can change from one computer to another or even one boot to another, so we need to resolve them before we can send messages to a family.

Generic Netlink in itself provides a single statically allocated family called nlctrl which provides with a command to resolve said families. It also provides since not long ago a way for introspecting operations and exposing policies to user space, but we are not going to go into detail on this in this tutorial.

One more thing of note is that a single Generic Netlink socket is not bound to any one family. A Generic Netlink socket can talk to any family at any time, it just needs to provide the family ID when sending the message, by using the type field as we saw earlier.

Multicast groups

There are some message that we would like to send asynchronously to user programs in order to notify them of some events, or just communicate information as it becomes available. This is where multicast groups come in.

Generic Netlink multicast groups, just like families, are dynamically registered with a string name and receiving a numeric ID upon registration. In other words, they must also be resolved before being to subscribe to them. Once subscribed to a multicast group, the user program will receive all message sent to the group.

In order to avoid mixing sequence numbers with unicast messages and to make handling easier, it is recommended to use a different socket for multicast messages.

Getting our hands dirty

There's much more about Generic Netlink and especially classic Netlink, but those were the most important concepts to know about when working with Generic Netlink. That said, it's not very interesting just knowing about something, we are here for the action after all.

I have made an example of using Generic Netlink that consists of two parts. A kernel module, and a userland program.

The kernel module provides a single generic netlink operation and a multicast group. The message structure is the same for the do op and the multicast notification. The first reads a string message, prints it to the kernel log, and sends its own message back; and the second sends a notification upon reading a message from sysfs, echoing it.

The user space program connects to Generic Netlink, subscribes to the multicast group, sends a message to our family and prints out the received messages.

I'll be explaining them step by step with code listings in this article. The full source code for both parts can be found at https://git.yaroslavps.com/genltest/.

The land of the Kernel

Using Generic Netlink from kernel space is pretty straightforward. All we need to start using it is to include a single header in our file, net/genetlink.h. In total all the headers that we need to start working with our example are as follows:

#include <linux/module.h>
#include <net/genetlink.h>

We'll need some definitions and enumerations that will be shared between kernel space and user space, we'll put them in a header file that we'll call genltest.h:

#define GENLTEST_GENL_NAME "genltest"
#define GENLTEST_GENL_VERSION 1
#define GENLTEST_MC_GRP_NAME "mcgrp"

/* Attributes */
enum genltest_attrs {
	GENLTEST_A_UNSPEC,
	GENLTEST_A_MSG,
	__GENLTEST_A_MAX,
};

#define GENLTEST_A_MAX (__GENLTEST_A_MAX - 1)

/* Commands */
enum genltest_cmds {
	GENLTEST_CMD_UNSPEC,
	GENLTEST_CMD_ECHO,
	__GENLTEST_CMD_MAX,
};

#define GENLTEST_CMD_MAX (__GENLTEST_CMD_MAX - 1)

There we defined the name of our family, our protocol version, our multicast group name, the attributes that we will use in our messages and our commands.

Back in our kernel code, we make a validation policy for our "echo" command:

/* Attribute validation policy for our echo command */
static struct nla_policy echo_pol[GENLTEST_A_MAX + 1] = {
	[GENLTEST_A_MSG] = { .type = NLA_NUL_STRING },
};

Make an array with our Generic Netlink operations:

/* Operations for our Generic Netlink family */
static struct genl_ops genl_ops[] = {
	{
		.cmd	= GENLTEST_CMD_ECHO,
		.policy = echo_pol,
		.doit	= echo_doit,
	 },
};

Similarly an array with our multicast groups:

/* Multicast groups for our family */
static const struct genl_multicast_group genl_mcgrps[] = {
	{ .name = GENLTEST_MC_GRP_NAME },
};

Finally the struct describing our family, where we include everything so far:

/* Generic Netlink family */
static struct genl_family genl_fam = {
	.name	  = GENLTEST_GENL_NAME,
	.version  = GENLTEST_GENL_VERSION,
	.maxattr  = GENLTEST_A_MAX,
	.ops	  = genl_ops,
	.n_ops	  = ARRAY_SIZE(genl_ops),
	.mcgrps	  = genl_mcgrps,
	.n_mcgrps = ARRAY_SIZE(genl_mcgrps),
};

On initialization of our module, we need to register our family with Generic Netlink. For that we just need to pass it our genl_family structure:

ret = genl_register_family(&genl_fam);
if (unlikely(ret)) {
	pr_crit("failed to register generic netlink family\n");
	// etc...
}

And similarly, on module exit we need to unregister it:

if (unlikely(genl_unregister_family(&genl_fam))) {
	pr_err("failed to unregister generic netlink family\n");
}

As you may have noticed, we set our doit callback for our "echo" command to a echo_doit function. Here's what it looks like:

/* Handler for GENLTEST_CMD_ECHO messages received */
static int echo_doit(struct sk_buff *skb, struct genl_info *info)
{
	int		ret = 0;
	void	       *hdr;
	struct sk_buff *msg;

	/* Check if the attribute is present and print it */
	if (info->attrs[GENLTEST_A_MSG]) {
		char *str = nla_data(info->attrs[GENLTEST_A_MSG]);
		pr_info("message received: %s\n", str);
	} else {
		pr_info("empty message received\n");
	}

	/* Allocate a new buffer for the reply */
	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
	if (!msg) {
		pr_err("failed to allocate message buffer\n");
		return -ENOMEM;
	}

	/* Put the Generic Netlink header */
	hdr = genlmsg_put(msg, info->snd_portid, info->snd_seq, &genl_fam, 0,
			  GENLTEST_CMD_ECHO);
	if (!hdr) {
		pr_err("failed to create genetlink header\n");
		nlmsg_free(msg);
		return -EMSGSIZE;
	}
	/* And the message */
	if ((ret = nla_put_string(msg, GENLTEST_A_MSG,
				  "Hello from Kernel Space, Netlink!"))) {
		pr_err("failed to create message string\n");
		genlmsg_cancel(msg, hdr);
		nlmsg_free(msg);
		goto out;
	}

	/* Finalize the message and send it */
	genlmsg_end(msg, hdr);

	ret = genlmsg_reply(msg, info);
	pr_info("reply sent\n");

out:
	return ret;
}

In summary, when handling a do command we follow these steps:

  1. Get the data from the incoming message from the genl_info structure.
  2. Allocate a new message buffer for the reply.
  3. Put the Generic Netlink header in the message buffer; notice that we use the same port id and sequence number as in the incoming message since this is a reply.
  4. Put all our payload attributes.
  5. Send the reply.

Now let's take a look at how to send multicast notifications. I've used sysfs to make this example a little bit more fun; I've created a kobj called genltest which contains a ping attribute from which we will echo what is written to it. For brevity I'll elide the sysfs code from the article and just add the function that forms and sends the message here:

/* Multicast ping message to our genl multicast group */
static int echo_ping(const char *buf, size_t cnt)
{
	int		ret = 0;
	void	       *hdr;
	/* Allocate message buffer */
	struct sk_buff *skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);

	if (unlikely(!skb)) {
		pr_err("failed to allocate memory for genl message\n");
		return -ENOMEM;
	}

	/* Put the Generic Netlink header */
	hdr = genlmsg_put(skb, 0, 0, &genl_fam, 0, GENLTEST_CMD_ECHO);
	if (unlikely(!hdr)) {
		pr_err("failed to allocate memory for genl header\n");
		nlmsg_free(skb);
		return -ENOMEM;
	}

	/* And the message */
	if ((ret = nla_put_string(skb, GENLTEST_A_MSG, buf))) {
		pr_err("unable to create message string\n");
		genlmsg_cancel(skb, hdr);
		nlmsg_free(skb);
		return ret;
	}

	/* Finalize the message */
	genlmsg_end(skb, hdr);

	/* Send it over multicast to the 0-th mc group in our array. */
	ret = genlmsg_multicast(&genl_fam, skb, 0, 0, GFP_KERNEL);
	if (ret == -ESRCH) {
		pr_warn("multicast message sent, but nobody was listening...\n");
	} else if (ret) {
		pr_err("failed to send multicast genl message\n");
	} else {
		pr_info("multicast message sent\n");
	}

	return ret;
}

The process is very similar to the do operation, except that we are not responding to a request but sending an asynchronous message. Because of that we are setting the sequence number to 0, since it is not of consequence here, and the port id to 0, the kernel port/PID. We also sent the message via the genlmsg_multicast() function.

This is all for the kernel side of things for this tutorial. Now let's take a look at the user side of things.

User land

Netlink is a socket family and so it's possible to communicate over Netlink by just opening a socket and send and receiving messages over it, something like this:

int fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);

/* Format request message... */
/* ... */

/* Send it */
send(fd, &req, sizeof(req), 0);
/* Receive response */
recv(fd, &resp, BUF_SIZE, 0);

/* Do something with response... */
/* ... */

That said, it's better to make use of the libnl1 library or similar since they provide a better way to interface with Generic Netlink that is less prone to errors and already contains all the boilerplate that you would need to write anyway. This library is precisely the one that I'll be using in this example.

We'll need to include some headers from the libnl library to get started:

#include <netlink/socket.h>
#include <netlink/netlink.h>
#include <netlink/genl/ctrl.h>
#include <netlink/genl/genl.h>
#include <netlink/genl/family.h>

As well as our shared header with the enumerations and defines from the kernel module:

#include "../ks/genltest.h"

I've also made a little helper macro for printing errors:

#define prerr(...) fprintf(stderr, "error: " __VA_ARGS__)

As I mentioned in the introduction, it's easier to use different sockets for unicast and multicast messages, so that's what I'm going to be doing here, opening two different sockets to connect to Generic Netlink:

/* Allocate netlink socket and connect to generic netlink */
static int conn(struct nl_sock **sk)
{
	*sk = nl_socket_alloc();
	if (!sk) {
		return -ENOMEM;
	}

	return genl_connect(*sk);
}

/*
 * ...
 */

struct nl_sock *ucsk, *mcsk;

/*
 * We use one socket to receive asynchronous "notifications" over
 * multicast group, and another for ops. We do this so that we don't mix
 * up responses from ops with notifications to make handling easier.
 */
if ((ret = conn(&ucsk)) || (ret = conn(&mcsk))) {
	prerr("failed to connect to generic netlink\n");
	goto out;
}

Next we need to resolve the ID of the Generic Netlink family that we want to connect to:

/* Resolve the genl family. One family for both unicast and multicast. */
int fam = genl_ctrl_resolve(ucsk, GENLTEST_GENL_NAME);
if (fam < 0) {
	prerr("failed to resolve generic netlink family: %s\n",
	      strerror(-fam));
	goto out;
}

A (Generic) Netlink socket is not associated with a family, we are going to need the family ID when sending the message a little bit later.

The libnl library can do sequence checking for us, but we don't need it for multicast messages, so we disable it for our multicast socket:

nl_socket_disable_seq_check(mcsk);

We also need to resolve the multicast group name. In this case we are going to be using the resolved ID right away to subscribe to the group and start receiving the notifications:

/* Resolve the multicast group. */
int mcgrp = genl_ctrl_resolve_grp(mcsk, GENLTEST_GENL_NAME,
				  GENLTEST_MC_GRP_NAME);
if (mcgrp < 0) {
	prerr("failed to resolve generic netlink multicast group: %s\n",
	      strerror(-mcgrp));
	goto out;
}
/* Join the multicast group. */
if ((ret = nl_socket_add_membership(mcsk, mcgrp) < 0)) {
	prerr("failed to join multicast group: %s\n", strerror(-ret));
	goto out;
}

We need to modify the default callback so that we can handle the incoming messages:

/* Modify the callback for replies to handle all received messages */
static inline int set_cb(struct nl_sock *sk)
{
	return -nl_socket_modify_cb(sk, NL_CB_VALID, NL_CB_CUSTOM,
				    echo_reply_handler, NULL);
}

/*
 * ...
 */

if ((ret = set_cb(ucsk)) || (ret = set_cb(mcsk))) {
	prerr("failed to set callback: %s\n", strerror(-ret));
	goto out;
}

As you can see, we set the handler function to the same for both sockets, since we will be basically receiving the same message format for both the do request and the notifications. Our handler looks like this:

/*
 * Handler for all received messages from our Generic Netlink family, both
 * unicast and multicast.
 */
static int echo_reply_handler(struct nl_msg *msg, void *arg)
{
	int		   err	   = 0;
	struct genlmsghdr *genlhdr = nlmsg_data(nlmsg_hdr(msg));
	struct nlattr	  *tb[GENLTEST_A_MAX + 1];

	/* Parse the attributes */
	err = nla_parse(tb, GENLTEST_A_MAX, genlmsg_attrdata(genlhdr, 0),
			genlmsg_attrlen(genlhdr, 0), NULL);
	if (err) {
		prerr("unable to parse message: %s\n", strerror(-err));
		return NL_SKIP;
	}
	/* Check that there's actually a payload */
	if (!tb[GENLTEST_A_MSG]) {
		prerr("msg attribute missing from message\n");
		return NL_SKIP;
	}

	/* Print it! */
	printf("message received: %s\n", nla_get_string(tb[GENLTEST_A_MSG]));

	return NL_OK;
}

Nothing that fancy going on here, much of it is very similar to what we were doing on the kernel side of things. The main difference is that the message was already parsed for us by the kernel API, while here we have the option to walk the attributes manually or parse them all onto an array with the help of library function.

Next we want to send a message to kernel space:

/* Send (unicast) GENLTEST_CMD_ECHO request message */
static int send_echo_msg(struct nl_sock *sk, int fam)
{
	int	       err = 0;
	struct nl_msg *msg = nlmsg_alloc();
	if (!msg) {
		return -ENOMEM;
	}

	/* Put the genl header inside message buffer */
	void *hdr = genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, fam, 0, 0,
				GENLTEST_CMD_ECHO, GENLTEST_GENL_VERSION);
	if (!hdr) {
		return -EMSGSIZE;
	}

	/* Put the string inside the message. */
	err = nla_put_string(msg, GENLTEST_A_MSG,
			     "Hello from User Space, Netlink!");
	if (err < 0) {
		return -err;
	}
	printf("message sent\n");

	/* Send the message. */
	err = nl_send_auto(sk, msg);
	err = err >= 0 ? 0 : err;

	nlmsg_free(msg);

	return err;
}

/*
 * ...
 */

/* Send unicast message and listen for response. */
if ((ret = send_echo_msg(ucsk, fam))) {
	prerr("failed to send message: %s\n", strerror(-ret));
}

Also not that different from the kernel API. Now we listen once for the response to our command and indefinitely for incoming notifications:

printf("listening for messages\n");
nl_recvmsgs_default(ucsk);

/* Listen for "notifications". */
while (1) {
	nl_recvmsgs_default(mcsk);
}

As good hygiene, let's close the connection and socket before exiting our program:

/* Disconnect and release socket */
static void disconn(struct nl_sock *sk)
{
	nl_close(sk);
	nl_socket_free(sk);
}

/*
 * ...
 */

disconn(ucsk);
disconn(mcsk);

That's about it!

Conclusion

The old ways of interacting with the kernel and its different subsystems through such interfaces as sysfs and especially ioctl had many downsides and lacked some very needed features such as asynchronous operations and a properly structured format.

Netlink, and its extended form, Generic Netlink, provide a very flexible way of communicating with the kernel, solving many of the downsides and problems of the interfaces of old. It's certainly not a perfect solution (not very Unixy for instance), but it's certainly the best way we have to communicate with the kernel for things more complicated than setting simple parameters.

Post scriptum

At the moment when I started learning to use Generic Netlink, the 6.0 kernel wasn't yet stable, hence the excellent kernel docs Netlink intro2 wasn't yet in the "latest" section, but rather in the "next" section. Since I was looking inside the then "latest" (v5.19) docs, I didn't notice it until I started writing my own article.

I'm not sure if I had started writing this article had I come across the new kernel docs page, but in the end I think it was worth it since this one is a good complement to the official docs if you want to get your hands dirty straight away, especially considering that it provides a practical example.

Do give a read to the kernel docs page! It covers some things that might not be covered here.

1

Original site with documentation https://www.infradead.org/~tgr/libnl/, up-to-date repository: https://github.com/thom311/libnl

2

Very good introductory Netlink article, from the kernel docs themselves: https://kernel.org/doc/html/latest/userspace-api/netlink/intro.html

© 2018—2024 Yaroslav de la Peña Smirnov.