VerneMQ and Proxy Protocol

Let's have a look together at a nice little feature that makes integration between proxy components like load balancers and VerneMQ a little bit easier. HAProxy was the first to implement it, and you can find the current specification of the protocol here.

Now, most people simply report what they have heard other folks saying: use the proxy protocol to forward client information "like source IP address" to the backend service. As if there was much more to forward than the source IP. Well, there is a little bit more, but let's have a close look. The spec says this:

"The information carried by the protocol are the ones the server would get using getsockname() and getpeername() :

- address family (AF_INET for IPv4, AF_INET6 for IPv6, AF_UNIX)
- socket protocol (SOCK_STREAM for TCP, SOCK_DGRAM for UDP)
- layer 3 source and destination addresses
- layer 4 source and destination ports if any"

So the proxy protocol doesn't tell you more than just TCP4, TCP6, and then gives you source and destination IPs and ports. A sender implementing the protocol will only send this information once, just after opening the TCP connection. For protocol version 1 (which is the human-readable version, while v2 is binary), it will send the following before sending anything else:

PROXY TCP4 192.168.0.1 192.168.0.11 56324 443\r\n

The receiver then knows that the original connection request comes from 192.168.0.1:56324.

Now, HAProxy didn't invent this only to be useful for a proxy in front of a backend service. The idea is that you can use the proxy protocol passing connection metadata through a chain of components, say firewall, load balancer, application proxy, etc. For that to work, every component needs to implement both sides of the proxy protocol. That is the sender and the receiver side (except the edge components).

VerneMQ only implements the receiver side, in the form of a listener configuration in the vernemq.conf file. For instance, you can enable the proxy protocol for a specific listener like this:

listener.tcp.LISTENER.proxy_protocol = on.

Why does VerneMQ only implement the receiver, you ask? Well, it wouldn't make much sense as a chaining component, and it certainly would break the decoupling of the Pub/Sub model.

Note that when you enable the proxy protocol on a listener, all clients connecting have to use the proxy protocol. That is you can't mix connections coming from HAProxy, and coming from MQTT clients directly, on the same listener.

MQTT clients do not have to implement the proxy protocol. They connect to HAProxy (or any other proxy), and HAProxy will take care of forwarding the connection metadata. VerneMQ will understand both the readable and the binary version of the proxy protocol without further configuration.

What is the proxy protocol used for?

To name just a few use cases:

  • You can log much more useful information in the backend (VerneMQ)
  • You can configure for "sticky session" behaviour in downstream components (supposing you have more than two components chained; you can configure sticky sessions in a load balancer without the proxy protocol)
  • You can configure for basic IP-level security in the backend

What about TLS?

The proxy protocol only works on pure TCP. The implementors even designed it to avoid possible confusion with higher-level protocols like HTTP, SMTP or TLS.

When you terminate TLS connections in VerneMQ, you can't use the proxy protocol.

When you terminate TLS on HAProxy directly (which is not the worst idea, by the way), you can still forward connection metadata using the proxy protocol, because HAProxy will connect to VerneMQ locally over TCP. In that case, you can even forward more information than source IP and port. Look at this config line, for instance.

listener.tcp.proxy_protocol_use_cn_as_username = on

This tells VerneMQ to use the client certificate common name as a user name. How is this even possible? Did we just extend the proxy protocol?

Yes, kind of. The proxy protocol has the following trick up its sleeve:

"If the length specified in the PROXY protocol header indicates that additional bytes are part of the header beyond the address information, a receiver may choose to skip over and ignore those bytes, or attempt to interpret those bytes."

Okay. We get some additional bytes for free, and we are free to interpret or ignore them, as a receiver. Those extra bytes are arranged in a Type-Length-Value struct (TLV). And with that, additional metadata can be transported. Most of those have to do with TLS, like the CN forwarding we capture with VerneMQ. For details on TLV have a look at the specification.

That's it. I hope this was helpful! Oh, and good to know: Proxy protocol support comes open source and for free with VerneMQ!