[MidoNet-dev] Draft proposal: Ping through NAT support (#435)

Navarro, Galo galo at midokura.com
Fri Feb 15 14:40:31 UTC 2013

Hello dev force!

I'm sending below a draft proposal for issue #435. Thanks in advance
for any comments / doubts / corrections / suggestions / improvements.

# Use case

Currently it's impossible to ping from a private network behind NAT to
an external address. Translation is not applied for ICMP because
lacking ports, we'd need to match on additional ICMP-specific fields
that are not supported by OpenFlow nor OVS, so it in practise we can't
route ICMP replies back to the correct sender.

This feature is typically supported by iptables and commodity routers,
so we'd like to make Midonet able to circumvent OVS/OpenFlow's
limitations. This will involve:

1. Implementing ICMP packets in NAT rules
2. Forcing user-space processing for ICMP req/repl

## Support ICMP messages in NAT rules

For ICMP messages, the identifier would act as transport source /
destination (see [RFC3022][1], esp. sections 2.2 and 4.1, as well as
[RFC5508][2]). The NatLeaseManager would treat these as ports, thus
being able to discriminate origins of ICMP echo requests sent to the
same destination:

If both A and B send an ICMP(src,dst,type,id) to Z accross a router R.

- A sends ICMP(A,Z,req,x), R translates to ICMP(R,Z,req,x')
- B sends ICMP(B,Z,req,y), R translates to ICMP(R,Z,req,y')

Nat mappings are: (A,x,Z,x') and (B,y,Z,x'). Thus:

- Z sends ICMP(Z,R,rep,x'), R translates to ICMP(Z,A,rep,x)
- Z sends ICMP(Z,R,rep,y'), R translates to ICMP(Z,B,rep,y)

One benefit of this approach is that we'd be able to reuse most of the
NatMapping code, rather than writing separate mapping for ICMP messages
alone. The main insufficiency of the current implementation comes from
the possibility of clashes between ICMP identifiers and port numbers
used by other applications This is solved in practise by including the
protocol in the mapping criteria (found references to this in [3] and

This solution would require adding the protocol to the current
NatLeaseManager + NatMapping. The logic to allocate a NAT lease would
work like this:

- TCP/UDP: identical to current implementation
- ICMP: lease the ICMP identifier as both source and destination "port"

This leaves one chance of collision (both A and B may send an ICMP req
with the same identifier, R won't be able to reverse-translate). To get
around this we can:
1. Drop an ICMP echo request if the identifier is already leased. These
   leases should probably have a low TTL.
2. Make the NatLeaseManager hold a separate list of free identifiers and
   assign them similarly as is done for ports.

(1) provides less complexity at the cost of some dropped ICMP requests

## Force userspace processing of ICMP messages

This is necessary because ODP does not parse the identifier fields of
the ICMP messages which is the only way to map src and dst accross a

1. Simulate ICMP messages should be simulated normally, but never
   produce installed flows.
2. Make ForwardNatRule and ReverseNatRule deal with ICMP by themselves.

Rules are only provided with context that may be relevant for installed
flows, so the ICMP identifier is not there and NAT rules cannot perform
the translation. There are various options to solve this:

- Make the Router artificially set the WildcardMatch's transport dst and
  src to the ICMP identifier before they enter chain.apply. This
  approach is problmatic considering that non-NAT rules will suddenly
  need to deal with a NAT-specific hack. Also, this solution will simply
  not work to support ICMP errors since, as it will be explained further
  down, we'll need rules to examine the contents of the ICMP payload.
- Extend WildcardMatch to include either the original packet so
  that each rule can simply examine the contents freely.
- Extend WildcardMatch adding the ICMP source/dst identifier.

The last option seems better because it makes it easy to reflect that,
in fact, we're extending the packet parsing capabilities of ODP.

WildcardMatch would start including "ODP-supported" fields, and
"ODP-unsupported" fields (ICMP id and payload would be part of these).
MM will emit the modified packet, but when it comes to installing new
flows the FlowController will simply ignore those that involve
unsupported matching fields. If further versions of ODP start parsing
unsupported fields, the corresponding flows can start being installed.

## Further support for ICMP error messages (#513)

As Jacob mentions one of the most important ICMP error messages to
support accross NAT would be ICMP Destination Unreachable, esp.
fragmentation required, etc. An ICMP error from Z to A triggered in
response to a TCP packet NAT'ed by R would look like this:

    ICMP(Z,R,dest-unreachable,frag-reqd, (A,x',Z,y'))

By examining the payload R would be able to use the payload for the
reverse mapping and deliver the ICMP error to A:

    ICMP(Z,A,dest-unreachable,frag-reqd, (A,x,Z,y))

## API changes

Initially it should not necessarily involve any API changes since most
of the work is localized in core code, but since part of the proposal
involves adding the protocol to NAT rules it may be considered to
expose this field also in the API as part of this issue.

## References

[1]: <http://tools.ietf.org/html/rfc3022>
[2]: <http://tools.ietf.org/html/rfc5508#page-6>
[3]: <http://hasenstein.com/linux-ip-nat/diplom/node6.html>
[4]: <http://superuser.com/questions/135094/how-does-a-nat-server-forward-ping-icmp-echo-reply-packets-to-users#135098>


More information about the MidoNet-dev mailing list