[MidoNet-dev] API proposal - Optional Bridge ARP Cache

Pino de Candia gdecandia at midokura.com
Mon Feb 25 16:27:28 UTC 2013

On Monday, February 25, 2013 at 4:35 PM, Navarro, Galo wrote:
> Thanks for the feedback, Dave & Ryu. Some comments inline:
> > I think Dave is raising a really interesting point here. Having a cloud
> > orchestration do pre-seeding of port/mac/ip creates duplicate data (DHCP,
> > ARP, Bridge MAC table). Even if we leave DHCP as a separate service, are
> > there problems to construct both bridge mac table and ARP cache from the
> > current port configs in ZK?
> > 

Hi Folks,
I don't understand this. BridgePorts don't have the MAC/IP information of the VM-nic
that's going to be attached. The DHCP information can help you with the ARP cache's
mac/ip entries, but you would have to make decisions for the client (permanent entries or
expiring? can they be over-written if we snoop the mac on a different ip?).

Somehow, I'm not so worried about duplicate information. After all, ARP, mac-table
learning and DHCP are separate concerns and would be separate in a real network.
If we expose the contents of those maps/functions, then it should be easy to debug
inconsistencies, and the API will be cleaner.

An alternate approach is to allow customers to specify mac/ip pairs and mac/vport pairs
for their L2 network in a non-technology-specific way. Then the virtual bridge could
have capabilities that are based on that information:
- 'enableDhcp' (requires the mac/ip pairs)
- 'enableMacTablePreSeeding' (requires the mac/vport pairs)
- 'enableArpCacheSnooping' 
- 'enableArpCachePreSeeding' (requires the mac/ip pairs)

That avoids data duplication in a clean way... but it's a bit API change.
What do you think?

> > A port config already has IP address and MAC
> > address fields. I still think it's worth having a separate ARP cache API
> > but only for doing proxy ARP, as Dave mentioned; The mapping in this table
> > set via REST API would simply override the implicit ARP cache created from
> > the port ZK configs, and they're always permanent entries. This idea
> > probably has already been discussed and dismissed but I would love to hear
> > some explanation from the core team as I am most likely missing something.
> > 
> Generally I prefer to minimise the likelihood of inconsistencies so I
> agree that if we can keep a single place to set MAC-IP mappings would
> be highly preferable.
> I'd be in favour of pulling ZK port configs to build the cache
> because, as Pino mentions in #154, "we may want to make this feature
> available regardless of whether DHCP is configured". It's worth noting
> that DHCP snooping has the side effect of helping prevent ARP
> spoofing, but I'm not sure that this is an important requirement right
> now.
> In any case, exposing the bridge's ARP caches on read-only mode for
> debugging might still be convenient, what do you think?
> > At first glance, I think we can support this using static bridge ARP cache
> > entries. For example, if IP A is reachable via the router, but we want
> > machines on the bridge to believe that IP A is local, we can set a static entry
> > of (IP A, router's MAC) on the bridge. Machines on the bridge should then
> > send packets for A direct to the router's MAC, and the router should (I think)
> > just route the packet to where it needs to go. I think this is what Pino
> > describes in Github issue #154 [3].
> > 
> It would be feasible (adding API methods to update the ARP cache), but
> I was talking with Guillermo about this and we can't really see a use
> case for this, do you have an example?

See the GH issue. Cloudstack has such a use-case: it has 'service VMs' that
are assigned FloatingIPs in the same prefix as the tenant VMs. The service
and tenant VMs are supposed to be able to communicate as if they're in the
same L2 even though we have them in different L2 bridges.

ProxyArp probably requires ArpCache entries to allow the IP addresses to be prefixes
instead of /32 addresses. Galo, could you make a note of that somewhere in your doc?

> > > Physical bridge equivalent
> > > How does this ARP caching behavior compare to physical bridges? For
> > > example, if I locally (e.g. using ifconfig on the machines) change VM A to
> > > take VM B's IP, and VM B to take VM A's IP, I'd imagine a physical bridge
> > > would realize that quickly and adapt. If we set ARP entries statically and
> > > permanently, could that result in worse behavior (never adapts to the IP
> > > address switch)? Or is changing your IP address statically / locally on the
> > > VM just a weird case?
> > > 
> > 
> > 
> > It's probably acceptable not to handle this case from the cloud integration
> > point of view, as it's the same with EC2, and it hasn't seem to cause
> > issues. Can midolman handle this case without any explicit API calls right
> > now?
> > 
> Bridges don't keep ARP tables right now so they wouldn't notice.
> Routers do, they would start dropping packets since the cached IP
> wouldn't match with the new one. At some point it the network would
> eventually pick up the changes by expiring the old ARP mappings.
> Initially bridges should behave similarly since we're talking about
> having a single implementation of the ARP cache shared by all devices.
> How we update the entries will really depend on your expectations. We can
> 1. Learn from ZK port config first, upon expiration renew by ARP
> snooping or by requering ZK
> 2. Listen to ZK port config updates
> 3. Offer an API method to manually trigger invalidates / renewal of
> cached entries for a given port / mac / ip
> ..
> What behaviour would you expect? Should the ARP cache react
> immediately or is it ok to wait? Will this situation happen
> frequently?
> Cheers,
> /g

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.midonet.org/pipermail/midonet-dev/attachments/20130225/18818c89/attachment.html>

More information about the MidoNet-dev mailing list