[MidoNet-dev] Feature Proposal: Tunnel Health checks

Dan Mihai Dumitriu dan at midokura.com
Thu Feb 21 07:26:48 UTC 2013


I think I prefer having metrics for each tunnel RTT.  Yes, it's O(N^2)
messages and state, which is not great.  However, MN could really know and
show the state of the underlay network to the operators, on a pairwise
basis.  If there is path diversity by each host being multihomed, MN can
even make a decision about which tunnel to use, based on the tunnel
'health'.


On Thu, Feb 21, 2013 at 2:25 AM, de Palol, Marc <marc at midokura.jp> wrote:

> comments inline
>
> Hi Marc, thanks for the feedback.
>>
>> If I understand your suggestion correctly, the problem I see is that
>> the ephemeral node based solution  it will signal problems on the
>> hosts, but not on the wire. Two peers A, B might be alive, have
>> conectivity with ZK, the tunnels be technically alive and working
>> A->B, but failing B->A. In this situation, ephemeral nodes wouldn't be
>> deleted. Does that make sense?
>>
>>
> I was not talking about having about having the host itself in zk, this
> way we would have the problem you mention, I was thinking of having a
> connection to zk (as an ephemeral node) tied to the tunnel itself.
> But after reading your e-mail I think that the  "hey, there is no incomin
> data on our tunnels" solution you are proposing is far easier to understand
> + implement. This zk solution was a bit of an overkill.
>
>
>> On a side note, and more directed to @Adam:
>>
>> Pino and I were discussing yesterday if we really need that much
>> granularity in the diagnostic.
>>
>> Our current proposal gives value under the assumption that each host
>> depends on specific network conditions and therefore it's important to
>> report exactly what host, what tunnel, and what direction seems to be
>> dead.
>>
>> But in practise, a bunch of hosts will depend on the same network
>> conditions (e.g.: all are in the same subnet). And therefore, if the
>> subnet is unreachable from outside all tunnels will be dead, so
>> reporting the problem for each tunnel is simply redundanty.
>>
>> With this in mind, it may be enough to implement a much simpler
>> proposal whereby every MM agent would simply report if it's receiving
>> inbound data on tunnel ports. When they don't, having all hosts on the
>> subnet saying "hey, there is no incomin data on our tunnels" is
>> probably good enough to know where to investigate. For example, would
>> this have been enough in the Netflix PoC?
>>
>> What do you think?
>>
>> Thanks!
>> /g
>>
>> On 19 February 2013 23:12, de Palol, Marc <marc at midokura.jp> wrote:
>> > Hi all,
>> >
>> > I agree that the results will need to be stored in Cassandra, for what
>> Galo
>> > said, the metrics are there and the GUI already knows how to get them.
>> >
>> > About this 'are you there' problem. I wonder if we could use zookeeper's
>> > ephemeral nodes. This nodes exist in zookeeper as long as the session
>> who
>> > created them still exists. We could create a znode for every tunnel,
>> tied to
>> > the session. If a tunnel disappears or stops working the ephemeral node
>> > disappears (don't know how, we should see the details here). There
>> could be
>> > some watchers set in place to notify the responsible for the tunnel
>> > recreation.
>> >
>> >
>> > On Tue, Feb 19, 2013 at 5:27 PM, Navarro, Galo <galo at midokura.jp>
>> wrote:
>> >>
>> >> Just to clarify after talking w. Guillermo:
>> >>
>> >> Even though we don't need to active listen for an ACK received (for
>> >> the reasons explained before), we do need to implement a mechanism to
>> >> receive and reply to "are-you-there" messages received on one side of
>> >> the tunnel.
>> >>
>> >> /g
>> >>
>> >> On 19 February 2013 16:50, Navarro, Galo <galo at midokura.jp> wrote:
>> >> > On 19 February 2013 16:31, Guillermo Ontañón <guillermo at midokura.jp>
>> >> > wrote:
>> >> >> On Tue, Feb 19, 2013 at 4:21 PM, Navarro, Galo <galo at midokura.jp>
>> >> >> wrote:
>> >> >>>
>> >> >>> Hi Guillermo, thanks for the quick feedback! Some comments below
>> >> >>>
>> >> >>> >> - TunnelPorts become active on each side of the tunnel, the
>> >> >>> >> TunnelDoc
>> >> >>> >>   becomes aware of local ports and starts taking care of them.
>> >> >>> >> - Regularly, for each cared-for tunnel the TunnelDoc:
>> >> >>> >>     - Sends a packet to the other peer
>> >> >>> >>     - Logs variation on RX value of the PortStats on the
>> tunnel's
>> >> >>> >> local
>> >> >>> >> port.
>> >> >>> >>     - If variation = 0, increment a "no-increment" counter
>> >> >>> >>     - If "no-increment" counter > threshold, trigger alert
>> message
>> >> >>> >> for
>> >> >>> >>       lack of connectivity on the REVERSE direction of the
>> tunnel
>> >> >>> >> (e.g.:
>> >> >>> >>       if the TunnelDoc at A spots no RX, the alert refers to
>> loss
>> >> >>> >> of
>> >> >>> >>       connectivity from B to A).
>> >> >>> >>     - Implement whatever corrective measures upon receiving the
>> >> >>> >> alert
>> >> >>> >>       (typically, the DatapathController could recreate the
>> tunnel)
>> >> >>>
>> >> >>> > This is not a lot of extra traffic, but the number of tunnels
>> does
>> >> >>> > grow
>> >> >>> > quadratically with the number of MM agents. I propose a slight
>> >> >>> > variation
>> >> >>> > on
>> >> >>> > the above to avoid sending traffic on non-idle tunnels, along the
>> >> >>> > lines
>> >> >>> > of
>> >> >>> > what is done by IPsec's dead peer detection:
>> >> >>> >
>> >> >>> > http://www.ietf.org/rfc/rfc3706.txt
>> >> >>> >
>> >> >>> > Basically, from the POV of view of one of the nodes, it looks
>> like
>> >> >>> > this:
>> >> >>> >
>> >> >>> >    * Monitor idleness (by looking at RX as you outline above)
>> and do
>> >> >>> > nothing
>> >> >>> > and consider the tunnel healthy while idleness doesn't go above a
>> >> >>> > certain
>> >> >>> > threshold.
>> >> >>> >    * When the tunnel becomes idle, send an "are-you-there"
>> packet to
>> >> >>> > the
>> >> >>> > Peer (we could just use the tunnel-key for this).
>> >> >>> >    * When an "are-you-there" packet is received, reply to it
>> with an
>> >> >>> > Ack.
>> >> >>>
>> >> >>> This is definitely better. I messed up copypastes badly but the
>> idea
>> >> >>> was basically what you explain, the "send packet to another peer"
>> >> >>> would be conditioned to several cycles without increment on the
>> >> >>> "no-data-increment" counter.
>> >> >>
>> >> >>
>> >> >>
>> >> >> But I think that for this to work you need the 'ack' reply, would
>> it be
>> >> >> included? Otherwise a host may be receiving traffic (non-idle) but
>> not
>> >> >> sending, and would never send any 'are-you-there' packets to the
>> other
>> >> >> side
>> >> >> because its RX is increasing.
>> >> >
>> >> > But note that A is only monitoring *incoming* connectivity (B->A).
>> >> > This is because once the packet leaves A it's agent can tell that
>> >> > something is broken in the line, but not in what direction (is PING
>> >> > lost bc. A->B is cut, or ACK lost because B->A is cut?). We need to
>> >> > report health of each direction.
>> >> >
>> >> > So, A doesn't care about A->B. It only asserts that data is arriving
>> >> > from B. With this in mind, once A's agent sends the "are-you-there"
>> >> > message it doesn't really need to pay attention to the ACK.
>> >> >
>> >> > From the other side, B will do the same in reverse. If the
>> >> > "are-you-there" never arrives because A->B is broken, B will notice
>> >> > the static rx count and start a health check of the A->B direction.
>> >> >
>> >> > Does that make sense?
>> >> > /g
>> >
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.midonet.org/pipermail/midonet-dev/attachments/20130221/0055b74e/attachment.html>


More information about the MidoNet-dev mailing list