[MidoNet-dev] Feature Proposal: Resource Tagging

Navarro, Galo galo at midokura.com
Wed Mar 13 15:50:24 UTC 2013


Hi Pino et. al.

Just to add some info on the redundancy point: Solr can provide replication
to Lucene (I used this a few years ago when and it was really based on
rsync, but now it seems to have grown up a bit :) There is info on
https://wiki.apache.org/solr/SolrReplication.

cheers,
/g

On 13 March 2013 14:41, Pino de Candia <gdecandia at midokura.com> wrote:

>  Hi Ryu,
>
> thanks for the great write-up!
>
> Something was bothering me about this, but I had to sleep on it to figure
> it out: before we commit to the scan-ZK/index-in-memory approach, I'd like
> to compare to having the relations in a data-store because:
> - I'd like to keep them outside of ZK (minor point though)
> - I'd like to be able to examine them without the API server, and to be
> able to run more than one API server.
>
> Can Cassandra serve this purpose? Doesn't it have most of the features
> you're looking for? The only comparison points I can come up with are:
> - inconsistency window with Cassandra vs. no server redundancy/scalability
> with Lucene.
> - adding relationships after the fact without triggering ZK watchers.
> - ease of implementation - not sure which one wins.
> - API server startup speed - I think the Cassandra approach wins
> - query speed - the Lucene approach definitely wins.
>
> Separately, and it goes for whatever approach we take - assuming we have
> to migrate Netflix to this new model... how will that work?
>
> Finally, overall I think this is a great idea - we really, really have to
> have some search capabilities in the API.
>
> thanks,
> Pino
>
>
> On Friday, March 8, 2013 at 10:12 AM, Ishimoto, Ryu wrote:
>
> Hi Devs,
>
> I have started writing down my proposal for resource tagging (phase 1) in
> wiki:
> https://sites.google.com/a/midokura.jp/wiki/midonet/resource-tagging
>
> There are still some TODOs in the document because I need to consult
> Lucene developers that we are planning to do this project with for more
> information, but I thought I have enough to get started.
>
> Feedback appreciated!
>
> Best,
> Ryu
>
>
> ---------------------------------------------------------------------------------
>
> Resource Tagging - Phase 1
> This document proposes the integration of Lucene(
> http://lucene.apache.org/core/<http://www.google.com/url?q=http%3A%2F%2Flucene.apache.org%2Fcore%2F&sa=D&sntz=1&usg=AFrqEzdFnR2odvAn8VEAwpLLBuOe3teygg>
> ), an open-source indexing library, to enhance the search capabilities of
> MidoNet API.  In addition, it illustrates the improvement in the current
> Zookeeper directory structures brought forth by Lucene integration.
>
> *Introduction*
> *
> *
> MidoNet resources are stored in Zookeeper, and for those that need to be
> searched by clients (integration projects such as OpenStack or CloudStack,
> and MidoNeet Control Panel), index directories are created in Zookeeper
> during the creation process and cleaned up during the deletion process from
> the MidoNet code.  This has proven to be highly inefficient because it
> requires non-trivial amount of development work just to add searching
> capabilities of the resources.  Furthermore, numerous Zookeeper directories
> are created only for the indexing purpose which increases the number of
> directories unneeded by Midolman, making it difficult to inspect data.
>
> *Assumptions*
>
> Only one instance of MidoNet API server runs at any time.   The detailed
> reasons for the difficulty of having multiple instances with Lucene
> integration are explained in the later section.
>
> It is currently planned that some parts of the project will be developed
> by external Lucene developers.  Thus, some parts of the documentation are
> incomplete as they need further consultation from them.  They will be
> filled in later.
>
> *Search capabilities*
>
> *Searchable items*
> *
> *
> The end goal is to implement search by all fields of all resources.
>  However, in this section, only those exposed by API are mentioned.
>
> Router, Bridge, Port, Port Group, Chain, Rule, Route, BGP, AdRoute, DHCP
> Subnet, DHCP Host, Host, and Tunnel Zone resource types are searchable.
>  They are searchable by its *id, **tags, **properties*, unique identifier
> of their parent resource.  The search scope is per resource type (i.e.
> search by a tag for a particular resource type).
>
> *id* is the unique identifier of each resource.  Most of them are type
> UUID.  DHCP Host, however, does not have *id* field, and it requires a
> combination of bridge ID and DHCP subnet (IP prefix/len) to uniquely
> identify the resource.
>
> *Tags* are a list of arbitrary strings associated with each resource
> object.  They can be set by external clients via MidoNet API.  Up to
> configurable maximum number of tags can be added to each resource.  *tags* are
> not considered first class citizens for both Midolman and MidoNet API.
>  They are meant to be data managed outside of MidoNet, by clients such as
> OpenStack and MidoNet Control Panel.
>
> *Properties* are key-value pairs of strings associated with each resource
> object.  These are used internally and cannot be accessed by external
> clients directly.  The data stored in *properties* are not relevant to
> Midolman, but they are relevant to MidoNet API.  The examples of *
> properties* are:
>
>    - *tenant_id*: ID of the resource owner.  This is the ID used to
>    perform authentication and authorization by MidoNet API.
>
> Parent resource ID is used to search for a list of sub-resources belonging
> to the parent resource.  For example, you could do a search for a list of
> router ports given the ID of the router.   The following sub-resource
> searches are supported:
>
>    - router -> ports
>    - bridge -> ports
>    - chain -> rules
>    - router -> routes
>    - bgp -> ad_routes
>    - bridge -> dhcp_subnet
>    - dhcp_subnet -> dhcp_host
>
> TODO: add mac table and arp cache when they are done
>
> Parent-child relationship could be many-to-many, in which case, both
> resource types could be queried given the ID of the other resource.  For
> example, tunnel zones can be searched by host ID and hosts can be searched
> by tunnel zone ID.  The following list shows the resources with such
> relationship:
>
>    - tunnel_zone <-> host
>    - port_group <-> port
>
>
> *Pagination*
>
> Pagination feature is built into Lucene.
>
> TODO:  Explain the actual pagination mechanism implemented by Lucene and
> the corresponding fields introduced in MidoNet API.
>
> *Sorting*
>
> Sorting feature is built into Lucene.
>
> TODO:  Explain the actual sorting mechanism implemented by Lucene and the
> corresponding fields introduced in MidoNet API.
>
>
> *Indexing*
> *
> *
> *RAM Index*
>
> Lucene lets indices to be stored in memory, and this is the mode used by
> MidoNet API.  The indices will be stored in memory of the same host that
> the API server runs.  This means, however, if multiple API instances are
> running, the indices must be replicated across all of them.  A typical
> deployment scenario could that there are multiple API servers running
> behind a load balancer.   Having indices not in sync among them would
> expose incorrect behavior to the clients.  This problem will be the main
> focus in Phase 2.
> *
> *
> TODO: Explain the details of how Lucene stores indices in RAM.
> *
> *
> *MidoNet API Server Start*
> *
> *
> The Zookeeper resource data are scanned and indexed when the MidoNet API
> server starts.  A failure in the indexing process would cause the server to
> shut down.  In the single API server deployment, no new indices should be
> introduced during the indexing process at the start up.
> *
> *
> TODO: Explain the details of how Lucene indexes Zookeeper data.
>
> *New Resource Types*
>
> It is required that when new resource types are added to the system, they
> will be quickly picked up by the indexer without much effort.
>
> TODO: Explain the actual implementation to achieve this requirement.
>
> *API Changes*
> *
> *
> *General Assumptions*
> *
> *
>
> When searching by unique resource ID or by parent resource ID, the API
> remains unchanged from the current version.  For all other types of search,
> query strings are used to filter resources.
>
>
> *Tenant Property Search URI*
>
> To search by *tenant_id* property, the URI template to achieve this is
> given in the Application resource response .
>
> Method: GET
> Accept: vnd.org.midonet.Application.v1+json
> URI: http://api.example.com/
>
> => {"tenant_routers_template": "
> http://api.example.com/routers?tenant_id={tenant_id}",
>     "tenant_bridges_template": "
> http://api.example.com/bridges?tenant_id={tenant_id}",
>     "tenant_chains_template": "
> http://api.example.com/chains?tenant_id={tenant_id}",
>     "tenant_port_groups_template":
>             "http://api.example.com/port_groups?tenant_id={tenant_id}",
>     ...}
>
> The actual URI can be constructed by replacing '{tenant_id}' with the
> actual tenant ID value:
>
> Method: GET
> Accept: vnd.org.midonet.Router.collection.v1+json
> URI: http://api.example.com/routers?tenant_id=foo
>
> => [{"id": "router1", "tenant_id": "foo", ...},
>     {"id": "router3", "tenant_id": "foo", ...}]
> *
> *
> *Tag URIs*
> *
> *
> *tags* location can be discovered from the *tags* URI field in the
> response of all the resource objects:
>
> Method: GET
> Accept: vnd.org.midonet.Router.v1+json
> URI: http://api.example.com/routers/1
>
> => {"id": "router1", "tags": "http://api.example.com/routers/1/tags",
>     "tag_template": "http://api.example.com/routers/1/tags/{tag}"}
> *
> *
> *tag_template* contains a '{tag}' token, where it should be replaced with
> the actual token to construct the URI for the DELETE operation.
> *
> *
> *Tag Media Type*
> *
> *
> New media types, *vnd.org.midonet.Tag.v1+json* and *
> vnd.org.midonet.Tag.collection.v1+json* represent a tag object and a
> collection of tag objects, respectively.  *tag* media type is:
>
> *Name*   *Type*        *Description*
> tag    String      Tag of the resource
> uri    URI         URI representing the location of this tag
> *
> *
> *Tag Search queries*
>
> Doing a GET on the *tags* URI  returns all the tags associated with the
> resource:
>
> Method: GET
> Accept: vnd.org.midonet.Tag.collection.v1+json
> URI: http://api.example.com/routers/1/tags
>
> => [{"tag": "foo", "uri": "http://api.example.com/routers/1/tags/foo"},
>     {"tag": "bar", "uri": "http://api.example.com/routers/1/tags/bar"}]
>
> Doing a POST on the *tags* URI adds a new tag:
>
> Content-type: vnd.org.midonet.Tag.v1+json
> Method: POST
> URI: http://api.example.com/routers/1/tags
>
> Body:
> {'tag': 'foo'}
>
> * This adds a new tag, 'foo', to this router.
>
> Doing a DELETE on the URI of an individual tag deletes the tag:
>
> Method: DELETE
> URI: http://api.example.com/routers/1/tags/foo
>
> * This deletes a tag, 'foo', from this router.  It is an idempotent
> operation.
>
>
> *Resource search by tags*
>
> To search resources by tags:
>
> Method: GET
> accept: vnd.org.midonet.Router.collection.v1+json
> URI: http://api.example.com/routers?tag=os-router&tag=os-router-id|foo
>
> => [{"id": "router2", ...}, {"id": "router3", ...}, ...]
>
> This query would return a list of routers that have tags 'os-router' and
> 'os-router-id|foo'.
>
> TODO: Give examples for sort and pagination.
>
> *Zookeeper directories*
>
> *Removing indexing directories*
> *
> *
> There are Zookeeper directories that exist only to provide indexing, and
> they are no longer necessary once Lucene does the indexing.   The following
> is a list of directories that can be removed:
>
>  /tenants
>       /bridge-names
>       /port_group-names
>       /chain-names
>       /router-names
>       /port_group-names
>
>
> *Removing unneeded directories*
>
> There may be other Zookeeper directories that can be removed after Lucene
> takes over the API queries.  For example, there are directories accessible
> by IDs:
>
> /resource/<id>
>
>  That may no longer be necessary because the ID search is no longer
> required from Zookeeper.  The end goal is to have Zookeeper directories
> only contain the minimum amount of data required by Midolman.
>
> This type of clean-up will be left for Phase 2.
>  _______________________________________________
> MidoNet-dev mailing list
> MidoNet-dev at lists.midonet.org
> http://lists.midonet.org/listinfo/midonet-dev
>
>
>
> _______________________________________________
> MidoNet-dev mailing list
> MidoNet-dev at lists.midonet.org
> http://lists.midonet.org/listinfo/midonet-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.midonet.org/pipermail/midonet-dev/attachments/20130313/6e917922/attachment-0001.html>


More information about the MidoNet-dev mailing list