atom feed23 messages in net.launchpad.lists.openstackRe: [Openstack] [keystone] v3 API dra...
FromSent OnAttachments
Joseph HeckJun 10, 2012 1:57 pm 
Mark NottinghamJun 11, 2012 10:27 pm 
Gabriel HurleyJun 12, 2012 1:23 am 
Mark NottinghamJun 12, 2012 3:10 am 
Joseph HeckJun 12, 2012 9:09 am 
Adam YoungJun 12, 2012 9:21 am 
Jay PipesJun 12, 2012 10:16 am 
Jay PipesJun 12, 2012 10:30 am 
Dolph MathewsJun 12, 2012 12:17 pm 
Michael BartonJun 12, 2012 2:12 pm 
Mark NottinghamJun 12, 2012 7:19 pm 
Gabriel HurleyJun 12, 2012 8:24 pm 
Mark NottinghamJun 12, 2012 8:42 pm 
Christopher B FerrisJun 13, 2012 4:52 am 
Gabriel HurleyJun 13, 2012 2:32 pm 
Nguyen, Liem ManhJun 14, 2012 9:08 am 
Mark NottinghamJun 14, 2012 5:20 pm 
Doug DavisJun 15, 2012 5:35 am 
Christopher B FerrisJun 15, 2012 6:56 am 
Nguyen, Liem ManhJun 15, 2012 9:50 am 
Jorge WilliamsJun 15, 2012 11:48 am 
Jorge WilliamsJun 15, 2012 11:50 am 
Hua ZZ ZhangJun 17, 2012 10:29 pm.gif, .gif, .gif
Subject:Re: [Openstack] [keystone] v3 API draft (update and questions to the community)
From:Jay Pipes (jayp@gmail.com)
Date:Jun 12, 2012 10:30:49 am
List:net.launchpad.lists.openstack

On 06/12/2012 12:21 PM, Adam Young wrote:

On 06/12/2012 04:24 AM, Gabriel Hurley wrote:

That said, we have also considered the case you propose where you effectively "request everything and handle it on the client-side"... however, I see that as a tremendously lazy solution. On the service-provider end you have access to powerful database methods that can do these operations in fractions of the time the client-side can (especially with good indexes, etc.). And if you've ever worked in mobile applications you'll know that minimizing data across the wire is crucial. The only argument I've heard in favor of that is basically "it's easier for us not to add API features".

At the expense of loading your Database. Serverside paging and filtering both require one of two things: caching or additional Database queries, and both increase your server footprint. For small datasets, or for limited queries, this is not a problem, but for scalability you want to limit the work you do on the server.

This is actually incorrect for relational databases. Passing filter and pagination/offset parameters in the API allows *more efficient* queries to be executed on the database server (given proper indexing, of course...). Not passing in these parameters in the API means more full table scans, more rows transmitted over the wire, and more work done by the database server.

For Keystone using the LDAP backend, caching and pagination are extremely expensive, and not something I would like to support.

This is a problem with LDAP, not with SQL backends, which are specifically built for querying in this manner. That said, however, there *are* certain filters that LDAP can deal with pretty well, right? Things like limiting to a specific OU can allow LDAP to winnow results, correct?

an LDAP query is not guaranteed to come back in any particular order, so you can't just do the SQL trick of executing the query at offset + window size. You have to do the equivalent of a Cursor, and this places serious load on the LDAP server, the Keystone server, and possibly impacts other apps dependand on LDAP.

If this is the case, then one solution might be to raise NotImplementedError in the case of when an API filter is not supported by a backend and leave it up to the client to retry the "full set of results" request and do the filtering/pagination itself?

To speak on the specific feature of pagination, the problem of 'corruption' by simultaneous writers is no excuse for not implementing it. You think Google, Facebook, Flickr, etc. etc. etc. don't have this problem? If you consume their feeds you'll notice you can fetch offset-based pagination with ease. You'd never expect to see a navigation element at the bottom of Google search results that said "take me to results starting with the letter m".

There is a major difference. We are working with data that has to be ACID. Google, Facebook and flickr do not. Before you migrate a VM, you need to know if the host meets the criteria for the VM. If it changes between when you check and when you reserve the space for the VM, you have just over committed. "Get it right eventually" does not work for management apps.

This isn't necessarily true. Nova's compute layer goes through a number of steps to ensure a semi-transactional nature to certain operations like resizing. Certain times a query needs to indicate that it intends to make a reservation of resources (see quota/reservation system now .. this is the SELECT FOR UPDATE paradigm) and other times, the query doesn't care about such things. In the latter case, there aren't expectations that the list returned is 100% accurate according to the state of the database at a particular timestamp of when the transaction occurred. In this case, filters and optimistic pagination works perfectly fine, IMHO.

None of this is a case of "someone might use it". The Horizon team has been loudly asking for these features for 8+ months now. And not just from Keystone but from all the projects. I have a list a mile long of API features we need to really deliver a compelling experience. I was just adding some items to it today, in fact.

The rest of your points I have no strong feelings on and generally agree, but when it comes to API features... I feel *very* strongly.

Note that I am not saying "don't do pagination" as I agree, it is essential for good user experience. What I am stating is that we need to be smart about the techniques and technologies we choose, as there is always an upside and a downside.

Sure, agreed. :)

Best, -jay