|Subject:||Re: [Repost] Re: A question about your HttpClient example and beyond|
|From:||Trustin Lee (이희승) (trus...@gmail.com)|
|Date:||Sep 20, 2009 7:17:05 pm|
Thanks Frederic for a nice guidance. Happy to see the great conversation!
— Trustin Lee, http://gleamynode.net/
On Sat, Sep 12, 2009 at 2:44 PM, J. Mi <jmi...@gmail.com> wrote:
You got it all correct in my application. Thanks so much for the help. Following your suggestion, I was able to maximize the concurrency by adding a ChannelFutureListener on connect operation. In its operationComplete(), channel sent a http request and count down a CountDownLatch since I do need to 'join' all responses.
So now, I have used 3 loops (vs previously 4 loops), plus more concurrency, like this:
- Setup the first CountDownLatch - Loop 1: connect n times with a listener for each connect. In the listener's operationComplete(), request was sent, handler list was added and a CountDownLatch was counted down. - wait on the first CountDownLatch - Loop 2: use the handler list to retrieve each response. - Setup the second CountDownLatch - Loop 3: do a channel.getCloseFuture().addListener. In the listener's operationComplete(), simply count down the CountDownLatch - wait on the first CountDownLatch - bootstrap.releaseExternalResources
I think I'm now observing now better performance than before AFTER a threshold. In my case, in serving less than 100 requests, multi-threading plus synch Apache HttpClient does better. In serving 100 requests, they break even. In serving 200 and 300 requests, Netty does better. My VMware workstation 6 with Centos 5.3 and 5gb mem in a 4-cores desktop cannot handle more than 300 requests in my simple testing application.
Next, I'm going to work on HttpChunkAggregator as you suggested. Not sure what else I need to do other than uncommenting out that line in snoop example. I'll look into it.
Again, thanks so much to guide me as a newbie to this framework.
On Fri, Sep 11, 2009 at 12:30 AM, Frederic Bregier <fred...@free.fr> wrote:
Again, I'm feeling not able to answer to all, but I will start to answer to some...
One of the interest of the Nio model is the asynchronous part. In your example, if I get it correctly you do something like this: For all host/port connect For all connect wait their finished connection and send request For all connected wait for the answer for one request in order
Then you are implementing something in the middle of synchronous and asynchronous. I would have the following idea (using all ChannelFuture capability of Netty):
For all host/port connect and add a ChannelFutureListener on the connection done
In the ChannelFutureListener send the request => each request will add (not necesseraly in order) the result in your arraylist
Wait for the list to be full (n requests => n answers or using a countdown from concurrent package)
Then you can have connection/sending request/receiving request all overlapping between several requests.
What you have done tend to be synchronous, not completely since you overlap connections between all channel connection, but as you are waiting that all are done... You can get it by this "picture": you create n task (connection) you are waiting that all n task are done (connected) so a global synchronisation of all threads then you create n task (request) you are waiting that all n task are done (answered) so again a global synchronisation
What I suggest is: you create n task (connection), they will continue by sending the request (no synchronisation) you are waiting that all n task (connected and answered) are done (on any order) so a global synchronisation but based on the slowest answer from remote host.
Of course, if you can avoid to wait for all answers to be there and work with each answer one by one, then you can even avoid such a global wait on the slowest answer. But it depends on you business logic there...
Reusing connection is not quite possible in Netty but there is some handlers/code that allow reconnection (Trustin made an example a few days ago posted in the ML).
Now for the chunk part, yes chunk should be supported by any HTTP server. The reason is that when a request is bigger than 8KB, it is supposed to be chunked. However, there is in Netty an handler (HttpChunkAggregator) that allow you to get the full body (only the body is concerned by chunk) in one ChannelBuffer. This handler does accumulating of all chunks up to the last one and returns to the next handler when it is completed. It is obviously simplest for a standard program. However take care of one thing, it means that if you have 100 requests and if all requests sends 1MB of body, then you will have 100 MB in memory (at least) since it will store all bodies in memory until they finished to decode all chunks. In my work, I use the Http codec chunk by chunk since I am able with my business model to handle data chunk by chunk so keeping the memory as low as possible. But if it is not your case, just use the HttpChunkAggregator handler, it works perfectly and then you can ignore if the answer is chunked or not. In the snoop example there is an example on how to use it.
J. Mi wrote:
Thanks to Frederic for the overview. It's very helpful for me.
I have come up with an approach to replace my multi-thread model with Netty's HttpClient. It's pretty much based on the snoop example. I just added 3 loops to achieve the concurrency (multiple http requests at the same time). The first loop was around the call to bootstrap.connect(new InetSocketAddress(host, port)). The second loop was waiting for each connection attempt to succeed and then send the request. The third loop was using the handler to retrieve each http response by using a LinkedBlockingQueue. I used ArrayList to maintain a list for ChannelFuture, a list for Channel and a list for HttpResponseHandler among these 3 loops.
Everything worked well for me with the approach. However, my test result didn't seem to show this approach out-perform my multi-thread model, i.e. one thread (java.util.concurrent) for each http request which was done by Apache Commons HttpClient (a synchronous model). My performance was measured by timing the total time spent in making n http requests and retrieving this n http responses end-to-end.
With requests below 50, the multi-thread model performed a little better. I was hoping Netty's way can catch up for better scaling because I was concerned about the current muti-thread model may not scale well when getting hundreds requests at the same time. But I still failed to observe any increased performance relative to the multi-thread model beyond serving 50, 100, 200...800 concurrent requests.
One thing I need to understand more (Frederic already touched some basics here) is about the connection management. I felt that Apache Commons HttpClient seemed to manage the connection with possible reuse. Not exactly sure about how Netty does that.
One more question about Netty's HttpClient. In its HttpResponseHandler.java, messageReceived() method only receives a portion of response at a time and has a dependence on server's responding with "chunked' Transfer-Encoding header and content for an end of response condition. This raised 2 questions: (1) is there a way to receive response in one shot, like Apache's HttpClient; and (2) do all Http server required to respond with "chunked" content? In my case, I need to retrieve online responses from different web sites.
On Thu, Sep 10, 2009 at 6:45 AM, Frederic Bregier <fred...@free.fr>wrote:
I will not talk about the specific Http part of Netty but about its main interest, the NIO of Netty. Of course, Trustin or others can be more precised than me. It is just my general comprehension (I'm not a Nio expert neither a Netty expert, so it is just my comprehension as an end user).
To compare to a standard Blocking IO, Netty uses less threads to manage the same behaviour. For instance, if you think about Apache or Tomcat, one connection will be handled by at least one thread through the full life of the connection. So if you have 1000 connections, you will have at least 1000 threads. In Netty, a thread will be active when data arrives into the server (the general idea is greatly simplified here, it is not to take it as the reality). For instance, for those 1000 connections, maybe at most 100 are really sending something on the same time to the server, so around 100 threads will be used. Netty does something like reusing threads, whatever the connection is.
Another point of course is the non blocking way. Once you send something, you have the choice to continue the job without waiting that the data is really sent (of course, you have to take care about it for instance before closing the channel). So you can overlap sending data with other computations (for instance for next packet to be sent). Compares to blocking IO, of course, there you wait for the data to be really sent (or at least buffered).
So in many points, Netty approach should have more performance than blocking IO. I said "should" since there exist some counter examples where blocking IO are faster, since NIO introduces some extra computing comparing to blocking IO. However most of the time, these extra are masked by the implementation of Netty and are quicker than blocking IO. But I recall some examples however.
Also, Netty can have different kind of transport (nio, oio, ...), so the behaviour can be different according to one or another low network transport framework.
This is not the full idea of Netty, but a start of answer to your question. For more information, either other people can continue this thread (or correct where I a wrong), and of course you can read the examples that are in Netty (even those not about Http) and the documentation of Netty.
HTH, Cheers, Frederic
J. Mi wrote:
I guess my fundamental question here is if, in theory at least, Netty provides a better asynchronous mechanism than the concurrent java package from java.util.concurrent.* in terms of performance. Does internally Netty use multi-threading, java.nio, or both, or neither?
If Netty does better than java.util.concurrent.* for performance, is there any example, tutorial, which can guide me a little for replacing my current multi-threading process which I described in that previous email?
Many thanks to you for sharing your expertise, Jason
On Wed, Sep 2, 2009 at 12:11 PM, J. Mi <jmi...@gmail.com> wrote:
Currently, my application's process flow logic is like this:
-> A controlling process receives one request for data which will be fetched from multiple online sources. -> The controlling process spawns multiple threads. Each of these threads will (1) use Apache synchronous commons httpclient to fetch the data; (2) parse the data; and (3) return the data to the controlling process. -> The controlling process joins all threads and return the combined data to the requestor.
So basically, each thread uses a synchronous httpclient to fetch the data and then parse it.
In reading org.jboss.netty.example.http.snoop package, I have the following question: If I just replace the Apache's synchronous httpclient with Nettty's org.jboss.netty.handler.codec.http.* as the example does, will I be benefited performance-wise? I heard something about blocking I/O hurts multi-threading. If so, should Netty's package work better for me?
Or should I actually get ride of the existing multi-threading by using Netty's framework? If so, which of your examples can be better referenced for my purpose?
Thanks for your in advance, Jason
----- Hardware/Software Architect
-- View this message in context:
http://n2.nabble.com/A-question-about-your-HttpClient-example-and-beyond-tp3568879p3617420.html Sent from the Netty Developer Group mailing list archive at Nabble.com.
----- Hardware/Software Architect
-- View this message in context: http://n2.nabble.com/A-question-about-your-HttpClient-example-and-beyond-tp3568879p3624150.html Sent from the Netty Developer Group mailing list archive at Nabble.com.