

![]() | Start a set with this search |
![]() | Include this search in one of my sets |
![]() | Exclude this search from one of my sets |
![]() | Permalink to these results Paste this link in email or IM: |
| Atom feed for tracking future search results Paste this URL into your reader: |
11 messages in org.apache.ws.synapse-devRe: Transport appears to be hanging b...| From | Sent On | Attachments |
|---|---|---|
| ant elder | Mar 23, 2007 1:50 am | |
| Oleg Kalnichevski | Mar 23, 2007 2:37 am | |
| Asankha C. Perera | Mar 23, 2007 3:48 am | |
| Asankha C. Perera | Mar 23, 2007 5:50 am | |
| Asankha C. Perera | Mar 23, 2007 10:24 am | |
| ant elder | Mar 24, 2007 3:56 am | |
| Asankha C. Perera | Mar 24, 2007 5:05 am | |
| ant elder | Mar 25, 2007 2:43 am | |
| Oleg Kalnichevski | Mar 25, 2007 8:10 am | |
| Asankha C. Perera | Mar 26, 2007 12:46 am | |
| Oleg Kalnichevski | Mar 26, 2007 2:15 am |

![]() | Permalink for this message Paste this link in email or IM: |
![]() | Permalink for this thread Paste this link in email or IM: |
| Atom feed for this thread Paste this URL into your reader: |
| Subject: | Re: Transport appears to be hanging because an unchecked exception caused the I/O dispatch thread to terminate | Actions |
|---|---|---|
| From: | Oleg Kalnichevski (ole...@apache.org) | |
| Date: | Mar 25, 2007 8:10:22 am | |
| List: | org.apache.ws.synapse-dev | |
On Sun, 2007-03-25 at 09:43 +0000, ant elder wrote:
The symptoms I get do seem to match what you describe. There's still two problems with that though which I'd like to understand better.
1) Why don't I see this with the non-NIO transport? For example I can run the Synapse server samples in either the Synapse sample server which uses the NIO transport, or I can just use a separate axis2-1.1.1 distro with the non-NIO transport. When using JMeter against axis2-1.1.1 it works fine and i can send tens of thousands of requests without any errors. Whats different here, the underlying TCP stack and config is the same isn't it?
Anthony, Asankha, at al
The problem appears to be caused by Synapse opening an I/O pipe per *every* incoming and outgoing HTTP message. On some platforms this can be a very expensive operation both in terms of performance and system resources. On Windows opening a I/O pipe apparently requires a local IP port to be allocated. No wonder Synapse chokes only after a few thousand of requests.
I see absolutely no reason why Synapse should make use of I/O pipes. Essentially pipes are being used to bridge event-driven NIO and stream based classic IO. There are other ways to get the job done. A trivial shared buffer with synchronized access should perfectly suffice. I'll happily lend you a helping hand if necessary.
2) Synapse often hangs after the IO error and needs to be restarted. Is there any way we can make it recover from this without requiring a restart? By handling the exception differently or something?
Please let me know if you see any unchecked exceptions thrown by I/O reactors, as those exceptions cause I/O dispatch threads to terminate, effectively locking up the I/O reactor.
Oleg
...ant
On 3/24/07, Asankha C. Perera < asan...@wso2.com> wrote: Ant
This is the same error seen by Indika on Windows.. and I think my analysis is correct. If you run the test for the first time or after a few minutes of running the test last, you should be able to go to around 1000 iterations. After you start to hit this issue, even 200 iterations would give you the error. At this time, doing a netstat -na should show you that most of the tcp ports are in TIME_WAIT state. Usually it could take at least one minute till a port is cleared up by the OS. The tuning parameters I specified for Linux tells the OS to use the full port range for applications, and to set the tcp fin timeout to 30 secs - to clear up the ports as quickly as possible. Without *any* OS tuning and on a Windows XP system - you definitely will encounter this issue.
asankha
ant elder wrote:
> I've tried again with the latest Synapse and HTTP components
> code and several JVMs. The results feel slightly different
> than before but the end result is still always the root
> exception included below. Sometime it doesn't occur till
> around 1000 requests, but sometimes it happens after not
> many requests at all.
>
> ...ant
>
> java.io.IOException: Unable to establish loopback connection
> at sun.nio.ch.PipeImpl$Initializer.run(Unknown
> Source)
> at
> java.security.AccessController.doPrivileged(Native Method)
> at sun.nio.ch.PipeImpl.<init>(Unknown Source)
> at sun.nio.ch.SelectorProviderImpl.openPipe(Unknown
> Source)
> at java.nio.channels.Pipe.open(Unknown Source)
> at
> org.apache.axis2.transport.nhttp.ServerHandler.requestReceived
(ServerHandler.java:108)
> at
>
org.apache.axis2.transport.nhttp.LoggingNHttpServiceHandler.requestReceived(LoggingNHttpServiceHandler.java:83)
> at
> org.apache.http.impl.nio.DefaultNHttpServerConnection.consumeInput
(DefaultNHttpServerConnection.java:96)
> at
>
org.apache.axis2.transport.nhttp.PlainServerIOEventDispatch.inputReady(PlainServerIOEventDispatch.java:67)
> at
> org.apache.http.impl.nio.reactor.BaseIOReactor.readable
> (BaseIOReactor.java:68)
> at
>
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:160)
> at
>
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java
:145)
> at
>
org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:127)
> at
>
org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java
:153)
> at java.lang.Thread.run(Unknown Source)
> Caused by: java.net.BindException: Address already in use:
> connect
> at sun.nio.ch.Net.connect(Native Method)
> at sun.nio.ch.SocketChannelImpl.connect (Unknown
> Source)
> at java.nio.channels.SocketChannel.open(Unknown
> Source)
>
> On 3/23/07, Asankha C. Perera <asan...@wso2.com> wrote:
> Ant
>
> I am quite sure that the problem seen by Indika now
> was related to the ports being exhausted - see the
> following articles and esp. the "MaxUserPort" and
> "TcpTimedWaitDelay" parameters that could tweaked -
> to be consistent with what I am using before running
> a load test on Linux. I will ask Indika to check
> these on Monday - but you may try this in the
> meantime if you get a chance
>
> http://www.microsoft.com/technet/network/deploy/depovg/tcpip2k.mspx
> http://www.microsoft.com/technet/community/columns/cableguy/cg1205.mspx
>
http://www.psc.edu/networking/projects/tcptune/OStune/winxp/winxp_stepbystep.html
>
>
> asankha
>
> Asankha C. Perera wrote:
> > Hi Ant
> >
> > I fixed this for Linux and JDK 1.5 - I am
> > confident of this fix as I was able to first
> > recreate the issue consistently and then see the
> > fix in action using 5 concurrent users sending a
> > total of 5000 messages multiple times. However
> > Indika is still seeing a 'similar' issue in
> > Windows using JDK 1.4. We will try to see if its
> > related to JDK 1.4 or Windows. If you get the
> > latest nhttp code and build the nhttp JAR you
> > could verify this fix - and let me know.
> >
> > I am listing some of the linux commands that came
> > in handy for the resolution incase someone wants
> > to check this.
> >
> > lsof -p 7426 => lists the open files for the pid
> > given after the -p option
> >
> > ls -l /proc/9976/fd | wc -l => for each process
> > the /proc filesystem lists the files used and thus
> > you could count the open files with this command
> >
> > asankha
> >
> > Asankha C. Perera wrote:
> > > Ant / Oleg
> > >
> > > I can recreate this issue on both Windows and
> > > Linux and think its caused by my code related to
> > > use of Pipes.. and I am actively looking into
> > > this right now.. will get back to you on what I
> > > find.
> > >
> > > asankha
> > >
> > > ant elder wrote:
> > > > I've tried on several JDKs now and _always_
> > > > get similar intermittent I/O related errors. I
> > > > can use JMeter directly against Axis2-1.1.1
> > > > without any problems at all, so this does look
> > > > like some issue with the NIO transport. Be
> > > > really good to hear from other Windows users
> > > > to see if this is just my specific environment
> > > > or a more general problem problem.
> > > >
> > > > To recreate:
> > > >
> > > > 1) build Synapse server sample by running
> > > > 'ant' in the samples\axis2Server\src
> > > > \SimpleStockQuoteService directory
> > > > 2) start the sample service by running samples
> > > > \axis2Server\axis2server.bat
> > > > 3) get the Synapse config (either 8 or 501)
> > > > from http://people.apache.org/~antelder/temp/,
> > > > put in repository\conf\sample and start
> > > > syanps: bin\synapse.bat -sample=8
> > > > 4) get the JMeter config test1.jmx from
> > > > http://people.apache.org/~antelder/temp/,
> > > > start Jmeter and File -> Open and point to the
> > > > test1.jmx file
> > > > 5) JMeter Run -> Start and after not to long
> > > > IO errors should appear in the Syanpse
> > > > console
> > > >
> > > > ...ant
> > > >
> > > > ---------- Forwarded message ----------
> > > > From: Asankha C. Perera <asan...@wso2.com>
> > > > Date: Mar 22, 2007 4:58 PM
> > > > Subject: Re: [jira] Resolved: (HTTPCORE-60)
> > > > Transport appears to be hanging because an
> > > > unchecked exception caused the I/O dispatch
> > > > thread to terminate
> > > > To: HttpComponents Project
> > > > <http...@jakarta.apache.org>
> > > >
> > > > Oleg/Ant
> > > >
> > > > I am guessing this is something to do with
> > > > Windows or the JDK you use.. But I am unable
> > > > to test this week, so will try to my best to
> > > > try this sometime next week. As I said, on
> > > > Linux I have run the system through thousands
> > > > of messages and multiple threads concurrently
> > > > and have fixed all the issues I came across.
> > > >
> > > > So Oleg, I do not see this as a blocker for
> > > > the HttpCore release - but I will use your
> > > > latest snapshots in Synapse to check on this
> > > > in future if it occurs again
> > > >
> > > > thanks
> > > > asankha
> > > >
> > > > Oleg Kalnichevski (JIRA) wrote:
> > > > > [
> > > > >
> > > > >
https://issues.apache.org/jira/browse/HTTPCORE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> > > > >
> > > > > ]
> > > > >
> > > > > Oleg Kalnichevski resolved HTTPCORE-60.
> > > > > ---------------------------------------
> > > > >
> > > > >
> > > > >
> > > > > Resolution: Fixed
> > > > >
> > > > > Anthony
> > > > > It turned out ClosedChannelException is a checked I/O exception so it
cannot kill the I/O dispatch thread. So, apparently I was wrong in my initial
assertion about the cause of the Synapse I/O transport lockup. I tweaked
HttpCore code a little and changed the IOSessionImpl to catch all
ChannelClosedException-s thrown by the underlying byte channel just in case.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Please review the changes and let me know if it is okay to proceed
with the release
> > > > >
> > > > > Oleg
> > > > >
> > > > >
> > > > > > Transport appears to be hanging because an unchecked exception
caused the I/O dispatch thread to terminate
> > > > > >
----------------------------------------------------------------------------------------------------------
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Key: HTTPCORE-60
> > > > > > URL: https://issues.apache.org/jira/browse/HTTPCORE-60
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Project: HttpComponents Core
> > > > > > Issue Type: Bug
> > > > > > Affects Versions: 4.0-alpha4
> > > > > > Reporter: ant elder
> > > > > > Assigned To: Oleg Kalnichevski
> > > > > > Fix For: 4.0-alpha4
> > > > > >
> > > > > >
> > > > > >
> > > > > > See discussion on synapse-dev mailing list:
http://www.nabble.com/Intermittent-IO-Errors-using-Synapse-tf3439957.html
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > The transport appears to be hanging because an unchecked exception
> > > > > > caused the I/O dispatch thread to terminate. I believe there are
several
> > > > > > different types of problems (at least two) that we are seeing here.
> > > > > >
> > > > > > [I/O reactor worker thread 5] ERROR ServerHandler - I/O Error : null
> > > > > >
> > > > > > > java.nio.channels.ClosedChannelException
> > > > > > > at
> > > > > > >
sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:112)
> > > > > > > at
> > > > > > > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java
> > > > > > >
> > > > > > >
> > > > > > > :139)
> > > > > > >
> > > > > > >
> > > > >
> > > > --------------------------------------------------------------------- To
unsubscribe, e-mail: http...@jakarta.apache.org For
additional commands, e-mail: http...@jakarta.apache.org
> > > --------------------------------------------------------------------- To
unsubscribe, e-mail: syna...@ws.apache.org For additional
commands, e-mail: syna...@ws.apache.org
> > --------------------------------------------------------------------- To
unsubscribe, e-mail: syna...@ws.apache.org For additional
commands, e-mail: syna...@ws.apache.org
> --------------------------------------------------------------------- To
unsubscribe, e-mail: syna...@ws.apache.org For additional
commands, e-mail: syna...@ws.apache.org
>
--------------------------------------------------------------------- To
unsubscribe, e-mail: syna...@ws.apache.org For additional
commands, e-mail: syna...@ws.apache.org







