atom feed39 messages in ru.sysoev.nginxRe: Nginx stops sending file after ~1...
FromSent OnAttachments
Yo'av MosheJan 11, 2009 5:43 am 
Jure PečarJan 11, 2009 9:50 am 
Yo'av MosheJan 11, 2009 10:46 am 
Jure PečarJan 11, 2009 11:07 am 
Yo'av MosheJan 11, 2009 11:22 am 
Jure PečarJan 11, 2009 11:35 am 
Igor SysoevJan 12, 2009 2:28 am 
Yo'av MosheJan 18, 2009 3:35 pm 
Dave CheneyJan 18, 2009 4:24 pm 
Yo'av MosheJan 19, 2009 2:49 am 
Just MarcJan 19, 2009 4:25 am 
Yo'av MosheJan 19, 2009 4:59 pm 
Igor SysoevJan 19, 2009 11:23 pm.Other
Yo'av MosheJan 20, 2009 1:44 am 
Igor SysoevJan 20, 2009 2:08 am 
Yo'av MosheJan 20, 2009 10:25 am 
jeff emmingerJan 20, 2009 11:31 am 
Kiril AngovJan 20, 2009 11:38 am 
Yo'av MosheJan 20, 2009 3:52 pm 
ResicowJan 20, 2009 4:17 pm 
Igor SysoevJan 20, 2009 9:26 pm 
Yo'av MosheJan 23, 2009 4:34 am 
Igor SysoevJan 23, 2009 5:04 am 
Yo'av MosheJan 25, 2009 8:26 am 
Igor SysoevJan 25, 2009 9:33 am 
Yo'av MosheJan 25, 2009 11:04 am 
Igor SysoevJan 25, 2009 11:20 am 
Yo'av MosheJan 26, 2009 4:05 am.gz
Yo'av MosheJan 27, 2009 12:42 am 
Igor SysoevJan 27, 2009 12:50 am 
Igor SysoevJan 27, 2009 1:00 am 
Yo'av MosheJan 27, 2009 1:24 am 
Igor SysoevJan 27, 2009 1:39 am.nodelay
Yo'av MosheJan 27, 2009 1:58 am 
Igor SysoevJan 27, 2009 2:05 am 
Yo'av MosheJan 27, 2009 2:24 am 
Dave CheneyJan 27, 2009 3:03 am 
Igor SysoevJan 27, 2009 3:17 am 
Jure PečarJan 28, 2009 1:52 pm 
Subject:Re: Nginx stops sending file after ~1.5mb ?
From:Yo'av Moshe (bje@gmail.com)
Date:Jan 20, 2009 10:25:34 am
List:ru.sysoev.nginx

Hey,

It's gets even weirder. I found out that if I limit the download speed, I can get the whole file.

Could anyone confirm that this works: wget --limit-rate=3500 http://www.noal.org.il/static/temp/barvazi2.pdf while this doesn't: wget http://www.noal.org.il/static/temp/barvazi2.pdf

Thank you. Yo'av

On Tue, Jan 20, 2009 at 12:08 PM, Igor Sysoev <is@rambler-co.ru> wrote:

On Tue, Jan 20, 2009 at 11:45:13AM +0200, Yo'av Moshe wrote:

Sorry, but it is still the same... server is now running the modified nginx, but this file still can't be fetched: http://www.noal.org.il/static/temp/barvazi2.pdf . I just can't think what the problem might be.

If it helps, my VPS is running CentOS 5.2 with kernel 2.6.18.

Any more tests I can run?

The patch tries to sendfile() until EAGAIN, even it already said that it sent smaller chunk than it was requested.

Could you create a new strace and grep single pid to ensure that nginx calls sendfile() until EAGAIN ?

If the patch does not help, I suspect a bug in kernel/VPS/etc, as kernel does not return event via epoll_wait().

Thanks again. Yo'av.

2009/1/20 Igor Sysoev <is@rambler-co.ru>

On Tue, Jan 20, 2009 at 02:59:34AM +0200, Yo'av Moshe wrote:

Thanks. I'm not an strace guru but I don't think it shows anything unusual...

http://paste-it.net/public/mffe731/

You can see download starts right at the beginning, and at 02:25:25

you

can

see the last sendfile64 event. After that, no data was sent and I couldn't find anything related to that in this strace log.

The only thing this strace log is telling me is that nginx knows that there is still more data to send, since it recognized the right size of the file in the beginning (line 13: sendfile64(22, 23, [0], *1950230)*), and

it

knows

it has more to send by the last sendfile event (line 1173: sendfile64(22, 23, [1618265], *331965*)).

What stops the sending?

Looking at these lines:

2140 02:25:24 epoll_wait(11, <unfinished ...> 2140 02:25:24 <... epoll_wait resumed> {{EPOLLOUT, {u32=140043201, u64=13817743186826159041}}}, 512, 22224) = 1 2140 02:25:24 gettimeofday({1232411124, 913871}, NULL) = 0 2140 02:25:24 sendfile64(22, 23, [1559225], 391005) = 28800

2140 02:25:24 epoll_wait(11, {{EPOLLOUT, {u32=140043201, u64=13817743186826159041}}}, 512, 22077) = 1 2140 02:25:25 gettimeofday({1232411125, 66164}, NULL) = 0 2140 02:25:25 sendfile64(22, 23, [1588025], 362205) = 30240

2140 02:25:25 epoll_wait(11, {{EPOLLOUT, {u32=140043201, u64=13817743186826159041}}}, 512, 21924) = 1 2140 02:25:25 gettimeofday({1232411125, 211986}, NULL) = 0 2140 02:25:25 sendfile64(22, 23, [1618265], 331965) = 28800

I see that client socket (22) is associated with u64=13817743186826159041 in epoll. Also I see that the socket was added and modified in epoll in the start of the transfer:

2140 02:25:12 epoll_ctl(11, EPOLL_CTL_ADD, 22, {EPOLLIN|EPOLLET, {u32=140043201 , u64=579069412499841985}}) = 0

2140 02:25:12 epoll_ctl(11, EPOLL_CTL_MOD, 22, {EPOLLIN|EPOLLOUT|EPOLLET, {u32=140043201, u64=13817743186826159041}}) = 0

However, I do not see socket deletion from epoll or closing. I see only that after 02:25:25 epoll_wait(), i.e. kernel, never returned any events about 22 socket.

You may try the attached patch.