Subject: Re: libssh2_sftp_init() blocks and does not return

Re: libssh2_sftp_init() blocks and does not return

From: Heiner Steven <heiner.steven_at_nexgo.de>
Date: Fri, 20 Mar 2009 17:33:15 +0100

Daniel Stenberg wrote:
> On Fri, 20 Mar 2009, Heiner Steven wrote:
>
>>> Are you using a blocking or non-blocking socket?
>> I tried both. As a default I used a blocking socket, resulting in the
>> symptom decribed. When I modified the code to use a non-blocking socket
>> (libssh2_session_set_blocking(session, 0)) the function returns several
>> times (maybe 10, 20 times) with
>>
>> LIBSSH2SFTP_EAGAIN == LIBSSH2_ERROR_EAGAIN == -37
>>
>> and blocks afterwards.
>
> Why does it return that 10-20 times? Do you call it immediately in a
> busy-loop?
>
> What function blocks with a non-blocking socket? That surely can't hang on the
> recv() then at least...

well, it "hangs" on recvfrom(3:

The relevant part of the code:

--------------------------------------------------------------------------
[...]
     /* No SFTP session: create a new one */
     if (!(session = getsession(conn, url))) {
         return NULL;
     }

     libssh2_session_set_blocking(cp->session, 0);

     while (!(cp->sftp = libssh2_sftp_init(cp->session))) {
         errmsg("DEBUG: libssh2_sftp_init failed: %d\n",
                 libssh2_session_last_errno(cp->session));

         if (libssh2_session_last_errno(cp->session) != LIBSSH2SFTP_EAGAIN)
             break;
         sleep(1);
     }

     if (!cp->sftp) {
         ssherrmsg(cp->session, "ERROR: could not initiate SFTP session");
         return NULL;
     }
--------------------------------------------------------------------------

The output of the program:

     scopy: DEBUG: getsession: authenticate_session succeeded; session=0xa486b0
     scopy: DEBUG: libssh2_sftp_init failed: -37
     scopy: DEBUG: libssh2_sftp_init failed: -37
     scopy: DEBUG: libssh2_sftp_init failed: -37
     [The program "hangs" here]

The accompanying "strace" output:
--------------------------------------------------------------------------
write(2, "DEBUG: getsession: authenticate_s"..., 68) = 68
write(2, "\n"..., 1) = 1
fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
sendto(3, "\t'\34\361\33I\212\22\222-\200\3661\201Ae@$QA\320\254\300I\331\323\5H\17\233\30o\374"..., 68, MSG_NOSIGNAL, NULL, 0) = 68
recvfrom(3, 0x164c8b0, 16384, 16384, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
select(4, [3], NULL, NULL, {60, 0}) = 1 (in [3], left {59, 901096})
write(2, "scopy: "..., 7) = 7
write(2, "DEBUG: libssh2_sftp_init failed: "..., 36) = 36
write(2, "\n"..., 1) = 1
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, {1, 0}) = 0
recvfrom(3, "\315?:U\224+\322\374\351\220\224\271zt{h\330+\242\374\257\365\236\362\335+\3534yHb\365Q"..., 16384, MSG_NOSIGNAL, NULL, NULL) = 52
sendto(3, "#\237\367\230\0\317\310\2521n\16Ul\221i([\271\f\363JC^\352F\374\320u\244\256\277\347\233"..., 68, MSG_NOSIGNAL, NULL, 0) = 68
recvfrom(3, 0x164c8b0, 16384, 16384, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
select(4, [3], NULL, NULL, {60, 0}) = 1 (in [3], left {59, 999431})
write(2, "scopy: "..., 7) = 7
write(2, "DEBUG: libssh2_sftp_init failed: "..., 36) = 36
write(2, "\n"..., 1) = 1
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, {1, 0}) = 0
recvfrom(3, "8I\302\"/k\\<Eb\342\35\246<\36Z\377z\221\314\243\23Bu\32f\304T\203\345|%\225"..., 16384, MSG_NOSIGNAL, NULL, NULL) = 88
sendto(3, "dz\223\376\212k\336\346\272\276\3553\333\222F\16]n\16\322]\31\305n\t\222\213\274\276:\344p\241"..., 52, MSG_NOSIGNAL, NULL, 0) = 52
recvfrom(3, 0x164c8b0, 16384, 16384, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
write(2, "scopy: "..., 7) = 7
write(2, "DEBUG: libssh2_sftp_init failed: "..., 36) = 36
write(2, "\n"..., 1) = 1
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, {1, 0}) = 0
recvfrom(3, "br\0366[5\33\312\271\17\253\3313c\211\305%\262\246\242\214A\264\332:`\tq\227}\342H\270"..., 16384, MSG_NOSIGNAL, NULL, NULL) = 148
recvfrom(3, 0x164c8b0, 16384, 16384, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
sendto(3, "\233\373\310\356Oe2\312|\3 @\234\343z1\242\365z\303\323\204\262\17AC@\303a\321\255x\243"..., 52, MSG_NOSIGNAL, NULL, 0) = 52
recvfrom(3, 0x164c8b0, 16384, 16384, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
fcntl(3, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl(3, F_SETFL, O_RDWR) = 0
sendto(3, "Ld\10\244;n>\1\202<\6YsYG\"Dd\242\1*\360\3723\236\267\305\357\204E\374o;"..., 68, MSG_NOSIGNAL, NULL, 0) = 68
recvfrom(3, "\217'\37\245\234\301\314\3\227\247U\202] \4t\367tpQDN\224zK>9tx\354TA\334"..., 16384, MSG_NOSIGNAL, NULL, NULL) = 84
recvfrom(3, 0x164c8b0, 16384, 16384, 0, 0) = ? ERESTARTSYS (To be restarted)

[The program hangs here]
--- SIGINT (Interrupt) @ 0 (0) ---

--------------------------------------------------------------------------

Here is a stack backtrace I got after compiling the program with
"-g" and aborting it using SIGQUIT:

--------------------------------------------------------------------------
     scopy: DEBUG: getsession: authenticate_session succeeded; session=0xa486b0
     scopy: DEBUG: libssh2_sftp_init failed: -37
     scopy: DEBUG: libssh2_sftp_init failed: -37
     scopy: DEBUG: libssh2_sftp_init failed: -37
     ^\Quit(coredump)

     $ gdb scopy core
     [...]
     (gdb) bt
#0 0x00007fbbefe3e0d5 in recv () from /lib/libc.so.6
#1 0x0000000000428416 in _libssh2_packet_read (session=0xa486b0)
     at transport.c:362
#2 0x000000000040f504 in libssh2_channel_read_ex (channel=0xa50a20,
     stream_id=0, buf=0xa4def0 "\b", buflen=37) at channel.c:1526
#3 0x000000000041b56b in sftp_packet_read (sftp=0xa51770) at sftp.c:188
#4 0x000000000041ba07 in sftp_packet_requirev (sftp=0xa51770,
     num_valid_responses=2, valid_responses=0x4354a1 "ieLStatting",
     request_id=0, data=0x7ffff8f07538, data_len=0x7ffff8f07540) at sftp.c:354
#5 0x0000000000421aba in libssh2_sftp_stat_ex (sftp=0xa51770,
     path=0xa46ff0 "/tmp", path_len=4, stat_type=0, attrs=0xa46f40)
     at sftp.c:2175
#6 0x0000000000409079 in remote_fstat (connp=<value optimized out>,
     url=0xa46e80, mustexist=false)
     at smove/remoteops.c:965
#7 0x000000000040537b in main (argc=4, argv=0x7ffff8f08148)
     at smove/smove.c:850
(gdb)
--------------------------------------------------------------------------

>>> No it isn't, but I'll admit I don't do a lot of tests with blocking
>>> sockets.
>> The problem also exists with non-blocking sockets.
>
> I've done quite a lot of tests with non-blocking and I've not seen this, and
> I've also got reports from others who've done the libcurl-based (~30) tests
> without experiencing this. It might indicate that the problem happens only in
> more specific situations, like with a particular server on a particular OS or
> something else...

This could be the case, but on the other hand version 0.18 and 1.0 both
run without problems on the same system.

I wonder what the error code "ERESTARTSYS" means, which recvfrom()
returns on the SIGQUIT signal. Shouldn't the system call
return with EINTR? Both are different:

     85 ERESTART Interrupted system call should be restarted
     4 EINTR Interrupted system call

The errno result code does not change even when I catch the signal
SIGQUIT and install a signal handler without SA_RESTART flags.

Heiner

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
libssh2-devel mailing list
libssh2-devel_at_lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libssh2-devel
Received on 2009-03-20