Subject: libssh2 API brokenness

libssh2 API brokenness

From: Henrik Nordström <>
Date: Thu, 01 Sep 2011 09:37:22 +0200

tor 2011-09-01 klockan 09:36 +0800 skrev m odd:
> Yes,The user level will never see LIBSSH2_ERROR_BAD_USE.
> I handle it in the function _libssh2_channel_receive_window_adjust(),

You should never see it internally either. It's bad use of the
transport_send function, outside of how it's intended to be used in
current design.

Thanks for finding this, it illustrates a family of problems that have
been nagging in the back of my head for a while but which I have not yet
looked for in the source. I however incidentally stumbled over a related
issue in the keep-alive probe while reading the source earlier.

The underlying issue currently hits

* Keep-alive probes.

* TCP window adjusts

* want_reply response messages

all are situations where libssh2 may want to internally send a packet in
response to other activity, which will fail if the transport is
currently blocked. And not only fail but may also corrupt the transport
if only part of the message it tries to send fits the transport. All
three already have FIXME markers in my tree due to the latter issue
already identified.

And quite likely there is similar issues in key exchange if
renegotiation is attempted while the outgoing transport is currently
blocking with EAGAIN.

In addition the state machine in sftp_recv() fails to follow the EAGAIN
API design, returning a positive response even if the transport is
currently blocking, possibly luring the application into trying to make
a non-sftp call which will then break with BAD_USE.

More intriguing the API is also somewhat borked when it comes to
multiple concurrent operations, even in the simple case of both reading
& writing from a single channel. The breakage there is that EAGAIN is
returned when reading waits for more data from the network, but API
design says the same call should be repeated until EAGAIN is no longer
returned which in effect forbids trying to send data while also trying
to read. Keep in mind that trying to read MAY also block with EAGAIN in
the outgoing direction from the above three internally generated

Some of this can be patched over without changing the public API for
blocking mode, but not entirely sure how to cleanly solve it in
non-blocking mode without adding some other session pump than the
existing API calls.

What I am thinking of is making libssh2_transport_send() always succeed,
and queue transmits if the transport is currently blocking. Then pump
the transport in BLOCK_ADJUST both before and after the actual call.
Problem there in non-blocking mode is that we may need to return a
positive API response even while the transport is still blocking pending
(i.e. just like sftp_recv currently does) but we don't really know when
the caller will call another API call after that.

Thinking about it the latter is actually a problem in itself already.
Consider calling a server such as OpenSSH which uses keep-alive in an
active manner, terminating the session if there is no response in a
timely fashion. This will fail if the application is currently only
doing writes and waiting internally for more data to write and hence not
making any libssh2 calls at the moment (other than possibly keep-alive
probing if enabled, but that may be far away in time).

Can obviously be solved by changing the non-blocking API to add a libssh
pump call of some kind which non-blocking applications should call when
there is activity on the socket or timer (the latter for encapsulating
keep-alive, something I think applicaitons should not need to manually
care for).


Received on 2011-09-01