Subject: Re: SFTP character encoding and problem with agent auth

Re: SFTP character encoding and problem with agent auth

From: Alexander Lamaison <>
Date: Wed, 11 Aug 2010 23:15:15 +0100

2010/8/11 ┼Żeljko Marjanovi─ç <>:
> Is it possible to determine the character encoding the SSH/SFTP server is
> using? I have read the protocol
> specs for SFTP v3 and there is no mention of it, but in v4 default encoding
> is UTF-8.  Is it safe to assume
> and use UTF-8 for default encoding?

Short answer, yes if connecting to machines running modern Unices.

The reason the v3 spec didn't mandate UTF-8 for filenames is probably
that some servers can't guarantee that. On Linux, for instance, you
can give the file a name using an arbitrary encoding of your choice -
it just stores a sequence of bytes [1][2]. When `ls` displays the
contents of a directory, it decides how to decode the filenames based
on the user's LANG environment variable. For instance, on my Ubuntu
machine, this is en_GB.UTF-8 so all filename data is interpreted as
UTF-8. If, by chance, an Arabic filename were encoded in MacArabic
encoding, it would be garbled in the listing.

This explains the problems encountered with a local `ls` but, of
course, a remote listing over SFTP faces all the same issues; the
filenames sent to the client can be a mix of UTF-8 and non-UTF-8. I
have no idea how SFTP v4 expects servers to guarantee they supply
UTF-8 when the server doesn't even know the encoding of its own

In practice, however, modern Unices default to UTF-8 so it would be
unusual to encounter a filename with a different encoding. My project
assumes all filenames are UTF-8. A more correct solution would be to
default to UTF-8 but provide the user with an option to specify a
custom encoding.




Swish - Easy SFTP for Windows Explorer (
Received on 2010-08-12