On Sat, May 31, 2008 at 04:09:25PM -0700, Toshio Kuratomi wrote:
However, the flipside of this is if a program has an xml config file that the user is expected to edit manually in a text editor and the program will adapt to multiple encodings (for instance, by using libxml2 to parse the file[1]_) having it exist in utf-8 is much better than having it exist in SOME_EXOTIC_ENCODING. In this case it's the program
I disagree. It is not an obvious choice and should be left to the maintainer. It depends on the user target of the software, for instance.
not upstream agrees. In the case of documentation that does specify the encoding I lean towards converting [2]_. In the case of a file that is used by a program we should definitely have a conversation with upstream about it, although we could convert locally with upstream's blessing (ie: Upstream says: "I'm going to continue writing my xml config file in latin-1. If you want to convert them to utf-8 for your users that's fine -- I'm going to continue to use a library for xml parsing that understands encodings.")
Once again, better leave it to the maintainer. This doesn't prevent from issuing recommendations, though.
-- Pat