Programming Forums

Programming Forums (http://www.programmingforums.org/forumindex.php)
-   Python (http://www.programmingforums.org/forum43.html)
-   -   Inputting Non-ASCII Characters (http://www.programmingforums.org/showthread.php?t=12546)

UnKnown X Feb 9th, 2007 5:35 PM

Inputting Non-ASCII Characters
 
More specifically, the characters in the extended ASCII set.

Admittedly, I've only tried raw_input(), but that's the only method I know except file input, to which I'm trying to avoid resorting.

Whenever I input a character in the extended ASCII set (say, "æ"), I get a UnicodeEncodeError exception. Google has not helped me much, as I am only able to find descriptions on how to use foreign characters in the Python file itself, which is of no use to me.

Thanks for any help, if you have any.

Dietrich Feb 9th, 2007 7:27 PM

First of all read up on unicode at:
http://www.python.org/peps/pep-0263.html

Also here is a small code sample I wrote a while ago that might help you in your quest:
:

# Python supports Unicode strings whose individual characters
# are 16 bits. The full Unicode set contains Cyrillic, Chinese,
# Japanese and other language characters. They probably wouldn't
# show up on a typical US computer.  You can play with characters
# up to hexadecimal FF though.
# A Unicode string literal is preceded with a 'u'

# not much happens here ...
a = u'How are you?'
print a
# result is  How are you?

# here I used the unicode \uxxxx where xxxx is a four digit
# hexadecimal number to put in two typical Spanish characters
# 00bf = ¿  and  00f3 = ó  that are not on my US keyboard
a = u'\u00bfC\u00f3mo es usted?'
print a
# result is ¿Cómo es usted?

# these are really in the ASCII set, so we could have done it this way
a = u'\xbfC\xf3mo es usted?'
print a

Just a note of caution, not all editors handle unicode properly!

UnKnown X Feb 10th, 2007 6:17 AM

That's not what I was asking for. I already know that and use it extensively in my programme, though I'm looking for a way to get unicode input from the user.

On a related note, though, I tried using some Cyrillic characters (not even outputting them on the display, simply storing them in a variable), let's say Б (or \ud091), but my IDE, ActivePython gives me the following error:
UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 1-5: truncated \uXXXX escape

Surely, this shouldn't happen. It won't let me run the programme, even. It just displays this when I run that error-check on the .py file.

Edit: I found the problem with the Cyrillic characters, which was a fairly silly typo.

pal Feb 11th, 2007 7:16 AM

Quote:

Originally Posted by UnKnown X (Post 123760)
That's not what I was asking for. I already know that and use it extensively in my programme, though I'm looking for a way to get unicode input from the user.

It's always a good idea to post up the code if you're running into a problem. Others may misunderstand what you're asking for if you just babble in words.

Arevos Feb 11th, 2007 8:04 AM

It may be that it's just ActivePython that's messing up.


All times are GMT -5. The time now is 1:44 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC