Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Feb 9th, 2007, 5:35 PM   #1
UnKnown X
Hobbyist Programmer
 
UnKnown X's Avatar
 
Join Date: Dec 2005
Location: Sandvika, Norway
Posts: 114
Rep Power: 0 UnKnown X is an unknown quantity at this point
Send a message via MSN to UnKnown X
Inputting Non-ASCII Characters

More specifically, the characters in the extended ASCII set.

Admittedly, I've only tried raw_input(), but that's the only method I know except file input, to which I'm trying to avoid resorting.

Whenever I input a character in the extended ASCII set (say, "æ"), I get a UnicodeEncodeError exception. Google has not helped me much, as I am only able to find descriptions on how to use foreign characters in the Python file itself, which is of no use to me.

Thanks for any help, if you have any.
UnKnown X is offline   Reply With Quote
Old Feb 9th, 2007, 7:27 PM   #2
Dietrich
Professional Programmer
 
Dietrich's Avatar
 
Join Date: Feb 2005
Posts: 434
Rep Power: 4 Dietrich is on a distinguished road
First of all read up on unicode at:
http://www.python.org/peps/pep-0263.html

Also here is a small code sample I wrote a while ago that might help you in your quest:
# Python supports Unicode strings whose individual characters 
# are 16 bits. The full Unicode set contains Cyrillic, Chinese,
# Japanese and other language characters. They probably wouldn't
# show up on a typical US computer.  You can play with characters
# up to hexadecimal FF though.
# A Unicode string literal is preceded with a 'u'

# not much happens here ...
a = u'How are you?'
print a
# result is  How are you?

# here I used the unicode \uxxxx where xxxx is a four digit
# hexadecimal number to put in two typical Spanish characters
# 00bf = ¿  and  00f3 = ó  that are not on my US keyboard
a = u'\u00bfC\u00f3mo es usted?'
print a
# result is ¿Cómo es usted?

# these are really in the ASCII set, so we could have done it this way
a = u'\xbfC\xf3mo es usted?'
print a
Just a note of caution, not all editors handle unicode properly!
__________________
I looked it up on the Intergnats!
Dietrich is offline   Reply With Quote
Old Feb 10th, 2007, 6:17 AM   #3
UnKnown X
Hobbyist Programmer
 
UnKnown X's Avatar
 
Join Date: Dec 2005
Location: Sandvika, Norway
Posts: 114
Rep Power: 0 UnKnown X is an unknown quantity at this point
Send a message via MSN to UnKnown X
That's not what I was asking for. I already know that and use it extensively in my programme, though I'm looking for a way to get unicode input from the user.

On a related note, though, I tried using some Cyrillic characters (not even outputting them on the display, simply storing them in a variable), let's say Б (or \ud091), but my IDE, ActivePython gives me the following error:
UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 1-5: truncated \uXXXX escape

Surely, this shouldn't happen. It won't let me run the programme, even. It just displays this when I run that error-check on the .py file.

Edit: I found the problem with the Cyrillic characters, which was a fairly silly typo.

Last edited by UnKnown X; Feb 10th, 2007 at 7:02 AM.
UnKnown X is offline   Reply With Quote
Old Feb 11th, 2007, 7:16 AM   #4
pal
Programmer
 
pal's Avatar
 
Join Date: Mar 2005
Location: Washington
Posts: 91
Rep Power: 4 pal is on a distinguished road
Exclamation

Quote:
Originally Posted by UnKnown X View Post
That's not what I was asking for. I already know that and use it extensively in my programme, though I'm looking for a way to get unicode input from the user.
It's always a good idea to post up the code if you're running into a problem. Others may misunderstand what you're asking for if you just babble in words.
pal is offline   Reply With Quote
Old Feb 11th, 2007, 8:04 AM   #5
Arevos
Programming Guru
 
Arevos's Avatar
 
Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 5 Arevos is on a distinguished road
It may be that it's just ActivePython that's messing up.
Arevos is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Replacing characters in std::string magnus.therning C++ 10 Feb 1st, 2007 10:11 AM
Converting ANSI characters to hex for Checksum. JawaKing00 C 4 Sep 9th, 2005 6:07 AM
binary and ascii Nellie C++ 4 Jun 8th, 2005 2:39 PM
Unicode to Ascii welles Other Scripting Languages 5 May 6th, 2005 11:36 AM
ASCII Adjust after Multiplication rick barclay Assembly 1 Apr 25th, 2005 10:42 PM




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 6:04 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC