Quote:
|
Originally Posted by Vb Programmer
most of the charactes may already be capitals, depending on what the user types in so i just need the non caps to be capital
|
Benoit answered this one in the first post in the thread, and I elaborated on it. You have two choices.
First is to subtract from the character values. In ASCII, the uppercase letters are values 65 through 90 (decimal), and the lowercase letters are 97 through 122. The order is purely alphabetical, so 65 is 'A', 66 is 'B', and so on. Thus, if you subtract 32 from the value of a lowercase letter, it becomes an uppercase one.
My elaboration on
Benoit's post was simply pointing out you need to first make sure it's a lowercase letter. If it's something else, you will get unpredictable results, so you can do something like this (NASM-style syntax, but you can adapt it):
mov cx, length ; load the string length
jcxz L3 ; guard against zero-length string; if this is the case, exit
mov si, string ; load the string start address
mov al, 32 ; mem, reg operations are more efficient than mem, imm8
L1:
cmp [si], byte 97
jb L2 ; go on to next iteration if [si] < 97, ie if it's < 'a'
cmp [si], byte 122
ja L2 ; go on to next iteration if [si] > 122, ie if it's > 'z'
sub [si], al
L2:
inc si ; point at next character
loop L1 ; loop to process remaining characters
L3: ; all done now
If you're unsure of any of the asm mnemonics I used, you can
look them up here.
The second option is to load a table (either 256 or 128 bytes in size) with the 'convert to' values. Then you do a lookup to convert a value into another value, or more specifically, use value A as an index into the table to fetch value B. In your specific case, you'd have each value equal its own index, except for the lowercase letter indices; these would be equal to their uppercase counterparts. Thus, index 10 would store the value 10, index 65 would hold 65, but index 97 would also hold 65. Get it?
Then you use the XLATB (table lookup translation) instruction, like so:
mov cx, length ; load the string length
jcxz L2 ; guard against zero-length string; if this is the case, exit
mov si, string ; load the string start address
mov bx, table ; load the lookup table
mov al, [si] ; fetch the first byte of the string
and al, 07Fh ; limit it to ASCII range (0-127)
xlatb ; transform it
mov [si], al ; and write it back
inc si
loop L1
L2: ; all done now
This one relies on you having the lookup table initialized (you should be able to figure out how to do this, as it's just a block of 128 or 256 bytes, and I explained it above), and is arguably faster than the first method, since there is less conditional branching. Also, most of the branching is just the main loop, so branch prediction in modern CPUs will handle this well, not like the first method (where the branching depends on the character's value).
There is a line in red there. This can be deleted if your table is 256 bytes in size, but if you want strict conformance with ASCII, and don't care about 'extended ASCII', you can cut the table to 128 bytes. If you do this, the line in red will prevent you from indexing past the end of your lookup buffer.
Quote:
|
Originally Posted by Vb Programmer
and how do i delete letters from a string, for example if i have a string called hello i want to delete the 'll'
|
This I will leave as an exercise for you, since I already wrote you two routines. You can use REP MOVSB to do this for you (bonus points if you optimize it to use MOVSW and MOVSD when appropriate).