PxPlus User Forum
Main Board => Discussions => Programming => Topic started by: Thomas Bock on August 19, 2020, 10:06:43 AM
-
A customer expects "B%C3BCchse" for the german word "Büchse" e.g. This looks like an URL encoded UTF8 character. So I gave it a try, but had no luck so far. This is what I tried:
thing$="Büchse"
thingUTF8$=cvs(thing$,"ASCII:UTF8")
print hta(thingUTF8$), " OK"
thingURL$=cvs(thingUTF8$,"UTF8:URL")
print hta(thingURL$)," nOK"
print thingURL$
Is there a way to do it with CVS()?
-
"B%C3BCchse" is not the correct URL encoding for "Büchse". The correct encoding would be either "B%F9chse" keeping it ascii or "B%C3%B9chse" encoding UTF8.
With CVS you can get "B%F9chse" by just doing CVS(thing$,"ASCII:URL"). See Mike's post for how to get the other format.
-
Thomas
Are you sure about what the customer expects?
If I convert the value you have first from ANSI (ISO 8859-1) to UTF8 then to URL encoding I get the following:
->thing$="Büchse"
->thingUTF8$=cvs(thing$,"ASCII:UTF8")
->print thingUTF8$
Büchse
->print cvs(thingUTF8$,"ASCII:URL")
B%C3%BCchse
That's awfully close to what you posted so is it possible in your example you missed the second %?
-
According to his specifiaction all unicocde characters must be written using the pattern %NNNN. There are several examples showing this.
That kind of encoding is new to me, too. Perhaps I can convince him to use Mike's approach.
-
The URL encoding was just my guess because of the leading "%".
I think I must encode/decode this myself, as CVS has no option for that kind of notation.
-
Generally you don't use URL encoding on Unicode data but rather UTF-8 data. Here sis a bit of discussion on the subject which generally recommends Using UTF8.
https://stackoverflow.com/questions/912811/what-is-the-proper-way-to-url-encode-unicode-characters
-
If they are not using this for a URL and need to use the non standard %NNNN encoding then yes you would have to do it yourself. One possible way would be to go through the string character by character and do a CVS(chrstr$,"ASCII:UTF8") if the result is different you can add the % at the beginning and add it to the output string if the result of CVS is the same just add it as is to output string.