PDA

View Full Version : [OPEN] UTF-8 characters in pathway scrambled!



lorezyra
3 Dec 2012, 12:59 AM
It would appear that Architect does not properly support UTF-8 pathways/filenames!

I click on publish in the IDE and this command prompt pops up.

As shown below, the wrong pathway is used.

Below, I typed in the correct pathway.

40505

jarrednicholls
3 Dec 2012, 1:29 PM
Hi lorezyra,

Can you paste the path you are trying here into the forum thread so we can copy it verbatim for some tests?

Thanks.

Phil.Strong
3 Dec 2012, 1:30 PM
Just to verfiy it's the 1st argument (project path) that's being "scrambled"

Phil.Strong
3 Dec 2012, 4:44 PM
So I've looked into it further and turns out we're running a bat file from your temp directory.

Can you locate that
<userhome>\appData\local\temp\....\temp.bat

it will be called temp.bat and somewhere in the temp directory.

note: this bat is reused by other Architect workflows so you may have to run the publish to get the correct bat file.

Let's see if it's wrong in that file

lorezyra
3 Dec 2012, 6:28 PM
Here's the exact output:


C:\Program Files (x86)\SenchaArchitect>xcopy "E:\AxlBit.com\險育判\DataHotel\KesukomuTool" "C:\WebProjects\Axlbit.com\kesukomu" /exclude:C:\Users\lorezyra\AppData\Local\Temp\exclude.txt /s /e /y /d /i
ファイルが見つかりません - KesukomuTool
0 個のファイルをコピーしました

C:\Program Files (x86)\SenchaArchitect>xcopy "E:\AxlBit.com\計画\DataHotel\KesukomuTool" "C:\WebProjects\Axlbit.com\kesukomu" /exclude:C:\Users\lorezyra\AppData\Local\Temp\exclude.txt /s /e /y /d /i



To translate: 「ファイルが見つかりません - KesukomuTool」 -> "Cannot find file/folder - KesukomuTool"


Additionally, I'm using SA v2.1.0 Build 676 on Win7x64 (Japanese OS)

lorezyra
3 Dec 2012, 6:50 PM
Phil,
I found the batch file you are referring to. Here are it's contents:


xcopy "E:\AxlBit.com\險育判\DataHotel\KesukomuTool" "C:\WebProjects\Axlbit.com\kesukomu" /exclude:C:\Users\lorezyra\AppData\Local\Temp\exclude.txt /s /e /y /d /i


Apparently, the text in the batch file is incorrect.

Seems like the bits of the Japanese kanji are getting truncated/shifted...

When I opened the batch file in notepad, it looks correct. However, when I use a different editor like vim, it shows it as I see in the command prompt. Apparently notepad performs some autocorrection before display.


I originally thought it to be an issue with how the file is processed differently based on the extention used (cmd vs bat). However, the reason it worked after renaming the file as due to notepad. I used notepad to _save as_ rather than the rename command on the command-prompt. Notepad fixed the issue. But SA still outputs the wrong encoding on the text file.

Phil.Strong
10 Dec 2012, 7:55 AM
Oh man just seeing this. lorezyra I give you permission to ping me if I've taken longer than 24 hours. This got buried in a sea of emails last week.

Jarred Nicholls is looking into it.

jarrednicholls
10 Dec 2012, 8:45 AM
Hi lorezyra,

Sorry for the delayed reply, I'm subscribing to this thread so I can keep in touch with a faster turnaround time.

I believe the issue may be that cmd.exe is using an incorrect codepage, and thus will read the characters not as UTF-8 but as "something else", whatever that may be. There is no UTF-8 BOM at the beginning of the file, so it's not out of the question that the file is being decoded incorrectly/inconsistently by different editors or programs.

Can you attach the file to this thread here? I'd like to inspect the contents of it in hexadecimal form to be more confident in my assumption above. I just want to make sure the characters are not being corrupted on disk, but instead are being decoded improperly, and can further investigate the issue from there.

Thanks!

lorezyra
10 Dec 2012, 5:31 PM
Sure...
40692

jarrednicholls
12 Dec 2012, 8:19 AM
Here is a copy of the temp.bat file, but with a UTF-8 BOM at the beginning of it. Try double-clicking this and see if it runs properly for you.

40773

lorezyra
12 Dec 2012, 5:14 PM
Here's the screenshot of what I see when I run the script. (I extracted it to my c:\temp directory.)40793

As you may see, there is something before the xcopy command. Not sure how you inserted the BOM signature, but it appears to not get read properly by DOS.

lorezyra
17 Feb 2013, 9:47 PM
Hey guys, I'm stilling seeing this issue. I'm currently using SA v2.1.0 Build 678 on Win7x64 (Japanese OS).

Have you been able to determine the source of this issue? To me, it seems to be an encoding issue in how Windows handles the text/batch file.

My only work-around is to use ASCII text only and no kanji text in the folder/file names.

Phil.Strong
22 Feb 2013, 10:21 AM
Thanks for the report! I have opened a bug in our bug tracker.

lorezyra
24 Mar 2013, 3:26 AM
FYI, I have confirmed that this is also an issue on Mac OSX 10.8.3.


rsync does not understand UTF-8 characters!


Plus, SA does not escape the parenthesis characters "(" & ")" in the pathway. This causes an error when trying to copy.

Phil.Strong
25 Mar 2013, 10:35 AM
We had investigated a fix regarding chcp 65001 on windows but this fails on Windows XP. I think the real fix would be to simply not write the file to disk and just run it as a concatenated command.

lorezyra
25 Mar 2013, 4:52 PM
Hey Phil,

That might work for Windows, but how about the MacBook?

Phil.Strong
27 Mar 2013, 12:05 PM
Yeah this would also work on macbook as well

do stuff && do more stuff && ...