2009-07-26

how to display utf-8 encoded files in Windows Console?

How to display utf-8 encoded files in windows console?

when i do

get-content some-utf8-file.txt

the unicode chars in that file displays as gibberish.

I did some search on window console, it seems it can display unicode but didn't find the exact solution. The chars in question is chinese chars and some math symbols. They display fine in IE and notepad.

the option -encoding for get-content seems to be output only.

Thanks.

Xah
∑ http://xahlee.org/



Xah Lee wrote:
> How to display utf-8 encoded files in windows console?

On Jul 26, 12:37 pm, Larry__Weiss wrote:
> Maybe this article is related to that?
>
> http://blogs.msdn.com/powershell/archive/2006/12/11/outputencoding-to-the-rescue.aspx

On Jul 26, 2:57 pm, Joel Bennett wrote:
> You can't. The console can't display most unicode characters, even
> though you can do whatever you want with them in scripts.
>
> That's the main reason why the PowerShell team created ISE.
>
> I don't know, off the top of my head, what the level of Unicode support
> is in the third party consoles like PS+ or PowerGUI -- it should work ok
> in my PoshConsole though ;)

Thanks a lot to both! That solved the problem.

In summary, set:

$OutputEncoding = New-Object -typename System.Text.UTF8Encoding

then, to display a utf-8 file, i do

get-content -encoding utf8 myfile.txt

Now, chinese shows as squares instead of gibberish. Some math symbols shows correctly though.

After little research, i tried to set the console's font to Lucida Console, still no go, even though the file displays fine in notepad which uses Lucida Console.

But the Windows PowerShell ISE solves the problem! I'll be using ISE now.

a little puzzle though... according to the blog
http://blogs.msdn.com/powershell/archive/2006/12/11/outputencoding-to-the-rescue.aspx

he seems to display chinese fine. Maybe the script output in the blog is from ISE too.

Xah
∑ http://xahlee.org/

No comments:

Post a Comment