unicode - Another Encoding/Decoding in C# Issue -
i've been working on gui cli. rather write text file, i'm redirecting standard output, , creating abject output can use/reuse within gui code. have tried every possible solution have come across, , have yet resolve issue. it's each line in return encoded in 2 different ways. here have command line interface:
class cmdtoolinteraction { private static string returnstring = null; public string runcommandline(string argumentstring) { unicodeencoding uni = new unicodeencoding(); process proc; proc = new process(); proc.startinfo.filename = "cmd.exe"; proc.startinfo.arguments = argumentstring; proc.startinfo.windowstyle = processwindowstyle.hidden; proc.startinfo.useshellexecute = false; proc.startinfo.redirectstandardoutput = true; proc.startinfo.redirectstandardinput = true; proc.start(); streamwriter cmdstreamwriter = proc.standardinput; cmdstreamwriter.write(argumentstring); cmdstreamwriter.close(); returnstring = uni.getstring(proc.standardoutput.currentencoding.getbytes(proc.standardoutput.readtoendasync().result)); proc.waitforexit(); console.write(returnstring); return returnstring; } }
where i'm running issue output. of readable english white rest jibberish/chinese i.e.
"someone@somewhere.net 䰀愀渀最甀愀最攀㨀 攀渀ഀ\n successഀ"
in instances entire line or return looks second half of above when know there should english alphanumerics instead.
halp!
edit:
i updated code above add proc.startinfo.standardoutputencoding = encuding.unicode
i still string "someone@somewhere.net 䰀愀渀最甀愀最攀㨀 攀渀ഀ\n successഀ"
know why now. second part in case in bigendian unicode, whereas rest littleendian. i'm trying figure out how clean uninterpreted parts.
edit #2 @ roelands suggestion took unicode output , tried convert ascii. similar issue feel i'm getting closer "someone@somewhere.net 䰀愀渀最甀愀最攀㨀 攀渀ഀ\n successഀ"
reads "someone@somewhere.net???????????????\n success??"
i have decoding set this:
byte[] bytes = encoding.ascii.getbytes(proc.standardoutput.readtoend()); returnstring = encoding.ascii.getstring(bytes);
i think it's indeed encoding issue. list of bytes (part of string), assuming string utf-16 little endian. closely @ bytes around newline:
119 'w' 0 104 'h' 0 101 'e' 0 114 'r' 0 101 'e' 0 46 '.' 0 110 'n' 0 101 'e' 0 116 't' 0 13 cr 10 lf 0 32 ' ' 0 32 ' ' 0 76 'l' 0 97 'a' 0 110 'n' 0 103 'g' 0 117 'u' 0
at point utf-16 byte stream interpreted ansi text, , newlines ("\n"
) expanded cr-lf pairs, corrupting utf-16 string.
the solution depends on how program works. need run program via cmd command processor? if so, use /u
option? otherwise, can open i/o streams in binary mode?
Comments
Post a Comment