java输出utf8编码

答案:4 悬赏:30

解决时间 2021-04-27 18:06

提问者网友：时间却是纷扰
2021-04-27 15:08

java输出utf8编码

最佳答案

二级知识专家网友：狙击你的心
2021-04-27 15:40

response.setHeader("Content-type", "text/html;charset=UTF-8");

UTF-8（8-bit Unicode Transformation Format）是一种针对Unicode的可变长度字符编码，又称万国码。由Ken Thompson于1992年创建。现在已经标准化为RFC 3629。UTF-8用1到6个字节编码UNICODE字符。用在网页上可以同一页面显示中文简体繁体及其它语言（如英文，日文，韩文）。

全部回答

1楼网友：哭不代表软弱
2021-04-27 17:59

utf-8编码规范及如何判断文本是utf-8编码的 utf-8的编码规则很简单，只有二条： 1）对于单字节的符号，字节的第一位设为0，后面7位为这个符号的unicode码。因此对于英语字母，utf-8编码和ascii码是相同的。 2）对于n字节的符号（n>1），第一个字节的前n位都设为1，第n+1位设为0，后面字节的前两位一律设为10。剩下的没有提及的二进制位，全部为这个符号的unicode码。根据以上说明下面给出一段java代码判断utf-8格式 public static boolean isutf8(byte[] rawtext) { int score = 0; int i, rawtextlen = 0; int goodbytes = 0, asciibytes = 0; // maybe also use utf8 byte order mark: ef bb bf // check to see if characters fit into acceptable ranges rawtextlen = rawtext.length; for (i = 0; i < rawtextlen; i++) { if ((rawtext[i] & (byte) 0x7f) == rawtext[i]) { // 最高位是0的ascii字符 asciibytes++; // ignore ascii, can throw off count } else if (-64 <= rawtext[i] && rawtext[i] <= -33 //-0x40~-0x21 && // two bytes i + 1 < rawtextlen && -128 <= rawtext[i + 1] && rawtext[i + 1] <= -65) { goodbytes += 2; i++; } else if (-32 <= rawtext[i] && rawtext[i] <= -17 && // three bytes i + 2 < rawtextlen && -128 <= rawtext[i + 1] && rawtext[i + 1] <= -65 && -128 <= rawtext[i + 2] && rawtext[i + 2] <= -65) { goodbytes += 3; i += 2; } } if (asciibytes == rawtextlen) { return false; } score = 100 * goodbytes / (rawtextlen - asciibytes); // if not above 98, reduce to zero to prevent coincidental matches // allows for some (few) bad formed sequences if (score > 98) { return true; } else if (score > 95 && goodbytes > 30) { return true; } else { return false; } 另外对于正规的文本文件来说utf-8的文件开头有3个字节来标识该文本是utf-8编码 ef, bb, bf三个字节但通常不采用以下这种方案因为许多文件不标准 public static boolean getbyteencode(byte[] b) { if(b != null && b.length>3) { byte utf8[] = {(byte) 0xef, (byte) 0xbb, (byte) 0xbf}; if((b[0] == utf8[0])&&(b[1]==utf8[1])&&(b[2]==utf8[2])) return true; } return false; }

2楼网友：花一样艳美的陌生人
2021-04-27 17:29

package test; import java.io.UnsupportedEncodingException; public class TestString { public static String byte2hex(byte[] b) { // 一个字节的数， // 转成16进制字符串 String hs = ""; String stmp = ""; for (int n = 0; n < b.length; n++) { // 整数转成十六进制表示 stmp = (java.lang.Integer.toHexString(b[n] & 0xFF)); if (stmp.length() == 1) hs = hs + "0" + stmp; else hs = hs + stmp; hs=hs+" "; } return hs.toUpperCase(); // 转成大写 } public static void main(String[] args) { String s = "今天天气不错"; try { byte b[] = s.getBytes("UTF-8"); System.out.print(TestString.byte2hex(b)); } catch (UnsupportedEncodingException e) { e.printStackTrace(); } } } OK不？结果：E4 BB 8A E5 A4 A9 E5 A4 A9 E6 B0 94 E4 B8 8D E9 94 99

3楼网友：猖狂的痴情人
2021-04-27 15:55

给你一点提示： import java.net.URLEncoder; yourNewString = URLEncoder.encode(yourString, "UTF-8"); 同样，解码就把以上都变成Decoder 我这个是内置包，最简单的方法了，你先试一下吧

我要举报

如以上回答内容为低俗、色情、不良、暴力、侵权、涉及违法等信息，可以点下面链接进行举报！

点此我要举报以上问答信息！