python编码问题
win下的Dos乱码
utf-8保存的文件,在win中映射为gbk,输出文字就是乱码的,或者读取网页的时候在dos下输出,因为dos是用gbk编码,这样就容易导致出错
解决办法:
print "大家好".decode('utf-8').encode('GBK')
另外还有一种情况是一些软件(notepad)在保存utf-8会在文件开头插入不可见字符BOM(0xEF 0xBB 0xBF)
可以用codecs模块来处理
python">import codecscontent = open("test.txt",'r').read()
filehandle.close()
if content[:3] == codecs.BOM_UTF8:
content = content[3:]
print content.decode("utf-8")
ps:bom可以用来绕过一些文件内容的判断(xdcms 2015 代码审计第四题)
private function check_content($name) {
if(isset($_FILES[$name]["tmp_name"])) {
$content = file_get_contents($_FILES[$name]["tmp_name"]);
if(strpos($content, "<?") === 0) {
return false;
}
}
return true;
}
py头未设置字符集
s = "测试"print s
File "/Users/l3m0n/study/program/python/code_study/test3.py", line 1SyntaxError: Non-ASCII character '\xe6' in file /Users/l3m0n/study/program/python/code_study/test3.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
python默认编码是ascii,所以print的时候会把中文当ascii处理导致出错
解决办法:
# coding=utf-8或者
#!/usr/bin/python
# -*- coding: utf-8 -*-
字符连接出现错误
# coding=utf-8s = "测试" + u"1下"
print s
Traceback (most recent call last): File "/Users/l3m0n/study/program/python/code_study/test3.py", line 2, in <module>
s = "测试" + u"一下"
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 0: ordinal not in range(128)
左边是中文字符串类型str,右边是unicode,这样str转换为unicode的时候会用系统默认ascii编码去解码,0-127,ascii能够处理,但是当str解出的大于128的时候,ascii就处理不来,于是抛出异常
两种方法解决:
1、str转换为unicode:s = “测试".decode("gbk") + u"1下"
2、unicode进行utf-8编码
s = "测试" + u"1下”.decode("utf-8")
默认字符集出问题
Traceback (most recent call last): File "/Users/l3m0n/study/program/python/code_study/mangzhu.py", line 14, in <module>
print sqli(1);
UnicodeEncodeError: 'ascii' codec can't encode characters in position 275-281: ordinal not in range(128)
解决:
import sysreload(sys)
sys.setdefaultencoding('utf8')
以上是 python编码问题 的全部内容, 来源链接: utcz.com/z/388836.html