Python ÖеÄ×Ö·û±àÂë
1¡¢strÀàÐÍ¿ÉÒÔÀí½âΪһ¸ö¶þ½øÖÆblock£¬»òmultibyte
2¡¢multibyte_str.decode("<multibyte_encode_method>") -> unicode
3¡¢unicode_str.encode("<multibyte_encode_method>") -> multibyte_str(binary block)
4¡¢unicode_str µÄ²Ù×÷²ÎÊýҲӦΪunicode£¬È磺unicode_str.find("Ñù±¾".decode("utf-8"))
5¡¢´úÂëÀïµÄuǰ׺»á×Ô¶¯Éú³Éunicode×Ö·û´®£¨Ëü¸ú¾ÝÔ´ÂëÊײ¿µÄ#coding:*** ¶ÎÀ´¾ö¶¨Ó¦¸ÃÔõÑùÓÉmultibyteÉú³Éunicode£©
6¡¢python µÄprint½«Êä³öbinary block¸øconsole£¬colsole½«ÓÃϵͳµÄmultibyte_encode_methodΪÏÔʾÕâЩbinary block
REF
http://blog.sina.com.cn/s/blog_620c017e0100erh8.html
Ïà¹ØÎĵµ£º
±¾ÏµÁÐÎÄÕ²»»á½éÉÜÖîÈç°²×°PythonÖ®ÀàµÄÎÊÌ⣬Ҳ²»Ïëд³É¹ØÓÚPythonµÄ½Ì¿ÆÊé¡£ÊÂʵÉ϶ÔÓÚ³ÌÐòÔ±£¬Python¸ù±¾²»ÐèÒªºñºñµÄÈëÃÅÊé¼®£¬ÊÖ±ßÒ»±¾PythonÎĵµ¼´¿É¡£ÒòΪÄã»á·¢ÏÖ£¬ÊÔ×ÅÖ´ÐÐÒ»¶ÎÄãÈÏΪ¿ÉÄܳɹ¦µÄ´úÂ룬Python»áÏñÆÚÍûµÄÄÇÑùÔËÐС£
ÎÄÕ½«»á½éÉÜÁ½Î»Ö÷È˹«£ºÐ¡²ËºÍС°×£¬Á½ÈËÊÇ´óÈýѧÉú£¬Ñ§Ï°¹ýCºÍJava¿Î³Ì¡£±¾Îļ ......
µ±pythonÖм䴦Àí·ÇASCII±àÂëʱ£¬¾³£»á³öÏÖÈçÏ´íÎó£º
UnicodeDecodeError: 'ascii' codec can't decode byte 0x?? in position 1: ordinal not in range(128)
0x??Êdz¬³ö128µÄÊý×Ö£¬pythonÔÚĬÈϵÄÇé¿öÏÂÈÏΪÓïÑԵıàÂëÊÇascii±àÂ룬ËùÒÔÎÞ·¨´¦ÀíÆäËû±àÂ룬ÐèÒªÉèÖÃpythonµÄĬÈϱàÂëΪËùÐèÒªµÄ±àÂë¡£
Ò»¸ö½â¾öµÄ·½°¸ÊÇ ......
×ܽáÏ£¬Python ÏÂÔØÍøÒ³µÄ¼¸ÖÖ·½·¨
1
fd = urllib2.urlopen(url_link)
data = fd.read()
ÕâÊÇ×î¼ò½àµÄÒ»ÖÖ£¬µ±È»Ò²ÊÇGetµÄ·½·¨
2
ͨ¹ýGETµÄ·½·¨
def GetHtmlSource(url):
try:
htmSource = ''
&nb ......
import urllib2
import time
import socket
from datetime import datetime
from thread_pool import *
def main():
url_list = {"sina":"http://www.sina.com.cn",
"sohu":"http://www.sohu.com",
"yahoo":"http://www.yahoo.com",
"xiaonei":"http://www.x ......