Python¿âÏê½âÖ®ÍøÂç(2)
×òÌìÊÔÁËÏÂÓÃHTMLParserÀàÀ´½âÎöÍøÒ³£¬¿É·¢ÏÖ½á¹û²¢²»ÀíÏë¡£²»¹ÜÔõô˵£¬ÏÈдϹý³Ì£¬Ï£ÍûºóÀ´ÈËÄÜÔÚ´Ë»ù´¡ÉϽâ¾öÎÒËùÓöµ½µÄÎÊÌâ¡£
дÁË2Ì×½â¾ö·½°¸£¬µ±È»Õâ2Ì×Ö»ÄܶÔÌØ¶¨ÍøÕ¾ÓÐЧ¡£ÎÒÕâÀïÖ÷Ҫ˵Ã÷϶ÔBBCÖ÷Ò³www.bbc.co.ukºÍ¶ÔÍøÒ×www.163.comµÄ½âÎö¡£
¶ÔÓÚBBC£º
ÕâÌ×Òª¼òµ¥µÃ¶à£¬¿ÉÄÜÊǸÃÍøÒ³µÄ±àÂë±È½Ï±ê×¼°É
import html.parser
import urllib.request
class parseHtml(html.parser.HTMLParser):
def handle_starttag(self, tag, attrs):
print("Encountered a {} start tag".format(tag))
def handle_endtag(self, tag):
print("Encountered a {} end tag".format(tag))
def handle_charref(self,name):
print("charref")
def handle_entityref(self,name):
print("endtiyref")
def handle_data(self,data):
print("data")
def handle_comment(self,data):
print("comment")
def handle_decl(self,decl):
print("decl")
def handle_pi(self,decl):
print("pi")
#´ÓÕâÀ↑ʼ¿´Æð£¬ÉÏÃæÄǸö¼Ì³ÐºÜ¼òµ¥£¬È«²¿ÖØÔظ¸ÀຯÊý
#ÒÔ¶þ½øÖÆÐ´µÄ·½Ê½´æ´¢BBCÍøÒ³£¬ÕâÊÇÉÏÆªÄÚÈÝ(http://blog.csdn.net/xiadasong007/archive/2009/09/03/4516683.aspx),²»×¸Êö
file=open("bbc.html",'wb') #it's 'wb',not 'w'
url=urllib.request.urlopen("http://www.bbc.co.uk/")
while(1):
line=url.readline()
if len(line)==0:
break
file.write(line)
#Éú³ÉÒ»¸ö¶ÔÏó
pht=parseHtml()
#¶ÔÓÚÕâ¸öÍøÕ¾£¬ÎÒʹÓÃ'utf-8'´ò¿ª£¬·ñÔò»á³ö´í£¬ÆäËûÍøÕ¾¿ÉÄܾͲ»ÐèÒª£¬utf-8ÊÇUNICODE±àÂë
file=open("bbc.html",encoding='utf-8',mode='r')
#´¦ÀíÍøÒ³£¬feed
while(1):
line=
Ïà¹ØÎĵµ£º
µÚ¾Å¹Ø Image
´ÓÒ³ÃæÉϵÄͼƬ¿ÉÒÔ¿´µ½ÓÐÒ»´®µã£¬ÄÇôÊDz»ÊÇ´ú±í¸Ã¹ØÓëͼÏñµãÓйأ¿ ÎÒÃÇ´ÓÒ³ÃæÔ´Âë¿ÉÒÔ¿´µ½£¬ÓÐÁ½¶ÎÊý×ÖÐòÁÐfirstºÍsecond£¬¶øÓÐÒ»¸öÌáʾfirst+second=? ʲôÒâË¼ÄØ£¿ÄѵÀÊÇ˵(first, second)´ú±íÁËͼÏñµãµÄ×ø±ê£¿²»Ïñ£¬Á½¶ÎÐòÁеij¤¶ÈÓкܴó²îÒì¡£ÄÇôËã·û+»¹ÓÐʲôº¬ÒåÄØ£¬Óп ......
Ê×ÏÈÊÇÏÂÔØpython3£¬ÏÖÔÚµÄ×î¸ß°æ±¾ÊÇ3.1.1
for linux¡£
ÎҵķÅÖ÷¾¶ÊÇ/home/pythonÏ·ÅÖÃPython-3.1.1.tgz,Ö´ÐÐÒÔÏÂϵÁвÙ×÷£º
1.½âѹ£ºtar zxvf Python-3.1.1.tgz----Éú³É½âѹ°üPython-3.1.1
2.ת»»µ½Python-3.1.1·¾¶Ï£¬Ö´ÐÐ./configure
3.make
4.make install
ÔÚrehl5ÖÐÒѾĬÈϰ²×°ÁËpython2.4,ËùÒÔÒª×öÈçÏ ......
——ÓÉÓÚ×î½üÔÚ×öÓйØÍøÒ³ËÑË÷µÄÏîÄ¿£¬Éæ¼°µ½Ò»Ð©±àÂë·½ÃæµÄ֪ʶ£¬Ð¡µÜÔÚÍøÉÏżȻµØ·¢ÏÖÁËÕâôһƪÎÄÕ£¬ºÜÒ×¶®£¬²»»Þɬ£¬ÎªÁË·½±ã×Ô¼ºÒ²Í¬Ê±ÄÜ·½±ã´ó¼Ò£¬¾ÍתÁ˹ýÀ´£¬ÒÔ×÷²Î¿¼……
ÎÄÕ³ö´¦£ºhttp://blog.csdn.net/tingsking18/arc ......
½ñÌì×öftpµÄ½çÃæ,×öµÄÏ൱ÓôÃÆ£¬ÅªµÃÐÄÇé¼°Æä²»Ë¬£¬ÔÚÍøÉÏËѵ½ËÀ¶¼²»ÖªµÀ¸ÃÔõô°ì£¬´òËãÃ÷ÌìÏÈ¿´¿´C£«£«
µÄÊÇÔõôŪµÄÔÙ˵¡£²»¹ý£¬ÏÖÔÚÎÒÏëдһϹØÓÚsocketµÄ±à³Ì¡£
ÏÈдһ¸öʱ¼ä·þÎñÆ÷°É£¬Ëû¼àÌý¶Ë¿Ú£¬²¢Çһ᷵»Ø ·þÎñÆ÷µÄʱ¼ä
server.py
#!/usr/bin/python
# Copyright (c) angelipin (angelipin@126.com)
import ......
¹¤ÓûÉÆÆäʱØÏÈÀûÆäÆ÷£¡
¿ª·¢PythonÓÃʲô¹¤¾ßºÃÄØ£¿Æäʵ¸ÕѧPythonµÄ»°£¬Ê¹ÓÃIDLE¾Í¹»ÁË£¬ËäÈ»µ÷ÊÔ²»ÊÇÌØ±ð·½±ã£¬µ«ÊǶÔÓÚ³õѧÒѾ¹»ÁË£¬¿ÉÒÔʹÓÃPrint½øÐмòµ¥µÄµ÷ÊÔ£¬²»½¨ÒéʹÓüÇʱ¾½øÐпª·¢£¬²»ÖªµÀµÄÈÏΪÄãºÜÅ££¬ÖªµÀµÄ……ÕâÊÇ×Ô¼º¸ø×Ô¼ºÕÒ×ïÊÜ£¬ÓÃEditplusÃ²Ë ......