Python¿âÏê½âÖ®ÍøÂç(2)
×òÌìÊÔÁËÏÂÓÃHTMLParserÀàÀ´½âÎöÍøÒ³£¬¿É·¢ÏÖ½á¹û²¢²»ÀíÏë¡£²»¹ÜÔõô˵£¬ÏÈдϹý³Ì£¬Ï£ÍûºóÀ´ÈËÄÜÔÚ´Ë»ù´¡ÉϽâ¾öÎÒËùÓöµ½µÄÎÊÌâ¡£
дÁË2Ì×½â¾ö·½°¸£¬µ±È»Õâ2Ì×Ö»ÄܶÔÌض¨ÍøÕ¾ÓÐЧ¡£ÎÒÕâÀïÖ÷Ҫ˵Ã÷϶ÔBBCÖ÷Ò³www.bbc.co.ukºÍ¶ÔÍøÒ×www.163.comµÄ½âÎö¡£
¶ÔÓÚBBC£º
ÕâÌ×Òª¼òµ¥µÃ¶à£¬¿ÉÄÜÊǸÃÍøÒ³µÄ±àÂë±È½Ï±ê×¼°É
import html.parser
import urllib.request
class parseHtml(html.parser.HTMLParser):
def handle_starttag(self, tag, attrs):
print("Encountered a {} start tag".format(tag))
def handle_endtag(self, tag):
print("Encountered a {} end tag".format(tag))
def handle_charref(self,name):
print("charref")
def handle_entityref(self,name):
print("endtiyref")
def handle_data(self,data):
print("data")
def handle_comment(self,data):
print("comment")
def handle_decl(self,decl):
print("decl")
def handle_pi(self,decl):
print("pi")
#´ÓÕâÀ↑ʼ¿´Æð£¬ÉÏÃæÄǸö¼Ì³ÐºÜ¼òµ¥£¬È«²¿ÖØÔظ¸ÀຯÊý
#ÒÔ¶þ½øÖÆдµÄ·½Ê½´æ´¢BBCÍøÒ³£¬ÕâÊÇÉÏƪÄÚÈÝ(http://blog.csdn.net/xiadasong007/archive/2009/09/03/4516683.aspx),²»×¸Êö
file=open("bbc.html",'wb') #it's 'wb',not 'w'
url=urllib.request.urlopen("http://www.bbc.co.uk/")
while(1):
line=url.readline()
if len(line)==0:
break
file.write(line)
#Éú³ÉÒ»¸ö¶ÔÏó
pht=parseHtml()
#¶ÔÓÚÕâ¸öÍøÕ¾£¬ÎÒʹÓÃ'utf-8'´ò¿ª£¬·ñÔò»á³ö´í£¬ÆäËûÍøÕ¾¿ÉÄܾͲ»ÐèÒª£¬utf-8ÊÇUNICODE±àÂë
file=open("bbc.html",encoding='utf-8',mode='r')
#´¦ÀíÍøÒ³£¬feed
while(1):
line=
Ïà¹ØÎĵµ£º
µÚ¾Å¹Ø Image
´ÓÒ³ÃæÉϵÄͼƬ¿ÉÒÔ¿´µ½ÓÐÒ»´®µã£¬ÄÇôÊDz»ÊÇ´ú±í¸Ã¹ØÓëͼÏñµãÓйأ¿ ÎÒÃÇ´ÓÒ³ÃæÔ´Âë¿ÉÒÔ¿´µ½£¬ÓÐÁ½¶ÎÊý×ÖÐòÁÐfirstºÍsecond£¬¶øÓÐÒ»¸öÌáʾfirst+second=? ʲôÒâ˼ÄØ£¿ÄѵÀÊÇ˵(first, second)´ú±íÁËͼÏñµãµÄ×ø±ê£¿²»Ïñ£¬Á½¶ÎÐòÁеij¤¶ÈÓкܴó²îÒì¡£ÄÇôËã·û+»¹ÓÐʲôº¬ÒåÄØ£¬Óп ......
Python Firewall Win32 (pyfw-win32)
pyfw-win32ÊÇÒ»¸ö¿ÉÓÃPython½Å±¾¿ª·¢Êý¾Ý°ü¹ýÂË(·À»ðǽ)µÄÄ£¿é¡£µ×²ãʹÓÃCÓïÑÔ±àдµÄNDISÖмä²ãÇý¶¯(NDIS IMD)Ìṩ֧³Ö£¬ÉϲãÌṩPython¿ª·¢½Ó¿Ú¡£¿ÉÓÃPython½Å±¾´¦ÀíËùÓÐÂß¼ÎÊÌ⣬¶ø²»±Ø¹ØÐĵײãʵÏÖ£¬´ïµ½¿ìËÙ¡¢Áé»î¿ª·¢µÄÄ¿µÄ¡£
Google ÏîÄ¿Íйܣº
http://code.google.com/p/py ......
ÕýÔò±í´ïʽ
¾ßÌåµÄ²Î¿¼ÊֲᣬÕâÀï¼ÇÏÂһЩСÎÊÌ⣺
1¡¢re¶ÔÏóµÄ·½·¨
match Match a regular expression pattern to the beginning of a string.
search re.search(pattern, string, flags) flags:re.I re.M re.X re.S re.L re.U
sub Substitute oc ......
×òÌìÊÔÁËÏÂÓÃHTMLParserÀàÀ´½âÎöÍøÒ³£¬¿É·¢ÏÖ½á¹û²¢²»ÀíÏë¡£²»¹ÜÔõô˵£¬ÏÈдϹý³Ì£¬Ï£ÍûºóÀ´ÈËÄÜÔÚ´Ë»ù´¡ÉϽâ¾öÎÒËùÓöµ½µÄÎÊÌâ¡£
дÁË2Ì×½â¾ö·½°¸£¬µ±È»Õâ2Ì×Ö»ÄܶÔÌض¨ÍøÕ¾ÓÐЧ¡£ÎÒÕâÀïÖ÷Ҫ˵Ã÷϶ÔBBCÖ÷Ò³www.bbc.co.ukºÍ¶ÔÍøÒ×www.163.comµÄ½âÎö¡£
¶ÔÓÚBBC£º
ÕâÌ×Òª¼òµ¥µÃ¶à£¬¿ÉÄÜÊǸÃÍøÒ³µÄ±àÂë±È½Ï±ê×¼°É
import ......
ÔõôÕÒ²»µ½µÚÈýÕµÄѧϰ±Ê¼ÇÁË£¿¶ªÁË£¿
PythonµÄº¯ÊýûÓÐʲôµÄ£¬¿ÉÒÔ˵£¬¿´ÁË¡¶¼òÃ÷Python½Ì³Ì¡·ºó£¬¾Í»áдÁË¡£
ÕâÒ»ÕÂÌṩµÄÄÚÈÝÒ²±È¡¶¼òÃ÷Python½Ì³Ì¡·Òª¶àһЩ¡£±È½Ï¸´ÔÓµÄÊÇ×÷ÓÃÓò¹æÔò£¬²»ÖªµÀÊÇÊéû½²Çå³þ»¹ÊÇ·ÒëµÃ²»ºÃ£¬±È½ÏÄѶ®¡£Ç®Äܵġ¶C++³ÌÐò½Ì³Ì¡·¹ØÓÚº¯ÊýµÄ×÷ÓÃÓò¹æÔò½²µÃÒªÇå³þЩ£¬ÓÐC++µÄ֪ʶÔÚÀïÃ棬 ......