¡¾Nutch¡¿LinuxÏÂÓ¦ÓÃnutch 1.0 Webǰ¶ËʵÏÖµ¥»ú¼ìË÷
nutchµÄÅÀ³æºÍËÑË÷¿ÉÒÔ˵ÊÇ·ÖÀëµÄÁ½¿é£¬ÅÀ³æ¿ÉÒÔÊÇM/R×÷Òµ£¬µ«ËÑË÷²»ÊÇM/R×÷Òµ¡£ËÑË÷ÓÐÁ½ÖÖ·½Ê½£ºÒ»Êǽ«ÅÀ³æÊý¾Ý(»òÕß³ÆË÷ÒýÊý¾Ý)·ÅÔÚ±¾µØÓ²ÅÌ£¬½øÐÐËÑË÷¡£¶þÊÇÖ±½ÓËÑË÷HDFSÖеÄÅÀ³æÊý¾Ý¡£
ÕâÀï½éÉÜÈçºÎʹÓÃnutch-1.0µÄWEBǰ¶Ë¼ìË÷±¾µØÅÀ³æÊý¾Ý£º
(1)NutchµÄËÑË÷¿ÉÒÔ¶ÀÁ¢ÓÚhadoop¼¯Èº£¬Ö»Òª½«ÅÀ³æÏÂÀ´µÄÊý¾Ýcopyµ½ÈκλúÆ÷£¬ÔÚ´Ë»úÆ÷Éϰ²×°Ò»¸ötomcat£¬²¢ÔËÐÐnutch×Ô´øµÄWEBǰ¶Ë³ÌÐò²¢×öÏàÓ¦ÅäÖ㬾ͿÉʵÏÖËÑË÷¡£
(2)½«Ê¹ÓÃÃüÁîbin/nutch crawl -dir data -depth 3 -topN 5ÅÀ³æÏÂÏÂÀ´µÄÊý¾Ýdata·ÅÔÚ±¾µØÄ³Ä¿Â¼Ï£¨Èç¹ûÊÇ·Ö²¼Ê½ÅÀ³æ£¬¿ÉÒÔʹÓÃÃüÁî" bin/hadoop dfs -copyfromLocal data ±¾µØÄ¿Â¼" ½«ÅÀ³æÊý¾Ýdata¸´ÖƵ½±¾µØÄ¿Â¼£©£¬ÀýÈ罫Éú³ÉµÄdataĿ¼¸´ÖƵ½/home/nutch/nutchinstall/crawltest/Ŀ¼Ï¡££¨°²È«Æð¼û£¬ÇëÈ·±£Ä¿Â¼Â·¾¶ÖÐûÓпոñ£¬Õâ¸ö¿ÉÄÜÓÐÓ°Ï죩¡£
˵Ã÷£º
dataĿ¼ÊÇÅÀ³æÉú³ÉµÄĿ¼£¬ÏÂÃæÓÐÕâЩ×ÓĿ¼£ºcrawldb,index,indexes,linkdb,segments
(3)°²×°tomcat£¬ÇëÈ·±£°²×°Â·¾¶Ã»Óпոñ£¬ÕâºÜÖØÒª£¬ÔÚwindowsÉÏÒòΪÓпոñµ¼ÖÂËÑË÷½á¹ûʼÖÕΪ0.
(4)½«NutchÖ÷Ŀ¼ÏµÄWEBǰ¶Ë³ÌÐònutch-1.0.war¸´ÖƵ½ /usr/program/apache-tomcat-6.0.18/webapps/Ŀ¼ÏÂ(apache°²×°Ä¿Â¼ÊÇ/usr/program/apache-tomcat-6.0.18)
(5)ä¯ÀÀÆ÷ÖÐÊäÈëhttp://localhost:8080/nutch-1.0£¬½«×Ô¶¯½âѹnutch-1.0.war¡£
(6)ÅäÖÃWEBǰ¶Ë³ÌÐòÖеÄnutch-site.xmlÎļþ£¬ÅäÖÃÍê³Éºó±ØÐëÖØÆôtomcat(/usr/program/apache-tomcat-6.0.18/bin/shutdown.sh,È»ºóÔÚstart.sh)¡£
nutch-site.xmlÔÚĿ¼/usr/program/apache-tomcat-6.0.18/webapps/nutch-1.0/WEB-INF/classes/Ï£¬
ÅäÖÃÈçÏ£º
<property>
<name>http.agent.name</name> ²»¿ÉÉÙ£¬·ñÔòÎÞËÑË÷½á¹û
<value>nutch-1.0</value>
<description>HTTP 'User-Agent' request header.</description>
</property>
<property>
<name>http.robots.agents</name>
<value>nutch-1.0,*</value>
<description>The agent strings we'll look for in robots.txt files,
comma-separated, in decreasing order of precedence. You should
put the value of http.agent.name as the first agent name, and keep the
default * at the end of the li
Ïà¹ØÎĵµ£º
2009 Äê 4 ÔÂ 23 ÈÕ
±¾ÎÄÖÐÎÒÃÇÕë¶Ô Linux É϶àÏ̱߳à³ÌµÄÖ÷ÒªÌØÐÔ×ܽá³ö 5 Ìõ¾Ñ飬ÓÃÒÔ¸ÄÉÆ Linux ¶àÏ̱߳à³ÌµÄϰ¹ßºÍ±ÜÃâÆäÖеĿª·¢ÏÝÚå¡£ÔÚ±¾ÎÄÖУ¬ÎÒÃÇ´©²åһЩ Windows µÄ±à³ÌÓÃÀýÓÃÒÔ¶Ô±È Linux ÌØÐÔ£¬ÒÔ¼ÓÉî¶ÁÕßÓ¡Ïó¡£
±³¾°
Linux ƽ̨ÉϵĶàÏ̳߳ÌÐò¿ª·¢Ïà¶ÔÓ¦ÆäËûƽ̨£¨±ÈÈç Windows£©µÄ¶àÏß³Ì API ÓÐһЩϸ΢ ......
Ò»¡¢ÒýÑÔ£º
±¾ÎĵµµÄÄÚÈݴ󲿷ÝÄÚÈݶ¼ÊÇ´ÓÍøÉÏÊÕ¼¯¶øÀ´£¬È»ºóÅäºÏһЩеĽØÍ¼£¨Äں˰汾£ºV2.4.19£©¡£ÔÚÿһÅäÖÃÏîºó»áÓÐÒ»¸öÑ¡ÔñÖ¸ÄϵIJ¿·Ý£¬ÓÃÀ´Ö¸µ¼´ó¼ÒÔõôÑù¸ù¾Ý×Ô¼ºµÄÇé¿öÀ´×öÏàÓ¦µÄÑ¡Ôñ£»»¹ÓÐÔÚÿһ¸ö´óÏîºÍÎĵµµÄ×îºó»áÓÐÒ»¸ö¾Ñé̸£¬ËüÊÇһЩ¸ßÊÖÃÇÔÚÓ¦¶ÔÎÊÌâºÍ´¦ÀíÌØÓÐÓ²¼þʱµÄһЩ¾Ñ飨Õâ¸ö»¹µ ......
Ubuntu10.04¾µÏñ»á³öÏÖÎÞ·¨Ê¶±ð¼üÅÌÊäÈë¡£ÔڵǼ»ÃæÊ±£¬²»ÄÜÊäÈëÃÜÂ룬ÎÞ·¨Õý³£½øÈëϵͳ¡£ ½â¾ö°ì·¨£º £¨1£© ¿ª»§ÆÁÄ»¼üÅ̹¦ÄÜ ÔÚÉÏͼ¿ª»ú½çÃæÔÚÓÒϽǵ¥»÷СÈËͼ°¸£¬ÔÙµ¥»÷“Universal Access Perferences”£¬³ö»÷“Universal Access Perferences”¶Ô»°¿ò£¬ÔÚ¶Ô»°¿òÖÐÑ¡ÖГUse on-screen keybo ......
Ãæ¶Ô×ÅһЩºó׺Ϊtar¡¢tar.gz¡¢tar.bz2……¾ÍÓÐЩãȻÁË£¬
ÆäʵÀí½âÁË£¬×ÔÈ»¾ÍÓÈжø½âÁË¡£
Ê×ÏÈÒªÃ÷È·Ò»¸ö¸ÅÄlinuxÏÂÃæµÄ´ò°üºÍѹËõÊÇÁ½ÂëÊ£¬
Ò²¾ÍÊÇÄã¿ÉÒÔ¶ÔÎļþ½øÐдò°ü£¬µ«ÊDz»Ñ¹Ëõ¡£
À´¿´¸öÀý×Ó£º
ÔÚÎҵĸùĿ¼ÏÂÓиöÃûΪwallpaperµÄÎļþ¼Ð£¬ÀïÃæÊÇһЩǽֽ£¬
ÎÒÏÖÔÚ¶ÔÕâ¸öÎļþ¼Ð½øÐдò°ü£¬µ« ......
Ŀ¼
2.1 ÆÕͨÎļþ£»
2.2 Ŀ¼£»
2.3 ×Ö·ûÉ豸»ò¿éÉ豸Îļþ£»
2.4 Ì×½Ó¿ÚÎļþ£»
2.5 ·ûºÅÁ´½ÓÎļþ£»
1¡¢LinuxÎļþµÄÀ©Õ¹Ãû£»
2¡¢Linux ÎļþÀàÐÍ£»
3¡¢¹ØÓÚ±¾ÎÄ£»
4¡¢ºó¼Ç£»
5¡¢²Î¿¼Îĵµ£»
6¡¢Ïà¹ØÎĵµ£»
++++++++++++++++++++++++++++++++++++++
ÕýÎÄ
++++++++++++++++++++++++++++++++++++++
1¡¢ ......