Ò׽ؽØͼÈí¼þ¡¢µ¥Îļþ¡¢Ãâ°²×°¡¢´¿ÂÌÉ«¡¢½ö160KB

¡¾Nutch¡¿LinuxÏÂÓ¦ÓÃnutch 1.0 WebÇ°¶ËʵÏÖµ¥»ú¼ìË÷

nutchµÄÅÀ³æºÍËÑË÷¿ÉÒÔ˵ÊÇ·ÖÀëµÄÁ½¿é£¬ÅÀ³æ¿ÉÒÔÊÇM/R×÷Òµ£¬µ«ËÑË÷²»ÊÇM/R×÷Òµ¡£ËÑË÷ÓÐÁ½ÖÖ·½Ê½£ºÒ»Êǽ«ÅÀ³æÊý¾Ý(»òÕß³ÆË÷ÒýÊý¾Ý)·ÅÔÚ±¾µØÓ²ÅÌ£¬½øÐÐËÑË÷¡£¶þÊÇÖ±½ÓËÑË÷HDFSÖеÄÅÀ³æÊý¾Ý¡£
ÕâÀï½éÉÜÈçºÎʹÓÃnutch-1.0µÄWEBÇ°¶Ë¼ìË÷±¾µØÅÀ³æÊý¾Ý£º
(1)NutchµÄËÑË÷¿ÉÒÔ¶ÀÁ¢ÓÚhadoop¼¯Èº£¬Ö»Òª½«ÅÀ³æÏÂÀ´µÄÊý¾Ýcopyµ½ÈκλúÆ÷£¬ÔÚ´Ë»úÆ÷ÉÏ°²×°Ò»¸ötomcat£¬²¢ÔËÐÐnutch×Ô´øµÄWEBÇ°¶Ë³ÌÐò²¢×öÏàÓ¦ÅäÖ㬾ͿÉʵÏÖËÑË÷¡£
(2)½«Ê¹ÓÃÃüÁîbin/nutch crawl -dir data -depth 3 -topN 5ÅÀ³æÏÂÏÂÀ´µÄÊý¾Ýdata·ÅÔÚ±¾µØijĿ¼Ï£¨Èç¹ûÊÇ·Ö²¼Ê½ÅÀ³æ£¬¿ÉÒÔʹÓÃÃüÁî" bin/hadoop dfs -copyfromLocal data ±¾µØĿ¼" ½«ÅÀ³æÊý¾Ýdata¸´ÖƵ½±¾µØĿ¼£©£¬ÀýÈ罫Éú³ÉµÄdataĿ¼¸´ÖƵ½/home/nutch/nutchinstall/crawltest/Ŀ¼Ï¡££¨°²È«Æð¼û£¬ÇëÈ·±£Ä¿Â¼Â·¾¶ÖÐûÓпոñ£¬Õâ¸ö¿ÉÄÜÓÐÓ°Ï죩¡£
˵Ã÷£º
dataĿ¼ÊÇÅÀ³æÉú³ÉµÄĿ¼£¬ÏÂÃæÓÐÕâЩ×ÓĿ¼£ºcrawldb,index,indexes,linkdb,segments
(3)°²×°tomcat£¬ÇëÈ·±£°²×°Â·¾¶Ã»Óпոñ£¬ÕâºÜÖØÒª£¬ÔÚwindowsÉÏÒòΪÓпոñµ¼ÖÂËÑË÷½á¹ûʼÖÕΪ0.
(4)½«NutchÖ÷Ŀ¼ÏµÄWEBÇ°¶Ë³ÌÐònutch-1.0.war¸´ÖƵ½ /usr/program/apache-tomcat-6.0.18/webapps/Ŀ¼ÏÂ(apache°²×°Ä¿Â¼ÊÇ/usr/program/apache-tomcat-6.0.18)
(5)ä¯ÀÀÆ÷ÖÐÊäÈëhttp://localhost:8080/nutch-1.0£¬½«×Ô¶¯½âѹnutch-1.0.war¡£
(6)ÅäÖÃWEBÇ°¶Ë³ÌÐòÖеÄnutch-site.xmlÎļþ£¬ÅäÖÃÍê³Éºó±ØÐëÖØÆôtomcat(/usr/program/apache-tomcat-6.0.18/bin/shutdown.sh,È»ºóÔÚstart.sh)¡£
nutch-site.xmlÔÚĿ¼/usr/program/apache-tomcat-6.0.18/webapps/nutch-1.0/WEB-INF/classes/Ï£¬
ÅäÖÃÈçÏ£º
<property>
  <name>http.agent.name</name>   ²»¿ÉÉÙ£¬·ñÔòÎÞËÑË÷½á¹û
  <value>nutch-1.0</value>
  <description>HTTP 'User-Agent' request header.</description>
</property>
<property>
  <name>http.robots.agents</name>
  <value>nutch-1.0,*</value>
  <description>The agent strings we'll look for in robots.txt files,
  comma-separated, in decreasing order of precedence. You should
  put the value of http.agent.name as the first agent name, and keep the
  default * at the end of the li


Ïà¹ØÎĵµ£º

LinuxÄÚºËÅäÖÃÎĵµ(make menuconfig)

Ò»¡¢ÒýÑÔ£º
    ±¾ÎĵµµÄÄÚÈݴ󲿷ÝÄÚÈݶ¼ÊÇ´ÓÍøÉÏÊÕ¼¯¶øÀ´£¬È»ºóÅäºÏһЩеĽØͼ£¨Äں˰汾£ºV2.4.19£©¡£ÔÚÿһÅäÖÃÏîºó»áÓÐÒ»¸öÑ¡ÔñÖ¸ÄϵIJ¿·Ý£¬ÓÃÀ´Ö¸µ¼´ó¼ÒÔõôÑù¸ù¾Ý×Ô¼ºµÄÇé¿öÀ´×öÏàÓ¦µÄÑ¡Ôñ£»»¹ÓÐÔÚÿһ¸ö´óÏîºÍÎĵµµÄ×îºó»áÓÐÒ»¸ö¾­Ñé̸£¬ËüÊÇһЩ¸ßÊÖÃÇÔÚÓ¦¶ÔÎÊÌâºÍ´¦ÀíÌØÓÐÓ²¼þʱµÄһЩ¾­Ñ飨Õâ¸ö»¹µ ......

LinuxÄں˲ÎÊýµ÷ÓÅ

1. ²é¿´socket״̬
server1:~ # netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
TIME_WAIT 257
CLOSE_WAIT 117
FIN_WAIT2 2
ESTABLISHED 228
2. #vi /etc/sysctl.conf
3. Ôö¼Ó
   net.ipv4.tcp_keepalive_time = 120
   net.ipv4.tcp_keepalive_probes = 2
 &n ......

Ó¦ÓÃTPM×÷linux¿ÉÐÅÆ𶯵ijõ²½Êµ¼ù


Ê×ÏÈÒªÓÐһ̨´øÓÐTPM 1.2оƬ£¬×°ÓÐlinuxϵͳµÄ¼ÆËã»ú¡£
ʹÓÃÏÂÃæÕâÌõÃüÁî¿ÉÒԲ鿴ϵͳÄÚºËtpmÇý¶¯Çé¿ö£º
$ ls -la /lib/modules/`uname -r`/kernel/drivers/char/tpm
×Ü¼Æ 100
drwxr-xr-x 2 root root 4096 02-03 21:47 .
drwxr-xr-x 7 root root 4096 02-03 21:47 ..
-rwxr--r-- 1 root root 9812 01-21 15:27 tp ......

ÅäÖÃVNC·þÎñʵÏÖºìÆìLinuxÔ¶³Ì×ÀÃæ·ÃÎÊ

±êÌ⣺ÅäÖÃVNC·þÎñʵÏÖºìÆìLinuxÔ¶³Ì×ÀÃæ·ÃÎÊ
ÄÚÈݼò½é£º
VNC (Virtual Network Computing)ÊÇÐéÄâÍøÂç¼ÆËã»úµÄËõд£¬ÊÇÒ»¿îÓÅÐãµÄÔ¶³Ì¿ØÖƹ¤¾ßÈí¼þ£¬ÓÉÖøÃûµÄAT&TµÄÅ·ÖÞÑо¿ÊµÑéÊÒ¿ª·¢¡£
ÏÂÃæ½éÉÜÔÚ“ºìÆìLinux DC Server 5.0”ºÍ“ºìÆìLinux ×ÀÃæ°æ 6.0”²Ù×÷ϵͳÖУ¬ÅäÖÃVNC·þÎñ£¬ÊµÏÖ¿Í»§¶ËÒ ......
© 2009 ej38.com All Rights Reserved. ¹ØÓÚE½¡ÍøÁªÏµÎÒÃÇ | Õ¾µãµØͼ | ¸ÓICP±¸09004571ºÅ