Ò׽ؽØÍ¼Èí¼þ¡¢µ¥Îļþ¡¢Ãâ°²×°¡¢´¿ÂÌÉ«¡¢½ö160KB

lucene Ë÷ÒýHTMLÎĵµ ÉîδÀ´¼¼Êõ


1¡¢´ó²¿·ÖWEBÎĵµ²ÉÓÃHTML¸ñʽ¡£
2¡¢±¾ÀýÓÃÈçÏÂHTMLÎĵµ
<html>
   <head>
      <title>
         Laptop power supplies are avaliable in First class only
       </title>
   </head>
    <body>
       <h1>code,write,fly</h1>
   </body>
</html>
3¡¢Ê¹ÓÃJTidy
JTidyÓÉAndy Quick±àдµÄTidyµÄJava°æ±¾¡£
public class JTidyHTMLHandler implements DocumentHandler{
   publicorg.apache.lucene.document.Document getDocument(InputStreamis) 
      throwsDocumentHandlerException{ //´«ÈëÒ»¸ö´ú±íHTMLÎĵµµÄInputStream¶ÔÏó
   Tidy tidy=new Tidy();
   tidy.setQuiet(true);
   tidy.setShowWarnings(false);
  //½âÎö´ú±íHTMLÎĵµµÄInputStream¶ÔÏó
   org.w3c.dom.Documentroot=tidy.parseDOM(is,null);
   ElementrawDoc=root.getDocumentElement();
  
  org.apache.lucene.document.Document doc=neworg.apache.lucene.document.Document();
   Stringtitle=getTitle(rawDoc);//»ñµÃ±êÌâ
   Stringbody=getBody(rawDoc);//»ñµÃ<body>ºÍ</body>Ö®¼äËùÓÐÔªËØ
   if((title!=null)&&(!title.equals("")))  {
     doc.add(Field.Text("title",title));
   }
   if((body!=null)&&(!body.equals(""))){
      doc.add(Field.Text("body",body));
   }
   return doc;
 }
 protected String getTitle(Element rawDoc){
    if(rawDoc==null){
        returnnull;
    }
   
    Stringtitle="";
    NodeListchildren=rawDoc.getElementsB


Ïà¹ØÎĵµ£º

ʵÏÖHTMLµÄ¼òµ¥Ñ¹Ëõ

PageReleaserÐèÒªÒ»ÖÖHTMLµÄѹËõËã·¨£¬GoogleÁ˺ܾ㬷¢ÏÖÈç¹ûÖ»ÊǼòµ¥È¥³ý¿Õ°×ºÍ×¢Ê͵ϰ£¬Ê¹ÓÃXLinq¾Í¿ÉÒÔÇáÒ×µÄʵÏÖ ÏÈ¿´¿´MSDNÊÇÔõô˵µÄ£º Ò»ÖÖ³£Ó÷½°¸ÊǶÁÈ¡Ëõ½øµÄ XML£¬ÔÚÄÚ´æÖд´½¨Ò»¸öûÓÐÈκοհ×Îı¾½Úµã£¨¼´²»±£Áô¿Õ°×£©µÄ XML Ê÷£¬¶Ô¸Ã XML Ö´ÐÐijЩ²Ù×÷£¬È»ºó±£´æ´øËõ½øµÄ XML¡£ÔÚÐòÁл¯´ø¸ñʽµÄ XML Ê ......

htmlÖÐÉÏ´«ÎļþµÃµ½ÎļþµÄ¾ø¶Ô·¾¶£¡

HTMLÖÐʹÓÃinput type="file"ÉÏ´«Îļþʱ£¬´úÂëÖÐÖ»Äܵõ½ÎļþµÄÃû³Æ£¬¶øÓÐÐ©ÌØÊâµÄÐèÒªÒªÇóÎÒÃDZØÐëµÃµ½ÉÏ´«ÎļþµÄ¾ø¶Ô·¾¶£¬Îª´ËÎÒÃDzÉÓÃJavascriptʵÏֵõ½ÎļþµÄ¾ø¶Ô·¾¶¡£
¾ßÌåÈçÏ¡£
Ò³Ãæ´úÂ루ֻճÌùÁ˹ؼü´úÂ룩£º
<form name="thisform" method="post"
action="<%=request.getContextPath()%>/movi ......

[HTML±à¼­Æ÷]C#±àдµÄHTML±à¼­Æ÷£ºÔ­ÀíÆª

×÷Õߣº¹â½ÅѾ˼¿¼ ʱ¼ä£º12/23/2009 1:51:00 PM
Ò»¿ªÊ¼¾Í¾õµÃHTML±à¼­Æ÷ÕâÍæÒâÓ¦¸ÃÊǺܸßÉîβâµÄ¡£ËæËæ±ã±ã¾ÍÏëÕûÒ»¸öÓ¦¸Ã²»ÊÇÒ»¼þÈÝÒ×µÄÊÂÇé¡£ºóÀ´¶ÔWebBrowser¿Ø¼þÓÐÁËһЩÁ˽⣬²»¹ý¶¼ÊǺܷôdzµÄÁ˽⡣ֻ֪µÀÓÃÕâ¸ö¿Ø¼þ¾ÍÄܹ»ÔÚ×Ô¼ºµÄ³ÌÐòÖиãÒ»¸öWEBä¯ÀÀÆ÷Ö®ÀàµÄ¶«Î÷£¬´ÓÀ´Ã»ÓÐÏë¹ýHTML±à¼­Æ÷Ò²¿ÉÒÔʹÓÃÕâ¸ö¿Ø¼þÀ´ÊµÏ ......

HTML DOM Window ¶ÔÏó


Window ¶ÔÏó
Window ¶ÔÏóÊÇ JavaScript ²ã¼¶ÖеĶ¥²ã¶ÔÏó¡£
Window ¶ÔÏó´ú±íÒ»¸öä¯ÀÀÆ÷´°¿Ú»òÒ»¸ö¿ò¼Ü¡£
Window ¶ÔÏó»áÔÚ <body> »ò <frameset> ÿ´Î³öÏÖʱ±»×Ô¶¯´´½¨¡£
ÓÐ¹Ø Window ¶ÔÏóµÄÏêϸÃèÊö¡£
IE: Internet Explorer, F: Firefox, O: Opera.
Window ¶ÔÏóµÄ¼¯ºÏ
CollectionDescriptionIEFO
fr ......

HTMLתÒå·ûºÅ

HTML³£Ó÷ûºÅ£º
ÏÔʾһ¸ö¿Õ¸ñ &nbsp; &#160;
< СÓÚ &lt; &#60;
> ´óÓÚ &gt; &#62;
& &·ûºÅ &amp; &#38;
" Ë«ÒýºÅ &quot; &#34;
ÆäËû³£ÓõÄ×Ö·ûʵÌå(Character Entities)
ÏÔʾ½á¹û ˵Ã÷ Entity Name Entity Number
? °æÈ¨ &copy; &#169;
? ×¢²áÉ̱ ......
© 2009 ej38.com All Rights Reserved. ¹ØÓÚE½¡ÍøÁªÏµÎÒÃÇ | Õ¾µãµØÍ¼ | ¸ÓICP±¸09004571ºÅ