Java HTML ParserÓ¦ÓÃ
×î½üÒòΪÏîÄ¿ÐèÒª£¬Ñо¿ÁËjava html parserÀà¿âµÄÓ¦Ó᣼ǼÏÂʹÓÃÒªµã£º
Ö÷ÒªµÄÀà˵Ã÷£º
1¡¢ParserÀà
½âÎöÆ÷Ö÷À࣬¸ºÔðÔØÈëHTML´úÂë²¢½âÎö¡£
2¡¢Node½Ó¿Ú
ÓÃÀ´±íÕ÷ÔÚ½âÎö¹ý³ÌÖÐʹÓõÄÓï·¨µ¥Ôª¡£Ê¾ÀýÈç϶Îhtml´úÂ룺
<span> ----Tag node
text ----Text Node
</span>
Îı¾ºÍ±êÇ©¶¼ÊǶÀÁ¢µÄnodeÔªËØ¡£textÎı¾ÊDZêÇ©spanµÄchild node
3¡¢NodeFilter
±êÇ©¹ýÂËÆ÷½Ó¿Ú£¬ÓÃÀ´ÔÚparser»òNodeListÖйýÂ˳öÐèÒªµÄijһÀànode¡£
4¡¢NodeList
Êý¾Ý½á¹¹£¬±íʾNodeµÄ¼¯ºÏ
ÐèÒªÌØ±ð×¢ÒâµÄµØ·½£º
ParserºÍNodeList¶¼ÓÐÒ»¸öÃûΪextractAllNodesThatMatch(NodeFilter filter)µÄ·½·¨ÓÃÀ´¹ýÂ˳ö·ûºÏij¸öÌõ¼þµÄnode£¬µ«ÊÇÆäÄÚ²¿µÄʵÏÖ»úÖÆ²»Í¬¡£
ParserÊÇÔÚ½âÎöÆ÷µÄ¹¦ÄÜ»ù´¡ÉÏʹÓÃIterorʵÏÖ¡£Ã¿´Îµ÷Óø÷½·¨ºóÐèÒªÖ´ÐÐreset·½·¨£¬·ñÔò»áÓ°ÏìÏÂÒ»´Îµ÷ÓõĽá¹û¡£
¶øNodeListÊÇÔÚÄÚ²¿µÄÊý×éÉϽøÐÐÑ»·Åжϣ¬Òò´Ë¸÷´Îµ÷ÓÃÖ®¼ä²»»á»¥ÏàÓ°Ï죬ЧÂÊÒ²±ÈParserµÄ¸ß£¬ÍÁ½¨Ê¹Óá£
´úÂëʾÀý£º
ʵÏÖgetElementByID¹¦ÄÜ
<code>
public class NodeIDFilter implements NodeFilter {
private String id;
public NodeIDFilter(String id)
{
this.id=id;
}
public boolean accept(Node node) {
if(node instanceof Tag)
{
if(!((Tag)node).isEndTag())
{
String s=((Tag)node).getAttribute("id");
if(s!=null)
return s.equals(this.id);
}
}
return false;
// throw new UnsupportedOperationException("Not supported yet.");
}
}
public class MHTMLParser
{
....
protected Node getElementById(String id) throws ParserException
{
//this.myparser.reset();
if(this.mNodeList==null||this.mNodeList.size()==0) return null;
NodeIDFilter nodef = new NodeIDFilter(id);
NodeList nl = this.mNodeList.extractAllNodesThatMatch(nodef,true);
//
if (nl.size() != 0)
{
return nl.elementAt(0);
}
return null;
}
}
</code>
Ïà¹ØÎĵµ£º
1.
Ôö¼ÓJavaÓ¦ÓÃ
ÔÚandroid_2.1/packages/appsÔö¼Ó¸ÃÏîÄ¿£»
ÔÚbuild/target/product/generic.mkµÄPRODUCT_PACKAGESºóÃæÔö¼Ó¶ÔÓ¦°üµÄÃû×Ö¡£
1.1
´ÓÃüÁîÐ ......
try {
String source = "xxx.xls";
InputStream is = new FileInputStream(source);
Workbook rwb = Workbook.getWorkbook(is);
Sheet sheet = rwb.getSheet(0);
for (int i = 1; i < sheet.getRows(); i++) {
for (int j = 1; j < sheet.getColumns(); j++) {
//»ñȡָ¶¨µ¥Ôª¸ñ ......
ÀàÃû£º
java.util.Date
¹¹Ôì·½·¨£º
¹¹Ôì·½·¨ ½â˵
Date()
ÎÞ²ÎÊýµÄ¹¹Ôì·½·¨£¬½«¹¹½¨Ò»¸ö±£³Öµ±Ç°ÈÕÆÚ.ʱ¼äµÄDate¶ÔÏó¡£
Date(long time)
²ÎÊýΪ1970Äê1ÔÂ1ÈÕ00ʱ00·Ö00ÃëÆðËù¾¹ýµÄºÁÃëÊý£¬½«¹¹½¨Ò»¸ö±£³Ö¸ÃÈÕÆÚ.ʱ¼äµÄ¾«È·µ½ºÁÃëµÄDate¶ÔÏó¡£
ÆäËûÓÐЩ¹¹Ôì·½·¨ÒѾ²»ÍƼöʹÓã¬ÕâÀï² ......
abstract classºÍinterfaceÊÇJavaÓïÑÔÖжÔÓÚ³éÏóÀඨÒå½øÐÐÖ§³ÖµÄÁ½ÖÖ»úÖÆ£¬ÕýÊÇÓÉÓÚÕâÁ½ÖÖ»úÖÆµÄ´æÔÚ£¬²Å¸³ÓèÁËJavaÇ¿´óµÄÃæÏò¶ÔÏóÄÜÁ¦¡£ abstract classºÍinterfaceÖ®¼äÔÚ¶ÔÓÚ³éÏóÀඨÒåµÄÖ§³Ö·½Ãæ¾ßÓкܴóµÄÏàËÆÐÔ£¬ÉõÖÁ¿ÉÒÔÏà»¥Ìæ»»£¬Òò´ËºÜ¶à¿ª·¢ÕßÔÚ½øÐгéÏóÀඨÒåʱ¶ÔÓÚ abstract cl ......