wordתhtmlÈçºÎÇå³ýÈßÓà´úÂë
ÎÒÓм¸Íò¸ö´ÓwordתÀ´µÄhtmlÎļþ£¬µ«ÕâЩhtmlÎļþÓÉdocµÄ100¶àK±ä³ÉÁ˼¸M£¬¼¸Ê®M¡£
ÔÀ´×ªÎªhtmlʱ²úÉúÁË´óÁ¿µÄÈßÓà´úÂ룬ÇëÎÊÓÐʲô·½·¨¿ÉÒÔÇå³ýÕâЩÀ¬»ø¡£
ÐèÒª³ÌÐò´úÂë¡£
¸Õ²Åû·ÖÁË£¬ÏÖÔÚÓÖÓÐÁË£¬¿ÉÒÔ¼Ó·ÖµÄ
/// <summary>
/// ÇåÀíWordÉú³ÉµÄÈßÓàHTML
/// </summary>
/// <param name="html"> </param>
/// <returns> </returns>
public static string CleanWordHtml(string html)
{
StringCollection sc = new StringCollection();
// get rid of unnecessary tag spans (comments and title)
sc.Add(@" <!--(\w|\W)+?-->");
sc.Add(@" <title>(\w|\W)+? </title>");
// Get rid of classes and styles
sc.Add(@"\s?class=\w+");
sc.Add(@"\s+style='[^']+'");
// Get rid of unnecessary tags
//sc.Add(@"
Ïà¹ØÎÊ´ð£º
ÎÒÔÚ×öÒ»¸öÍøÕ¾aspµÄ£¬ÏëÉú³Éhtml£¬Éú³ÉºóÈçºÎµ÷ÓÃÄØ£¿
È磺ÎÒµ±Ç°µ÷ÓÃÒ³ÃæÊÇhttp://192.168.0.100/jdasp/x.asp?cnmai=1795 £¬Éú³ÉµÄÊÇx1795.htmlÎļþ£¬
ÈçºÎÔÚµ÷ÓÃx ......
Àý£º°ÑÏÂÃæÒ»ÐдúÂëÓÃÑ»·10´Î£®ºóÒ³Ãæ´úÂëÏÔʾΪ10ÐУ»ÆäÖÐstrIDºó¸ú×ÅÑ»·´ÎÊýÏÔʾ£¬ÈçstrID1,strID2,strID3
<tr>
&nbs ......
ÈçÌ⣺
¿ÉÒÔ¶¯Ì¬µÄ¸Ä±äÍøÒ³µÄ±êÌâÂð£¿
ÊÇÄÄÒ»ÖÖÍøÒ³ÄØ£¿ Èç¹ûÊÇhtml¿ÉÄܱȽϸ´ÔÓ
¿ÉÒÔ°¡£¬Ö»ÒªÊÇÄÜÖ´ÐзþÎñ¶Ë½Å±¾µÄ¡£
¾ÍÊÇHTMLÍøÒ³
Javascript
àÅ£¬ÊÇÒªÓÃJavaScript£¬ÄÇÒªÔõôʵÏÖÄØ£¿
<sc ......
ÊÖ»úÄÜ´ò¿ª.htmlµÄÍøÕ¾,Ϊʲô»¹Òª×öwapÍøÕ¾ÁË?,,,ÊÖ»úä¯ÀÀwapÍøÕ¾ÓÐʲôºÃ´¦
ÎÒÃǹ«Ë¾×öµÄwap¾ÍÊÇhtmlµÄ¡£
¹Ø×¢
ºÜ¶àµÍ¶ËµÄÊÖ»ú¶¼»¹ÊÇÖ»ÄÜ¿´wml¸ñʽµÄÀ²£¬wml±¾À´¾ÍÊÇרÃÅÕë¶ÔÊÖ»úÖÆ¶¨µÄÒ»Ì×Ò³ÃæÏÔʾÓïÑÔÀ²£ ......
C# code:
protected void Button1_Click(object sender, EventArgs e)
{
string str = HttpContext.Current.Server.MapPath("/WebSite1");
str += @"\index.htm";
......