Python word
³¬Èº.comµÄ²©¿Í
Pythonת»»office wordÎļþΪHTML
ÕâÀï²âÊԵĻ·¾³ÊÇ£ºwindows xp,office 2007,python 2.5.2,pywin32 build
213£¬ÔÀíÊÇÀûÓÃwin32com½Ó¿ÚÖ±½Óµ÷ÓÃoffice
API£¬ºÃ´¦ÊǼòµ¥¡¢¼æÈÝÐԺã¬Ö»ÒªofficeÄÜ´¦ÀíµÄ£¬python¶¼¿ÉÒÔ´¦Àí£¬´¦Àí³öÀ´µÄ½á¹ûºÍoffice wordÀïÃæ“Áí´æÎª”Ò»Ö¡£
#!/usr/bin/env python
#coding=utf-8
from
win32com import
client as
wc
word = wc.Dispatch
(
'Word.Application'
)
doc = word.Documents
.Open
(
'd:/labs/math.doc'
)
doc.SaveAs
(
'd:/labs/math.html'
, 8
)
doc.Close
(
)
word.Quit
(
)
¹Ø¼üµÄ¾ÍÊÇdoc.SaveAs(‘d:/labs/math.html’,
8)ÕâÒ»ÐУ¬ÍøÉϺܶàÎÄÕÂд³É£ºdoc.SaveAs(‘d:/labs/math.html’,
win32com.client.constants.wdFormatHTML)£¬Ö±½Ó±¨´í£º
AttributeError: class Constants has no attribute ‘wdFormatHTML’
µ±È»ÄãÒ²¿ÉÒÔÓÃÉÏÃæµÄ´úÂ뽫wordÎļþת»»³ÉÈÎÒâ¸ñʽÎļþ£¨Ö»Òªoffice 2007Ö§³Ö£¬±ÈÈ罫wordÎļþת»»³ÉPDFÎļþ£¬°Ñ8¸Ä³É17¼´¿É£©£¬ÏÂÃæÊÇoffice 2007Ö§³ÖµÄÈ«²¿Îļþ¸ñʽ¶ÔÓ¦±í£º
wdFormatDocument = 0
wdFormatDocument97 = 0
wdFormatDocumentDefault = 16
wdFormatDOSText = 4
wdFormatDOSTextLineBreaks = 5
wdFormatEncodedText = 7
wdFormatFilteredHTML = 10
wdFormatFlatXML = 19
wdFormatFlatXMLMacroEnabled = 20
wdFormatFlatXMLTemplate = 21
wdFormatFlatXMLTemplateMacroEnabled = 22
wdFormatHTML = 8
wdFormatPDF = 17
wdFormatRTF = 6
wdFormatTemplate = 1
wdFormatTemplate97 = 1
wdFormatText = 2
wdFormatTextLineBreaks = 3
wdFormatUnicodeText = 7
wdFormatWebArchive = 9
wdFormatXML = 11
wdFormatXMLDocument = 12
wdFormatXMLDocumentMacroEnabled = 13
wdFormatX
Ïà¹ØÎĵµ£º
½ñÌìÓöµ½ÁËÂé·³£º
ÓÃEclipse±àÒëpythonÏòsqliteÊý¾Ý¿â²åÈëÊý¾Ý×ÜÊÇÌáʾ±àÂëµÄÎÊÌ⣬ÎÒÉèÖÃÁËEclipseµÄworkspaceÓÃutf-8±àÂ뻹ÊDz»ÐÐ
µ«ÊÇÓÃEclipseдµÄ³ÌÐò±£´æÔÙÓÃIDLE´ò¿ªÈ´ÄܱàÒë£¬Ææ¹ÖÁË
ÎÒ¿¼Âǵ½¿ÉÄÜÊÇÅäÖõÄÔÒò£¬ÕÒÕÒ±éÁËËùÓеÄÅäÖ㬶¼Ã»ÓÐÎÊÌ⣬°üÀ¨¹¤³ÌÎļþ
×îºóÎÒÓÃxvi32´ò¿ ......
µ±ÎÒÃÇÕâÑù½¨Á¢Îļþʱ
f =
file('x1.txt', 'w')
f.write(u'ÖÐÎÄ')
f.colse()
Ö±
½Ó½á¹ûÓ¦¸ÃÊÇÀàËÆ
f.write(u'ÖÐÎÄ')
UnicodeEncodeError: 'ascii'
codec can't encode characters in position 0-16: ordinal not in
range(128)
ÒªÖ±½Óд utf-8 ÎļþÔõô°ìÄØ?
import codecs
f = codecs. ......
»ù±¾É϶¼ÊÇʹÓÃpythonÀ´½âÎöxmlÎļþµÄ¡£
±ÈÈçÎÒÒª½«ÄÚÈÝΪ
<?xml version="1.0" encoding="utf-8"?>
<root>
<book isbn="34909023">
<author>
&n ......
%a ÐÇÆÚ¼¸µÄ¼òд
%A ÐÇÆÚ¼¸µÄÈ«³Æ
%b Ô·ֵļòд
%B Ô·ݵÄÈ«³Æ
%c ±ê×¼µÄÈÕÆÚµÄʱ¼ä´®
%C
Äê·ÝµÄºóÁ½Î»Êý×Ö
%d Ê®½øÖƱíʾµÄÿÔµĵڼ¸Ìì
%D ÔÂ/Ìì/Äê
%e ÔÚÁ½×Ö·ûÓòÖУ¬Ê®½øÖƱíʾµÄÿÔµĵڼ¸Ìì
%F
Äê-ÔÂ-ÈÕ
%g Äê·ÝµÄºóÁ½Î»Êý×Ö£¬Ê¹ÓûùÓÚÖܵÄÄê
%G Äê·Ö£¬Ê¹ÓûùÓÚÖܵÄÄê
%h ¼òдµÄÔ·ÝÃû ......
ʹÓÃxlrd
À´
¶ÁÈ¡£¬xlrdµÄÏÂÔØ¼°°²×°¿ÉÒԲο´:
Python
"xlrd" package for extracting data from Excel files
---------------------------------------------------------------------------------
#coding=utf-8
import xlrd
import os, types, datetime
#excel´æ·ÅĿ¼
dir = u'D:\\temp\\excel'
......