Anyways – any insight on the following:
- open a web page in IE
- highlight a table
- copy the table
- start MS-Word
- paste table into word
The HTML table renders as word table with all formatting etc…
This is different from including HTML format text in a WordML element
(assuming that that is what you are driving at!)
When you copy/pste from IE to Word, you are not copying HTML format text
from one .htm file to another .htm/.xml/.doc file. You are copying
material from one application to another via the Windows clipboard.
So let's make the simplifying assumption that when you copy a table in
IE, IE places a copy of the original HTML in the clipboard.
The questin is what happens when you then paste into Word. Well Word can
understand HTML too, so what really happens is that Word reads the table
HTML and converts it so that it becomes part of Word's in-memory
document representation. However, when you save that Word document, that
table will be rendered as HTML if you are saving as a Web page; as .rtf
if you are saving in .rtf format; as .doc foormat if you are saving in
that format, and as WordML if you are saving as WordML .xml or Word 2007
..docx format.
e.g. Suppose I start with the following chunk of HTML that represents
the first cell of a table where the cell contains the text "1."
<table border="1" width="100%">
<tr>
<td width="4%" valign="top"><font size="2">1.</font></td>
If I copy that from IE7 and paste it into Word 2007 as HTML, then save
the document as a Web page and re-open it as a plain text file, I see
<table class=MsoNormalTable border=1 cellpadding=0 width="100%"
style='width:100.0%;mso-cellspacing:1.5pt;border
utset #660000 1.0pt;
mso-border-alt
utset #660000 .75pt;mso-yfti-tbllook:1184'>
<tr style='mso-yfti-irow:0;mso-yfti-firstrow:yes'>
<td width="4%" valign=top style='width:4.0%;border:inset #660033 1.0pt;
mso-border-alt:inset #660033 .75pt;padding:.75pt .75pt .75pt .75pt'>
<p class=MsoNormal
style='margin-bottom:0cm;margin-bottom:.0001pt;line-height:
normal'><span style='font-size:10.0pt;font-family:"Book Antiqua","serif";
mso-fareast-font-family:"Times New Roman";mso-bidi-font-family:"Times
New Roman";
color:black'>1.</span><span style='font-size:12.0pt;font-family:"Book
Antiqua","serif";
mso-fareast-font-family:"Times New Roman";mso-bidi-font-family:"Times
New Roman";
color:black'><o
></o
></span></p>
</td>
Not surprisingly, since this is an HTML file, the table is rendered
using the <TABLE>, <TR> and <TD> elements. Hardly surprising, since
there's no other way to do it. But the content is radically different
from the original because Word needs to record loads of layout
information that it has in effect added.
If I save as Word 2003 format WordML, the equivalent chunk is
<w:tbl><w:tblPr><w:tblW w:w="5000" w:type="pct"/><w:tblCellSpacing
w:w="15" w:type="dxa"/><w:tblBorders><w:top w:val="outset" w:sz="6"
wx:bdrwidth="15" w:space="0" w:color="660000"/><w:left w:val="outset"
w:sz="6" wx:bdrwidth="15" w:space="0" w:color="660000"/><w:bottom
w:val="outset" w:sz="6" wx:bdrwidth="15" w:space="0"
w:color="660000"/><w:right w:val="outset" w:sz="6" wx:bdrwidth="15"
w:space="0" w:color="660000"/></w:tblBorders><w:tblCellMar><w:top
w:w="15" w:type="dxa"/><w:left w:w="15" w:type="dxa"/><w:bottom w:w="15"
w:type="dxa"/><w:right w:w="15" w:type="dxa"/></w:tblCellMar><w:tblLook
w:val="04A0"/></w:tblPr><w:tblGrid><w:gridCol w:w="404"/><w:gridCol
w:w="1554"/><w:gridCol w:w="1554"/><w:gridCol w:w="2899"/><w:gridCol
w:w="2735"/></w:tblGrid><w:tr wsp:rsidR="00596248"
wsp:rsidRPr="00596248"><w:trPr><w:tblCellSpacing w:w="15"
w:type="dxa"/></w:trPr><w:tc><w:tcPr><w:tcW w:w="200"
w:type="pct"/><w:tcBorders><w:top w:val="outset" w:sz="6"
wx:bdrwidth="15" w:space="0" w:color="660033"/><w:left w:val="outset"
w:sz="6" wx:bdrwidth="15" w:space="0" w:color="660033"/><w:bottom
w:val="outset" w:sz="6" wx:bdrwidth="15" w:space="0"
w:color="660033"/><w:right w:val="outset" w:sz="6" wx:bdrwidth="15"
w:space="0" w:color="660033"/></w:tcBorders></w:tcPr><w
wsp:rsidR="00596248" wsp:rsidRPr="00596248" wsp:rsidRDefault="00596248"
wsp:rsidP="00596248"><w
Pr><w:spacing w:after="0" w:line="240"
w:line-rule="auto"/><w:rPr><w:rFonts w:ascii="Book Antiqua"
w:fareast="Times New Roman" w:h-ansi="Book Antiqua"/><wx:font
wx:val="Book Antiqua"/><w:color w:val="000000"/><w:sz
w:val="24"/><w:sz-cs w:val="24"/></w:rPr></w
Pr><w:r
wsp:rsidRPr="00596248"><w:rPr><w:rFonts w:ascii="Book Antiqua"
w:fareast="Times New Roman" w:h-ansi="Book Antiqua"/><wx:font
wx:val="Book Antiqua"/><w:color w:val="000000"/><w:sz
w:val="20"/><w:sz-cs w:val="20"/></w:rPr><w:t>1.</w:t></w:r></w
></w:tc>
No sign of any HTML there - it's all WordML. However, if you opened that
..xml document and saved it as .html, you'd probably roughly the same
thing as the previous .htm chunk.
Peter Jamieson
http://tips.pjmsn.me.uk