Word's text format

B

Ben Bullock

Recently I have been working on a program which pulls the text out of
text boxes in Word using OLE. I've noticed that Word's internal text
format seems to contain ASCII characters like 0x1 ("start of header")
when it's pulled directly from the text box, and also it seems to use
only carriage returns for the ends of lines. I've noticed that the
0x1's are being used fairly consistently by Word to actually mark the
beginnings of headers?

Am I right in thinking this, and if so, can anyone point me to a
reference for what kind of characters I might expect in text?

Thanks.
 
T

Tony Jollans

The 0x1 is a placeholder character in text containing an (in-line) textbox -
nothing to do with headers; it is not part of the textbox and if you are
working with the contents of the textbox you should not see it.

Word doesn't actually mark an end of line at all, but it does use a carriage
return to mark the end of each paragraph.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top