Stripping all formatting from Word doc

K

Kurt

I think that has more to do with the original text brought into
TextEdit. I've never had that problem unless I was bringing in text
copied from an email.

I saved as Text Only out of Word, opened in TextEdit, then copy and pasted.
Every paragraph ended with a line break. Not that it matters, since I have a
workflow that works for me.

Re your problem, Kurt:
I wonder whether formatting the doc in Word, and then bringing it over might
be an efficient approach? Figure out what GL will bring over as you want it,
and use that formatting, copy directly from Word, then finish off the
formatting in GL. (I've kinda lost track of this thread, but it doesn't
seem to have been resolved)

All formatting in Word set to Normal style and no character formatting
should just come over as a plain paragraph, and it's pretty easy to wipe a
Word doc down to that. What types of CSS formatting are you trying to get
the text to automatically pick up? I'm having a hard time picturing what you
wrote below.[/QUOTE]

Generally body text, but I cut and paste heads, too.
Although, once a workflow is decided upon, it should be easy to automate
whether it's Word or TextEdit, but I would assume using one less program
would be slightly simpler.

Agreed, but Word is far too bloated and cumbersome to work with most of
the time. Stripping it in TextEdit, and then using it out of there is
much better, not to mention the lack of screen clutter with all the Word
palettes, etc. I can put TextEdit up in a corner of my other monitor (I
use 2)
 
D

Daiya Mitchell

What types of CSS formatting are you trying to get
Generally body text, but I cut and paste heads, too.

Body text and headings should come over into GL with no problem at all, and
heading 1 in Word, heading 2 etc should transfer properly. At least they do
for me here (CS 1). Perhaps I'm just too much of a web novice and body text
in CSS is not actually the same thing as just said:
Agreed, but Word is far too bloated and cumbersome to work with most of
the time. Stripping it in TextEdit, and then using it out of there is
much better, not to mention the lack of screen clutter with all the Word
palettes, etc. I can put TextEdit up in a corner of my other monitor (I
use 2)

You can close all those palettes, and/or replace them with a custom-built
toolbar that takes up only the space you deem necessary.

Stripping text of formatting in Word requires 3 key combinations.

But, if you would rather use TextEdit, then why did you paste your question
in the first place? :) You seemed to want to get away from using an
intermediary.

Not that it hasn't produced an interesting conversation, which is reason
enough, of course.

Daiya
 
K

Kurt

Generally body text, but I cut and paste heads, too.

Body text and headings should come over into GL with no problem at all, and
heading 1 in Word, heading 2 etc should transfer properly. At least they do
for me here (CS 1). Perhaps I'm just too much of a web novice and body text
in CSS is not actually the same thing as just <p></p> tags as I have been
assuming?[/QUOTE]

I usually set up a few CSS definitions that can be applied to plain text
as I create the page. It gives me more flexibility with styles.
You can close all those palettes, and/or replace them with a custom-built
toolbar that takes up only the space you deem necessary.

Yes, but often times, I use the full toolbar/pallette setup for other
projects. Word is not a primary program for me, so I generally like to
see all the tools when I'm creating a new doc.
Stripping text of formatting in Word requires 3 key combinations.

But, if you would rather use TextEdit, then why did you paste your question
in the first place? :) You seemed to want to get away from using an
intermediary.
No, I was actually hoping just to use Word, but as the discussion
progressed, it seems as though I'll have to spend more some time playing
with it- to see if it's really faster to work within it.

Not that it hasn't produced an interesting conversation, which is reason
enough, of course.

Daiya

I really intended to stay on topic. This board is probably the best free
online resource out there for Office products.
 
H

Helpful Harry

"Klaus Linke" said:
As long as you just edit the text ( and don't format something in ways that
can't be saved in plain text format), Word will save in the same (text)
format you opened, without any warning (since there's no formatting to
loose) and without any changes.

You're missing the point - the document on screen is not the file
that's saved on the disk, it's only a temporary copy.

When you open the file, Word reads in the data from the file and
converts IN MEMORY to a Word document to be displayed, regardless of
what file type is stored on the disk or whether you make any changes or
not. Internally while the document is open Word works on it as a Word
document - the file type of the version still on the disk is completely
irrelevant (until it comes time to save it again).

This allows you to use any Word tool on any open document, and
therefore any copying from that document is also in Word format. Any
temporaray saves are also in Word format. (I'm not sure about
auto-saves, they may complain about incompatibilities or simply use a
Word document type.)

It's only if you then try to Save or Save As the document that Word
(re)converts it to the appropriate format to be written to a file, with
any appropriate warnings first.

For example, open a Plain Text file into Word. You can then use drawing
tools to create boxes, circles and insert pictures to your hearts
content, but it's not until you try and resave the document as Plain
Text that Word complains about losing bits due to file type
incompatibilities.


Helpful Harry
Hopefully helping harassed humans happily handle handiwork hardships ;o)
 
C

Clive Huggan

On 27/8/05 6:14 AM, in article
(e-mail address removed), "Kurt"

No, I was actually hoping just to use Word, but as the discussion
progressed, it seems as though I'll have to spend more some time playing
with it- to see if it's really faster to work within it.

And from Kurt's earlier comment:
Am I the only one still lamenting that MS should have stopped at Word 5?
Fast startup, fine features, and easy to edit.

Hello Kurt,

This has been an interesting thread!

If Word 5's operation is familiar to you, you might get some use, when
familiarizing yourself with Word 2004, from some notes titled "Bend Word to
Your Will", which are available as a free download from the Word MVPs'
website (http://word.mvps.org/MacWordNew/Bend/BendWord.htm).

They started off in 2001 when, screaming, I made the migration from Word
5.1a to 2001. Despite growing in size (the page extent is now about 170
pages) and covering significant changes in Word 2004, the underlying
starting point is still migration from 5.1. ("Bend Word to Your Will"
doesn't cover Word X -- I skipped that version.)

[Note: "Bend Word to Your Will" is designed to be used electronically and
most subjects are self-contained dictionary-style entries. Be sure to read
the front end so you can use the document to best advantage and select the
right settings for reading it.]

I have yet to meet one person who used Word 5.1 <nostalgic sigh> and
denigrates it. But the world moves on (not necessarily upwards).

Cheers,

Clive Huggan
Canberra, Australia
(My time zone is at least 7 hours different from the US and Europe, so my
follow-on responses to those regions can be delayed)
============================================================
 
K

Klaus Linke

Hi Harry,

What about TextEdit or Windows Notepad? They have formatting (bold,
underlined, centered ...) too.
As long as you don't use any formatting, it's still a plain text file. Even
if you do and discard the formatting, it's again a plain text file.

On Windows, Word is often used as a plain text editor for Mail ..., just by
forbidding any formatting. Not necessarily a good solution, but possible.

As I said, we probably just have different perspectives: Let's agree to
disagree.

Klaus
 
H

Helpful Harry

"Klaus Linke" said:
What about TextEdit or Windows Notepad? They have formatting (bold,
underlined, centered ...) too.
As long as you don't use any formatting, it's still a plain text file. Even
if you do and discard the formatting, it's again a plain text file.

I can't say I've ever tried opening a formatted TeachText / SimpleText
/ TextEdit / Notepad in another text only editor or any other editor,
but I'd guess that if you use formatting then it either saves some
strange codes (which other editors will display as garbage) OR as a
"Rich Text" / RTF file OR as some hybrid file type.

In fact, I've just tested this and it's true for SimpleText:

- a document saved from SimpleText (Mac OS 9) with various
styles, fonts and colours,

- a second document of tab-separated data exported from a
FileMaker database.

I loaded both into Resourcer (a file / hex editor) to see what is
REALLY saved in each "text" file.

What you get for the SimpleText "text" file is the plain text in the
Data Fork and all the style / font / colour (and presumably image)
information in the Resource Fork in a "styl" resource. This is NOT a
true Text file (aka "ASCII" file).

On the other hand, the tab-separated "text" file only has the Data Fork
with the plain text. This IS a true Text file, it has absolutely
nothing but the text.

Obviously, similar to what I said about Word, SimpleText uses it's own
internal format as well as for the saved file. But in this case the
saved format is a hybrid file type that's similar to what you were
saying about the clipboard. It basically stores the normal Text version
and the formatting commands separately. These formatting commands will
be understood by any Mac applications that know about them, but ignored
by those Mac (and Windows) applications that don't leaving just the
plain text.

Notepad no doubt does something similar.

Almost every application (whether that's Word, Photoshop, or anything
else) that can open / save document formats for other applications will
convert the file when it is opened into it's own INTERNAL format for
viewing and working on. When you save the file it is then converted
back to the appropriate format.

It's only common sense since it makes writing an application MUCH
easier - you only have to worry about one INTERNAL format rather than
trying to have a multitude of "similar, but different" sections of the
code trying to keep the memory version of the document the same format
as the various readable / writeable types. You worry about the other
formats only when opening or saving the file, and when copying you only
have to worry about the OS's clipboard format(s).



Helpful Harry
Hopefully helping harassed humans happily handle handiwork hardships ;o)
 
D

Daiya Mitchell

I usually set up a few CSS definitions that can be applied to plain text
as I create the page. It gives me more flexibility with styles.

But you do that in GL, right? Is there any way you want Word/TextEdit to
help with that?

Although, possibly, I'm thinking it's the fact that Word does come over with
<p></p> tags that ruins this, since each <p> probably needs to be formatted
separately, and even pasting into the middle of a existing paragraph
probably still brings the said:
No, I was actually hoping just to use Word, but as the discussion
progressed, it seems as though I'll have to spend more some time playing
with it- to see if it's really faster to work within it.

Hmm. I started playing with this, and it's pretty easy to get rid of all
text formatting by setting to Normal and ResetChar to clear character
formatting. There's also a Find and Replace to eliminate all extra spaces
and tabs, I think, though I can't remember it just now.

Line breaks I'm not so sure--easy enough to delete and replace with
paragraphs, but would that really be what you wanted?

Tables are a bit of a problem, and I'm not sure what would happen with
graphics. Kinda depends on what type of crazy things you need to deal with.

I am sure that a macro/script in Word could be worked out (I'm pretty sure
John McGhie has one), but if you have a workflow that works for you, might
be better to put the energy into automating the three programs rather than
trying to get it down to two. On general principle, I should think energy
is better spent automating a 10 step process than cutting it down to 5.

Daiya
 
K

Kurt

I usually set up a few CSS definitions that can be applied to plain text
as I create the page. It gives me more flexibility with styles.

But you do that in GL, right? Is there any way you want Word/TextEdit to
help with that?[/QUOTE]

Not really, as I mentioned, I get all my copy from clients in Word. Just
wanted to cut and paste easily out of it.
Although, possibly, I'm thinking it's the fact that Word does come over with
<p></p> tags that ruins this, since each <p> probably needs to be formatted
separately, and even pasting into the middle of a existing paragraph


Hmm. I started playing with this, and it's pretty easy to get rid of all
text formatting by setting to Normal and ResetChar to clear character
formatting. There's also a Find and Replace to eliminate all extra spaces
and tabs, I think, though I can't remember it just now.

Line breaks I'm not so sure--easy enough to delete and replace with
paragraphs, but would that really be what you wanted?

Tables are a bit of a problem, and I'm not sure what would happen with
graphics. Kinda depends on what type of crazy things you need to deal with.

I am sure that a macro/script in Word could be worked out (I'm pretty sure
John McGhie has one), but if you have a workflow that works for you, might
be better to put the energy into automating the three programs rather than
trying to get it down to two. On general principle, I should think energy
is better spent automating a 10 step process than cutting it down to 5.
That's my philosophy - the path of least resistance. I work in too many
different programs during a day. i wouldn't have a life if I devoted
myself to learning all the nuances of all of them.
 
K

Kurt

Clive Huggan said:
On 27/8/05 6:14 AM, in article
(e-mail address removed), "Kurt"

No, I was actually hoping just to use Word, but as the discussion
progressed, it seems as though I'll have to spend more some time playing
with it- to see if it's really faster to work within it.

And from Kurt's earlier comment:
Am I the only one still lamenting that MS should have stopped at Word 5?
Fast startup, fine features, and easy to edit.

Hello Kurt,

This has been an interesting thread!

If Word 5's operation is familiar to you, you might get some use, when
familiarizing yourself with Word 2004, from some notes titled "Bend Word to
Your Will", which are available as a free download from the Word MVPs'
website (http://word.mvps.org/MacWordNew/Bend/BendWord.htm).

They started off in 2001 when, screaming, I made the migration from Word
5.1a to 2001. Despite growing in size (the page extent is now about 170
pages) and covering significant changes in Word 2004, the underlying
starting point is still migration from 5.1. ("Bend Word to Your Will"
doesn't cover Word X -- I skipped that version.)

[Note: "Bend Word to Your Will" is designed to be used electronically and
most subjects are self-contained dictionary-style entries. Be sure to read
the front end so you can use the document to best advantage and select the
right settings for reading it.]

I have yet to meet one person who used Word 5.1 <nostalgic sigh> and
denigrates it. But the world moves on (not necessarily upwards).
Hi Clive,

Look forward to reading what you compiled. It's not that I dislike the
fact that software developers want to add more features each year, it's
more that they often write them as if the world is supposed to revolve
around their program and they need to be everything for everybody.

And when a program gets to be an industry standard, we all become slaves
to the programmer's whims.

Kind of like having to buy a Ford Excursion SUV when all you need is a
reliable car.
 
C

Clive Huggan

Clive Huggan said:
On 27/8/05 6:14 AM, in article
(e-mail address removed), "Kurt"

No, I was actually hoping just to use Word, but as the discussion
progressed, it seems as though I'll have to spend more some time playing
with it- to see if it's really faster to work within it.

And from Kurt's earlier comment:
Am I the only one still lamenting that MS should have stopped at Word 5?
Fast startup, fine features, and easy to edit.

Hello Kurt,

This has been an interesting thread!

If Word 5's operation is familiar to you, you might get some use, when
familiarizing yourself with Word 2004, from some notes titled "Bend Word to
Your Will", which are available as a free download from the Word MVPs'
website (http://word.mvps.org/MacWordNew/Bend/BendWord.htm).

They started off in 2001 when, screaming, I made the migration from Word
5.1a to 2001. Despite growing in size (the page extent is now about 170
pages) and covering significant changes in Word 2004, the underlying
starting point is still migration from 5.1. ("Bend Word to Your Will"
doesn't cover Word X -- I skipped that version.)

[Note: "Bend Word to Your Will" is designed to be used electronically and
most subjects are self-contained dictionary-style entries. Be sure to read
the front end so you can use the document to best advantage and select the
right settings for reading it.]

I have yet to meet one person who used Word 5.1 <nostalgic sigh> and
denigrates it. But the world moves on (not necessarily upwards).
Hi Clive,

Look forward to reading what you compiled. It's not that I dislike the
fact that software developers want to add more features each year, it's
more that they often write them as if the world is supposed to revolve
around their program and they need to be everything for everybody.

And when a program gets to be an industry standard, we all become slaves
to the programmer's whims.

Kind of like having to buy a Ford Excursion SUV when all you need is a
reliable car.

How right you are, Kurt!

Clive
=====
 
J

John McGhie [MVP - Word and Word Macintosh]

Word does NOT paste 'plain text' unless you force it to.

Depending on what the receiving application asks for, it pastes RTF, HTML,
XHTML or XML...


Not sure how to describe this, but cutting and pasting Word plain text
into say, my web program, GoLive. produces text that I need to reformat
to the existing CSS. The same text cut and pasted from TextEdit assumes
the CSS definitions of where I paste it into.
Wonder why there should be any difference between plain text in Word and
the same in TextEdit?

--

Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.

John McGhie <[email protected]>
Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer
Sydney, Australia +61 4 1209 1410
 
K

Klaus Linke

John McGhie said:
Word does NOT paste 'plain text' unless you force it to.


Word does NOT paste, period ;-)

Word might not even be running any more at the time you paste.
Depending on what the receiving application asks for, it pastes RTF,
HTML, XHTML or XML...

.... or plain text (Unicode/ANSII), or enhanced metafile (EMF), or a bitmap
graphic, ..., depending on what formats Word has put on the clipboard, and
what format the application you paste into decides on.

In Windows, you can start clipbrd.exe to see what formats an application has
made available on the clipboard.
It's probably similar (at least in principle) on the Mac.

Greetings,
Klaus
 
K

Kurt

Word does NOT paste 'plain text' unless you force it to.

Depending on what the receiving application asks for, it pastes RTF, HTML,
XHTML or XML...

And how does one "force" it to?
 
J

John McGhie [MVP - Word and Word Macintosh]

Sorry, I should have added that teensy little bit of essential detail ...
{Duh!}

Use Edit>Paste Special... You get a choice of formats, depending on what
you copied to the clipboard and what you are pasting into.

You can choose "Unformatted Text" to force a conversion of whatever is on
the clipboard to plain text.

Since I use this facility a lot, I have made a macro to do it, which I use
the Insert key for. Any time I want to strip the formatting on incoming
text, I just hit Insert instead of Paste.

That won't work for you, because you want to paste into GoLive. You should
find that GoLive has Edit>Paste Special available, most applications do...

Cheers


And how does one "force" it to?

--

Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.

John McGhie <[email protected]>
Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer
Sydney, Australia +61 4 1209 1410
 
K

Kurt

Sorry, I should have added that teensy little bit of essential detail ...
{Duh!}

Use Edit>Paste Special... You get a choice of formats, depending on what
you copied to the clipboard and what you are pasting into.
You can choose "Unformatted Text" to force a conversion of whatever is on
the clipboard to plain text.

Since I use this facility a lot, I have made a macro to do it, which I use
the Insert key for. Any time I want to strip the formatting on incoming
text, I just hit Insert instead of Paste.

That won't work for you, because you want to paste into GoLive. You should
find that GoLive has Edit>Paste Special available, most applications do...
It has that feature. Excellent tip! Thanks!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top