Lengthy merge code

H

Heather

Hi, all - I have a question that's probably painfully easy and obvious,
but I'm stumped.

I'm working with a very large (1400+ columns) csv file to create
customized letters for our students. I used merge code to determine
scholarships and special program offers based on things like admission
averages. Unfortunately, an admission average can appear in one of ten
positions, and within the one in ten, in one of six positions. So,
naturally, my merge code is long. I don't know sql and there isn't time
to write new programming, so I'm stuck with my pages and pages of merge
code.

To try to improve performance, I have code stored in different
files...so with the main letter, if the individual record meets certain
criteria, there's an includetext function for a document. That
document, in turn, has includetext functions from a master document
that inserts appropriate paragraphs. It works - it's not pretty, but it
works.

The merge takes a long time, and I assume it's because word is ticking
through each line of code and saying to itself 'yes/no.'

My question is: is there something I can insert into my code to say
that if the condition is found (positive result), to stop searching
though the rest of the code in that linked document and move on?

Thank you again for your help. This group has been an invaluable
resource.

Heather
 
P

Peter Jamieson

The only type of code that does anything like that is a { SKIPIF } code and
as far as I can tell that's not appropriate for your situation - i.e. it's
there to stop processing of the current record altogether when the condition
specified in the SKIP is found.

Not much helps when you're dealing with large numbers - you can't convert
your data to Access, for example, because it has a 255-column maximum, and
that's why you probably wouldn't be able to use SQL even if you wanted to.
The merge takes a long time, and I assume it's because word is ticking
through each line of code and saying to itself 'yes/no.'

It could well be that. It could also be because Word takes quite a long time
even to read a long text record and split it up into separate fields. (It
shouldn't but as far as I can tell, it does). Doing lots of INCLUDETEXTS is
likely to have a significant impact as well. There are some unuaul
text/formatting features that can slow processing down (e.g. "formatted
bullets"), but I think you would be able to tell if you have any of those by
taking out most of your fields and seeing if the merge is still slow. You
may also find that Word gets progressively slower as it works its way
through your data file, and if that is the case, it /might/ make sense to do
the merge in multiple chunks.

However, in my view, this is a classic case of "if it ain't broke, don't
'fix' it": if I had the option, I'd probably just want to check that
increased memory or processor speed made a difference, but that's about it.

Peter Jamieson
 
H

Heather

I'm with you - it ain't broke (yet) and I'd rather not have to fix it.
:)

I'm not sure if the SKIPIF would be a good solution unless I create
multiple templates for different kinds of students and run the merges
using the same data. I have to run my merges in groups of 500 or less,
because of our internal document imaging software limitations, but our
daily files are about 50-100 records, with a few huge files in the
summer (3000+).

With regards to Word getting progressively slower - is there an upper
limit, do you think? 30+ records? I have a LOT of includetext fields -
the programming for this particular letter is about 90 pages of merge
code, broken down into five documents.

Heather
 
P

Peter Jamieson

the programming for this particular letter is about 90 pages of merge

Rather you than me :) I've come across some complex merge documents over
the years but I think yours beats the lot !

Reports suggest that there are "boundaries" at which things start to slow
significantly, but I've never really had a good go at pinning down any
specific causes. One possibility in many cases is that even with quite large
amounts of RAM in a system, software (i.e. code/programs) can still take so
much of it that an application may run into a RAM limitation quite quickly
when it is loading or generating data, at which point Windows will spend
more time playing with its virtual memory. You /might/ be able to check that
by removing as many programs and inessential services from your system and
having a look at real/virtual memory usage. The trouble is that performance
testing of this kind can be very time-consuming, and if the real scenario
involved real printing, your tests aren't necessarily realistic unless
you're actually using up paper or using something like the full Acrobat or
Microsoft Office Document Imaging as your output printer.

Peter Jamieson
 
H

Heather

Yes - the merge code seems rather insane. We've converted to a
proprietary software package - I don't dare name it - and the merge
seems to be the best way to handle the types of communications we send
out. I haven't come across much information on merge code of this
magnitude.

Do you think bumping the virtual memory might help? We're running the
merges on a dedicated machine that's faster and better than our other
desktop jobbies. All it does is the merges. It lives for merging. ;)

Heather
 
P

Peter Jamieson

Do you think bumping the virtual memory might help? We're running the
merges on a dedicated machine that's faster and better than our other
desktop jobbies. All it does is the merges. It lives for merging. ;)

Shouldn't think so, but in this kind of situation I would try anything I was
convinced I could back out of. However, since you've dedicated a system to
it
a. it's probably going to be worth taking a fairly systematic approach to
performance monitoring, using the Windows performance monitoring facilities,
Task manager, and so on to try to discover what is going on.
b. if there's plenty of free (real) memory even before Word starts its
merge, the chances are that the problem has nothing to do with memory.
c. if not, I'd probably try to remove/disable anything I didn't need. Not
always easy to discover, of course.
Yes - the merge code seems rather insane. We've converted to a
proprietary software package - I don't dare name it - and the merge
seems to be the best way to handle the types of communications we send
out.

From what you said earlier it sounds as if you're now committed to this
approach, but are there other options? In particular, is there any way to do
more of the decision-making work when exporting from your package (if that's
hwat you're doing) and less using field coding?

Peter Jamieson
 
H

Heather

There must be, but I don't have the programming skills to work
something else out. Perl would be difficult because of the multiple
positions for the data, I think. If I had Excel 2007, I could probably
use data sorts and some fancy macros, but I don't have the software.
 
P

Peter Jamieson

Do any of your data fields contain any of the delimiter characters (i.e. the
text delimiter - double quote ", the field delimiter, usually a comma but
could be a tab or something else, or the record delimiter, usually CRLF) ?
Also, is all your data ASCII/ANSI or does it contain "international
characters" such as accented characters?

If the data is sufficiently simple, it might make sense to write some simple
VBA to preprocess it.

Peter Jamieson
 
H

Heather

Yes - they are comma separated values, and the data has all ASCII/ANSI
characters, and each record is delimted by a CRLF. No international
characters that I'm aware of.

The data is relatively simple, but needs a lot of logic applied to it
(if this type of student, if this type of admission rating, if this
type of average, then this...) to generate the appropriate letter and
scholarship offer.
 
P

Peter Jamieson

If it were my project, I'd keep the existing merge, but given the resource
(enough of my time, in my case :) ) I'd probably
a. try to work out what fields I actually needed to have in my data source
to reduce the "field logic" to a bunch of MERGEFIELD fields, perhaps
INCLUDETEXT fields, and as few IF fields as possible
b. if there were a lot more fields than I began with, I'd probably stick
with what I had
c. otherwise, I'd consider writing some VBA to preprocess the fields, or
even just to load them up into SQL Server or some such, get them back via
some views, stitch them together again (assuming more than 255) and merge.

Also, have you considered using Word events to try to move some of the
processing logic from Word fields to VBA? (I'm not particularly keen on this
approach myself but I'd probably try to discover if there were performance
benefits.

Up to you of course:)

Peter Jamieson
 
H

Heather

Well, it's reassuring to know that I'm basically taking the approach
you would have. I'm merging in the least amount of data needed into a
generic template letter.

The logic is built mainly around the admission average and scholarship
offers. I've been working on improving the speed of the merge, and I've
got it down to about 25-30 seconds per record (very slow, but works!).
I did this by having the main template, and then a bunch of referring
documents that split off into trees. 28 of them. So, for example: if
high school, include this document, which asks what KIND of high school
student, and breaks off into another document, that asks what kind of
program, and then breaks down into the final document, which includes
the logic.

I figure that this way, it cuts down the merge having to process
unnecessary logic. I'm trying to put the most common type of student at
the top of the tree, too.

The fun part will be writing it up at the end of the project. Finally,
my English degree will be useful!

Heather
 
P

Peter Jamieson

<<
Finally, my English degree will be useful!
I hope so - all you have to do is find a reader who'll appreciate the fact
that you can spell, write grammatically, and so on. If your project's for UK
Gov, try writing in newspeak - it appears to be in vogue :)

Over and out for now, and good luck,

Peter Jamieson

Well, it's reassuring to know that I'm basically taking the approach
you would have. I'm merging in the least amount of data needed into a
generic template letter.

The logic is built mainly around the admission average and scholarship
offers. I've been working on improving the speed of the merge, and I've
got it down to about 25-30 seconds per record (very slow, but works!).
I did this by having the main template, and then a bunch of referring
documents that split off into trees. 28 of them. So, for example: if
high school, include this document, which asks what KIND of high school
student, and breaks off into another document, that asks what kind of
program, and then breaks down into the final document, which includes
the logic.

I figure that this way, it cuts down the merge having to process
unnecessary logic. I'm trying to put the most common type of student at
the top of the tree, too.

The fun part will be writing it up at the end of the project. Finally,
my English degree will be useful!

Heather
 
H

Heather

Thanks very much for your help!

Heather

<<
Finally, my English degree will be useful!

I hope so - all you have to do is find a reader who'll appreciate the fact
that you can spell, write grammatically, and so on. If your project's forUK
Gov, try writing in newspeak - it appears to be in vogue :)

Over and out for now, and good luck,

Peter Jamieson

Well, it's reassuring to know that I'm basically taking the approach
you would have. I'm merging in the least amount of data needed into a
generic template letter.

The logic is built mainly around the admission average and scholarship
offers. I've been working on improving the speed of the merge, and I've
got it down to about 25-30 seconds per record (very slow, but works!).
I did this by having the main template, and then a bunch of referring
documents that split off into trees. 28 of them. So, for example: if
high school, include this document, which asks what KIND of high school
student, and breaks off into another document, that asks what kind of
program, and then breaks down into the final document, which includes
the logic.

I figure that this way, it cuts down the merge having to process
unnecessary logic. I'm trying to put the most common type of student at
the top of the tree, too.

The fun part will be writing it up at the end of the project. Finally,
my English degree will be useful!

Heather






read more »
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top