File Diff

S

Srikanth

Hi,

I have been workin on a file comparison implementation which needs to high
light the differences between two files.Both these files are almost alike but
they have a few differences(in the sense they are revisied versions of the
same file)
I have tried out some approaches like finding the Longets Common Subsequence
between them
Also tried to implement the Algorithm found at
http://delivery.acm.org/10.1145/360...GUIDE&dl=GUIDE&CFID=16728805&CFTOKEN=10032967

The Algorithm works fine for most of the cases

But in some case as mentioned in the link it fails

Can any one suggest a better alorithm

Thanks in advance
Srikanth
 
J

John

Srikanth said:
Hi,

I have been workin on a file comparison implementation which needs to high
light the differences between two files.Both these files are almost alike but
they have a few differences(in the sense they are revisied versions of the
same file)
I have tried out some approaches like finding the Longets Common Subsequence
between them
Also tried to implement the Algorithm found at
http://delivery.acm.org/10.1145/360000/359467/p264-heckel.pdf?key1=359467&key2
=7794270321&coll=GUIDE&dl=GUIDE&CFID=16728805&CFTOKEN=10032967

The Algorithm works fine for most of the cases

But in some case as mentioned in the link it fails

Can any one suggest a better alorithm

Thanks in advance
Srikanth

Srikanth,
Several years ago I developed a Project file comparison macro. This was
before Microsoft released their Compare Project Versions utility. I
haven't used my compare macro for a few years but as far as I know it
still works with any version of Project (Project 95 through Project
2007).

I'm not familiar with Longets Common Subsequence so I can't comment on
whether it's applicability to Project files is useful or not. My macro
and, as far as I know, Microsoft's utility, uses the Unique ID field as
the indexing reference for comparison. That's why both files must be
genetically related for the macro/utility to work.

My macro color codes changed field cells in the later version file and
also uses a spare field to enter a code for changes (e.g. changed data,
deleted task, and added task). Microsoft's utility creates a separate
file with spare fields showing change information.

In terms of implementation, this is what my macro does.
1. Sorts both files by Unique ID
3. Creates two index registers for task rows in each file and stores
that data in arrays
4. Use the index register of the later version file as a reference
5. Step through both index register arrays. If an element of each array
is equal (i.e. that Unique ID appears in both files), then store a data
array of fields to be compared
6. If the index registers do not agree, set up a difference array for
use later to annotate either an added line (later file) or deleted line
(original file).
7. When the field cell data array from the later file is all gathered,
compare those values with the same fields of the original file. Where
differences exist (i.e. field cell changes), store those field cell
locations in a changed array. (Yeah, I know, there are a lot of arrays.
However I found that storing the data in arrays lets the overall
comparison process run faster than if the two files were compared in
real time on a field cell by field cell basis.)
8. Once the compare is complete, use the changed array to color code
those field cells of the later file and annotate a spare field with a
change code. To speed this process, 10 field cells are selected and
color coded as a group. Ten is the maximum multiple cell selection in
Project.
9. Finally go back to the original file and use the same spare field to
annotate any task lines that were deleted in the later version.

And no, my macro is not available.

John
Project MVP
 
Top