P
paddys
Suppose I have a specific number of text [dot doc] files of specified size
[say not more than 500 words], and have to discover if there are text
portions [ie., a set of words or phrases or clauses or entire sentences]
'repeating' across these files. In other words, it is a 'search' for files,
from a given set of text files, containing 'repeating text portions' across
themselves. The challenge is to discover them intelligently even without
any pre-specified 'text portions'. Simple 'find' mechanism is very
cumbersome and tedious, especailly when you have to search a number of text
files.
[say not more than 500 words], and have to discover if there are text
portions [ie., a set of words or phrases or clauses or entire sentences]
'repeating' across these files. In other words, it is a 'search' for files,
from a given set of text files, containing 'repeating text portions' across
themselves. The challenge is to discover them intelligently even without
any pre-specified 'text portions'. Simple 'find' mechanism is very
cumbersome and tedious, especailly when you have to search a number of text
files.