Before you start this quixotic enterprise, are you certain that every
Unicode character has an equivalent in ISCII (which I know nothing
about)? Does ISCII have a separate character for every possible
conjunct akshara with every possible matra, the way Unicode Korean has
a separate character for every possible syllable block?
That seems unlikely ...
Yes, every Unicode (Devanagari) character has an equivalent in ISCII.
This is because, ISCII means "Indian Standard Code for Information
Interchange".
[Ref--Indian Standard Document 13194, Bureau of Indian Standards,
1991.]
The Unicode (Devanagari part) is based on ISCII.
Yes, ISCII has a separate character for every possible conjunct akshara
with every possible matra. Everything in Unicode (Devanagari part) is
there in ISCII.
Does ISCII automatically form conjuncts, or do you have to input the
reduced form alongside the full form of the base character?
No. ISCII doesn't automatically form conjuncts.
ISCII text isn't a readable Devanagari text. The ISCII is simply a
format. ISCII only contains the basic "consonants", "vowels", "matras of
vowels" and "other marks" just as Unicode (Devanagari) does. Showing the
visible conjuncts on the screen is the job of the programmer. ISCII is
exactly similar to Unicode (Devanagari part of Unicode). Just as the
Unicode Devanagari text in Microsoft Word that is stored in Word files
stores only the "consonants", "vowels" and "matras of vowels" etc, and
the conjuncts that are visibly displayed on screen are rendered by a
component of the operating system (Windows) called "Uniscribe", the
ISCII also only contains "consonants", "vowels" and "matras of vowel"
etc. and NOT the conjunct forms.
There is practically one-to-one relationship between the every character
in ISCII and the every character in Unicode (Devanagari) with some
exceptions, but these exceptions can be solved easily.
Since India still very much depends on the old non-Unicode format text
for publishing of books in Devanagari (Desk-Top-Publishing). [because
the publishing softwares like QuarkXPress, Adobe Indesign etc. do not
support Unicode Devanagari text or have only newly introduced it] Indian
people haven't got rid of the OLD non-Unicode format text yet. However,
since Unicode Devanagari text is becoming popular on Internet, web sites
(such as google) and emails, people frequently need to convert
NON-UNICODE text to Unicode and vice versa.
There are many third party softwares in India (such as ISM, Shree-Lipi
and Indica), which provide converters from ISCII to their format
(non-Unicode) and vice versa, and I have purchased many such third party
softwares. Hence I can readily convert ISCII text to any of the popular
non-Unicode format text of India and vice versa, using these softwares
of India, which I have purchased. Because of availability of these
third-party softwares, for me, doing Unicode-to-non-Unicode conversion
(and vice versa) is equivalent to doing ISCII-to-Unicode conversion (and
vice versa).
Of course, these softwares also provide ISCII-to-Unicode and vice versa
conversion (using non-VBA programming), but I want to do my own
ISCII-to-Unicode and vice versa conversion, using my own Microsoft Word
macros, because their conversions aren't perfect and secondly, I have
successfully replaced many of their conversion tools with my own VBA
Word macros, which I want to do here also.
I have already successfully created an ISCII-to-Unicode (Devanagari)
macro in Microsoft Word. But, now, only the reverse direction
macro---Unicode-to-ISCII macro----has to be created, and that is giving
me problems, as described in my previous message.
I have written the forward direction macro (ISCII-to-Unicode macro)
successfully using "Find and Replace" commands. The macro simply issues
the following block of statements repeatedly with different values in
each occurrence.
For example,
'ISCII-to-Unicode macro [forward macro]
'
'WORKS SUCCESSFULLY
'
'
'e.g.
'
'(1)
Selection.Find.Text = "^0204" 'ISCII code of Devanagari
consonant 'ma'
Selection.Find.Replacement.Text = "^u2350"
'Unicode Devanagari consonant 'ma' [92E hex or
2350 decimal]
Selection.Find.Execute Replace:=wdReplaceAll
'(2)
Selection.Find.Text = "^0219" 'ISCII code of Devanagarimatra
of vowel "hrasva i"
Selection.Find.Replacement.Text = "^u2367"
'Unicode Devanagari 'matra of hrasva i'
'[93F hex or 2367 decimal]
Selection.Find.Execute Replace:=wdReplaceAll
'(3) etc.
As mentioned above, this forward macro works perfectly and
successfully.
But the reverse macro gives problems.
'Unicode-to-ISCII macro [reverse macro]
'
'GIVES PROBLEMS
'BECAUSE THE CHARACTER Unicode Devanagari 'ma' [92E hex or 2350
decimal]
'ISN'T FOUND BY MICROSOFT WORD WHEN IT IS IMMEDIATELY FOLLOWED (IN THE
FILE)
'BY A VOWEL-MATRA, SUCH AS Unicode Devanagari 'matra of hrasva i'
'[93F hex or 2367 decimal]
'
'IT IS NOT FOUND BY THE FOLLOWING COMMAND, WHEN IT IS FOLLOWED BY ANY
VOWEL-MATRA.
'
'e.g.
'
'(1)
Selection.Find.Text = "^u2350" 'Unicode Devanagari consonant 'ma'
[92E hex or 2350 decimal]
Selection.Find.Replacement.Text = "^0204" 'ISCII code of Devanagari
consonant 'ma'
Selection.Find.Execute Replace:=wdReplaceAll
'(2)
Selection.Find.Text = "^u2367" 'Unicode Devanagari 'matra of
hrasva i'
'[93F hex or 2367
decimal]
Selection.Find.Replacement.Text = "^0219" 'ISCII code of Devanagari
matra of vowel "hrasva i"
Selection.Find.Execute Replace:=wdReplaceAll
'(3) etc.
As mentioned above (as a comment in the macro), the character Unicode
Devanagari 'ma' [92E hex or 2350 decimal] isn't found by Microsoft Word
when it is immediately followed (in the file) by a vowel-matra such as
Unicode Devanagari 'matra of hrasva i' [93F hex or 2367 decimal]
But when such character (Unicode Devanagari 'ma') exists singly in the
file [i.e. when it is NOT followed by any vowel matra] it is replaced by
the macro.
So, the command to replace Unicode Consonant with ISCII consonant
sometimes becomes successful and sometimes doesn't-----when the
consonant is present singly in the file, it is successful, and when the
consonant is followed by a vowel matra, it is unsuccessful.
This is a faulty behavior of Microsoft Word. The command should replace
it always, whether or not it is followed by any vowel matra or not.
That is what I am talking about.
[As a matter of fact, the vowel of matra ALSO isn't found when it is
combined with the consonant. It is found only when it appears singly
meaninglessly. (We know that the matra of a vowel cannot appear singly
meaningfully, although it is possible to type a single vowel-matra,
which would have no meaning.)]
In Word, you should be able to search any sequence of consonants and
vowels (whether or niot they are combined), and you might even make
some shortcuts using wildcards, but it isn't entirely clear what
you're trying to do.
You cannot.
When any Devanagari Unicode consonant appears singly in the file, then
you can search for it using Find command (either through UI or
programmatically), but when it is followed by a vowel sign, it is NOT
found by Microsoft Word (neither through UI nor programmatically), as
explained above, AND also as explained in my previous message. [Both the
messages use the same examples.]
Is there any option in Microsoft Word which would enable it to find the
consonants embedded in consonants clusters? (or consonant embedded in
consonent+vowel-matra?)
OR should I ask in another way? Has anyone made any Unicode Devanagari
font, which doesn't implement ANY conjuncts------just as it would show
up under Windows 98, where no "Uniscribe" is present and the operating
system wouldn't display any conjuncts-----just plain consonants, vowels
and matras of vowels? (which, of course, isn't readable).
[If anyone has made such a font, then I can solve my problem
temporarily. I would just apply that font to my text, and then run my
macro on it, which will replace all characters now, and my
Unicode-to-ISCII conversion, USING MACRO, would become successful.]