Spotlight not indexing/searching text WITHIN.ppt or .pptx in tiger

C

Corentin Cras-Méneur

step 3 didn't work for the .ppt that wasn't appearing fully indexed.

Well what does mdimport return??
Obviously, there is a nasty spotlight issue on your Mac :-\
A conflict maybe??

mdimport should be able to import the file.
If it fails, then no need re-indexing the entire drive (since mdimport
would then do the same thing, on a much larger scale, simply failing
over and over again on each and every Office file).

You might want to consider booting on your MacOS X DVD to run
DiskUtility from there to fix the drive,
You might also want to repair Permissions from Disk Utility (booting
from your own drive this time).


Corentin
 
T

tkjazzer

OK, what happen was I made this extremely long post to put on this forum and the post got cut off and I lost like 3/4 of the post. Of course I didn't think of copying the whole thing in case it happened.

So once I need a break from studying I'll try to replicate more of the post.

So Office Word and Acrobat Profession 8 is Indexing EVERYTHING fine.

It is just ppt and pptx files.

I will show you what exactly those files are indexing later and will then proceed if you still suggest on doing the things you mentioned above.

Thank you so much for your time. I will post soon. I just am so frustrated at mactopia for cutting off my post that I need a few more hours before entering terminal land again.
 
T

tkjazzer

Oh cool, terminal saved what I did yesterday:

mdimport -n -d2 /Users/***removedinfo****/Desktop/***removedinfo****/10\ -\ G.I.\ Liver/080311_0800_***removedinfo****_drug_induced_liver_slides08.ppt
2008-04-17 16:32:28.830 mdimport[4462] Import '/Users/***removedinfo****/Desktop/***removedinfo****/10 - G.I. Liver/080311_0800_***removedinfo****_drug_induced_liver_slides08.ppt' type 'com.microsoft.powerpoint.ppt' using 'file://localhost/Library/Spotlight/Microsoft%20Office.mdimporter/'
2008-04-17 16:32:34.171 mdimport[4462] Sending attributes of '/Users/***removedinfo****/Desktop/***removedinfo****/10 - G.I. Liver/080311_0800_***removedinfo****_drug_induced_liver_slides08.ppt' to server. Attributes: '{
"_kMDItemImporterCrashed" = <null>;
"com_apple_metadata_modtime" = 230060323;
kMDItemAuthors = ("***removedinfo****");
kMDItemContentCreationDate = 2008-03-11 00:34:12 -0700;
kMDItemContentModificationDate = 2008-04-16 10:38:43 -0700;
kMDItemContentType = "com.microsoft.powerpoint.ppt";
kMDItemContentTypeTree = (
"com.microsoft.powerpoint.ppt",
"public.data",
"public.item",
"public.presentation",
"public.composite-content",
"public.content"
);
kMDItemDisplayName = {"" = "080311_0800_***removedinfo****_drug_induced_liver_slides08.ppt"; };
kMDItemKind = {"" = "Microsoft PowerPoint document"; };
\tFXR farnesoid X receptor\nLigand binding domain\nDNA binding domain\n Ligand binding pocket of human SXR isClotrimazole\nBile acids\nCYP3A4*\nCYP2B\nMDR1* (p-gp)\nCYP2C\***removedinfo****GI/Liver\nMRP2\nCYP1A1\nSulfotransferase and UDGPT isozymes\n* Intestinal first pass protection\nRifampin\nPXR\nCYP3A\nCYP2B\nPhenobarbital\nCAAdv Drug Delivery Reviews, 2001\nP\na\nt\nt\ne\nr\nn\nE\nf\nf\ne\nc\nt\n \no\nn\n \nt\na\nr\ng\ne\nt\nc\ne\nl\nl\nI\nn\nf\nl\na\nm\nm\na\nt\ni\no\nn\nD\nr\nu\ng\nM\ni\nc\nr\no\ns\nc\no\np\ni\nc\nc\nh\no\nl\na\nn\ng\ni\nt\ni\ns\nB\ni\nl\ne\n \nd\nu\nc\nt\n \nc\ne\nl\nl\ni\nn\nj\nu\nr\ny\nP\no\nr\nt\na\nl\n \nt\nr\ni\na\nd\nC\nh\nl\no\nr\np\nr\no\nm\na\nz\ni\nn\ne\n;\nE\nr\ny\nt\nh\nr\no\nm\ny\nc\ni\nn\ne\ns\nt\no\nl\na\nt\ne\nB\nl\na\nn\nd\nc\nh\no\nl\ne\ns\nt\na\ns\ni\ns\nI\nn\nh\ni\nb\ni\nt\ni\no\nn\n \no\nf\nh\ne\np\na\nt\no\nc\ny\nt\ne\nt\nr\na\nn\ns\np\no\nr\nt\nN\no\nn\ne\nE\ns\nt\nr\no\ng\ne\nn\n,\nC\ny\nc\nl\no\ns\np\no\nr\ni\nn\n \nA\nHow can wof cholestatic liver injury?\nBile acid independent\nBile acid excretion\nBile Flow\nBile acid dependent\nToxic Drug\nInhibition of bile acid transport (bland cholestasis)\n\Uffb1 to hepatocytes (mixed injury)\nMetabolitGSH\nBile duct cells\***removedinfo**** Div. GI/Liver\nToxic concentrations of bile salts\nBA\nHepatocyte injury (mixedBosentan\nEstradiol-17b-glucuronide\***removedinfo**** Div. GI/Liver\nMRP2\nFlucloxacillin\n5-OH-Methylflucloxacillin\nCYP3A4\nBile Duct Cells\nToxicity to bile duct cells >> hepatocytes\nLakehal et al, Chem Res Toxicol, 2001\nFaSurvivors: serum phosphate <1.2 mmol/l at 48 to 96 hours\nAPAP= acetaminophen; d/c=discontinue; NPO- nothing bmouth; prn=as needed, Rx=prescribe\n";
kMDItemTitle = "Drug-Induced Hepatotoxicity";
}'
 
T

tkjazzer

so again, I can't seem to type any words I see within that file and have them show up in the spotlight.

However, I can type a word that appears in the file name and that shows up.

The Office Word documents index fine. The PDFs index fine.

What should I do again?

Can you break it down in to steps?

Thank you!
 
T

tkjazzer

ok, something did work.

but only some words inside were indexed.

so if you look up. I typed "duct cells" in spotlight which appears to have been indexed, and spotlight showed it.

SO WHY ARE ONLY SOME WORDS INSIDE THE PPT FILES BEING INDEXED WHILE OTHERS ARE NOT?

so frustrating.
 
T

tkjazzer

Well what does mdimport return??
The exact same partial index that it did the first time I checked what was indexed.
Obviously, there is a nasty spotlight issue on your Mac :-\

yup, very frustrating. don't know what to do about it.
A conflict maybe??
How do I tell? It only has a problem partially indexing the text within .ppt files. Very few words get indexed.

Microsoft Word and PDF files index the entire thing.
mdimport should be able to import the file.
It imported the exact information that it already had - a partial index... not every word in ppt gets indexed - far from it.
If it fails, then no need re-indexing the entire drive (since mdimport
would then do the same thing, on a much larger scale, simply failing
over and over again on each and every Office file).

What is failure? I mean, it indexed something, but not everything?

And not every office file is the problem - just .ppt
You might want to consider booting on your MacOS X DVD to run
DiskUtility from there to fix the drive,

I have no idea how to do this
You might also want to repair Permissions from Disk Utility (booting
from your own drive this time).

I have no idea how to do this.

Thank you so much for your time and help,
 
C

Corentin Cras-Méneur

OK, what happen was I made this extremely long post to put on this
forum and the post got cut off and I lost like 3/4 of the post. Of course I
didn't think of copying the whole thing in case it happened.

Hehehe, that's why I always use a dedicated newsreader for newsgroups
instead of a Web Interface ;-)
So once I need a break from studying I'll try to replicate more of the post.

So Office Word and Acrobat Profession 8 is Indexing EVERYTHING fine.

It is just ppt and pptx files.


I just played around with a pptx file.
I got:
Corentin:~ corentin$ mdimport -n -d2
/Volumes/Gloubi/Users/me/Documents/Office\ Projects/Présentations/Beta\
cell\ regeneration\ review.pptx
2008-04-18 13:03:22.752 mdimport[20552:10b] Imported
'/Volumes/Gloubi/Users/me/Documents/Office Projects/Présentations/Beta
cell regeneration review.pptx' of type
'org.openxmlformats.presentationml.presentation' with no plugIn.
2008-04-18 13:03:22.755 mdimport[20552:10b] Attributes: {
"_kMDItemFinderLabel" = <null>;
"com_apple_metadata_modtime" = 223533394;
kMDItemContentCreationDate = 2008-01-31 22:36:34 -0600;
kMDItemContentModificationDate = 2008-01-31 22:36:34 -0600;
kMDItemContentType =
"org.openxmlformats.presentationml.presentation";
kMDItemContentTypeTree = (
"org.openxmlformats.presentationml.presentation",
"org.openxmlformats.openxml",
"public.zip-archive",
"com.pkware.zip-archive",
"public.data",
"public.item",
"com.apple.bom-archive",
"public.archive",
"public.presentation",
"public.composite-content",
"public.content"
);
kMDItemDisplayName = {
"" = "Beta cell regeneration review.pptx";
};
kMDItemKind = {
"" = "Microsoft PowerPoint presentation";
};
}



As you can see, the file is indexed, but there isn't any content
indexation.
That's not a bug though, it's a limitation. The mdimporter doesn't index
the content.
I believe that even though the mdimporter is made by MS, it actually
ships through Apple with the System
Any improvement in this respect could only come through Apple.


Corentin
 
C

Corentin Cras-Méneur

so again, I can't seem to type any words I see within that file and have
them show up in the spotlight.

However, I can type a word that appears in the file name and that shows
up.


So if I'm getting this right:
- words appearing in the index are fine, but not everything is indexed??
That's not a failure, it's more likely to be a bug in the importer (or a
corruption in the file)

Corentin
 
C

Corentin Cras-Méneur

ok, something did work.

but only some words inside were indexed.

so if you look up. I typed "duct cells" in spotlight which appears to
have been indexed, and spotlight showed it.

SO WHY ARE ONLY SOME WORDS INSIDE THE PPT FILES BEING INDEXED WHILE
OTHERS ARE NOT?


Well, as I was saying in another post, either the file is corrupted, or
the mdimporter is buggy.
It could also be that the mdimporter stops importing after a certain
size in the file to avoid over-crowding the index (that would be sad...
but it is a possibility)
That's the only explanation that comes to my mind.

Reindexing could help, (since you saw some improvement here), but it
doesn't look like it will properly entirely index your files.

Out of curiosity... Are your files big?? Do you start of with a bunch of
graphics??
I'm really starting to wonder wehther the mdimporter could simply stop
indexing after reaching some sort of pre-defined limit...

Corentin
 
C

Corentin Cras-Méneur

It might not be related to your problem after all, but these two tips
are worth knowing about

I have no idea how to do this

only repair a tiger drive with a Tiger DVD, a leopard drive with a
leaopard dvd etc...
put the DVD in the DVD drive, launch the System Preferences and change
the startup options to boot from the DVD
after the regular installationdialogs (select your language, etc), you
shoudl see the menu bar appear. Of course, DON'T select to reinstall
your system. All you need to do is to get to the first dialog with the
menu bar,
In the options in the menu bar, you can find disk utility. Select it to
launch it. In the application, select your internal drive and hit
"repair"
Once you are done, go back to the menu bar to find the startup
preferences. Reselect your internal drive and reboot on it.

I have no idea how to do this.

Thank you so much for your time and help,

On your computer, launch the Disk Utility application
(/Applications/Utilities). Select your hard drive on the left and click
the Repair Permissions button.


These two trick can correct quite a few problems on your Mac. they are
worth running every once in a while.

Corentin
 
T

tkjazzer

Most of my ppt files are huge! (50 mb+)

however, I specifically chose this one because it was mainly ALL text, even though like 90+ slides of text.

This ppt file is only 5.9 mb.

So yes,

to clarify.
EVERY word in WORD files are being indexed.

Only SOME words in PPT files are being indexed.

The same words in the PPT files were indexed before and after reindexing that file.

What should I do?
tell microsoft?
tell apple?
ask a mac genius?
 
C

Corentin Cras-Méneur

to clarify.
EVERY word in WORD files are being indexed.

Only SOME words in PPT files are being indexed.

The same words in the PPT files were indexed before and after reindexing
that file.

What should I do?
tell microsoft?
tell apple?
ask a mac genius?

I doubt the Mac Genius will be able to do anything for you since this
looks like a limitation on the mdimporter.
You can use the Send Feedback command in PPT to let MS know what's going
on, but don;t expect a reply (though it will matter since it will let MS
know there is a problem: you can't fix a bug if you don;t know it
exists).

I tried to escalade the information to the contacts we have at MS and I
hope it will get noticed...

Corentin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top