Rob Garrett - Blogs

Welcome to Rob Garrett - Blogs Sign in | Join | Help
in Search
Google

Software/Technology Discussion

Software and Technology Tid-bits

Visio 2003 IFilter and MTA

I recently had a problem instantiating the Visio 2003 IFilter from C#....

The following code instantiates an IFilter using COM Interop. The CLSID is obtained from the registry (method excluded for brevity) and then used to obtain the COM object type, the activator then instantiates the object from the type. I tested the code from a console C# application and it works great. The CLSID is correctly obtained from the registry for a given VSD extension and then the COM object is instantiated by the activator.

When I call this same code from a WinForm application the method throws an invalid cast exception when casting an activated object to an IFilter. Stepping through with the debugger shows me that the CLSID is obtained correctly, the type is returned from Type.GetTypeFromCLSID, and the call to the activator also succeeds.


I read somewhere that the threading model for the COM component matters when calling from a multi threaded environment. So, I made sure the calling code to my method is running in a Single-Threaded-Apartment (STA) thread. This change from Multi threaded Apartment (MTA) to STA fixed the same problem with the Adobe PDF IFilter, but did nothing for the Visio IFilter. I had just about given up when I decided to check the threading model in the registry. I noticed that the PDF IFilter's threading model is set to Apartment, whereas the Visio threading model is blank. Setting the following registry REG_SZ value to Apartment fixes the problem with Visio also. HKLM\SOFTWARE\Classes\CLSID\{FAEA5B46-761B-400E-B53E-E805A97A543E}\InprocServer32\ThreadingModel. I'm not sure if this is an problem with the installation of the Visio IFilter or intended. When I reinstalled the IFilter the threading model was set back to empty. It's not an elegant solution, but it seems to work.
Share this post: Email it! | bookmark it! | digg it! | reddit!
Published Monday, January 24, 2005 9:52 AM by Rob Garrett

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Raldo said:

Don't know if anyone else has had this before, but I have tried (to no avail) to get the IFilter working on all Word docs. Using the LoadIFilter method seems to work fine on more than 90-95% of all docs on my pc.

I then used the same example above by creating a type from the CLSID, but the problem persists. The error thrown is "Catastrophic failure" as soon as the Load method is called. I also tried spawning a seperate thread and setting it to STA.

I haven't found any information about this problem anywhere on the web. Do you think it is the way the filter is initialized or is it actually a problem with Microsoft's Offilt.dll
August 14, 2005 9:33 AM
 

Rob Garrett said:

Raldo,
I managed to get the IFilter to work for office documents, I just ran into problems with IFilter for Visio. I used the same code as above. If you're getting a catastophic error then I would guess that there is a problem with you installation of the office IFilter, or even office itself. Try your code on another machine (if possible) and a new installation of office. I would be happy to look at your code if you emailed it to me.
August 14, 2005 11:05 PM
 

Raldo said:

Rob,
Thanks for the offer to help. I've got quite of bit of other code to finalise and then I may send you sample if I haven't managed to solve it myself.
August 18, 2005 7:36 AM
 

Raldo said:

Hi Rob

I have made some interesting discoveries with regard to the IFilter and would like to have your opinion. Microsoft provides a IFilter testing framework provided with the Platform SDK http://msdn.microsoft.com/library/default.asp?url=/library/en-us/indexsrv/html/ixufilt_96n9.asp

When running this test on some of my word docs it always fails on the same set of docs that fail in my own code. Through this I discovered (assume) that nothing is wrong with my code. To take it further, I installed the latest Windows Desktop Search on a test machine and run these same test on the same set of docs and all worked fine. (I believe this testing framework uses LoadIFilter).

This lead me to think that Microsoft's default IFilter shipped with Windows 2000/XP etc cannot filter all docs. That's why Microsof are shipping updated IFilter's with their Desktop Search Tool.

My actual question is: I use the LoadIFilter through platform Invoke to determine which IFilter to use. I see you prefer to get the type based on the doc CLSID and create an IFilter from that. Is their any particular reason why you prefer using it this way?
September 13, 2005 11:13 AM
 

Raldo said:

Sorry, didn't actually go over your code again. I see you do use LoadIFilter if the CreateInstance call failed. But still, why is this better and is it better?

I would have thought for it to be the other way around?
September 13, 2005 11:30 AM
 

Raldo said:

Hi

I tried the your method and got this error.

System.Runtime.InteropServices.COMException (0x8000FFFF): Catastrophic failure
at System.Runtime.InteropServices.UCOMIPersistFile.Load(String pszFileName, Int32 dwMode)

I also get this catastrophic failure with the Microsoft IFilter testing framework for the test set that doesn't want to filter. This means that their could hopefully be a workaround for this (which I have been unable to find), or I have to write my own IFilter like Microsoft did for MSN Desktop Search, or wait until Microsoft provides a free update I can ship with our product.
September 13, 2005 12:17 PM
 

TrackBack said:

October 6, 2005 9:27 AM
 

Rob Garrett said:

Raldo,
Thanks for the numerous comments that you posted, Sorry if I have been a little behind in getting back to you.

From the nature of your posts it sounds that you have a lot more knowledge on this subject than I. found that a lot of experimenting was required to get IFilters to work correctly in custom code.

With regard to your word docs, if the desktop framework gives you an error then it can either mean that the IFilter is broken, or that your word doc has something funky, which prevents it opening. I found that documents with any form of security set on them would cause problems - but this generally meant that the load method came back with E_FAIL, not a catastrophic error. If I can find some time I could check out your documents and find out why my code does not work with them.

BTW, there are a ton more comments to my other post about IFilters here: http://robgarrett.com/Blogs/software/archive/2005/01/11/442.aspx

I'm not sure how much more time I can put into the IFilter problems - I essentially wrote some code for a company I was working for a while ago - I chalked them up to being unreliable at best. As with all 3rd party libraries, your code stability is at the mercy of the implementation of the IFilters you use, which concerned me. I'll do my best to assist you.
October 6, 2005 9:31 AM
 

alon said:

Hi.
I am trying to extract text from Excel documents using Microsoft IFilter (OFFFilt.dll)

My problem is with the cells order - the extracted text is not in the order it appears in spreadsheet (down than over / over than down), I think it is related to the time the cells were created/modified.

Can I control that extraction order?
Thanx alon.
January 3, 2006 5:37 AM
 

Eyal Post said:

See http://www.epocalipse.com/blog for an article about using IFilter in C#
March 12, 2006 8:32 AM

Leave a Comment

(required) 
(optional)
(required) 
Submit

Blurb


Head Shot
Rob Garrett is a British Expat living in Maryland USA. Rob is a trained software engineer and experienced in Windows .NET development.

Rob enjoys listening to Rock music, posting to blogs, driving in the country with the sunroof open, beer (not in conjunction with country driving) and spending time with his family.

This Blog

Syndication

Powered by Community Server, by Telligent Systems