![]() |
Microsoft Word Parsing
I am attempting to parse a Microsoft Word document that has form fields in it. My normal method of parsing word documents isn't going to work here. Generally I use some API to open the document, save it as html, and then parse the html. However, when I try that here, I can't tell whether or not the check boxes have a check or not because the check state is not saved to the html. Has anyone tried this before? My next option here is to look into c# and the word api to see if I can do that.
|
You might look into VBA/VB. It's ugly, I know, but it deals with the underlying objects, even if it looks like Quasimodo.
|
Thanks DaWei, I am looking into that now and you are correct. It's very ugly, but it seems that VBA is what I need to get the job done.
|
What version of Word? I thought the latest version saved things as XML?
|
I haven't seen an example of 2007's XML support, but if you have ever seen a sample from 2003, you would throw up.
|
| All times are GMT -5. The time now is 3:03 AM. |
Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC