![]() |
Help Parsing 1.2 Gig XML File
Here is my situation: I currently get a very large xml file each month that contains data that I need to parse and load into a database. I just finished a very complex system to handle the data in this file, however I've run into a problem.
The problem is that I'm forced to open the file twice. There is a date in the header of the file that I need to grab and look at before I know if I want to parse the file. Currently, I'm using Dom4j to parse the file. I'm setting a callback handler to trigger when it finds my date node. What I'd like to be able to do is stop the parser once I find the date node. Currently, I've got this code: :
public void handleEffectiveDate() {I was hoping that by removing the only handler the parser would stop, however it very obviously does not. Any recommendations? |
Re: Help Parsing 1.2 Gig XML File
XStream and Properties.load()
|
Re: Help Parsing 1.2 Gig XML File
How in the world would that help me?
|
Re: Help Parsing 1.2 Gig XML File
I'm not sure that reading only part of a document is an intended use of Dom4j. You might try throwing an exception and catching it where you call the method to begin parsing. This is just a shot in the dark.
|
Re: Help Parsing 1.2 Gig XML File
Quote:
|
Re: Help Parsing 1.2 Gig XML File
It's ugly, but might work. I wouldn't think that one exception per undesirable file is a serious performance concern, either. There simply doesn't seem to be a clean or official way to go about it.
|
Re: Help Parsing 1.2 Gig XML File
I was able to figure out a solution. What I'm doing is the date event handler validates that the file is one that I want to process, if it is then I set the additional event handlers, if not, i just kill the process.
its ugly, i know, but i'd much rather just kill the job than let it run and do nothing for another half hour. Using Dom4j it takes only a few seconds to get the date and kill the job if I determine that the file shouldn't be processed. When I determine that the file should be processed it takes about 6 minutes to process the entire 1.2 gig file. I'm still a huge fan of Dom4j. |
Re: Help Parsing 1.2 Gig XML File
I am still curious to know what null_ptr0 is talking about with XStream and Properties.load(). I don't see how that is relevant to this at all.
|
Re: Help Parsing 1.2 Gig XML File
Quote:
|
Re: Help Parsing 1.2 Gig XML File
that still doesn't make sense. A lot of things "load from XML"...but whatever...i've got my solution.
|
| All times are GMT -5. The time now is 3:39 PM. |
Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC