Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Jan 11th, 2008, 3:08 PM   #1
fahlyn
Newbie
 
Join Date: Nov 2007
Posts: 15
Rep Power: 0 fahlyn is on a distinguished road
Help Parsing 1.2 Gig XML File

Here is my situation: I currently get a very large xml file each month that contains data that I need to parse and load into a database. I just finished a very complex system to handle the data in this file, however I've run into a problem.

The problem is that I'm forced to open the file twice. There is a date in the header of the file that I need to grab and look at before I know if I want to parse the file. Currently, I'm using Dom4j to parse the file. I'm setting a callback handler to trigger when it finds my date node. What I'd like to be able to do is stop the parser once I find the date node. Currently, I've got this code:

	public void handleEffectiveDate() {
		reader.addHandler("/Package/PackageHeader/AsOfDate", 
			new ElementHandler(){
				public void onStart(ElementPath path){}
				public void onEnd(ElementPath path){
					Element asOfDate = path.getCurrent();
					String[] date = asOfDate.getStringValue().split("-");
					((DataWarehouse32Assembler)getBatchUploadProcesser().getAssembler()).setUploadEffectiveDate(Integer.parseInt(date[0]), Integer.parseInt(date[1]), Integer.parseInt(date[2]));
					System.out.println("Data Effective Date saved in assembler!");
					//there can be only one, like the highlander, so once you find it stop looking
					reader.removeHandler("/Package/PackageHeader/AsOfDate");
					asOfDate.detach();
				}
			}
		);
	}

I was hoping that by removing the only handler the parser would stop, however it very obviously does not.

Any recommendations?
fahlyn is offline   Reply With Quote
Old Jan 11th, 2008, 4:12 PM   #2
null_ptr0
11 years old
 
Join Date: Nov 2007
Posts: 79
Rep Power: 1 null_ptr0 is on a distinguished road
Re: Help Parsing 1.2 Gig XML File

XStream and Properties.load()
__________________
iload_0 iconst_1 ishl or
iload_0 iconst_2 idiv or
iload_0 iconst_2 iconst_1 imul idiv
[1] & [2] use the smallest stack size
null_ptr0 is offline   Reply With Quote
Old Jan 11th, 2008, 9:20 PM   #3
fahlyn
Newbie
 
Join Date: Nov 2007
Posts: 15
Rep Power: 0 fahlyn is on a distinguished road
Re: Help Parsing 1.2 Gig XML File

How in the world would that help me?
fahlyn is offline   Reply With Quote
Old Jan 11th, 2008, 9:46 PM   #4
Dameon
Troll
 
Dameon's Avatar
 
Join Date: Apr 2005
Location: Texas
Posts: 732
Rep Power: 4 Dameon is on a distinguished road
Re: Help Parsing 1.2 Gig XML File

I'm not sure that reading only part of a document is an intended use of Dom4j. You might try throwing an exception and catching it where you call the method to begin parsing. This is just a shot in the dark.
__________________
MD5(sig) = bcef75433db02e9ad9bf81d6f7c5c270
Dameon is offline   Reply With Quote
Old Jan 11th, 2008, 9:59 PM   #5
null_ptr0
11 years old
 
Join Date: Nov 2007
Posts: 79
Rep Power: 1 null_ptr0 is on a distinguished road
Re: Help Parsing 1.2 Gig XML File

Quote:
Originally Posted by Dameon View Post
I'm not sure that reading only part of a document is an intended use of Dom4j. You might try throwing an exception and catching it where you call the method to begin parsing. This is just a shot in the dark.
I DON'T think throwing errors as a kind of event-handling mechanism is a good thing for the jvm, or anything.
__________________
iload_0 iconst_1 ishl or
iload_0 iconst_2 idiv or
iload_0 iconst_2 iconst_1 imul idiv
[1] & [2] use the smallest stack size
null_ptr0 is offline   Reply With Quote
Old Jan 11th, 2008, 11:10 PM   #6
Dameon
Troll
 
Dameon's Avatar
 
Join Date: Apr 2005
Location: Texas
Posts: 732
Rep Power: 4 Dameon is on a distinguished road
Re: Help Parsing 1.2 Gig XML File

It's ugly, but might work. I wouldn't think that one exception per undesirable file is a serious performance concern, either. There simply doesn't seem to be a clean or official way to go about it.
__________________
MD5(sig) = bcef75433db02e9ad9bf81d6f7c5c270
Dameon is offline   Reply With Quote
Old Jan 12th, 2008, 7:41 AM   #7
fahlyn
Newbie
 
Join Date: Nov 2007
Posts: 15
Rep Power: 0 fahlyn is on a distinguished road
Re: Help Parsing 1.2 Gig XML File

I was able to figure out a solution. What I'm doing is the date event handler validates that the file is one that I want to process, if it is then I set the additional event handlers, if not, i just kill the process.

its ugly, i know, but i'd much rather just kill the job than let it run and do nothing for another half hour.

Using Dom4j it takes only a few seconds to get the date and kill the job if I determine that the file shouldn't be processed. When I determine that the file should be processed it takes about 6 minutes to process the entire 1.2 gig file.

I'm still a huge fan of Dom4j.
fahlyn is offline   Reply With Quote
Old Jan 12th, 2008, 7:43 AM   #8
fahlyn
Newbie
 
Join Date: Nov 2007
Posts: 15
Rep Power: 0 fahlyn is on a distinguished road
Re: Help Parsing 1.2 Gig XML File

I am still curious to know what null_ptr0 is talking about with XStream and Properties.load(). I don't see how that is relevant to this at all.
fahlyn is offline   Reply With Quote
Old Jan 16th, 2008, 5:01 PM   #9
null_ptr0
11 years old
 
Join Date: Nov 2007
Posts: 79
Rep Power: 1 null_ptr0 is on a distinguished road
Re: Help Parsing 1.2 Gig XML File

Quote:
Originally Posted by fahlyn View Post
I am still curious to know what null_ptr0 is talking about with XStream and Properties.load(). I don't see how that is relevant to this at all.
They load from XML.
__________________
iload_0 iconst_1 ishl or
iload_0 iconst_2 idiv or
iload_0 iconst_2 iconst_1 imul idiv
[1] & [2] use the smallest stack size
null_ptr0 is offline   Reply With Quote
Old Jan 17th, 2008, 6:09 PM   #10
fahlyn
Newbie
 
Join Date: Nov 2007
Posts: 15
Rep Power: 0 fahlyn is on a distinguished road
Re: Help Parsing 1.2 Gig XML File

that still doesn't make sense. A lot of things "load from XML"...but whatever...i've got my solution.
fahlyn is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
parsing xml file using tinyxml!! honeybee XML 2 Dec 28th, 2006 11:39 PM
Parsing File mr_noname Community Announcements and Feedback 1 Jul 21st, 2005 8:13 AM
.lst File Help needed / Parsing JamesLomuscio C++ 1 Mar 10th, 2005 5:16 PM
After execution - Error cannot locate /Skin File? wchar Visual Basic 1 Mar 5th, 2005 9:04 PM
airport Log program using 3D linked List : problem reading from file gemini_shooter C++ 0 Mar 2nd, 2005 4:12 PM




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 1:56 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC