![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Newbie
Join Date: Nov 2007
Posts: 15
Rep Power: 0
![]() |
Help Parsing 1.2 Gig XML File
Here is my situation: I currently get a very large xml file each month that contains data that I need to parse and load into a database. I just finished a very complex system to handle the data in this file, however I've run into a problem.
The problem is that I'm forced to open the file twice. There is a date in the header of the file that I need to grab and look at before I know if I want to parse the file. Currently, I'm using Dom4j to parse the file. I'm setting a callback handler to trigger when it finds my date node. What I'd like to be able to do is stop the parser once I find the date node. Currently, I've got this code: public void handleEffectiveDate() {
reader.addHandler("/Package/PackageHeader/AsOfDate",
new ElementHandler(){
public void onStart(ElementPath path){}
public void onEnd(ElementPath path){
Element asOfDate = path.getCurrent();
String[] date = asOfDate.getStringValue().split("-");
((DataWarehouse32Assembler)getBatchUploadProcesser().getAssembler()).setUploadEffectiveDate(Integer.parseInt(date[0]), Integer.parseInt(date[1]), Integer.parseInt(date[2]));
System.out.println("Data Effective Date saved in assembler!");
//there can be only one, like the highlander, so once you find it stop looking
reader.removeHandler("/Package/PackageHeader/AsOfDate");
asOfDate.detach();
}
}
);
}I was hoping that by removing the only handler the parser would stop, however it very obviously does not. Any recommendations? |
|
|
|
|
|
#2 |
|
12 years old
Join Date: Nov 2007
Posts: 80
Rep Power: 1
![]() |
Re: Help Parsing 1.2 Gig XML File
XStream and Properties.load()
__________________
iload_0 iconst_1 ishl or iload_0 iconst_2 idiv or iload_0 iconst_2 iconst_1 imul idiv [1] & [2] use the smallest stack size |
|
|
|
|
|
#3 |
|
Newbie
Join Date: Nov 2007
Posts: 15
Rep Power: 0
![]() |
Re: Help Parsing 1.2 Gig XML File
How in the world would that help me?
|
|
|
|
|
|
#4 |
|
Troll
Join Date: Apr 2005
Location: Texas
Posts: 732
Rep Power: 4
![]() |
Re: Help Parsing 1.2 Gig XML File
I'm not sure that reading only part of a document is an intended use of Dom4j. You might try throwing an exception and catching it where you call the method to begin parsing. This is just a shot in the dark.
__________________
MD5(sig) = bcef75433db02e9ad9bf81d6f7c5c270 |
|
|
|
|
|
#5 |
|
12 years old
Join Date: Nov 2007
Posts: 80
Rep Power: 1
![]() |
Re: Help Parsing 1.2 Gig XML File
I DON'T think throwing errors as a kind of event-handling mechanism is a good thing for the jvm, or anything.
__________________
iload_0 iconst_1 ishl or iload_0 iconst_2 idiv or iload_0 iconst_2 iconst_1 imul idiv [1] & [2] use the smallest stack size |
|
|
|
|
|
#6 |
|
Troll
Join Date: Apr 2005
Location: Texas
Posts: 732
Rep Power: 4
![]() |
Re: Help Parsing 1.2 Gig XML File
It's ugly, but might work. I wouldn't think that one exception per undesirable file is a serious performance concern, either. There simply doesn't seem to be a clean or official way to go about it.
__________________
MD5(sig) = bcef75433db02e9ad9bf81d6f7c5c270 |
|
|
|
|
|
#7 |
|
Newbie
Join Date: Nov 2007
Posts: 15
Rep Power: 0
![]() |
Re: Help Parsing 1.2 Gig XML File
I was able to figure out a solution. What I'm doing is the date event handler validates that the file is one that I want to process, if it is then I set the additional event handlers, if not, i just kill the process.
its ugly, i know, but i'd much rather just kill the job than let it run and do nothing for another half hour. Using Dom4j it takes only a few seconds to get the date and kill the job if I determine that the file shouldn't be processed. When I determine that the file should be processed it takes about 6 minutes to process the entire 1.2 gig file. I'm still a huge fan of Dom4j. |
|
|
|
|
|
#8 |
|
Newbie
Join Date: Nov 2007
Posts: 15
Rep Power: 0
![]() |
Re: Help Parsing 1.2 Gig XML File
I am still curious to know what null_ptr0 is talking about with XStream and Properties.load(). I don't see how that is relevant to this at all.
|
|
|
|
|
|
#9 |
|
12 years old
Join Date: Nov 2007
Posts: 80
Rep Power: 1
![]() |
Re: Help Parsing 1.2 Gig XML File
They load from XML.
__________________
iload_0 iconst_1 ishl or iload_0 iconst_2 idiv or iload_0 iconst_2 iconst_1 imul idiv [1] & [2] use the smallest stack size |
|
|
|
|
|
#10 |
|
Newbie
Join Date: Nov 2007
Posts: 15
Rep Power: 0
![]() |
Re: Help Parsing 1.2 Gig XML File
that still doesn't make sense. A lot of things "load from XML"...but whatever...i've got my solution.
|
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| parsing xml file using tinyxml!! | honeybee | XML | 2 | Dec 28th, 2006 11:39 PM |
| Parsing File | mr_noname | Community Announcements and Feedback | 1 | Jul 21st, 2005 8:13 AM |
| .lst File Help needed / Parsing | JamesLomuscio | C++ | 1 | Mar 10th, 2005 5:16 PM |
| After execution - Error cannot locate /Skin File? | wchar | Visual Basic | 1 | Mar 5th, 2005 9:04 PM |
| airport Log program using 3D linked List : problem reading from file | gemini_shooter | C++ | 0 | Mar 2nd, 2005 4:12 PM |