Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Jul 17th, 2006, 4:21 AM   #1
quantalfred
Newbie
 
Join Date: Jan 2005
Posts: 21
Rep Power: 0 quantalfred is on a distinguished road
Question Data getting question

I want to get some data but the data is in the following format:

A-MAX < U40,000-0.172 20,000-0.172 U55,000-0.172 100,000-0.172 >[
520,000-0.17 210,000-0.172 190,000-0.17 60,000-0.172
210,000-0.17 100,000-0.169 130,000-0.168 80,000-0.166
20,000-0.165 15,000-0.168 Y50,000-0.168 50,000-0.165
200,000-0.168 960,000-0.165 Y50,000-0.165 420,000-0.163
50,000-0.162 130,000-0.161 100,000-0.162 400,000-0.161
Y100,000-0.16 110,000-0.16 280,000-0.161 300,000-0.162
400,000-0.163 200,000-0.164 145,000-0.163 200,000-0.164
350,000-0.165 250,000-0.164 700,000-0.165 150,000-0.166
Y50,000-0.166 200,000-0.166 Y320,000-0.166 80,000-0.165
50,000-0.166 100,000-0.165 200,000-0.166 Y560,000-0.166
440,000-0.166 80,000-0.165 100,000-0.166 695,000-0.165
220,000-0.164 100,000-0.163 200,000-0.164 800,000-0.163
390,000-0.162 255,000-0.161 100,000-0.162 Y15,000-0.161
200,000-0.161 2,395,000-0.16 770,000-0.159 695,000-0.158
405,000-0.159 200,000-0.157 350,000-0.158 725,000-0.157
705,000-0.158 140,000-0.157 305,000-0.158 415,000-0.159
1,135,000-0.157 150,000-0.156 140,000-0.155 290,000-0.156
65,000-0.157 160,000-0.156 260,000-0.155 Y100,000-0.155
120,000-0.155 1,150,000-0.156 300,000-0.157 480,000-0.156
Y385,000-0.157 615,000-0.157 330,000-0.155 270,000-0.156
750,000-0.157 40,000-0.156 100,000-0.157 420,000-0.156
Y70,000-0.156 100,000-0.156 100,000-0.157 400,000-0.156
730,000-0.155 Y20,000-0.155 20,000-0.155 1,180,000-0.156
50,000-0.155 100,000-0.156 40,000-0.155 645,000-0.156
540,000-0.157 655,000-0.156 100,000-0.157 570,000-0.155
Y45,000-0.155 615,000-0.155 150,000-0.157 30,000-0.156
10,000-0.157 500,000-0.158 20,000-0.157 100,000-0.158
500,000-0.159 100,000-0.158 70,000-0.157 Y30,000-0.157
250,000-0.157 540,000-0.158 350,000-0.159 Y20,000-0.158
760,000-0.159 900,000-0.158 300,000-0.159 10,000-0.158
140,000-0.159 1,775,000-0.16 775,000-0.161 Y10,000-0.161
1,600,000-0.162 20,000-0.163 320,000-0.162 1,600,000-0.161
200,000-0.162 675,000-0.161 925,000-0.16 185,000-0.159
100,000-0.16 100,000-0.159 120,000-0.16 50,000-0.159
1,150,000-0.16 Y70,000-0.16 360,000-0.16 100,000-0.161
50,000-0.16 500,000-0.161 40,000-0.16 150,000-0.161
Y100,000-0.161 1,295,000-0.161 100,000-0.162 665,000-0.161
1,000,000-0.162 100,000-0.161 160,000-0.162 1,030,000-0.163
960,000-0.164 1,400,000-0.165 100,000-0.164 210,000-0.165
670,000-0.164 350,000-0.163 1,000,000-0.162 50,000-0.163
1,000,000-0.162 10,000-0.163 220,000-0.162 10,000-0.163
300,000-0.162 100,000-0.161 50,000-0.162 40,000-0.161
30,000-0.162 1,450,000-0.161 10,000-0.16 200,000-0.161
1,400,000-0.16 850,000-0.161 Y5,000-0.161 680,000-0.16
300,000-0.161 100,000-0.16 800,000-0.161 200,000-0.16
50,000-0.161 330,000-0.16 50,000-0.161 1,380,000-0.16
40,000-0.159 300,000-0.161 30,000-0.16 330,000-0.161
Y50,000-0.161 260,000-0.161 80,000-0.16 240,000-0.161
120,000-0.16 80,000-0.161 65,000-0.16 100,000-0.161
100,000-0.16 175,000-0.161 400,000-0.162 100,000-0.161
1,840,000-0.162 10,000-0.161 405,000-0.162 Y50,000-0.162
625,000-0.162 160,000-0.163 580,000-0.162 190,000-0.163
1,475,000-0.162 50,000-0.163 50,000-0.162 100,000-0.161
50,000-0.162 500,000-0.161 60,000-0.162 15,000-0.161
530,000-0.162 25,000-0.161 Y10,000-0.162 220,000-0.162 ]/-//[
1,165,000-0.161 270,000-0.162 485,000-0.16 60,000-0.162
650,000-0.161 160,000-0.162 300,000-0.161 Y40,000-0.161
180,000-0.161 Y100,000-0.162 200,000-0.161 40,000-0.162
Y230,000-0.161 60,000-0.161 3,830,000-0.16 80,000-0.161
120,000-0.16 10,000-0.161 1,000,000-0.16 260,000-0.161
100,000-0.16 700,000-0.161 100,000-0.16 590,000-0.161
90,000-0.16 25,000-0.161 235,000-0.16 100,000-0.161
100,000-0.16 300,000-0.161 50,000-0.16 110,000-0.161
300,000-0.16 Y20,000-0.161 390,000-0.161 100,000-0.16
500,000-0.161 Y110,000-0.161 100,000-0.161 Y90,000-0.161
420,000-0.161 10,000-0.16 300,000-0.161 280,000-0.16
150,000-0.161 660,000-0.16 120,000-0.161 Y150,000-0.161
400,000-0.161 150,000-0.16 Y20,000-0.161 150,000-0.16
1,680,000-0.161 140,000-0.162 720,000-0.161 100,000-0.16
1,730,000-0.161 200,000-0.162 30,000-0.161 290,000-0.162
100,000-0.161 1,060,000-0.162 50,000-0.161 100,000-0.162
1,765,000-0.161 ]

Essentially those are the stock price and volume. I'd like to know which programming language is good for doing this. I'd like to extract these data to find how many volume is traded at what price. Besides, the data are all put in a htm file available on the website but with many stocks within the same htm file. And I would be interested in one stock at one time. Any input is welcome. Thanks a lot!!
quantalfred is offline   Reply With Quote
Old Jul 17th, 2006, 5:57 AM   #2
Mocker
Hobbyist Programmer
 
Mocker's Avatar
 
Join Date: Oct 2005
Location: Indiana
Posts: 224
Rep Power: 0 Mocker is an unknown quantity at this point
Send a message via AIM to Mocker
You can parse this with any number of languages. I am guessing Perl, Python or PHP (why all p's?) would be best suited for it since they are quick, easy and especially good at parsing text.

I dont really get what the correlation is, the first step is figuring the pattern -
Quote:
A-MAX < U40,000-0.172 20,000-0.172 U55,000-0.172 100,000-0.172 >[
what is this? is A-MAX the stock name or something? I am going to assume it is some type of header, so you'd parse out the header, then grab the data between the '[' and ']' for the meat of it. I am doing a lot of assuming here though.

For perl to grab the header you could do
//assume $bigstring has the whole thing in it
$string =~ /(.+)<(.+)>[(.+)]/g ;
$headername = $1; // A-MAX
$headerdata = $2; // the numbers between the < >
$maindata = $3; // The giant chunk of data
Other languages have functions for regular expressions, or a split function to break it apart based upon a set of symbols . Either would work.

From there you could put each volume-price match into an array (as an example in perl)
%priceassoc = array(); //empty associative array
@pricearray = split(/ /, $maindata); //array of volume-price matches
foreach $entry (@pricearray){
($volume, $price) = split(/-/, $entry);
$tmpstring = "";
if(exists($priceassoc{$price})){ //check if there is already an entry for price
$tmpstring = $priceassoc{$price}; //set data to existing entry
}
$tmpstring .= ", $volume"; //add next volume to it
$priceassoc{$price} = $tmpstring; //write new setting to assoc array
}

This will make an associative array sorted based on the prices, so you could do
echo $priceassoc{'0.161'};
and get
"140,000, 720,000, 100,000 ... etc etc"


You could then parse that or keep a count somewhere else if you just wanted a raw number

EDIT: I just noticed in the data there are a couple sets of [] tags which might mean the second half is ignored. It isn't too hard to add the second set but you need to check to see why it is there , if it means anything
__________________
#programmingforums relay - http://thegupstudio.com/cgi-bin/pforelay.cgi
freelance scripts - http://ryanguthrie.com/index.html
Mocker is offline   Reply With Quote
Old Jul 19th, 2006, 9:46 AM   #3
quantalfred
Newbie
 
Join Date: Jan 2005
Posts: 21
Rep Power: 0 quantalfred is on a distinguished road
Thanks a lot!! Which language is the best if I want to have access to internet to get the data?
quantalfred is offline   Reply With Quote
Old Jul 19th, 2006, 10:07 AM   #4
Arevos
Programming Guru
 
Arevos's Avatar
 
Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 5 Arevos is on a distinguished road
Any language with a "urlopen" feature (assuming that the data is access through HTTP), and any language with regular expressions (for parsing the data), should be fine.

Python, Ruby and Perl all have such features. I prefer Python myself, but it's really a matter of taste.
Arevos is offline   Reply With Quote
Old Jul 19th, 2006, 10:20 AM   #5
Game_Ender
Professional Programmer
 
Game_Ender's Avatar
 
Join Date: May 2006
Location: Maryland, USA
Posts: 306
Rep Power: 3 Game_Ender is on a distinguished road
Yes it is really a matter of taste. Pythons are much tastier than Perls, and easier to chew. That said perl was made for exactly this kind of task, it stands for Practical Extraction and Reporting Language (Just remember that today, so I thought I would share).
Game_Ender is offline   Reply With Quote
Old Jul 19th, 2006, 10:26 AM   #6
Marvin
Newbie
 
Marvin's Avatar
 
Join Date: Jul 2006
Location: Heart of Gold
Posts: 23
Rep Power: 0 Marvin is on a distinguished road
perl also stands for Pathologically Eclectic Rubbish Lister

edit:

source: http://www.perl.com/doc/manual/html/pod/perl.html

on the last line of the Bugs section
__________________
"Why should I want to make anything up? Life's bad enough as it is without trying to invent any more of it."
Marvin is offline   Reply With Quote
Old Jul 19th, 2006, 10:38 AM   #7
LOI Kratong
Professional Programmer
 
Join Date: May 2005
Location: Woo - Boot Sector!
Posts: 294
Rep Power: 4 LOI Kratong is on a distinguished road
I'm working on something similar in C++ which may be slightly more work, but you can make an executable from it and not have to worry about having an intepreter installed. But like Game_Ender suggested, it's all a matter of taste!
__________________
www.heldtogether.co.uk
LOI Kratong is offline   Reply With Quote
Old Jul 19th, 2006, 10:45 AM   #8
DaWei
Resident Grouch
 
DaWei's Avatar
 
Join Date: Jun 2005
Posts: 6,453
Rep Power: 10 DaWei is on a distinguished road
Perl is one of those languages that was written with a definite purpose in mind. To Wall's credit, if was so facile to use that people chose to use it in a general-purpose way. (Clipper also comes to mind.) Also to Wall's credit, the language managed to stand up under the traffic. Sure, it's obsolescent. Things move on (hopefully). The term, pathological, should probably be reserved for guys, like the one who wrote that line at that link, who floor the accelerator of their tongue without engaging the clutch of their brain. Not many people critique the Model-T, but not many enter it in the Daytona 500, either. Just sayin'.
__________________
Abstraction doesn't make it impossible to write bad code; it makes it possible to write superior code.
Contributor's Corner: Grumpy on C++ Exceptions DaWei on Pointers
DaWei is offline   Reply With Quote
Old Jul 22nd, 2006, 3:39 AM   #9
quantalfred
Newbie
 
Join Date: Jan 2005
Posts: 21
Rep Power: 0 quantalfred is on a distinguished road
Thank you all! I finally decided to try in Java first. They have the package java.util.regex and let's see if that would save a lot of work.
quantalfred is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Little help whoawhoayoyo Assembly 8 Apr 18th, 2006 8:10 PM
Recommended Practice for returning data from function Arla C# 1 Aug 16th, 2005 1:21 PM
help with sockets, having a client recieve data as well as send. cypherkronis Python 7 Jul 1st, 2005 6:59 PM
Help in QBASIC (I think it's similar to VB) phoenix987 Visual Basic 3 May 9th, 2005 1:33 PM
Help with a QBASIC program phoenix987 Other Programming Languages 4 May 5th, 2005 1:27 PM




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 8:56 PM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC