Programming Forums

Programming Forums (http://www.programmingforums.org/forumindex.php)
-   C# (http://www.programmingforums.org/forum16.html)
-   -   Design ? (http://www.programmingforums.org/showthread.php?t=14341)

kumar310 Nov 5th, 2007 8:49 PM

Design ?
 
Hello everyone,



I just recently got a new job for a startup company. For my first project, they want me to design a program that automatically retrieves information from a webpage(for example like taking stock quotes from etrade) and put it into a database. I have no idea how to do that :( I have no exsprience with SQL or any database programming languages. I also don't know c# that well but I have a solid understanding of c++. So heres my question(s):



a) I how would I retrieve information from a webpage; Should I use Macros ?

b) Do you think I should learn c# or will I be able to do this with visual c++ ?

c) Whats the most efficient way to get this done ?



Thanks for you time

DaWei Nov 5th, 2007 9:03 PM

Re: Design ?
 
You might consider Python and a tool called "Beautiful Soup." Works for me. Massage the output so that you can access the results from any language.

Your alternatives are to search for similar tools for your language of interest, or write one, yourself. I'll leave it to you to judge effectiveness/efficiency.

Ghost Nov 5th, 2007 9:09 PM

Re: Design ?
 
I'm not sure what DaWei's suggestion is, but it is more than likely a good one. As far as choosing C# vs Visual C++ I think you should stick to what you know and not hurt yourself with the learning curve of a new language.

A) There are many ways to retrieve data via the net; it would help to know exactly how the data you are trying to retrieve is formatted.

B) Visual C++ should allow you to use all the power of the .Net Framework that C# uses, so I would see no reason why you would not be able to do so. (You are making this an application and not a web application correct?)

C) Depending on the answers for A and B this question I do not believe can be answered with much merit.

kumar310 Nov 5th, 2007 9:14 PM

Re: Design ?
 
thanks for replying everyone. Well the application basically needs to connect to the internet, login to a specfic site, and retrieve the information and store it in a database. I have no clue what the web site is writtien in, but I believe it is java script.

Ghost Nov 5th, 2007 9:18 PM

Re: Design ?
 
Kumar, I'm about to go offline to clean my keyboard (haha G15's get dirty) but any information you can give us will help us guide you better.

So the website url would give us a hand in helping you.

kumar310 Nov 5th, 2007 9:32 PM

Re: Design ?
 
thanks alot man. alrite it needs to go to www.trapac.com and on the left hand side of the page, you will see something called quickcheck. The webpage asks for a container number(of a boat). Basically I want the program to be able to take a container number from the user, and enter this into the website without them having to even go on the website (hope that makes sense). Anyway, Once this is done, i need to retrieve data from the site , which can be found at: http://www.trapac.com/EquipmentHisto...er=ecmu1538359

The Only information I Need are the Fields labeled DATE: and ACTION:

DaWei Nov 5th, 2007 10:14 PM

Re: Design ?
 
It doesn't matter how the site content is produced, the content is HTML (or XHTML). If you're interested in the content, the producer doesn't matter. If you're interested in events, or how the content is produced, then you're in a different ballgame.

I would suggest that you research the production and presentation of web pages a tad.

kumar310 Nov 5th, 2007 11:27 PM

Re: Design ?
 
yeah I get you. Im only interested in the content and not the production. but ill look into that

DaWei Nov 6th, 2007 12:31 AM

Re: Design ?
 
Do a "View Source" with your browser. That's essentially content. You may or may not see some script and styling. You may or may not see some references to script or styling. That's the content that you'll be trying to parse into useful information.

Beautiful Soup (and possibly other tools) can build a tree from that and objectify it to some degree. This may seem a little esoteric (I didn't write it for beauty), but consider this: I had a requirement to search all table cells for a link. Once I found that cell, subsequent cells would contain the data I coveted.

The object, "soup", represents the tree:
:

    for i in soup.findAll ('td'):  Find all TDs in the document
        elstringo = "Failure"
        if i.string == None:  i is the tag...if it has no attributes...
            j = i.find ('a')  look for a link in the content
            if j != None and j.string != None:
                elstringo = str (j.string).lstrip ()  If there's a link, and content, it's my key

Obviously, you have to be able to codify, in some way, the characteristics that distinguish the information that you want to garner.

Infinite Recursion Nov 6th, 2007 9:24 AM

Re: Design ?
 
a)
If you wanted to use C#, look into the WebRequest and WebReponse classes... may point you in a viable direction. I believe there is a web browser component also.

b)
I think if you are writing code for Windows, you should definitely pick up some C# at some point. If this project is relatively critical and you can not afford to trip through a new language, use what you are familiar with.

c)
Depends on how you define efficiency.


@Ghost... didn't know you were a fed. Cool.

@DaWei... One of these days, I'm going to write more Python... I'm inspired. :)


All times are GMT -5. The time now is 3:28 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC