View Single Post
Old Aug 22nd, 2007, 5:20 PM   #3
lectricpharaoh
Caffeinated Neural Net
 
lectricpharaoh's Avatar
 
Join Date: Jun 2005
Location: Dry west coast of Canada
Posts: 1,010
Rep Power: 5 lectricpharaoh will become famous soon enough
I suggest you follow DaWei's advice and learn about relational databases, and normalizing the data.

The basic ideas here are to reduce redundant information, and make the system more extensible. For example, I once heard a horror story from someone where they were required to maintain a table for something or other. The original designer of the database (not the person relating the story) had come up with the brilliant solution of having the years as columns in one big table. This meant every year, they had to add a new column to the database, and modify all the queries.

With a relational database, it should have been designed such that some of the information was in one table, and then a second table would contain a year column, and a column of foreign keys. The foreign keys are essentially pointers to primary keys in the first table. Then, by searching the second table for all records where the year column matches what you're looking for (whether exact or in a range, such as 'all records from 2005 or earlier'), you have a list of all records from the first table where the year is what you want. You could then iterate through the first table, or better yet, you could design your query to do both operations (more efficient this way, as a) you only call the database once, and b) you take advantage of the many optimizations built into the DBMS software).

A better example might be a company that sells stuff, say books. They have a list of invariant information for each book, such as ISBN, author, title, publisher, and date published. Then they have a second table with the variable information, such as number of copies in stock, number of copies on order, supplier, price, etc.

Now imagine this company wants to track what books customers buy (there are many reasons for this, ranging from buyer incentive programs to targeted advertising). Thus, they have a third table for customers. Now, one way to design this table would be something like this:
Name     Purchase1    Purchase2    Purchase3    Purchase4
As you can probably see, this is a totally stupid design; if a customer purchases fewer than four titles, space is wasted, and if they purchase five or more, some purchases cannot be recorded. Thus, a better solution would be to add a table of purchases, with each purchase containing both a customer ID value, and an identifier for what was purchased (such as ISBN).

The above is intentionally simplified. A real system would likely have an intermediary table (call it 'transactions') between the customer and the purchase, because a customer can buy several items at once. When you start reading about the different relationships, you will see why this is necessary. You cannot directly have a many-to-many relationship in a relational database; instead, you have a many-to-one and a one-to-many (with the 'one' being the same between the two).

Anyways, read up on it. An understanding of the basic principles of modern relational databases is almost essential in the programming world today, even if learning SQL isn't (depends on what you want to do).
__________________
And once again, Probability proves itself willing to sneak into a back alley and service Drama as would a copper-piece harlot.
- Vaarsuvius, Order of the Stick
lectricpharaoh is offline   Reply With Quote