Programming Forums

Programming Forums (http://www.programmingforums.org/forumindex.php)
-   Coder's Corner Lounge (http://www.programmingforums.org/forum11.html)
-   -   Complex SQL Question Question (http://www.programmingforums.org/showthread.php?t=14860)

kurifu Jan 3rd, 2008 4:34 PM

Complex SQL Question Question
 
Hey guys; Trying to optimize some code that I wrote to not kill our DB server, I was hoping someone could give me some insightful feedback, as googling this isn't very easy.

Basically I have some cats in a database table, all with their own id. Each cat gets certain vaccinations, but never all at once. So I have another table where a record is inserted in a column indicating which shots they have received and when. There is a single column for each of these vaccinations, I put a "y" in it when complete.

So I want to generate a report which displays shots not administered, right now I grab each cat id, and loop through each one finding which ones are done, and which are not.

I am hoping I can take a single query and filter out results which have a Y in each column, problem is that this can be over multiple records. This will save a HUGE load on the DB. Any input?

Thanks,.

Sane Jan 3rd, 2008 4:51 PM

Re: Complex SQL Question Question
 
Could you clarify your question, or how the tables are set up, so someone can give the right answer? ;) Maybe even an example of what's in the DB would be great.

But I would have the tables set up so I can just do something like:

SELECT cat_id, vaccination_id FROM shots_table WHERE complete = 'n' ORDER BY cat_id

Then each row will be all the shots that the cat didn't receive, for each cat, ordered by the cat that didn't receive it. That is if this is how your tables are set up. But this seems way too simple an answer.

mbd Jan 3rd, 2008 6:04 PM

Re: Complex SQL Question Question
 
assuming one cat has many vaccinations and you want to find cats without any vaccinations
:

  1. SELECT * FROM cats
  2. LEFT OUTER JOIN vaccinations
  3. ON vaccinations.cat_id = cats.cat_id
  4. WHERE vaccinations.cat_id IS NULL


Jimbo Jan 3rd, 2008 10:29 PM

Re: Complex SQL Question Question
 
Not knowing how many columns there are, is it feasible to do something like this:
:

  1. SELECT *
  2. FROM Cats c
  3.   LEFT OUTER JOIN Vaccinations v ON c.cat_id = v.cat_id
  4. WHERE v.vaccination1 != 'Y' OR v.vaccination2 != 'Y' OR ...


Otherwise, what if you redid the database like this:
:

Cats (Id (PK), Name, Owner, whatever)
Vaccinations (Id(PK), Name, whatever)
Cat_Vacs (CatId (FK), VacId (FK))

and then used a query like this:
:

  1. SELECT c.Id, c.Name
  2. FROM Cats c LEFT OUTER JOIN
  3.     Cat_Vacs cv ON c.Id = cv.CatId
  4. GROUP BY c.Id, c.Name
  5. HAVING COUNT(cv.VacId) < (SELECT COUNT(*) FROM Vaccinations))

Logical idea: get all cats who haven't had the number of vaccines in the Vaccines table.

Of course, this is technically 2 queries, but the inner one will be quick (the DB will know how many rows are in the table so this query is easily optimized internally).

kurifu Jan 4th, 2008 12:56 AM

Re: Complex SQL Question Question
 
I always thought that LEFT OUTER JOIN required matching records in both tables, now that I know it does not, it should actually work very well. Thanks.

kurifu Jan 4th, 2008 2:09 PM

Re: Complex SQL Question Question
 
Ok, so officially this is what solved my problem, and despite the number of inner queries, it runs much much faster...

:

SELECT DISTINCT cat_id, cat_name, cst_status, cat_birthday, cat_intake_date, CONVERT(VARCHAR(10), cat_birthday, 111) AS [YYYY/MM/DD]
FROM MEOW_Cats
LEFT JOIN MEOW_CatsStatus ON MEOW_Cats.cat_id = MEOW_CatsStatus.cst_cat_id
WHERE (cst_status <> 'D' AND cst_status <> 'R' AND cst_status <> 'C') AND
(SELECT COUNT(*) FROM MEOW_HealthRecords WHERE hre_cat_id = cat_id AND hre_fvrcpc_initial = 'Y') = 0 OR
(SELECT COUNT(*) FROM MEOW_HealthRecords WHERE hre_cat_id = cat_id AND hre_fvrcpc_booster = 'Y') = 0 OR
(SELECT COUNT(*) FROM MEOW_HealthRecords WHERE hre_cat_id = cat_id AND hre_fevl_initial = 'Y') = 0 OR
(SELECT COUNT(*) FROM MEOW_HealthRecords WHERE hre_cat_id = cat_id AND hre_fevl_booster = 'Y') = 0 OR
(SELECT COUNT(*) FROM MEOW_HealthRecords WHERE hre_cat_id = cat_id AND hre_rabies = 'Y') = 0 OR
(SELECT COUNT(*) FROM MEOW_HealthRecords WHERE hre_cat_id = cat_id AND hre_deworm = 'Y') = 0 OR
(SELECT COUNT(*) FROM MEOW_HealthRecords WHERE hre_cat_id = cat_id AND hre_alter = 'Y') = 0
ORDER BY cat_name ASC


mbd Jan 4th, 2008 2:26 PM

Re: Complex SQL Question Question
 
try modifying your query to use this approach. i have no way of testing this, but i think you could eliminate those subqueries
:

  1. SELECT DISTINCT cat_id, cat_name, cst_status, cat_birthday, cat_intake_date, CONVERT(VARCHAR(10), cat_birthday, 111) AS [YYYY/MM/DD]
  2. FROM MEOW_Cats
  3. LEFT JOIN MEOW_CatsStatus ON MEOW_Cats.cat_id = MEOW_CatsStatus.cst_cat_id
  4. JOIN MEOW_HealthRecords ON hre_cat_id = cat_id
  5. WHERE hre_fvrpc_initial != 'Y' OR hre_fvrcpc_booster != 'Y' ...


Jimbo Jan 4th, 2008 9:32 PM

Re: Complex SQL Question Question
 
So, are the counts ever going to be more than 1? If so, presumably that means that either the cat will have to get another one in the future and you maybe should somehow be tracking the date it was last administered, unless getting the shot extra times is optional. If not, then using the inner query will be a performance cost, as you're doing a join and a projection for each sub-query when you could just do the join once and compare to each column.

kurifu Jan 8th, 2008 12:52 PM

Re: Complex SQL Question Question
 
Quote:

Originally Posted by mbd (Post 139114)
try modifying your query to use this approach. i have no way of testing this, but i think you could eliminate those subqueries
:

  1. SELECT DISTINCT cat_id, cat_name, cst_status, cat_birthday, cat_intake_date, CONVERT(VARCHAR(10), cat_birthday, 111) AS [YYYY/MM/DD]
  2. FROM MEOW_Cats
  3. LEFT JOIN MEOW_CatsStatus ON MEOW_Cats.cat_id = MEOW_CatsStatus.cst_cat_id
  4. JOIN MEOW_HealthRecords ON hre_cat_id = cat_id
  5. WHERE hre_fvrpc_initial != 'Y' OR hre_fvrcpc_booster != 'Y' ...


This won't work because there are multiple health records per cat, the WHERE clause would always return true, returning multiple records for each cat.

I should add that even when using multiple subqueries this improved DB performance by a massive amount.

mbd Jan 8th, 2008 1:38 PM

Re: Complex SQL Question Question
 
i suspect that your database is not normalized


All times are GMT -5. The time now is 3:37 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC