Difference between pages "Self join" and "SELECT .. GROUP BY"

From SQLZOO
(Difference between pages)
Jump to: navigation, search
(As stated, Question 7 result should only give one line for each unique pair of company, num; thus added DISTINCT.)
 
 
Line 1: Line 1:
==Edinburgh Buses==
 
<p>[[Edinburgh_Buses. |Details of the database]] Looking at the data</p>
 
stops('''id''', name)
 
route('''num''','''company''','''pos''', ''stop'')
 
  
<div class='schema'></div>
+
<h1>SELECT .. GROUP BY</h1>
<div class = 'ref_section'>
+
Host cities and continents for the Olympics Games are stored in the table <code>games</code>.
<table class = 'db_ref'>
+
Notice that Europe appears in the table twice:
<tr><th>'''stops'''</th><th>'''route'''</th></tr>
+
<table>
<tr><td>id</td><td>num</td></tr>
+
<caption align='center'>'''games'''</caption>
<tr><td>name</td><td>company</td></tr>
+
<tr><th align='center'>'''yr'''</th><th align='center'>'''city'''</th><th align='center'>'''continent'''</th></tr>
<tr><td></td><td>pos</td></tr>
+
<tr><td>2000</td><td align='left'>Sydney</td><td align='left'>Australasia</td></tr>
<tr><td></td><td>stop</td></tr>
+
<tr><td>2004</td><td align='left'>Athens</td><td align='left'>Europe</td></tr>
<tr><td></td><td></td></tr>
+
<tr><td>2008</td><td align='left'>Beijing</td><td align='left'>Asia</td></tr>
 +
<tr><td>2019</td><td align='left'>London</td><td align='left'>Europe</td></tr>
 
</table>
 
</table>
</div>
+
 
+
<div class='ht'>
<div class="progress_panel"><div>
+
<div class='params'>schema:scott</div>
  <div class="summary">Summary</div>
+
<source lang=sql class='tidy'> DROP TABLE games</source>
  <div class="progressbarbg">
+
<source lang=sql class='setup'> CREATE TABLE games(
    <div class="progressbar"></div>
+
  yr INTEGER,
  </div>
+
  city VARCHAR(20),
</div></div>
+
  continent VARCHAR(20));
 
+
INSERT INTO games VALUES (2000,'Sydney','Australasia');
<div class='qu'>
+
INSERT INTO games VALUES (2004,'Athens','Europe');
How many '''stops''' are in the database. 
+
INSERT INTO games VALUES (2008,'Beijing','Asia');
<source lang='sql' class='def'>
+
INSERT INTO games VALUES (2012,'London','Europe');
 
</source>
 
</source>
 
+
In a GROUP BY statement only <i>distinct</i> values are shown for the column in the GROUP BY.
<source lang='sql' class='ans'>
+
This example shows the continents hosting the Olympics with the count of the number of games held.
SELECT COUNT(*)  
+
<source lang='sql' class='def e-oracle'>
FROM stops
+
SELECT continent, COUNT(yr) FROM scott.games
 +
GROUP BY continent
 
</source>
 
</source>
</div>
 
 
<div class='qu'>
 
Find the '''id''' value for the stop 'Craiglockhart'   
 
 
<source lang='sql' class='def'>
 
<source lang='sql' class='def'>
</source>
+
SELECT continent, COUNT(yr) FROM games
 
+
GROUP BY continent
<source lang='sql' class='ans'>
+
SELECT id
+
FROM stops
+
WHERE name='Craiglockhart'
+
 
</source>
 
</source>
 
</div>
 
</div>
  
 
+
<p>See also</p>
<div class='qu'>
+
<ul>
Give the '''id''' and the '''name''' for the '''stops''' on the '4' 'LRT' service.   
+
   <li>[[SUM_and_COUNT |SUM and COUNT]]</li>
<source lang='sql' class='def'>
+
   <li>[[SELECT_.._WHERE |SELECT WHERE]]</li>
</source>
+
</ul>
 
+
<source lang='sql' class='ans'>
+
SELECT id, name FROM stops, route
+
   WHERE id=stop
+
    AND company='LRT'
+
    AND num='4'
+
</source>
+
</div>
+
 
+
<h2>Routes and stops</h2>
+
<div class='qu'>
+
The query shown gives the number of routes that visit either London Road (149) or Craiglockhart (53). Run the query and notice the two services that link these '''stops''' have a count of 2.
+
Add a HAVING clause to restrict the output to these two routes.   
+
<source lang='sql' class='def'>
+
SELECT company, num, COUNT(*)
+
FROM route WHERE stop=149 OR stop=53
+
GROUP BY company, num
+
</source>
+
 
+
<source lang='sql' class='ans'>
+
SELECT company, num, COUNT(*)
+
FROM route WHERE stop=149 OR stop=53
+
GROUP BY company, num
+
HAVING COUNT(*)=2
+
</source>
+
</div>
+
 
+
<div class='qu'>
+
Execute the self join shown and observe that b.stop gives all the places you can get to from Craiglockhart, without changing routes.
+
Change the query so that it shows the services from Craiglockhart to London Road.     
+
<source lang='sql' class='def'>
+
SELECT a.company, a.num, a.stop, b.stop
+
FROM route a JOIN route b ON
+
   (a.company=b.company AND a.num=b.num)
+
WHERE a.stop=53
+
</source>
+
 
+
<source lang='sql' class='ans'>
+
SELECT a.company, a.num, a.stop, b.stop
+
FROM route a JOIN route b ON
+
  (a.company=b.company AND a.num=b.num)
+
WHERE a.stop = 53 AND b.stop=149
+
</source>
+
</div>
+
 
+
<div class='qu'>
+
The query shown is similar to the previous one, however by joining two copies of the '''stops''' table we can refer to '''stops''' by '''name''' rather than by number.
+
Change the query so that the services between 'Craiglockhart' and 'London Road' are shown. If you are tired of these places try 'Fairmilehead' against 'Tollcross'     
+
<source lang='sql' class='def'>
+
SELECT a.company, a.num, stopa.name, stopb.name
+
FROM route a JOIN route b ON
+
  (a.company=b.company AND a.num=b.num)
+
  JOIN stops stopa ON (a.stop=stopa.id)
+
  JOIN stops stopb ON (b.stop=stopb.id)
+
WHERE stopa.name='Craiglockhart'
+
</source>
+
 
+
<source lang='sql' class='ans'>
+
SELECT a.company, a.num, stopa.name, stopb.name
+
FROM route a JOIN route b ON
+
  (a.company=b.company AND a.num=b.num)
+
  JOIN stops stopa ON (a.stop=stopa.id)
+
  JOIN stops stopb ON (b.stop=stopb.id)
+
WHERE stopa.name='Craiglockhart'
+
  AND stopb.name='London Road'
+
</source>
+
</div>
+
 
+
<h2>[[Using a self join]]</h2>
+
<div class='qu'>
+
Give a list of all the services which connect stops 115 and 137 ('Haymarket' and 'Leith')     
+
<source lang='sql' class='def'>
+
</source>
+
 
+
<source lang='sql' class='ans'>
+
SELECT DISTINCT R1.company, R1.num
+
  FROM route R1, route R2
+
  WHERE R1.num=R2.num AND R1.company=R2.company
+
    AND R1.stop=115 AND R2.stop=137
+
</source>
+
</div>
+
 
+
<div class='qu'>
+
Give a list of the services which connect the '''stops''' 'Craiglockhart' and 'Tollcross'     
+
<source lang='sql' class='def'>
+
</source>
+
 
+
<source lang='sql' class='ans'>
+
SELECT R1.company, R1.num
+
  FROM route R1, route R2, stops S1, stops S2
+
  WHERE R1.num=R2.num AND R1.company=R2.company
+
    AND R1.stop=S1.id AND R2.stop=S2.id
+
    AND S1.name='Craiglockhart'
+
    AND S2.name='Tollcross'
+
</source>
+
</div>
+
 
+
<div class='qu'>
+
Give a list of the '''stops''' which may be reached from 'Craiglockhart' by taking one bus. Include the details of the appropriate service.       
+
<source lang='sql' class='def'>
+
</source>
+
 
+
<source lang='sql' class='ans'>
+
SELECT S2.id, S2.name, R2.company, R2.num
+
  FROM stops S1, stops S2, route R1, route R2
+
  WHERE S1.name='Craiglockhart'
+
    AND S1.id=R1.stop
+
    AND R1.company=R2.company AND R1.num=R2.num
+
    AND R2.stop=S2.id
+
</source>
+
</div>
+
 
+
<div class='qu'>
+
Try to show it is possible to get from Sighthill to Craiglockhart by changing lanes twice.
+
This DB doesn't make this easy!       
+
<source lang='sql' class='def'>
+
</source>
+
 
+
<source lang='sql' class='ans'>
+
select distinct a.name, c.name
+
from stops a JOIN route z ON a.id=z.stop
+
JOIN route y ON y.num = z.num
+
JOIN stops b ON y.stop=b.id
+
JOIN route x ON x.num = y.num
+
JOIN stops c ON c.id=x.stop
+
where a.name='Craiglockhart'
+
AND c.name ='Sighthill'
+
</source>
+
</div>
+
 
+
<div>
+
<div class="lsclear">Clear your results</div>
+
<p><div class="quizlink">[[Self join Quiz]]</div></p>
+
</div>
+

Revision as of 21:30, 7 May 2013

SELECT .. GROUP BY

Host cities and continents for the Olympics Games are stored in the table games. Notice that Europe appears in the table twice:

games
yrcitycontinent
2000SydneyAustralasia
2004AthensEurope
2008BeijingAsia
2019LondonEurope
schema:scott
 DROP TABLE games
 CREATE TABLE games(
  yr INTEGER,
  city VARCHAR(20),
  continent VARCHAR(20));
INSERT INTO games VALUES (2000,'Sydney','Australasia');
INSERT INTO games VALUES (2004,'Athens','Europe');
INSERT INTO games VALUES (2008,'Beijing','Asia');
INSERT INTO games VALUES (2012,'London','Europe');

In a GROUP BY statement only distinct values are shown for the column in the GROUP BY. This example shows the continents hosting the Olympics with the count of the number of games held.

SELECT continent, COUNT(yr) FROM scott.games
 GROUP BY continent
SELECT continent, COUNT(yr) FROM games
 GROUP BY continent

See also

Personal tools
Namespaces

Variants
Actions
Reference
Toolbox
Google AdSense