Annye Driscoll | annyedriscoll.com
Girl Develop It is here to provide affordable and accessible programs to learn software through mentorship and hands-on instruction.
Some rules
We're going to make a table out of this info in part 2!
By the end of the class, you will be able to query and modify a SQL table to make sense of the data you have.
What's a database? What's SQL?
A set of related data and the way it is organized!
A collection of tables, schemas, queries, reports, views, and other objects (it’s not just tables of data!) often organized in a way that supports processes or reflects real life in order to facilitate business or functions.
A collection of tables, schemas, queries, views…
Relational database = organizes data into one or more tables of columns and rows with a unique key identifying each row (record).
Tables are joined on special fields called keys
| Oracle | MySQL |
| SQL Server | Postgres |
| Access | Sybase |
| DB2 |
All different flavors of the same language
What's a database? What's SQL?
The power is in the relationship between data sets.
| 1 SELECT | selects variables |
| 2 FROM | opens datasets |
| 3 WHERE | restricts observations |
| 4 GROUP BY | groups observations |
| 5 HAVING | restricts groups |
| 6 ORDER BY | sorts results |
We'll use MySQL, a popular version of SQL.
If you've done anything online, chances are the transation went through MySQL (Walmart, Kayak, Facebook, Etsy, Verizon).
We'll log on to the front-end of MySQL, PHPMyAdmin.
Logon site: https://demo.phpmyadmin.net/STABLE/
Login username is "root". Leave the password field blank.
Already loaded are a bunch of employee records we'll be using together.

employees is the database and it has 6 tables.
Database → Schema → Table
Book → Chapter → Page
MySQL Database → Employees Schema → employees Table
Usually it's bad practice to name tables the same thing as the database (confusing), but we'll be ok with the small test data set.
SELECT * FROM employees.employees;
Type out all the SQL you write. It'll be easier to learn.
Try a couple of these:
SELECT * FROM employees.titles;
SELECT * FROM employees.dept_emp;
SELECT * FROM employees.salaries;
and check out the tables and info.
Now let's try selecting only certain columns
SELECT title, emp_no FROM employees.titles;
A note about the semi-colon: depending on which flavor of SQL you're using you may not need it, but it's good practice in general to include it.
Selecting one column:
SELECT dept_name FROM employees.departments;
Selecting multiple columns:
SELECT emp_no, dept_no FROM employees.dept_emp;
SELECT AVG(salary) FROM employees.salaries;
SELECT MIN(from_date) FROM employees.dept_manager;
SELECT DISTINCT title FROM employees.titles;
SELECT COUNT(DISTINCT title) FROM employees.titles;
6
SELECT COUNT(title) FROM employees.titles;
14,782
Distinct: Selecting only unique instances
Which result set is bigger?
What is each one telling us?
SELECT COUNT(title) FROM employees.titles;
14,782
SELECT COUNT(*) FROM employees.salaries;
94,917
SELECT COUNT(*) from employees.departments;
9
SELECT COUNT(*) FROM employees.dept_emp;
11,050
SELECT COUNT(*) FROM employees.dept_manager;
24
SELECT COUNT(*) FROM employees.employees;
10,000
SELECT COUNT(*) FROM employees.salaries;
94,917
SELECT COUNT(*) FROM employees.titles;
14,782
| 1 SELECT | selects variables |
| 2 FROM | opens datasets |
| 3 WHERE | restricts observations |
| 4 GROUP BY | groups observations |
| 5 HAVING | restricts groups |
| 6 ORDER BY | sorts results |
SELECT type_of_dog,COUNT(*)
FROM animals.total_dogs
GROUP BY type_of_dog;
Group statements corral data together into categories when we want to count things by groups of things
i.e. How many total dogs vs puppies vs parents.

SELECT title, COUNT(*) FROM employees.titles
GROUP by title;
SELECT dept_name, COUNT(*) FROM employees.departments
GROUP BY dept_name;
SELECT first_name, COUNT(*)
FROM employees.employees
GROUP BY first_nameSELECT first_name, last_name, COUNT(*)
FROM employees.employees
GROUP BY first_name, last_nameHow many unique salaries does each employee have? (how many times have they been given raises?)
SELECT emp_no, COUNT(DISTINCT salary)
FROM employees.salaries
GROUP BY emp_no;
How many managers has each department had?
SELECT dept_no, COUNT(emp_no)
FROM employees.dept_manager
GROUP BY dept_no;
How many people that were born on the same day, have the same first name?
SELECT birth_date,first_name, COUNT(*)
FROM employees.employees
GROUP BY birth_date, first_name;
Which title has been around the longest?(Smallest employee number)
SELECT title, MIN(emp_no)
FROM employees.titles
GROUP BY title;
Will sort by ascending by default.
SELECT first_name, gender, COUNT(*)
FROM employees.employees
GROUP BY first_name, gender
ORDER BY first_name;
SELECT first_name, gender, COUNT(*)
FROM employees.employees
GROUP BY first_name, gender
ORDER by first_name DESC;
Ordering by more than one column, will sort by the first, then by the second, ascending
SELECT gender, birth_date FROM employees.employees
ORDER BY gender, birth_date;
Order by a number ascending
SELECT emp_no, salary FROM employees.salaries
ORDER BY salary ASC;
Order by a number descending
SELECT emp_no, salary FROM employees.salaries
ORDER BY salary DESC;
| 1 SELECT | selects variables |
| 2 FROM | opens datasets |
| 3 WHERE | restricts observations |
| 4 GROUP BY | groups observations |
| 5 HAVING | restricts groups |
| 6 ORDER BY | sorts results |
You only want to select certain things.
SELECT first_name, last_name
FROM employees.employees
WHERE first_name='Adam';
"Like" is used in a WHERE clause to search for a specific pattern.
SELECT last_name, COUNT( * )
FROM employees.employees
WHERE last_name LIKE 'S%' -- begins with 'S'
GROUP BY last_name;
WHERE column_name BETWEEN value1 AND value2;
WHERE column_name IN (value1, value2, ...);
WHERE condition1 AND condition2 AND condition3 ...;
WHERE condition1 OR condition2 OR condition3 ...;
All people named Mary or Elvis:
SELECT * FROM employees.employees
WHERE first_name IN ('Mary', 'Elvis');
SELECT * FROM employees.employees
WHERE first_name = 'Mary' OR first_name = 'Elvis';
All of the engineers before 2000:
SELECT * FROM employees.titles
WHERE title LIKE '%Engineer%' AND from_date < '2000-01-01';
One table is great, but the power of SQL comes from being able to join related tables to get information.
What's a database? What's SQL?
Tables are joined on special fields called keys
emp_no is the column that's the primary key for all three tables that allows us to join them
We join primary keys to foreign keys!

Often when we join tables, we make them easier to reference later on by aliasing them, but at this point it's a style choice.
Find out salary history:
SELECT first_name, last_name, hire_date,salary, from_date, to_date
FROM employees.employees a
JOIN employees.salaries b ON a.emp_no = b.emp_no
| first_name | last_name | hire_date | salary | from_date | to_date |
|---|---|---|---|---|---|
| Georgi | Facello | 6/26/86 | 60117 | 6/26/86 | 6/26/87 |
| Georgi | Facello | 6/26/86 | 62102 | 6/26/87 | 6/25/88 |
| Georgi | Facello | 6/26/86 | 66074 | 6/25/88 | 6/25/89 |
When you have two columns from different tables with the same name, you have to reference them specifically with the alias.
SELECT a.first_name, a.last_name, a.hire_date, b.salary, b.from_date, b.to_date, c.title, c.from_date, c.to_date
FROM employees.employees a
JOIN employees.salaries b
ON a.emp_no = b.emp_no
JOIN employees.titles c
ON a.emp_no = c.emp_no

Left join: All the fields in the first table, plus the ones that match in the second table
SELECT first_name, last_name, hire_date,title, c.from_date, c.to_date
FROM employees.employees a
LEFT JOIN employees.titles c ON a.emp_no = c.emp_no;
What if you'd done a right join instead?
What is this query telling us?
SELECT first_name, last_name, a.emp_no, hire_date,
COUNT(DISTINCT title), COUNT(DISTINCT salary)
FROM employees.employees a
JOIN employees.salaries b ON a.emp_no = b.emp_no
JOIN employees.titles c ON a.emp_no = c.emp_no
where first_name like 'A%'
GROUP BY first_name, last_name, hire_date, a.emp_no;
What is this query bring back?
SELECT first_name, last_name, hire_date, a.emp_no, salary, from_date as
salary_start_date, to_date AS salary_end_date
FROM employees.employees a
JOIN employees.salaries b ON a.emp_no = b.emp_no
WHERE salary > 60000
AND from_date > '2000-01-01'
ORDER BY last_name
Using nested logic datasets to query other data sets
A subquery (or inner query or nested query) is a query within a query. Most of the time, a subquery is used when you know how to search
for a value using a SELECT statement, but not the exact value
Exercises:
SELECT COUNT(DISTINCT emp_no)
FROM employees.employees;
SELECT * from (
SELECT emp_no, COUNT(*) AS number_of_salaries
FROM employees.salaries
WHERE from_date BETWEEN '1996-01-01' AND '1997-01-01'
GROUP BY emp_no) a
WHERE number_of_salaries > 1
SELECT title, COUNT(DISTINCT a.emp_no)
FROM employees.titles a
JOIN employees.employees b
ON a.emp_no = b.emp_no
GROUP BY title
Exercises:
What did we do so far?
Now that you see how powerful SQL is, let's make some tables of our own. What if we have tons of data we want to query?
You have the ability to create tables, but also to delete them by accident, to create duplicate data, or to confuse people with data.
You have great power. Use it wisely :)
CREATE DATABASE annye
Name it with your first name or something easy you can remember. You'll have to keep typing it, so under 10 characters is optimal.
By going to STRUCTURE
Run in the same place you did your select statements:
CREATE TABLE employees (
emp_no INT NOT NULL,
first_name VARCHAR(14) NOT NULL,
last_name VARCHAR(16) NOT NULL,
gender ENUM ('M','F') NOT NULL,
hire_date DATE NOT NULL,
PRIMARY KEY (emp_no);
Which field is the primary key?
Which fields are numbers?
What's an enum*?
What's meant by "NOT NULL"?
SELECT * FROM annye.moviesample;
SELECT first_name FROM annye.moviesample;
SELECT COUNT(*) FROM annye.moviesample;
What do we get?
Nothing! because our table has no data...yet
Now that we have the table set up, we can insert the values. Let's load a row of data.
INSERT INTO
yourname.tablename (first_name, favorite_movie, times_watched) -- column names
VALUES ('Jasmine', 'Aladdin', '456'); -- column values
INSERT INTO annye.moviesample(first_name, favorite_movie, times_watched)
VALUES ('Alice','Alice in Wonderland', 5);
INSERT INTO annye.moviesample
(first_name, favorite_movie,times_watched)
VALUES ('Annye', 'Return of the King', '20');
INSERT INTO annye.moviesample
(first_name, favorite_movie,times_watched)
VALUES ('Jasmine', 'Aladdin','456');
What kind of issues do you see arising from wrong data entry?
Let's answer some exercises with SQL:
What's the most popular movie?SELECT favorite_movie,
COUNT(DISTINCT first_name)
FROM yourname.moviesample
GROUP BY favorite_movie
ORDER BY COUNT(DISTINCT first_name) DESC;
We want to see which decade was most popular among GDI members. This one will have the data for the movie name and the year it was made.
INSERT INTO table_name (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);
SELECT * FROM annye.moviesample a
JOIN annye.movie_year b
ON b.movie_title = a.favorite_movie
What kind of issues do you see arising from wrong data entry?
This is the part where you have to be really careful. Most users won't have access to delete tables.
Easy way in UI: The delete button. Be careful!
SQL can be much more accurate:
DELETE FROM annye.moviesample WHERE first_name = 'Shrek';
DELETE FROM moviesample
WHERE first_name like '%S';
DELETE FROM moviesample
WHERE times_watched >= '500';
SELECT is to look at data
CREATE is to create a table
DELETE is to erase data or a table
UPDATE is to change specific data entries
UPDATE annye.moviesample
SET first_name = 'Chihiro', times_watched = 3
WHERE movie = 'Spirited Away';
Try it now
SELECT is to look at data
CREATE is to create a table
DELETE is to erase data or a table
ALTER is to fundamentally change a table:
ALTER TABLE icecream DROP COLUMN flavor;
ALTER TABLE icecream ADD UNIQUE (quantity);
ALTER TABLE icecream MODIFY flavor VARCHAR(35);
Try it now:
ALTER TABLE annye.moviesample ADD COLUMN release_yr
VARCHAR(4) AFTER times_watched;
Here's what we've learned:
Practice, practice, practice! Upload your Excel spreadsheets.
Annye Driscoll | annye.driscoll@gmail.com