Datacamp Sql Cheat Sheet

  



Datacamp Sql Cheat Sheet

I've found DataCamp, whose cheat sheets comprise a large portion of the above links, to provide very clear explanations and am currently enjoying their Python Data Science courses. Steve Testa. 2 years ago. Options. DataCamp offers interactive R, Python, Sheets, SQL and shell courses. All on topics in data science, statistics and machine learning. Learn from a team of expert teachers in the comfort of your browser with video lessons and fun coding challenges and projects. SQL CHEAT SHEET SQL GROUP BY The GROUP BY clause is usually used with an aggregate function (COUNT, SUM, AVG, MIN, MAX). It groups the rows by a given column value (specified after GROUP BY) then calculates the aggregate for each group and returns that to the screen. SELECT column1, COUNT(column2) FROM tablename GROUP BY column1.

Download this 2-page SQL Window Functions Cheat Sheet in PDF or PNG format, print it out, and stick to your desk.

The SQL Window Functions Cheat Sheet provides you with the syntax of window functions, a list of window functions, and examples. You can download this cheat sheet as follows:

Window Functions

Window functions compute their result based on a sliding window frame, a set of rows that are somehow related to the current row.

Aggregate Functions vs. Window Functions

Unlike aggregate functions, window functions do not collapse rows.

Syntax

Named Window Definition

PARTITION BY, ORDER BY, and window frame definition are all optional.

PARTITION BY

PARTITION BY divides rows into multiple groups, called partitions, to which the window function is applied.

Default Partition: With no PARTITION BY clause, the entire result set is the partition.

ORDER BY

ORDER BY specifies the order of rows in each partition to which the window function is applied.

Pandas cheat sheet pdf

Default ORDER BY: With no ORDER BY clause, the order of rows within each partition is arbitrary.

Window Frame

A window frame is a set of rows that are somehow related to the current row. The window frame is evaluated separately within each partition.

The bounds can be any of the five options:

  • UNBOUNDED PRECEDINGi
  • n PRECEDING
  • CURRENT ROW
  • n FOLLOWING
  • UNBOUNDED FOLLOWING

The lower_bound must be BEFORE the upper_bound.

As of 2020, GROUPS is only supported in PostgreSQL 11 and up.

Abbreviations

AbbreviationMeaning
UNBOUNDED PRECEDINGBETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
n PRECEDINGBETWEEN n PRECEDING AND CURRENT ROW
CURRENT ROWBETWEEN CURRENT ROW AND CURRENT ROW
n FOLLOWINGBETWEEN AND CURRENT ROW AND n FOLLOWING
UNBOUNDED FOLLOWINGBETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING

Default Window Frame

  • If ORDER BY is specified, then the frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
  • Without ORDER BY, the frame specification is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

Logical Order of Operations in SQL

  1. FROM, JOIN
  2. WHERE
  3. GROUP BY
  4. aggregate functions
  5. HAVING
  6. window functions
  7. SELECT
  8. DISTINCT
  9. UNION/INTERSECT/EXCEPT
  10. ORDER BY
  11. OFFSET
  12. LIMIT/FETCH/TOP

You can use window functions in SELECT and ORDER BY. However, you can't put window functions anywhere in the FROM, WHERE, GROUP BY, or HAVING clauses.

  • Ranking Functions
    • row_number()
    • rank()
    • dense_rank()
  • Distribution Functions
    • percent_rank()
    • cume_dist()
  • Analytic Functions
    • lead()
    • lag()
    • ntile()
    • first_value()
    • last_value()
    • nth_value()
  • Aggregate Functions
    • avg()
    • count()
    • max()
    • min()
    • sum()

Ranking Functions

  • row_number() - unique number for each row within partition, with different numbers for tied values
  • rank() - ranking within partition, with gaps and same ranking for tied values
  • dense_rank() - ranking within partition, with no gaps and same ranking for tied values
Sql

ORDER BY and Window Frame:rank() and dense_rank() require ORDER BY, but row_number() does not require ORDER BY. Ranking functions do not accept window frame definition (ROWS, RANGE, GROUPS).

Distribution Functions

  • percent_rank() - the percentile ranking number of a row—a value in [0, 1] interval: (rank-1) / (total number of rows - 1)
  • cume_dist() - the cumulative distribution of a value within a group of values, i.e., the number of rows with values less than or equal to the current row’s value divided by the total number of rows; a value in (0, 1] interval

ORDER BY and Window Frame: Distribution functions require ORDER BY. They do not accept window frame definition (ROWS, RANGE, GROUPS).

Analytic Functions

  • lead(expr, offset, default) - the value for the row offset rows after the current; offset and default are optional; default values: offset = 1, default = NULL
  • lag(expr, offset, default) - the value for the row offset rows before the current; offset and default are optional; default values: offset = 1, default = NULL
  • ntile(n) - divide rows within a partition as equally as possible into n groups, and assign each row its group number.

ORDER BY and Window Frame:ntile(), lead(), and lag() require an ORDER BY. They do not accept window frame definition (ROWS, RANGE, GROUPS).

  • first_value(expr) - the value for the first row within the window frame
  • last_value(expr) - the value for the last row within the window frame

Note: You usually want to use RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING with last_value(). With the default window frame for ORDER BY, RANGE UNBOUNDED PRECEDING, last_value() returns the value for the current row.

  • nth_value(expr, n) - the value for the n-th row within the window frame; n must be an integer

ORDER BY and Window Frame:first_value(), last_value(), and nth_value() do not require an ORDER BY. They accept window frame definition (ROWS, RANGE, GROUPS).

Aggregate Functions

  • avg(expr) - average value for rows within the window frame
  • count(expr) - count of values for rows within the window frame
  • max(expr) - maximum value within the window frame
  • min(expr) - minimum value within the window frame
  • sum(expr) - sum of values within the window frame
Download

ORDER BY and Window Frame: Aggregate functions do not require an ORDER BY. They accept window frame definition (ROWS, RANGE, GROUPS).

Datacamp Sql Cheat Sheet Free

You may also like