SQL Editor Queries
Last updated
Last updated
In Acho Studio, you can use the "SQL action" to insert customized SQL query for your table. This page will explain all the SQL queries that Acho Studio currently supports. The tables we are going to use are all from the sample-project created when you first log into Acho Studio.
Description
Select is the beginning keyword for every SQL query, it indicates that the user would like to view the following data.
Description
DISTINCT
function helps the user to identify unique values within a data field.
Example
DISTINCT
can also be used to identify unique combinations of multiple data fields.
Description
WHERE
is used to identify the beginning of filtering conditions being applied to the dataset. Within one table query section, there should only be one WHERE
and that should be located after the “From” statement after the name of the table. WHERE
should be followed immediately by the first filtering expression.
Example
AND
, OR
, NOT
are all supplementary expressions to help combine and refine filter expressions.
AND
allows users to obtain the intersecting or overlapping sections of two filtered datasets;
OR
allows users to obtain the union of two filtered datasets;
NOT
allows users to obtain the complement of a filtered dataset;
The three filter operations can be easily combined.
Description
ORDER BY
allow the user to sort the table based on one or multiple fields. The default sorting mechanism is ascending, alphabetical for string fields and from smallest to largest for numeric fields. ORDER BY
can also be used in combination with DESC
to reverse the sorting order into descending.
Example
INSERT INTO
can be used to add new records into a table. In other words, this is equivalent to make a manual entry. Note that Insert Into is not a query operation, the processed result will just be more records added to the table.
When using INSERT INTO
, users can choose to outline the value for every field;
Or just specify the select fields that the new record will have value in, the un-called out fields will simply have “null” as value.
NULL
is a special word in SQL to identify truly empty field values, please note that null is not equivalent to an empty string ("").
LIMIT
is used to select a requested number of sample records. Sometimes the table is very large and a simple select * statement can take longer to process. Hence during data exploration Limit can be used to get a small sample of the data. Limit is usually written at the very end of a query statement.
Min
and Max
are mathematical operators in SQL that will allow the user to obtain the minimum or maximum value of a specified field or set. It can be used on numeric values as well as strings, where the strings are ranked in alphabetically ascending order.
COUNT
, AVG
, and SUM
are mathematical operations that can be used to summarize field data. Note that COUNT
does not need to be applied to a numeric field, but AVG
and SUM
should. All three expressions need to be followed by brackets, “()”, and there can be further operations within the brackets. COUNT(*)
can be used to calculate the total number of records.
LIKE
is usually used in string expressions to find close lookalikes when exact match strings are cumbersome to express. It is often used together with the character wildcard %
or _
.
There are several wildcards in SQL that can be used with LIKE
to find strings that match a certain pattern.
%
is the most popular wildcard. It can represent any or no character.
_
represent one single character, having multiple _
in a row will represent that exact number of missing characters.
IN
can be used when specifying not one but multiple values, it can be useful in setting filters by combining several OR
expression into one single expression. It can also be combined with NOT
to form NOT IN
which will reflect the inverse of multiple exclusions. IN
always need to be followed by a bracket ()
where the selected matching values will be listed.
BETWEEN
can be used to identify ranges of values, it can be applied to numeric values or strings, where strings are ranked alphabetically ascending. BETWEEN
needs to be followed by an AND
in between the specified upper and lower bound values.
SQL has an easy alias function to assign nicknames to tables and or fields to save some typing within a query. The alias can be noted with a keyword AS
. Or, you can simply skip the AS
and directly type the nickname after the full expression.
In SQL, the keyword JOIN
is equivalent to INNER JOIN
. Both expressions will work in a query. INNER JOIN
will bring two tables together on records that match the identified key field(s) only. In one sense, this means the resulting table will be intercept of the two source tables, unmatched records from either side of the table will be dropped.
Note that SELECT *
in a JOIN
statement will output all columns from both tables. If only selected columns are needed then the chosen columns should be specified after SELECT
. Each column name should have a prefix of the table name, which indicates the column is from which source table.
If we view a join statement as a line of text on paper, the first-named table will physically be located on the left side since we write left to right, and therefore can be referred to as the left table. A left join will keep the first-named table to be joined intact, no record will be dropped from this first table, even if it did not have a match from the second table through the join key. Unmatched records from the first or left table will be carried forward into the resulting table with the new fields from the right table having null values. Unmatched records from the second or right table will be dropped.
In SQL LEFT JOIN
is equivalent to LEFT OUTER JOIN
, both can be used as keywords in a query.
In this following example which is exactly the same as the one we saw in inner join except the join type has change from inner to left, records from mask_use_by_county table which had null value for county_fips_code will be kept, but records from us_counties table that had null county_fips_code will be dropped.
SELECT b.state_name,b.county,b.date,a.never FROM mask_use_by_county a
left join us_counties b on a.county_fips_code = b.county_fips_code;
RIGHT JOIN
is the opposite of LEFT JOIN
where the unmatched records from the second named or right table will be kept regardless but the unmatched records from the first table will be dropped. It is equivalent to RIGHT OUTER JOIN
and will both work as keywords in a query.
FULL JOIN
is the no data loss join. In this case, unmatched records from both the left and right tables will be kept. That is, no records will be dropped from either side of the join. It is equivalent to FULL OUTER JOIN
.
A SELF JOIN
is an unconventional join statement expression since the keyword JOIN
is actually not found in the query anywhere. It is one way to obtain combinations within the same dataset.
Note that in this case where the field names are identical from the two sides. Thus, remember to use AS
to rename two tables and the outputted columns, so SQL will not show errors due to multiple columns with the same name.
SELECT A.state_name as higher_state,B.state_name as lower_state,A.deaths as higher_state_deaths,B.deaths as lower_state_deaths
FROM us_states A, us_states B
where A.state_name <> B.state_name
and A.confirmed_cases > B.confirmed_cases;
While JOIN
is the concept of merging multiple tables together horizontally (more added fields), UNION ALL
is the concept of appending two separate tables together vertically, by combing the total amount of records from both sides. In a union the user must be very careful with specifying the fields to bring from each table and make sure the correct fields are consolidating together. In a union expression, the keyword UNION ALL
is located in between two separate select statements.
GROUP BY
is an extremely useful function in SQL to aggregate data to a higher level. It can also generate a pivot table on top of datasets.
A GROUP BY
statement needs to have two parts that are in sync. After the initial SELECT
, a set of columns that will be used as aggregation key needs to be listed, followed by the calculated metrics that will be calculated at the aggregated level. Filters, joins, and other operations can follow just like in any regular query. After filtering conditions, a GROUP BY
key word is needed followed by the exact same set of fields used as aggregation key.
HAVING
is a very useful filtering mechanism that allow users to place filtering conditions on aggregated results after a group by query. In one sense this allows the user to do query on top of another query. This can be extremely useful in data exploration.
Note that the filtering conditions after HAVING
must be elements of the top select statement.
Exists in SQL is a binary test, it will return True/False for a select statement if it contains records or not. It is another way to place an advanced filter in a query as the test performed is not limited to the current table. The key word “exist” is placed after where, and then following by a bracketed separate select statement which contains the conditions for the test.
CASE
is the SQL way of placing “If & else” logic into a query. It means generating a new field using case logic into the existing table, usually used to categorize records. Although the case statement itself can take several rows to complete, it is equivalent to one new field, therefore it usually goes immediately behind SELECT
parallel to the fields to be displayed and in front of WHERE
.
The keyword COALESCE
can be used to embed a simple case logic when encountering null values, it can be very useful in calculations as null values may easily skew or null out the calculated results. COALESCE
is followed by a bracket with two parameters, the first parameter is the field where the null check will be performed, the second parameter is the value to replace NULL
with.
SQL Stored Procedures can only do select in Acho.
Special character /* and */ can be used to indicate start and end of a comment or none-executable section of the query code. Within these two special characters, any content can be written, no syntax needs to be followed. Usually the comment function is used to leave written comments on complex queries, it can also be used to comment out sections of the query codes to allow user to do flexible testing.
Note that is only /* is used, all written content after /* will all be commented out and become none executable.
The following are arithmetic operators that can be used in SQL query:
+ | addition |
- | subtraction |
* | multiplication |
/ | division |
% | remainder or modulus |
The following comparison operators can be used in SQL query:
= | equal |
> | greater than |
< | less than |
>= | greater than or equal to |
<= | less than or equal to |
Furthermore, “!” can be used as a prefix to any of the comparison operators to achieve the inverse operation. For example “!=” means “not equal to”, “!>” means “not greater than” which is equivalent to “<=”.
Additionally, there exists another distinctive operator denoted as "<>", which serves the purpose of expressing "not equal to" or "different from." This operator is suitable for numeric fields as well as other field types.