SQL on Mongo (legacy)
IMPORTANT
This guide applies to the Quasar-based SQL query engine, which worked for Mongo versions up to 3.4.Â
If you are using the new SQL engine or a Mongo 4.X version you should read this guide:Â New SQL Engine on MongoDB
The Semantic Data Hub has among other objectives to decouple the ontology from the underlying database.Â
For that, we propose to use SQL query language, so the platform offers a SQL Query Engine for each of the supported databases.
Here you can find Reference Guides for this language:
SELECT Operations
Data Types
Type | Description | Examples | MongoDB native-supported |
---|---|---|---|
Null | Indicates missing information. | null | Yes |
Boolean | true or false. | true , false | Yes |
Integer | Whole numbers (no fractional component). | 1 , -2 | Yes |
Decimal | Decimal numbers (optional fractional components). | 1.0 , -2.19743 | Yes |
String | Text. | '221B Baker Street' | Yes |
DateTime | Date and time, in ISO8601 format. | TIMESTAMP '2004-10-19 10:23:54' | Yes |
Time | Time in the format HH:MM:SS. | TIME '10:23:54' | No |
Date | Date in the format YYYY-MM-DD. | DATE '2004-10-19' | No |
Interval | Time interval, in ISO8601 format. | INTERVAL 'P3DT4H5M6S' | No |
Object ID | Unique object identifier. | OID '507f1f77bcf86cd799439011' | Yes |
Ordered Set | Ordered list with no duplicates allowed. | (1, 2, 3) | No |
Array | Ordered list with duplicates allowed. | [1, 2, 2] | Yes |
Clauses
The following clauses are supported:
Type | Clauses |
---|---|
Basic | SELECT , AS , FROM |
Joins | LEFT OUTER JOIN , RIGHT OUTER JOIN , INNER JOIN , FULL JOIN , CROSS |
Filtering | WHERE |
Grouping | GROUP BY , HAVING |
Conditional | CASE  , WHEN , DEFAULT |
Paging | LIMIT , OFFSET |
Sorting | ORDER BY  , DESC , ASC |
The following operators are supported:
Type | Operators |
---|---|
Numeric | + , - , * , / , % |
String | ~  , ~* , !~ , !~* , LIKE , || |
Array | || , [ ... ] |
Relational | = , >= , <= , <> , BETWEEN , IN , NOT IN |
Boolean | AND , OR , NOT |
Projection | foo.bar , foo[2] , foo{*} , foo[*] |
Date/Time | TIMESTAMP , DATE , INTERVAL , TIME |
Identity | OID |
Note:Â ~
 , ~*
, !~
, and !~*
 are regular expression operators. ~*
, !~
, and !~*
 are preliminary and may not work in the current release.
Note: The ||
 operator for strings will concatenate two strings; for example, you can create a full name from a first and last name property: c.firstName || ' ' || c.lastName
. The ||
 operator for arrays will concatenate two arrays; for example, if xy
 is an array with two values, then c.xy || [0]
 will create an array with three values, where the third value is zero.
The following functions are supported:
Type | Functions |
---|---|
String | CONCAT , LOWER , UPPER , SUBSTRING , LENGTH , SEARCH |
DateTime | DATE_PART , TO_TIMESTAMP |
Nulls | COALESCE |
Arrays | ARRAY_LENGTH , FLATTEN_ARRAY |
Objects | FLATTEN_OBJECT |
Set-Level | DISTINCT , DISTINCT_BY |
Aggregation | COUNT , SUM , MIN , MAX , AVG |
Identity | SQUASH |
Basic Queries
select * from ISO3166_1 as c returns info and ContextData:
select c.ISO3166 as ISO3166 from ISO3166_1 as c returns only Data:
select _id,c from ISO3166_1 Â returns * including id from Registry:
Filtering
You can filter a result set using the WHERE clause. The following operators are supported:
- Relational:Â
-
,Â=
,Â>=
,Â<=
,Â<>
,ÂBETWEEN
,ÂIN
,ÂNOT IN
- Boolean:Â
AND
,ÂOR
,ÂNOT
Examples:
select c.ISO3166 as ISO3166 from ISO3166_1 as c where c.ISO3166.name="Zambia"
Numeric and String Operations
You can use any of the operators or functions listed in the Clauses, Operators, and Functions section on numbers and strings. Some common string operators and functions include:
Operator or Function | Description |
---|---|
|| | Concatenates. |
LOWER | Converts to lowercase. |
UPPER | Converts to uppercase. |
SUBSTRING | Returns a substring. |
LENGTH | Returns length of string. |
Examples:
select c.ISO3166.number/100+1000 as calculated from ISO3166_1 as c where c.ISO3166.name="Zambia" and c.ISO3166.language="EN"
select c.ISO3166.name as name from ISO3166_1 as c where c.ISO3166.name like "S%" and c.ISO3166.language="EN"
Dates and Times
Filter by dates and times using the TIMESTAMP
, TIME
, and DATE
 operators. Also, you can also use the DATEPART
 operator for selection to select part of a date, such as the day.
Note: Some databases will automatically convert strings into dates or date/times. SlamData does not perform this conversion, since the underlying database has no schema and no fixed type for any field. As a result, an expression like WHERE ts > '2015-02-10'
 compares string-valued ts
 fields with the string '2015-02-10'
 instead of a date comparison. If you want to embed literal dates, timestamps, etc. into your SQL queries, you should use the time conversion operators, which accept a string and return value of the appropriate type. For example, the above snippet could be converted to WHERE ts > DATE '2015-02-10'
, which looks for date-valued ts
 fields and compares them with the date 2015-02-10
.
Note:Â If your database data does not use a native date/time type, and instead, you store your timestamps as epoch milliseconds in a numeric value, then you should either compare numbers or use the TO_TIMESTAMP function.
Filter based on a timestamp (date and time).
Use the TIMESTAMP operator to convert a string into a date and time. The string should have the format YYYY-MM-DDTHH:MM:SS
.
select c.type,c.name,c.dateLastUpdate,DATE_PART("year",c.dateLastUpdate) as year from CLM_BusStop as c
select c from CLM_BusStop as c WHERE c.CLM_BusStop.lastUpdateMillis > TO_TIMESTAMP(1446335999)
select c from CLM_BusStop as c WHERE c.CLM_BusStop.lastUpdate > TIMESTAMP('2018-11-06T12:00:00.000Z')
Function NOW()
This function returns the current system date, you can use it in the WHERE of your queries to bring data.
NOW(“format“,'unitTime', amount)
- “format“: we will give the format that we need to date, for example "yyyy-MM-dd'T'HH: mm: ss'Z '".
- 'unitTime': to date we can increase or decrease a number of hours, days, ... the possible values for 'unitTime' are: 'year', 'month', 'date', 'hour', 'minute', 'second', 'millisecond'.
- "amount": is a positive or negative integer value that is added or subtracted depending on the unitTime selected.
Examples:
select * from MyOntology as c where c.date.timestamp>NOW()
Grouping
SQL on Mongo allows you to group data by fields and by date parts.
Group based on a single field.
Use GROUP BY to group results by a field.
Example:
SELECT c.age, COUNT(*) AS cnt FROM Users as c GROUP BY c.age
SELECT c.age, c.gender, COUNT(*) AS cnt FROM Users as c GROUP BY c.age, c.gender
Group based on date part.
Use the DATE_PART function to group by a part of a date, such as the month.
Example:
SELECT DATE_PART('day', c.ts) AS day, COUNT(*) AS cnt FROM Events as c GROUP BY DATE_PART('day', c.ts)
Filter within a group.
Filter results within a group by adding a HAVING clause followed by a Boolean predicate.
Example:
SELECT DATE_PART('day', c.ts) AS day, COUNT(*) AS cnt FROM Events as c GROUP BY DATE_PART('day', c.ts) HAVING c.gender = 'female'
Double grouping
Perform double-grouping operations by putting operators inside other operators. The inside operator will be performed on each group created by the GROUP BY clause, and the outside operator will be performed on the results of the inside operator.
Example:
This query returns the average population of states. The outer aggregation function (AVG) operates on the results of the inner aggregation (SUM) and GROUP BY clause.
SELECT AVG(SUM(p.pop)) FROM Population as p GROUP BY p.state
Nested Data and arrays
Unlike a relational database, many NoSQL databases allows data to be nested (that is, data can be objects) and to contain arrays.
Nesting
Nesting is represented by levels separated by a period (.
).
Arrays
Array elements are represented by the array index in square brackets ([n]
).
Example:
SELECT c.profile.allAddress[0].street.number FROM Users as c
Flattening
You can extract all elements of an array or all field values simultaneously, essentially removing levels and flattening the data. Use the asterisk in square brackets ([*]
) to extract all array elements.
Example:
SELECT c.profile.allAddresses[*] FROM Users as c
Use the asterisk in curly brackets ({*}
) to extract all field values.
Example:
SELECT c.profile.{*} FROM Users as c
Filtering using arrays
You can filter using data in all array elements by using the asterisk in square brackets ([*]
) in a WHERE clause.
Example:
SELECT DISTINCT * FROM Users as c WHERE c.profile.allAddresses[*].street.number = '221B'
Pagination and Sorting
Pagination
Pagination is used to break large return results into smaller chunks. Use the LIMIT operator to set the number of results to be returned and the OFFSET operator to set the index at which the results should start.
Example (Limit results to 20 entries):
SELECT u FROM Users as u LIMIT 20
Example (Limit results to 20 entries empezando en el 10):
SELECT u FROM Users as u OFFSET 10 LIMIT 20
Sorting
Use the ORDER BY clause to sort the results. You can specify one or more fields for sorting, and you can use operators in the ORDER BY arguments. Use ASC for ascending sorting and DESC for decending sorting.
Example (Sort users by ascending age):
SELECT u FROM Users as u ORDER BY u.age ASC
Joins
Use the JOIN operator to join different collections.
Examples:
This example returns the names of employees and the names of the departments they belong to by matching up the employee department ID with the department’s ID, where both IDs are ObjectID types.
SELECT emp.name, dept.name FROM Employee as emp JOIN Department as dept ON dept._id = emp.departmentId
Conditional and Nulls
Conditionals
Use the CASE expression to provide if-then-else logic to SQL². The CASE sytax is:
SELECT (CASE <field>
WHEN <value1> THEN <result1>
WHEN <value2> THEN <result2>
...
[ELSE <elseResult>
END)
FROM "<path>"
Example:
The following example translates string number values into actual numbers.
select
  (CASE c.Ticket.number
   WHEN 'one' THEN 1
   WHEN 'two' THEN 2
   END)
as nameParsed
from Ticket as c
Considerations on ID
MongoDB has special rules about fields called _id
. For example, they must remain unique, which means that some queries (such as SELECT myarray[*] FROM foo
) will introduce duplicates that MongoDB won’t allow. In addition, other queries change the value of _id
 (such as grouping).
SELECT * FROM Ticket WHERE _id = OID( 'abc123')
SELECT _id, u AS email FROM users as u
Considerations on wildcard *
For performance reasons, we encourage you to avoid the use of the wildcard '*' on SELECTs, instead replace it with either projections or explicit fields:
SELECT * FROM Ticket -> SELECT t FROM Ticket as t
SELECT * FROM Ticket -> SELECT t.Ticket.name as name, t.Ticket.number as number FROM Ticket as t
Complex examples
Here we show some examples of queries that we've used in different projects:
select
t.tempo, e.lotto, count(1)
from TempoDesc as t
inner join
(select
c.lotto, c.epochEnd - d.minStart as durataTotale
from
(select
min(epochStart) as minStart, lotto
from Eventi1
group by lotto) as d
inner join Eventi1 as c
on c.lotto=d.lotto ) as e
on e.durataTotale <= t.tempo
group by t.tempo, e.lotto
select
   re.countrysrc,re.countrydest,re.count, iso.ISO3166.latitude , iso.ISO3166.longitude
   from
    ( select rx.routesexten.countrysrc as countrysrc, rx.routesexten.countrydest as countrydest, count(re.routesexten.countrysrc) as count
from routesexten as rx
group by rx.routesexten.countrysrc, rx.routesexten.countrydest
order by count desc) as re
   inner join
    ISO3166_1 As iso on re.countrydest = iso.ISO3166.name
   order by
   re.count desc
   limit 10
UPDATE Operations
You can also perform UPDATE operations in SQL. Here we will explain the peculiarities of this type of SQL directives.
Simple syntax
As in standard SQL, the update directive is built as follows:
UPDATE column SET attributes WHERE attributesFilter
Here are some examples of basic Update operations:
Let the Restaurants ontology be our working example with the following structure:
update Restaurants set Restaurant.borough="Queens" where Restaurant.borough="Manhattan"
update Restaurants set Restaurant.borough="Queens", Restaurant.restaurant_id=5400 where Restaurant.borough="Manhattan" and Restaurant.borough="Dj Reynolds Pub And Restaurant"
ObjectId
If you want to filter by Mongo ObjectId, you don't need to use any special function, you just need to indicate the quoted value of the ObjectId, for example:
update Restaurants set Restaurant.borough="Manhattan", Restaurant.restaurant_id=5400 where _id="5d14883e29a91330a9625d0c"
ISODates
As for the ObjectId, if you want work with Mongo dates, you just need to put the quoted value of the date in ISODate Format (yyyy-MM-dd'T'HH:mm:ss.SSS'Z'):
update ThermometerData set ThermometerData.value=60 where ThermometerData.timestamp="2019-06-26T13:57:01.000Z"
The compiled query:
 Result:
Booleans
If you want to set or filter boolean values, you have to use the SQL function Boolean() as follows:
update ThermometerData set ThermometerData.bool=Boolean(true) where ThermometerData.timestamp="2019-06-26T13:57:01.000Z"
Arrays
If you want to add elements to an array field, you have to use the APPEND() function.
APPEND(i, value) accepts two parameters:
- i: This is optional (default is last position in array). It indicates the position where the value is going to be pushed.
- value: This is the value being pushed to the array, it can be an object, string, number... This parameter can be also an array of values.
For example:
update ThermometerData set ThermometerData.array=APPEND("{\"Property\":\"value\"}")
update ThermometerData set ThermometerData.array=APPEND([true, false])
update ThermometerData set ThermometerData.array=APPEND([true, false, 45, "somestring", "2019-06-26T13:57:01.000Z"])
The result is:
Important note: If you want to push an ISODate (Mongo date) to the array, you want to surround the isodate value by ISODate function:
update ThermometerData set ThermometerData.array=APPEND([true, false, 45, "somestring", ISODate("2019-06-26T13:57:01.000Z")])
Â
DELETE operations
Delete directive works as in standard SQL.
DELETE FROM column WHERE attributes
ObjectId, ISODate, and Boolean operations works as with the updates. You can't use any type of array operations yet in this directive.
Examples:
DELETE FROM Restaurants WHERE _id="5d14883e29a91330a9625d0c"
DELETE FROM Restaurants WHEREÂ timestamp="2019-06-26T13:57:01.000Z"
DELETE FROM Restaurants WHERE isClosed=Boolean(true)