I have now for quite some weeks been writing on this #series of articles about the #SQL improvements in MapInfo Pro v2019. We aren't quite done yet so here's another article for you.
This time we will dive into the world of joins.
Support for Cross Join
Until now you have only been able to join two tables in MapInfo Pro by using the equal to (=) operator or one of the spatial operators. With MapInfo Pro v2019, we have extended this in a number of ways.
1. Support for = has been extended to not only support columns but also support a variable expression, a constant or a column on either side of the operand.
2. Support for other operators: <, >, <=, >= and <>.
3. You can combine two conditions but only using the AND operator.
Cross Joins will join every record in the first table with every record in the second table. This is also referred to as the Cartesian Product of records from the two tables. The SQL in MapInfo Pro still requires that you also specify a Join Condition between the two tables but as mentioned above, we have extended the way you can specify this Join Condition. The Join Condition will still need to reference a column or the spatial object from both tables.
Another thing to keep in mind is that it's worthwhile to try and limit the number of records in the two tables as that can affect the calculation time. If you have twenty records in both tables, MapInfo Pro needs to compare 400 combinations. If you can cut the records in both tables to the half, MapInfo Pro will only have to compare 100 records, 1/4 of the original. Sometimes it is worth doing one or two queries and using the resulting queries in the Cross Join instead of using the base tables.
These improvements mean a great deal to what you actually can do via a join now. Let's take a look at an example.
Finding objects nearby
Until now, you have only been able to do spatial joins on objects that in some way touches each other. This limit has now been removed. You can now also create joins on tables and find objects that are nearby.
Here's an example where we are looking for address points near a planning zone. In our example, I'm looking for addresses near a specific planning zone. I'll select that planning zone on the map and use the selection in my query. I could also have specified the ID of the planning zome in the query.
The query I'm using looks like this:
Select a.Address, q.plannavn
From Addresses As "a", Selection As "q"
Where CartesianObjectDistance(a.obj, q.obj, "m") < 250
Into Selection
I'm using the function CartesianObjectDistance() to calculate the distance between the planning zone and the address points. If this distance is less than 250 meters, the query will return the address point.
When executed, you can see that I get not only the addresses inside the planning zone selected, but also those that are outside but within the given distance. The selected addresses are here highlighted in red.
You can also extend this a bit to not include the addresses inside the planning zone:
Select a.Address, q.plannavn
From Addresses As "a", Selection As "q"
Where CartesianObjectDistance(a.obj, q.obj, "m") < 250
And NOT a.Obj Within q.Obj
Into Selection
As you can see you are not limited to only one condition in your Join statement
Finding Addresses Near Highways
In this second example, we are looking for addresses near highways. Our highways in the highways table aren't just one polyline for each highway. Each highway is divided into multiple polylines, often from street intersection to street intersection but it is also split when it crosses an administrative boundary. The highways cover around 80 km and the table holds almost 600 records.
Also worth considering is that we don't need to know which specific segment an address is near-by, we just want to find all the addresses that are closer than 200 meters to a highway.
In order to limit the number of polylines in order to join, we want to merge them based on the name of the highway. In this way, we can see which highway the address is near and also if an address is near multiple highways. This means that we divide our task into two Select statements: One that merges the highways and another that joins the merged highways with the addresses. Luckily, the new SQL Window allows us to run them both at the same time.
The two Select statements look like this:
Select STREET, AggregateCombine(obj)
From Highways
Group By STREET
Into qAggregatedHighways NoSelect
Select a.Address, h.STREET
From Addresses As "a", qAggregatedHighways As "h"
Where CartesianObjectDistance(a.obj, h.obj, "m") < 200
Into Selection
As I mentioned, I can type them both into the SQL Window and execute them at the same time.
The result was that in this specific case 89 addresses were closer than 200 meters to a highway. No addresses were within 200 meters of more than one highway. These two queries took around 15 seconds to execute.
I also tried using the two base tables and so joining every address to every highway polyline. That took more than 9 minutes to complete. Sometimes it worth writing an additional Select statement.
In the first case, I was comparing 14,000 address records to 2 highway records resulting in 28,000 combinations. In the second case, I was it was still 14,000 address records but now 600 highway records resulting in 8,400,000 combinations.
I hope you have found this post useful. Can you see some use cases where this will help you? Feel free to share your thoughts in the comments section below.
------------------------------
Peter Horsbøll Møller
Distinguished Engineer
Pitney Bowes Software & Data
------------------------------