MapInfo Pro

 View Only

MapInfo Monday: Two Ways to Update a Table with Values from another Table

  • 1.  MapInfo Monday: Two Ways to Update a Table with Values from another Table

    Employee
    Posted 03-25-2024 03:00

    Happy #MapInfoMonday

    This article is based on a question our technical support was given last week.

    A customer needed to update one table with values from another table. Not that uncommon a question and the answer is straightforward: Use Update Column.

    From the Table tab, click on the Update Column control.

    In the Update Column dialog, select the table and column to update, select the table and the column to get the value from, and finally check the join condition to ensure MapInfo Pro got that right too.
    In my example, I'm using a table of addresses that I want to enrich with values from a table of parcels. I'm using a spatial join: where parcels contains addresses parcels.
    You can use attribute joins too if there is no spatial relationship.
    If you have the MapBasic window open when running the Update Column process, you will see a statement like this:
    Add Column "Addresses" (ejerlavkode) From Parcels Set To ELAVSKODE Where contains
    Back to the question to our support team. The question wasn't as much about how to do it but more about optimizing it to run faster. He had to update not just one column, but multiple columns with values from a second table.
    The obvious solution is to repeat the command above as many times as you have columns to update. 
    Add Column "Addresses" (ejerlavkode) From Parcels Set To ELAVSKODE Where contains
    Add Column "Addresses" (ejerlavnavn) From Parcels Set To ELAVSNAVN Where contains
    Add Column "Addresses" (matrikelnr) From Parcels Set To MATRNR Where contains
    But when you start to look into the performance of it - that's not a good idea. It isn't a big issue if you only have a few hundred thousand records. But when you cross into the millions, you will start noticing the performance issue.
    I ran the test with a table of addresses and parcels for Denmark: 3.7 million addresses and 2.5 million parcels.
    Running one of the Add Column statements above take 5-6 minutes with my sample data. The second also takes this amount of time and the third and fourth. I think you get the point.
    There is another approach where you will start seeing the benefit when you have more than one column to update.
    This other approach has two steps to it:
    1. Join the two tables and include the columns to write to and read from.
    2. Use an Update statement to update the columns in the resulting query.

    This will typically be faster than running 2 of the Add Column statements.

    Here's an example of the Select statement:

    Select a.ejerlavkode As "WriteTo1"
        , a.ejerlavnavn As "WriteTo2"
        , a.matrikelnr As "WriteTo3"
        , p.ELAVSKODE As "ReadFrom1"
        , p.ELAVSNAVN As "ReadFrom2"
        , p.MATRNR As "ReadFrom3"
    From Addresses As "a", Parcels As "p"
    Where a.OBJ Within p.OBJ
    Into __To_Update NoSelect

    Notice that I give my column aliases. This makes it easier for me when creating the Update statement. That's especially the case when the columns share the same names in the two tables.

    After that, we have to update the resulting query with an Update statement:

    Update __To_Update
        Set   WriteTo1 = ReadFrom1
            , WriteTo2 = ReadFrom2
            , WriteTo3 = ReadFrom3

    Do note that the Update statement updates all three columns in one statement. You don't have to use a Update statement for each column. This is also one of the time-saving elements of the second method.

    Using this approach and updating all 3 columns took around 8-10 minutes - for all three columns. The initial approach took around 15-18 minutes for all three columns.

    I used the MapBasic window to build these scripts but you can also use the SQL window.
    This is the full script you can see in the MapBasic window. It also includes some timing variables making it possible to compare the two methods:
    Dim tStart As Time
    Dim fTime As Float
    tStart = CurTime()
    Print Time(24) + " Starting..."
    Select a.ejerlavkode As "WriteTo1"
    	, a.ejerlavnavn As "WriteTo2"
    	, a.matrikelnr As "WriteTo3"
    	, p.ELAVSKODE As "ReadFrom1"
    	, p.ELAVSNAVN As "ReadFrom2"
    	, p.MATRNR As "ReadFrom3"
    From Addresses As "a", Parcels As "p"
    Where a.OBJ Within p.OBJ
    Into __To_Update NoSelect
    fTime = CurTime() - tStart
    Print Time(24) + " Tables have been joined: " + fTime + " secs"
    Set Table Addresses FastEdit On Undo Off
    Update __To_Update
    	Set   WriteTo1 = ReadFrom1
    		, WriteTo2 = ReadFrom2
    		, WriteTo3 = ReadFrom3
    fTime = CurTime() - tStart
    Print Time(24) + " Table Addresses has been updated: " + fTime + " secs"
    Commit Table Addresses Interactive

    Similarly, the first script with the Add Column statement with timing variables:

    Dim tStart As Time
    Dim fTime As Float
    tStart = CurTime()
    Set Table Addresses FastEdit On Undo Off
    Print Time(24) + " Starting..."
    Add Column "Addresses" (ejerlavkode) From Parcels Set To ELAVSKODE Where contains
    fTime = CurTime() - tStart
    Print Time(24) + " 'Ejerlavkode' updated: " + fTime + " secs"
    Add Column "Addresses" (ejerlavnavn) From Parcels Set To ELAVSNAVN Where contains
    fTime = CurTime() - tStart
    Print Time(24) + " 'Ejerlavnavn' updated: " + fTime + " secs"
    Add Column "Addresses" (matrikelnr) From Parcels Set To MATRNR Where contains
    fTime = CurTime() - tStart
    Print Time(24) + " 'Matrikelnr' updated: " + fTime + " secs"
    Commit Table Addresses Interactive

    If you pay attention you will also spot the Set Table ... FastEdit On statement in the two examples above. This puts the table in a state where the changes are written directly to the table and not to a transaction file. This also speeds up the process but comes with the downside that you can't undo or revert your changes. You have to be sure you are doing the right thing - or ensure you have a backup of your table.

    I hope you found this useful even though it got a bit technical too. But then again, we all love a bit of MapBasic code, eh?



    ------------------------------
    Peter Horsbøll Møller
    Principal Presales Consultant | Distinguished Engineer
    Precisely | Trust in Data
    ------------------------------