Q-Base

SQL Technology Specialists

The Power of the Over() Clause

clock November 5, 2011 06:19 by author Jelluh

In SQL 2005 the Over Clause MSDN has been introduced. I think it's a great asset because of its power to be efficient and easy to use.

To see the difference lets work out an example for Adventureworks and try different solutions and compare the code with each other.

Case:
Multiple salespersons have been complaining to the management that other salespersons still sell orders in their territory.
The Management has asked you to give them a list which gives them information about the last time salespersons sold outside their territory.


Ofcourse both solutions will give use the exact same output:

Solution 1 using Group By

;WITH cteLastSalesOutsideGivenTerrority AS
(    
    SELECT MAX(tSSOH.SalesOrderNumber)          AS SalesOrderNumber
          ,tSSOH.SalesPersonID                  AS SalesPersonID
    FROM Sales.SalesOrderHeader             tSSOH
         INNER JOIN Sales.SalesPerson       tSSP    ON tSSP.SalesPersonID    = tSSOH.SalesPersonID
                                                       AND tSSP.TerritoryID != tSSOH.TerritoryID
    GROUP BY tSSOH.SalesPersonID                                                              
)
SELECT  cLSOGT.SalesPersonID                 AS SalesPersonID
       ,tPC.FirstName + ' ' + tPC.LastName   AS SalesPerson
       ,tSST_SSOH.Name                       AS TerroritySold
       ,tSST_SSP.Name                        AS TerrorityGiven
       ,tSSOH.OrderDate                      AS OrderDate
       ,tSSOH.SalesOrderNumber               AS SalesOrderNumber
FROM cteLastSalesOutsideGivenTerrority  cLSOGT
     INNER JOIN Sales.SalesOrderHeader  tSSOH       ON tSSOH.SalesOrderNumber      = cLSOGT.SalesOrderNumber
     INNER JOIN Sales.SalesPerson       tSSP        ON tSSP.SalesPersonID          = tSSOH.SalesPersonID   
     INNER JOIN HumanResources.Employee tHRE        ON tHRE.EmployeeID             = cLSOGT.SalesPersonID 
     INNER JOIN Person.Contact          tPC         ON tPC.ContactID               = tHRE.ContactID
     INNER JOIN Sales.SalesTerritory    tSST_SSOH   ON tSST_SSOH.TerritoryID       = tSSOH.TerritoryID
     INNER JOIN Sales.SalesTerritory    tSST_SSP    ON tSST_SSP.TerritoryID        = tSSP.TerritoryID

 

In the example above i first group the data to find the last order a salesperson has made outside his terrority. In this case i may assume that the max ordernumber gives the last order. The cteLastSalesOutsideGivenTerrority is then been used to find the rest of the information that is needed. You will see that i need to join most of the tables used in the cte again to get the rest of the information needed for the report. I rather wouldn't do this but i have no other option here.

When we look at the execution plan we will see what we would expect, first the grouping is being handled:

And then the rest of the query:

Solution 2 using Over Clause

If we would solve the same case with the over clause then the code would be like this:

;WITH cteSalesOutsideGivenTerrority AS
(
    SELECT ROW_NUMBER() OVER(PARTITION BY tSSOH.SalesPersonID ORDER BY tSSOH.SalesOrderNumber DESC) AS RowNumber
          ,tSSOH.SalesPersonID                  AS SalesPersonID
          ,tPC.FirstName + ' ' + tPC.LastName   AS SalesPerson
          ,tSST_SSOH.Name                       AS TerroritySold
          ,tSST_SSP.Name                        AS TerrorityGiven
          ,tSSOH.OrderDate                      AS OrderDate
          ,tSSOH.SalesOrderNumber               AS SalesOrderNumber
    FROM Sales.SalesOrderHeader             tSSOH
         INNER JOIN Sales.SalesPerson       tSSP        ON tSSP.SalesPersonID          = tSSOH.SalesPersonID
                                                            AND tSSP.TerritoryID      != tSSOH.TerritoryID
         INNER JOIN HumanResources.Employee tHRE        ON tHRE.EmployeeID             = tSSP.SalesPersonID 
         INNER JOIN Person.Contact          tPC         ON tPC.ContactID               = tHRE.ContactID
         INNER JOIN Sales.SalesTerritory    tSST_SSOH   ON tSST_SSOH.TerritoryID       = tSSOH.TerritoryID
         INNER JOIN Sales.SalesTerritory    tSST_SSP    ON tSST_SSP.TerritoryID        = tSSP.TerritoryID
)
SELECT   cSOGT.SalesPersonID
        ,cSOGT.SalesPerson
        ,cSOGT.TerroritySold
        ,cSOGT.TerrorityGiven
        ,cSOGT.OrderDate
        ,cSOGT.SalesOrderNumber
FROM cteSalesOutsideGivenTerrority cSOGT
WHERE RowNumber = 1;

 

With the row_number() function i'll give a number to each row starting from 1 that meets my over clause conditions.
Within the over clause i say it needs to group (=partition) my result by salesperson and order it by salesordernumber descending.
So the Row_Number function will start to count the rows by salesperson starting by the highest salesordernumber:

You will notice i dont have to join the same table multiple times to lookup the extra information, i'll now can handle it in one query only.

With this result given we then can say that we want all results with rownumber 1
When we look at the execution plan then we notice the it will first handle the joins

And then the result get scanned once and sorted by SalesPersonID ascending, SalesNumber descending and uses the segment operator the calculate the rownumbers which gets filtered later on.

If we compare the performance in one batch we see that the over clause is a much more efficient solution.

Both query will give use the exact same result, only with different performance!

 



Execute a storedprocedure from a batchfile with arguments

clock October 30, 2011 04:58 by author Jelluh

Sometimes you want to be flexible and control a storedprocedure with commands executed from a batch file, which for example you can fire from command promp or SQL Agent.
I made an example which shows you how to store entries to a table given by a batch file.


Lets start with making a table and storedprocedure to for example store the given argument(s).
The following code makes a table named tLogEntry and a Stored Procedure named p_InsertLogEntry which stores the given values to the table tLogEntry.

USE AdventureWorks
GO
 

/* Create Sample Table to store input values from batch command */
CREATE TABLE [dbo].[tLogEntry]
( LogID INT IDENTITY(1,1) NOT NULL
, Value NVARCHAR(50) NOT NULL
, LogDate DATETIME DEFAULT GETDATE() NOT NULL
, CONSTRAINT [PK_LogID] PRIMARY KEY CLUSTERED ( LogID ASC ) ON [PRIMARY]
)ON [PRIMARY]

GO

/* Create Stored Procedure with parameter to save entries to the Log table */
CREATE PROC [dbo].[p_InsertLogEntry]
@pstrLogValue NVARCHAR(50)
AS
BEGIN

INSERT INTO [dbo].[tLogEntry] (Value)  VALUES( @pstrLogValue );    

END

GO

 


Now i'll use NotePad and store the following code and save the file as LogEvent.bat

@echo Off

if "%1" == "" goto error

set LogEvent=%1

sqlcmd -E -d AdventureWorks -Q "exec p_InsertLogEntry @pstrLogValue=$(myLogValue)" /v myLogValue=%LogEvent%

goto end

:error
echo missing argument!
echo usage : LogEvent.bat <argument> 
:end

The %1 will catch the first argument given to the batch file and stores it into a LogEvent variable, If you want to use more batch arguments then you can use %2, %3 etc.
With SQLCMD we can execute a sql statement on a database, for more information about the SQLCMD see MSDN

I'll execute the storedprocedure p_InsertLogEntry with parameter @pstrLogValue and fill the entry with a variable called myLogValue, then i'll use the /v command to fill myLogValue with the given batch argument which i stored into %LogEvent% earlier.


Now you can fire the batch file with an argument to test it for example:

When we query the table we will see the logentries are saved with the values given to the batchfile.

The same thing can be achieved with powershell, which we will show you in an upcoming blog!

Happy programming!