Re-post: Pagination on SQL Server

Coming from a background in MySQL I always found the lack of a LIMIT clause in MSSQL to be frustrating. In particular, it has always made performing paginated queries expensive and difficult. The most common solutions always seems to be creating two queries with identical WHERE clauses, one which counts the total values and one which returns a subset up the data. This of course means more calls to the database and slightly more code once you have your responses as well as either maintaining the separate WHERE clauses or creating some unreadable method of including a single WHERE clause in multiple queries.

A few years ago Scott Smith gave a talk on SQL optimization and general tips for writing better performing Transact-SQL. One of the tips involved selecting all results and including the ROW_NUMBER() function to number each result, capturing the results within the WITH statement as a temporary named results set or CTE (Common Table Expression) and then immediately performing another select statement on that set which captures the COUNT of the results and limits them by row number. You might say to yourself that this is still executing two queries, and while this is true from the database’s perspective, handling queries is something it does well, and the results speak for themselves.

Example:

WITH foo AS (
    SELECT id
        , make
        , model
        , color
        , ROW_NUMBER() OVER (ORDER BY make, model) AS row
    FROM cars
    WHERE color = 'red'
)

SELECT *, ( SELECT COUNT( id ) FROM foo ) AS total_rows
FROM foo
WHERE row BETWEEN 20 AND 30
ORDER BY row

Notice that all of the conditional logic and ordering is done in the CTE, and the only condition logic or ordering in the final SELECT is performed on the row value.

What I like to do is implement the final query within a function which takes in any criteria I may want to search on, the page I want, and the number of rows I want returned per page. This way all of the necessary logic is wrapped up nicely, and I don’t ever have to write it again. You will also notice that additional meta data about the result set is returned with the query; this means that, for any use of this data, I will not need to perform any calculations other than to determine what the next page is, and this too could be included if you so choose.

Example:

WITH cars_0 AS (

    SELECT id
        , make
        , model
        , color
        , ROW_NUMBER() OVER (ORDER BY make, model, color) as row
    FROM cars
    WHERE 1=1
        AND make IN ()
        AND model IN ()
        AND color IN ()
)

SELECT cars_0.*
    , meta.total_rows
    , meta.total_pages
    ,  as current_page
    ,  as start_row
    ,  as end_row
FROM cars_0,
    (
         SELECT COUNT(1) AS total_rows
             , CEILING( COUNT(1) / ) AS total_pages
         FROM cars_0
    ) as meta
WHERE cars_0.row BETWEEN
    AND
ORDER BY row

One big difference with this query I would like to point out (besides the additional complexity in conditional logic) is that the sub-select which retrieves the total number of records in the CTE has been moved into the FROM clause. This is a significant performance boost for SQL, because instead of performing the COUNT calculation for every record returned by the second query, it only has to do it once and then make those values available to the SELECT statement. Keep this in mind the next time you write ANY query where you feel the need to include a sub-select.

One issue you may encounter with this example is that I have omitted any handling for the inevitable situation that will arise of a query returning zero results. Currently, if that is the case, SQL will throw an error in the second query as it cannot divide by zero.

Special Thanks to Scott Smith for sharing this technique.


This article was originally written on the WolfNet developer blog. Given that the developer blog was not well adopted within the company I am reposing here in case it is abandoned all together.