When SQL queries result in hundreds and then thousands of records being pulled into an application, memory starts to become a real problem, and that hits performance.
It's pretty rare that a user will want to view 40,000 records in one go, so most applications will present the data to a user across a number of pages. So why pull all that data into the application only to show the user a fraction of the records (probably never more than 100 records at any one time)?
Typically, when working with embeded SQL, the application will handle a lot more data than necessary.
Really, we only ever want to call the records that we’re going to display on a particular page.
"Select * from Orders" isn't going to cut it.
Developers working on MySQL have a very easy solution. Limit. We can easily get the specific number of records that we want to display on a page.
This will give us the first 10 records (our first page):
"Select * FROM Orders Limit 10"
This will give us the next 10 (page 2):
"Select * FROM Orders Limit 10, 10"
However developers working with SQL Server don’t have access to this very handy "Limit" option (that MySQL has been allowing developers to use for as long as I can remember), but there is a simple alternative.
Some of the more up to date Microsoft options (such as Linq-to-SQL), or indeed utilising Stored Procedures, make this a lot easier, but we need to support those applications that didn’t benefit from such “modern advancements”.
And, although it’s not as simple as "Limit", it’s only a few lines of code, and is well worthwhile.
The solution is to run the query by proxy. We can create our normal query, but we won’t run this query ourselves. We’ll ask another query to run it for us, eliminate all of the records that we're not interested in and pass us the ones that we are interested in.
So, starting with our standard query, we need to make one small tweak. We want to know the index of each row that we’ve retrieved. We can do this by adding an additional column RowNumber as follows:
SELECT *, ROW_NUMBER() OVER (ORDER BY OrderDate) AS 'RowNumber' FROM Orders
Note that we need to supply an order by here. This is important, but you can simply use the identity (e.g. OrderID) if needed).
If we run this query, we'll receive the full set of results, with the addition of our new RowNumber column.
Any required filtering should be added to this initial query (such as "Where CustomerID=10"). We want to get a full set of results here.
This is where we start to notice the real change. Below we have a piece of SQL that can be heavily reused.
The following SQL Query will run our requested search through the data and return a subset of the data. Our paged data.
WITH TheTable AS
(SELECT *, ROW_NUMBER() OVER (ORDER BY OrderDate) AS 'RowNumber' FROM Orders)
SELECT * FROM TheTable
WHERE RowNumber BETWEEN 11 AND 20
ORDER BY RowNumber
You'll notice our initial query embedded within this larger query. This, along with the numbers (11 & 20) control that data that we get back. The rest is totally generic and will save your application a great deal of memory.