Register
Login
YOUR CART IS EMPTY

Potential Projection/Filtering out-of-order issue?

Boy does time fly!

This thread looks to be a little on the old side and therefore may no longer be relevant. Please see if there is a newer thread on the subject and ensure you're using the most recent build of any software if your question regards a particular product.

This thread has been locked

This thread has been locked and is no longer accepting new posts, if you have a question regarding this topic please email us at support@mindscape.co.nz

BillMoller
42 posts

Using LightSpeed 5.0.2872, specifically when paging, LightSpeed generates code (in my project) as follows:

SELECT
  RWX_Listings.*
FROM
  (
    SELECT
      RWX_Listings.Id AS [RWX_Listings.Id],
      RWX_Listings.AcceptedActionCount AS [RWX_Listings.AcceptedActionCount],
      RWX_Listings.AutoRelistRemaining AS [RWX_Listings.AutoRelistRemaining],
      RWX_Listings.CreatedOn AS [RWX_Listings.CreatedOn],
      RWX_Listings.CurrencyId AS [RWX_Listings.CurrencyId],
      RWX_Listings.CurrentListingActionID AS [RWX_Listings.CurrentListingActionID],
      RWX_Listings.CurrentListingActionUserName AS [RWX_Listings.CurrentListingActionUserName],
      RWX_Listings.CurrentPrice AS [RWX_Listings.CurrentPrice],
      RWX_Listings.CurrentQuantity AS [RWX_Listings.CurrentQuantity],
      RWX_Listings.DeletedOn AS [RWX_Listings.DeletedOn],
      RWX_Listings.Description AS [RWX_Listings.Description],
      RWX_Listings.Duration AS [RWX_Listings.Duration],
      RWX_Listings.EndDTTM AS [RWX_Listings.EndDTTM],
      RWX_Listings.Hits AS [RWX_Listings.Hits],
      RWX_Listings.Increment AS [RWX_Listings.Increment],
      RWX_Listings.LastUpdatedUser AS [RWX_Listings.LastUpdatedUser],
      RWX_Listings.ListingTypeId AS [RWX_Listings.ListingTypeId],
      RWX_Listings.Location AS [RWX_Listings.Location],
      RWX_Listings.OriginalPrice AS [RWX_Listings.OriginalPrice],
      RWX_Listings.OriginalQuantity AS [RWX_Listings.OriginalQuantity],
      RWX_Listings.OriginalRelistCount AS [RWX_Listings.OriginalRelistCount],
      RWX_Listings.OwnerId AS [RWX_Listings.OwnerId],
      RWX_Listings.OwnerUserName AS [RWX_Listings.OwnerUserName],
      RWX_Listings.PrimaryCategoryId AS [RWX_Listings.PrimaryCategoryId],
      RWX_Listings.PrimaryImageURI AS [RWX_Listings.PrimaryImageURI],
      RWX_Listings.RelistIteration AS [RWX_Listings.RelistIteration],
      RWX_Listings.StartDTTM AS [RWX_Listings.StartDTTM],
      RWX_Listings.Status AS [RWX_Listings.Status],
      RWX_Listings.Subtitle AS [RWX_Listings.Subtitle],
      RWX_Listings.Title AS [RWX_Listings.Title],
      RWX_Listings.UpdatedOn AS [RWX_Listings.UpdatedOn],
      RWX_Listings.Version AS [RWX_Listings.Version],
      ROW_NUMBER() OVER(ORDER BY RWX_Listings.EndDTTM) as RowNumber
    FROM
      RWX_Listings
  )
  RWX_Listings
WHERE
  RowNumber > 33222 AND
  RowNumber <= 33264;

The Reads count is 106106 (from SQL profiler), or approximately 829MB

This appears to be projecting first and then filtering, because if I re-write it as follows:

;with resultSet as(
                select Id, RowNumber = ROW_NUMBER() over(order by RWX_Listings.EndDTTM)                     
                        from RWX_Listings
            )

select * from RWX_Listings where Id in (select Id from resultSet
    where RowNumber > 33222 AND
  RowNumber <= 33264)   
  order by RWX_Listings.EndDTTM

the number of reads drops to 203, or approximately 1.6MB. This is 0.2% of the original number of reads (CPU and Duration are also greatly decreased, even with the potential double sorting to maintain proper ordering).

Am I missing something, or could LightSpeed's paging implementation be greatly improved? Of course many other factors may be coming into play here (Indexes, Statistics, etc etc etc)... However, LightSpeed's query text appears to be projecting all of the resultant columns into the derived table in addition to RowNumber, from which only one page's worth is actually selected (however large that page may be).

Using a CTE, I'm able to project just the Primary Key and RowNumber, and then, perform the paging operation (and re-sort to maintain the order over the "in" clause, though even that seems to be unnecessary).

(SQL Version: SQL Server 2012 Service Pack 2, LightSpeed dataProvider="SqlServer2008")

Thanks, ~Bill

Posted on Dec 30 2014

jeremy 3,400 posts	What is the query you are running in your code? The first query would be generated if we cannot translate the query into a server side projection so have to select the whole table to perform a client side projection. Note: We will never produce a CTE as part of the SQL translation as this is not an available syntax across all providers. Jeremy Posted on Jan 05 2015

BillMoller
42 posts

This is my code:

int pageSize = 42;
int pageIndex = 600;
string sortColumn = "EndDTTM";

var preQuery = new Query();
if (pageSize != 0) preQuery.Page = Page.At(pageIndex * pageSize, pageSize);
preQuery.Order = Order.By(sortColumn).AndBy("Id");
var query = uow.Find<Listing>(preQuery);

LightSpeed generates the following SQL:

exec sp_executesql N'SELECT
  RWX_Listings.*
FROM
  (
    SELECT
      RWX_Listings.Id AS [RWX_Listings.Id],
      RWX_Listings.AcceptedActionCount AS [RWX_Listings.AcceptedActionCount],
      RWX_Listings.AutoRelistRemaining AS [RWX_Listings.AutoRelistRemaining],
      RWX_Listings.CreatedOn AS [RWX_Listings.CreatedOn],
      RWX_Listings.CurrencyId AS [RWX_Listings.CurrencyId],
      RWX_Listings.CurrentListingActionID AS [RWX_Listings.CurrentListingActionID],
      RWX_Listings.CurrentListingActionUserName AS [RWX_Listings.CurrentListingActionUserName],
      RWX_Listings.CurrentPrice AS [RWX_Listings.CurrentPrice],
      RWX_Listings.CurrentQuantity AS [RWX_Listings.CurrentQuantity],
      RWX_Listings.DeletedOn AS [RWX_Listings.DeletedOn],
      RWX_Listings.Description AS [RWX_Listings.Description],
      RWX_Listings.Duration AS [RWX_Listings.Duration],
      RWX_Listings.EndDTTM AS [RWX_Listings.EndDTTM],
      RWX_Listings.Hits AS [RWX_Listings.Hits],
      RWX_Listings.Increment AS [RWX_Listings.Increment],
      RWX_Listings.LastUpdatedUser AS [RWX_Listings.LastUpdatedUser],
      RWX_Listings.ListingTypeId AS [RWX_Listings.ListingTypeId],
      RWX_Listings.Location AS [RWX_Listings.Location],
      RWX_Listings.OriginalPrice AS [RWX_Listings.OriginalPrice],
      RWX_Listings.OriginalQuantity AS [RWX_Listings.OriginalQuantity],
      RWX_Listings.OriginalRelistCount AS [RWX_Listings.OriginalRelistCount],
      RWX_Listings.OwnerId AS [RWX_Listings.OwnerId],
      RWX_Listings.OwnerUserName AS [RWX_Listings.OwnerUserName],
      RWX_Listings.PrimaryCategoryId AS [RWX_Listings.PrimaryCategoryId],
      RWX_Listings.PrimaryImageURI AS [RWX_Listings.PrimaryImageURI],
      RWX_Listings.RelistIteration AS [RWX_Listings.RelistIteration],
      RWX_Listings.StartDTTM AS [RWX_Listings.StartDTTM],
      RWX_Listings.Status AS [RWX_Listings.Status],
      RWX_Listings.Subtitle AS [RWX_Listings.Subtitle],
      RWX_Listings.Title AS [RWX_Listings.Title],
      RWX_Listings.UpdatedOn AS [RWX_Listings.UpdatedOn],
      RWX_Listings.Version AS [RWX_Listings.Version],
      ROW_NUMBER() OVER(ORDER BY RWX_Listings.EndDTTM,RWX_Listings.Id) as RowNumber
    FROM
      RWX_Listings
    WHERE
      RWX_Listings.DeletedOn IS NULL
  )
  RWX_Listings
WHERE
  RowNumber > @p0 AND
  RowNumber <= @p1;

SELECT
  RWX_Currencies.Id,
  RWX_Currencies.Code,
  RWX_Currencies.ConversionToUSD,
  RWX_Currencies.CreatedOn,
  RWX_Currencies.DeletedOn,
  RWX_Currencies.LastUpdatedUser,
  RWX_Currencies.UpdatedOn
FROM
  RWX_Currencies
WHERE
  (EXISTS (
    SELECT
      RWX_Listings.*
    FROM
      (
        SELECT
          RWX_Listings.*,
          ROW_NUMBER() OVER(ORDER BY RWX_Listings.EndDTTM,RWX_Listings.Id) as RowNumber
        FROM
          RWX_Listings
        WHERE
          RWX_Listings.DeletedOn IS NULL AND
          RWX_Listings.DeletedOn IS NULL
      )
      RWX_Listings
    WHERE
      RWX_Listings.CurrencyId = RWX_Currencies.Id AND
      RowNumber > @p0 AND
      RowNumber <= @p1
  ) AND RWX_Currencies.DeletedOn IS NULL);

SELECT
  RWX_ListingTypes.Id,
  RWX_ListingTypes.CreatedOn,
  RWX_ListingTypes.DeletedOn,
  RWX_ListingTypes.LastUpdatedUser,
  RWX_ListingTypes.Name,
  RWX_ListingTypes.UpdatedOn
FROM
  RWX_ListingTypes
WHERE
  (EXISTS (
    SELECT
      RWX_Listings.*
    FROM
      (
        SELECT
          RWX_Listings.*,
          ROW_NUMBER() OVER(ORDER BY RWX_Listings.EndDTTM,RWX_Listings.Id) as RowNumber
        FROM
          RWX_Listings
        WHERE
          RWX_Listings.DeletedOn IS NULL AND
          RWX_Listings.DeletedOn IS NULL
      )
      RWX_Listings
    WHERE
      RWX_Listings.ListingTypeId = RWX_ListingTypes.Id AND
      RowNumber > @p0 AND
      RowNumber <= @p1
  ) AND RWX_ListingTypes.DeletedOn IS NULL)',N'@p0 int,@p1 int',@p0=25200,@p1=25242

If I run just the following (manually replacing @p0 and @p1):

SELECT
  RWX_Listings.*
FROM
  (
    SELECT
      RWX_Listings.Id AS [RWX_Listings.Id],
      RWX_Listings.AcceptedActionCount AS [RWX_Listings.AcceptedActionCount],
      RWX_Listings.AutoRelistRemaining AS [RWX_Listings.AutoRelistRemaining],
      RWX_Listings.CreatedOn AS [RWX_Listings.CreatedOn],
      RWX_Listings.CurrencyId AS [RWX_Listings.CurrencyId],
      RWX_Listings.CurrentListingActionID AS [RWX_Listings.CurrentListingActionID],
      RWX_Listings.CurrentListingActionUserName AS [RWX_Listings.CurrentListingActionUserName],
      RWX_Listings.CurrentPrice AS [RWX_Listings.CurrentPrice],
      RWX_Listings.CurrentQuantity AS [RWX_Listings.CurrentQuantity],
      RWX_Listings.DeletedOn AS [RWX_Listings.DeletedOn],
      RWX_Listings.Description AS [RWX_Listings.Description],
      RWX_Listings.Duration AS [RWX_Listings.Duration],
      RWX_Listings.EndDTTM AS [RWX_Listings.EndDTTM],
      RWX_Listings.Hits AS [RWX_Listings.Hits],
      RWX_Listings.Increment AS [RWX_Listings.Increment],
      RWX_Listings.LastUpdatedUser AS [RWX_Listings.LastUpdatedUser],
      RWX_Listings.ListingTypeId AS [RWX_Listings.ListingTypeId],
      RWX_Listings.Location AS [RWX_Listings.Location],
      RWX_Listings.OriginalPrice AS [RWX_Listings.OriginalPrice],
      RWX_Listings.OriginalQuantity AS [RWX_Listings.OriginalQuantity],
      RWX_Listings.OriginalRelistCount AS [RWX_Listings.OriginalRelistCount],
      RWX_Listings.OwnerId AS [RWX_Listings.OwnerId],
      RWX_Listings.OwnerUserName AS [RWX_Listings.OwnerUserName],
      RWX_Listings.PrimaryCategoryId AS [RWX_Listings.PrimaryCategoryId],
      RWX_Listings.PrimaryImageURI AS [RWX_Listings.PrimaryImageURI],
      RWX_Listings.RelistIteration AS [RWX_Listings.RelistIteration],
      RWX_Listings.StartDTTM AS [RWX_Listings.StartDTTM],
      RWX_Listings.Status AS [RWX_Listings.Status],
      RWX_Listings.Subtitle AS [RWX_Listings.Subtitle],
      RWX_Listings.Title AS [RWX_Listings.Title],
      RWX_Listings.UpdatedOn AS [RWX_Listings.UpdatedOn],
      RWX_Listings.Version AS [RWX_Listings.Version],
      ROW_NUMBER() OVER(ORDER BY RWX_Listings.EndDTTM,RWX_Listings.Id) as RowNumber
    FROM
      RWX_Listings
    WHERE
      RWX_Listings.DeletedOn IS NULL
  )
  RWX_Listings
WHERE
  RowNumber > 25200 AND
  RowNumber <= 25242;

Reads are 80,591.

If I convert the above as follows:

;with resultSet as(
                select Id, RowNumber = ROW_NUMBER() over(ORDER BY RWX_Listings.EndDTTM,RWX_Listings.Id)                     
                        from RWX_Listings
                        WHERE
      RWX_Listings.DeletedOn IS NULL
            )

select * from RWX_Listings where Id in (select Id from resultSet
    where RowNumber > 25200 AND
  RowNumber <= 25242)   
  order by RWX_Listings.EndDTTM,RWX_Listings.Id

Reads are 256.

I understand CTE's are not available across all providers, but I'm specifically using the LightSpeed "SqlServer2008" provider, and SQL Server 2008 supports CTE's (actually, SQL 2005-2014 as far as I can tell). I would have thought specific providers would be coded specifically for their target databases...

It just so happens that I have an index that contains "EndDTTM", "DeletedOn", and implicitly "Id," which is where the performance comes from, but projecting all columns in the derived table forces SQL to never use this index (whether it exists or not) and incurs a very costly clustered key lookup to fulfill the projection...

I don't think mine is that much of an edge case, I'm simply sorting and paging entities from LightSpeed...

Please don't misunderstand, I'm not trying to criticize MindScape or LightSpeed (it is, after all, my OR/M tool of choice), however, if I did stumble on an area for (potentially great, 314 times fewer reads in my case) performance improvement, I'd like to see it built into LightSpeed :)

Thanks, ~Bill

Posted on Jan 07 2015

jeremy
3,400 posts

Hi Bill,

Thanks for the clarification.

We select all columns because they are needed when populating the entity for hydrating the result set. If you approached this with 2 queries (e.g. first select the Ids, then select the entities which matched those id's) I am guessing you would see similar results in terms of the number of reads.

I certainly know what you are meaning about CTEs being something we could leverage when using the SQL 2005 or higher providers (clearly they do perform a lot better for this type of case!). Its not really supportable in our current query pipeline but Ill do some investigation into how we might be able to approach something like this and let you know if we can make an improvement here.

Jeremy

Posted on Jan 07 2015

BillMoller
42 posts

I understand the need to select all columns to hydrate the entity, which is why I have the following in my code:

select * from RWX_Listings where Id in ...

Also, my revised query:

;with resultSet as(
                select Id, RowNumber = ROW_NUMBER() over(ORDER BY RWX_Listings.EndDTTM,RWX_Listings.Id)                     
                        from RWX_Listings
                        WHERE
      RWX_Listings.DeletedOn IS NULL
            )

select * from RWX_Listings where Id in (select Id from resultSet
    where RowNumber > 25200 AND
  RowNumber <= 25242)   
  order by RWX_Listings.EndDTTM,RWX_Listings.Id

appears to actually be considered a single Query (both by actual execution plan, and by Profiler as a complete SQL Batch), and this part:

;with resultSet as(
                select Id, RowNumber = ROW_NUMBER() over(ORDER BY RWX_Listings.EndDTTM,RWX_Listings.Id)                     
                        from RWX_Listings
                        WHERE
      RWX_Listings.DeletedOn IS NULL
            )

cannot be run independent of this part:

select * from RWX_Listings where Id in (select Id from resultSet
    where RowNumber > 25200 AND
  RowNumber <= 25242)   
  order by RWX_Listings.EndDTTM,RWX_Listings.Id

The total reads remains ~200.

What I believe is happening, is that your derived table, the inner:

(
    SELECT
      RWX_Listings.Id AS [RWX_Listings.Id],
      RWX_Listings.AcceptedActionCount AS [RWX_Listings.AcceptedActionCount],
      RWX_Listings.AutoRelistRemaining AS [RWX_Listings.AutoRelistRemaining],
      RWX_Listings.CreatedOn AS [RWX_Listings.CreatedOn],
      RWX_Listings.CurrencyId AS [RWX_Listings.CurrencyId],
      RWX_Listings.CurrentListingActionID AS [RWX_Listings.CurrentListingActionID],
      RWX_Listings.CurrentListingActionUserName AS [RWX_Listings.CurrentListingActionUserName],
      RWX_Listings.CurrentPrice AS [RWX_Listings.CurrentPrice],
      RWX_Listings.CurrentQuantity AS [RWX_Listings.CurrentQuantity],
      RWX_Listings.DeletedOn AS [RWX_Listings.DeletedOn],
      RWX_Listings.Description AS [RWX_Listings.Description],
      RWX_Listings.Duration AS [RWX_Listings.Duration],
      RWX_Listings.EndDTTM AS [RWX_Listings.EndDTTM],
      RWX_Listings.Hits AS [RWX_Listings.Hits],
      RWX_Listings.Increment AS [RWX_Listings.Increment],
      RWX_Listings.LastUpdatedUser AS [RWX_Listings.LastUpdatedUser],
      RWX_Listings.ListingTypeId AS [RWX_Listings.ListingTypeId],
      RWX_Listings.Location AS [RWX_Listings.Location],
      RWX_Listings.OriginalPrice AS [RWX_Listings.OriginalPrice],
      RWX_Listings.OriginalQuantity AS [RWX_Listings.OriginalQuantity],
      RWX_Listings.OriginalRelistCount AS [RWX_Listings.OriginalRelistCount],
      RWX_Listings.OwnerId AS [RWX_Listings.OwnerId],
      RWX_Listings.OwnerUserName AS [RWX_Listings.OwnerUserName],
      RWX_Listings.PrimaryCategoryId AS [RWX_Listings.PrimaryCategoryId],
      RWX_Listings.PrimaryImageURI AS [RWX_Listings.PrimaryImageURI],
      RWX_Listings.RelistIteration AS [RWX_Listings.RelistIteration],
      RWX_Listings.StartDTTM AS [RWX_Listings.StartDTTM],
      RWX_Listings.Status AS [RWX_Listings.Status],
      RWX_Listings.Subtitle AS [RWX_Listings.Subtitle],
      RWX_Listings.Title AS [RWX_Listings.Title],
      RWX_Listings.UpdatedOn AS [RWX_Listings.UpdatedOn],
      RWX_Listings.Version AS [RWX_Listings.Version],
      ROW_NUMBER() OVER(ORDER BY RWX_Listings.EndDTTM,RWX_Listings.Id) as RowNumber
    FROM
      RWX_Listings
    WHERE
      RWX_Listings.DeletedOn IS NULL
  )

is reading the entirety of the RWXListings table (WHERE RWXListings.DeletedOn IS NULL) simply to prepare for the final select, filtered by RowNumber:

SELECT
  RWX_Listings.*
FROM
  (
    {costly derived table read}
  )
  RWX_Listings
WHERE
  RowNumber > 25200 AND
  RowNumber <= 25242;

I understand your current query pipeline wouldn't really support a modification like this, and you therefore may never implement it, but I appreciate the consideration.

Thanks, ~Bill

Posted on Jan 08 2015

BillMoller
42 posts

Additionally, you mentioned:

If you approached this with 2 queries (e.g. first select the Ids, then select the entities which matched those id's) I am guessing you would see similar results in terms of the number of reads.

This is very true if I didn't just so happen to have an index on "EndDTTM", "DeletedOn", and implicitly "Id." But by virtue of having such a small index, the reads are MUCH lower to get the PK's (Id) for the final result set. (Index Scan for a small index, vs. Table Scan for a large table).

The performance gap between the two queries (LightSpeed's, and my proposal) is greatly amplified the larger (or wider) the Entity table is vs. the requested page size.

i.e. Say my RWX_Listings table contained a billion rows, and I wanted to return the 600th page (of size 1). As written, SQL would read and fully project all columns for a billion rows (potentially spilling to tempdb, etc...) to return only a single row's worth of data.

Again, I thank you for the consideration.

Thanks again, ~Bill

Posted on Jan 08 2015

Potential Projection/Filtering out-of-order issue?

Data Products

DevOp Tools

Visual Controls

Popular Products

Quick Links