Selects With No Limits noreply@blogger.com (Kent Milligan – IBM Technology Expert Labs)

[[{“value”:”There’s no
limit on the number of rows that an SQL SELECT statement can return to an
application by default. There are a couple of different techniques, however,
that programmers can employ to change this default behavior. One direct option
is adding a Fetch or Limit
clause to set the maximum numbers of rows the query can return. Usage of these
clauses can prevent accidentally sending millions of rows across the network or
ensuring a query only returns the number of rows that can be displayed on the
end user interface.
Like many
syntax options, there’s nothing that prevents these clauses from being used
incorrectly. The following examples highlight a poor coding practice with these
clauses that has recently emerged in SQL performance reviews performed by our Technology
Expert Labs team. Both of these SQL examples will fail if the specified SELECT
returns more than one row.
SELECT c1 INTO :hv1 FROM t1 WHERE c2=1 AND c3=’A’ FETCH FIRST 1 ROWS ONLY

SET :hv1 = (SELECT c1 FROM t1 WHERE c2=1 AND c3=’A’ LIMIT 1)

While these
limiting clauses prevent the SQL from failing with a More Than One Row
error, there are several issues with this coding approach.
One big issue
is data integrity. If these clauses are
being used just as a safety net – meaning the query shouldn’t return more than row but it’s
added just in case that happens, then these clauses will prevent the
application from ever detecting that the query is returning more than one row.
Multiple rows being returned could be caused by a data integrity problem, but
that potential problem is hidden from the application with this coding practice.
Furthermore, if it’s normal that the query could return more than one row from
time to time, these SQL requests have told Db2 it’s okay to randomly select
which row is returned to the program. That potential randomness exists because
these SELECT statements don’t have an ORDER BY clause. In this case, Db2 is
allowed to return the selected rows in any order that it wants. The chances of
the randomness goes up even higher if these queries are eligible for parallel
processing with Db2 SMP. I think
you’d agree it’s a much cleaner and safer approach to control which row is
assigned to the host variable instead of assigning a value at random.
The other
issue is performance. These clauses require Db2 to track how many rows a query
is returning and that tracking takes time and resources. Some internal tests
show these clauses adding 0.5% overhead to the execution of a query. That
overhead may seem like nothing to worry about, but just multiply that little
overhand times thousands or millions of statement executions.
The Sky’s the Limit when it comes to the scalability of your SQL applications…
assuming that you’re utilizing good SQL coding practices. If you’re company
needs help learning efficient SQL coding practices, then our IBM Technology Expert Labs
team is here to help.”}]] Read More