SQL — Cursors and Their Usage in Databases

An introduction to working with the cursor in PostgreSQL and MS SQL Server

Published in

Stackademic

7 min readJan 14, 2024

When it comes to working with structured data, SQL is still the most important language to use. Although we have multiple options to perform data processing, like using Python as the main language and its useful frameworks, many operations can be performed easily on the database side.

However, SQL is a declarative language and has a few weaknesses, such as working with iterations over a table or part of the data. Today, I’m introducing you to a great tool inside any SQL dialect — the cursor. To put it simply, the cursor allows us to consume the results while the database engine is generating them, thus opening up many possible solutions. So, let’s start with the basics.

I use the dataset from the NASDAQ stock screener, which was also used in the previous article on SQL indexes. You can also directly access the CSV file with the dataset from my GitHub. The same dataset is used for both PostgreSQL and MS SQL Server examples.

SQL — Indexes in the Databases and Why Do We Need Them

A comprehensive explanation of indexed in relational databases with the PostgreSQL

blog.stackademic.com

medium-materials/sql-data at main · IhorLuk/medium-materials

Contribute to IhorLuk/medium-materials development by creating an account on GitHub.

github.com

Why do we need cursors in SQL?

Although cursors are not commonly used in SQL statements, they are still a crucial concept to understand. When working with a database from Python, you will undoubtedly use a cursor to execute your statements. Therefore, it is important to comprehend how it works on the SQL side. You may have used the following code structure previously to access the database.

import psycopg2

conn = psycopg2.connect()
cur = conn.cursor()

cur.execute("SELECT * FROM test_table;")
res = cur.fetchall()

cur.close()
conn.close()

In certain cases, you may want to process rows individually or in batches, particularly if you are dealing with large amounts of data and memory capacity is a concern. You can fetch a small number of rows at a time to avoid any potential memory issues. In Postgres, for-loops automatically use cursors during processing.

Additionally, you can return a cursor reference from the function. This provides an efficient way to output a large number of rows from the function, allowing the caller to use them as needed.

Basic cursor operations

Working with a cursor is a simple process that involves four main operations. These logical steps provide you with the functionality to work with data in a loop-like structure. The four operations are cursor declaration (DECLARE), opening the newly created cursor (OPEN), fetching data from the cursor (FETCH), and finally closing it once you no longer need it (CLOSE). Let me explain each of these operations in detail.

Cursor operations in one diagram. Source.

Declaring cursor

To begin with, we need to declare a new cursor. This can be done by specifying the name and processing the query. For instance, consider the following query where we want to select the top 5 companies with the highest volume of stocks that entered the market in 2021 (ipoyear). The only difference between the two syntaxes is that in the SQL Server case, we use TOP 5 instead of LIMIT 5.

-- PostgreSQL
DECLARE
    -- Declare cursor
    stock_cursor CURSOR FOR
    SELECT symbol
    FROM public.nasdaq_stocks
    WHERE ipoyear = 2021
    ORDER BY volume
    LIMIT 5;

-- MS SQL Server
DECLARE
    stock_cursor CURSOR FOR
    SELECT TOP 5 symbol
    FROM dbo.nasdaq_data
    WHERE ipoyear = 2021
    ORDER BY volume;

It’s worth mentioning that there are some options available for additional functionality. One particularly popular option is SCROLL, which allows you to move back through the cursor data. You can easily include this keyword in your declaration to enable this functionality.

-- for both cases
DECLARE
    stock_cursor SCROLL CURSOR FOR
    SELECT symbol
    FROM nasdaq_stocks;

I believe that in most cases, a bounded cursor is used which involves a query in its declaration. In both cases, we use the FOR keyword. However, there is also an option to create an unbounded cursor. I could not find a practical use case for this type since it still involves some form of query, so we will not discuss it further.

Opening

Once we declare our cursor, we need to open it to start working with it. This can be done using the OPEN keyword. It’s important to note that when we open the cursor, a new data structure is created on the database side, which allows us to access data from the cursor. However, in the case of Postgres, this step is automatically done for us, so there’s no need to perform it as a separate step.

OPEN stock_cursor;

Fetching data

We have reached the most important step of our topic which is fetching the data. This is where the cursor comes in handy as it allows us to fetch data in batches or by a single row, and perform different logic based on a certain condition. As with the previous operation, we can use the FETCH keyword to retrieve the desired amount of data. Let’s begin with a basic example and get the first rows from the cursor.

-- PostgreSQL
DECLARE
    -- Declare cursor
    stock_cursor CURSOR FOR
    SELECT symbol, volume
    FROM public.nasdaq_stocks
  ORDER BY volume DESC;
 
FETCH NEXT FROM stock_cursor;

-- SQL SERVER
DECLARE
    -- Declare cursor
    stock_cursor CURSOR FOR
    SELECT symbol, volume
    FROM dbo.nasdaq_data
ORDER BY volume DESC;

OPEN stock_cursor; -- the only difference - open the cursor
FETCH NEXT FROM stock_cursor;
-- MCOM, 345306502

It’s not useful to get only one row from the results, so we can use the LIMIT statement to retrieve more. Now, let’s perform fetching in a loop, which is possible for both SQL syntaxes. However, there are some differences between them. For example, in PostgreSQL, we use a variable to process each row in each iteration. In this example, I simply notify that the row has been processed.

DO $$ 
DECLARE
    -- Declare variables
    row_data public.nasdaq_stocks%ROWTYPE;
    -- Declare cursor
    symbol_cursor CURSOR FOR
  SELECT symbol, volume
    FROM public.nasdaq_stocks
    WHERE ipoyear = 2021
  ORDER BY volume DESC;
BEGIN
    -- Loop to fetch rows
    FOR row_data IN symbol_cursor
    LOOP
        -- Process the row
        RAISE NOTICE 'Processing row: %', row_data;

        -- Add your processing logic here
    END LOOP;
    CLOSE symbol_cursor;

END $$;

Let’s now explain the same logic for SQL Server. One of the great features of T-SQL is the @@FETCH_STATUS variable, which stops the iteration once all the data from the cursor has been processed. Additionally, we use variables to obtain data from each iteration.

GO
-- declaring variables and cursor
DECLARE @Symbol VARCHAR(50), @Volume INT;
DECLARE
    stock_cursor CURSOR FOR
    SELECT symbol, volume
    FROM dbo.nasdaq_data
 WHERE ipoyear = 2021
 ORDER BY volume DESC;

OPEN stock_cursor; -- don't forget to open for SQL Server

-- Fetch first row before iterating
FETCH NEXT FROM stock_cursor  
INTO @Symbol, @Volume;

WHILE @@FETCH_STATUS = 0
BEGIN
 -- print to see the output
 PRINT @Symbol + ' - ' + CAST(@Volume AS VARCHAR);
  
 -- fetch the following row
 FETCH NEXT FROM stock_cursor  
 INTO @Symbol, @Volume;
END

-- close the cursor in the end
CLOSE stock_cursor;  
DEALLOCATE stock_cursor;  
GO

Like in a previous example, all the data was processed separately.

You can find various useful features for cursors in T-SQL, and I recommend exploring the documentation. Different types of cursors might benefit your use cases.

Cursors (Transact-SQL) - SQL Server

Cursors (Transact-SQL)

- SQL Server Cursors (Transact-SQL)learn.microsoft.com

Closing

As a final step, you need to close your cursor and deallocate it from memory for SQL Server. Add the CLOSE statement at the end of your transaction.

-- PostgreSQL
CLOSE stock_cursor;

-- MS SQL Server
CLOSE stock_cursor;
DEALLOCATE stock_cursor;

Possible alternatives

Although it may be a useful approach, implementing and working with cursors can be more complex in general. Therefore, I suggest exploring simpler and more common options before resorting to using cursors. Some other options to consider are:

Common Table Expressions (CTEs) — these allow you to write complex queries using a usual syntax. You may be able to achieve your desired functionality without the need for iterations.
Window functions — these provide easy ways to categorize or aggregate your rows, particularly on a micro level.
Recursive queries — while not necessarily the easiest option, these could be helpful for cases involving hierarchical data.

Conclusions

Hopefully, you now have a basic understanding of the cursor and its potential use cases. However, the functionality of the cursor is vast and cannot be fully covered in one article. By understanding these examples, you should be able to delve deeper into the documentation for your specific database system. Keep in mind that if you have a complex workflow that requires precise control on a row-by-row basis, you should consider using a cursor.

I can be found on LinkedIn and I am looking forward to connecting with you. Let’s engage in discussions about the intricate world of data science and data engineering.

Stackademic

Thank you for reading until the end. Before you go:

Please consider clapping and following the writer! 👏
Follow us X | LinkedIn | YouTube | Discord
Visit our other platforms: In Plain English | CoFeed | Venture

SQL — Cursors and Their Usage in Databases

An introduction to working with the cursor in PostgreSQL and MS SQL Server

SQL — Indexes in the Databases and Why Do We Need Them

A comprehensive explanation of indexed in relational databases with the PostgreSQL

medium-materials/sql-data at main · IhorLuk/medium-materials

Contribute to IhorLuk/medium-materials development by creating an account on GitHub.

Why do we need cursors in SQL?

Basic cursor operations

Declaring cursor

Opening

Fetching data

Cursors (Transact-SQL) - SQL Server

Cursors (Transact-SQL)

Closing

Possible alternatives

Conclusions

Stackademic

Written by Ihor Lukianov