Splitting Comma-Separated Lists into Rows in SAP Datasphere with SQLScript

Q: Which SQL functions are key to this solution?

OCCURRENCES_REGEXPR (to count commas), SERIES_GENERATE_INTEGER (to generate row numbers dynamically), SUBSTR_BEFORE, SUBSTRING, and LOCATE_REGEXPR (to extract text parts between commas).

Fotios06 November 2025 9 min read

When working with business data, it’s not uncommon to find columns that store multiple values in a single field, often separated by commas.
For example, an account may have multiple classifications (A1,A2,A3), or a sales order might include several product codes (P01,P02,P03) within the same record.
While this approach might simplify the source system’s structure, it makes analytics more complex, because most reporting or modeling tools prefer normalized data, where each value appears in its own row.
In SAP Datasphere (and SAP HANA), this type of transformation can be done efficiently without using loops, thanks to SQLScript built-in functions like SERIES_GENERATE_INTEGER and OCCURRENCES_REGEXPR.

In this article, we’ll explore a real-world use case and walk through a SQLScript that cleanly and dynamically splits comma-separated lists into multiple rows.

The Business Use Case

Imagine getting data from multiple systems where account codes or classifications come packed into one string. We’ll unpack this typical scenario and understand why separating values into distinct rows is essential for clean data models.

Let’s say your company collects data from various financial systems, and one of them provides an Account field like this:

01.Account_Field_SQLScript

This structure causes issues when:

You need to join with a master data table on individual account codes.
You want to aggregate or filter based on each specific account.
Or you’re building a data model that feeds into SAP Analytics Cloud and requires atomic values.

The goal is to transform each comma-separated list into separate rows like this:

02. Split_Lists_SQLScript

Watch the recording of our webinar:

"SAP Datasphere and the Databricks Lakehouse Approach"

The Core Challenge - The Set-Based Way to Split Lists

Loops may sound like a quick fix and you might be tempted to solve this with a loop or procedural logic, iterating through each record, counting commas, and slicing text.
But in SQL-based engines like SAP Datasphere, loops are inefficient and break the idea of set-based processing.

The code snippet below provides a fully set-based solution. It uses only SQLScript functions to detect how many elements exist in each list, to dynamically generate the required number of rows and extract each substring corresponding to its position.

03.SQLScript_Full

The SQLScript Explained

Let’s break it down step by step.

Step 1: Identify the Maximum Number of Commas

Count the Commas in Each Row. Start by finding out how many elements each list holds. With OCCURRENCES_REGEXPR, you can easily detect the number of commas — and therefore, how many new rows you’ll need to create.

lt_table = SELECT 
   "Id",
   "Account",
   OCCURRENCES_REGEXPR(',' IN "Account") AS "occurrences"
FROM "Accounts_Table";

This part creates a local table that scans through the chosen column (“Account” for this example) and counts the number of commas in each row using OCCURRENCES_REGEXPR.

If a row has A1,A2,A3 the result is 2 (two commas).
We’ll later use this information to determine how many rows to generate.

Step 2: Generate a Dynamic Row Series

Instead of hard-coding anything, create a flexible number series using SERIES_GENERATE_INTEGER. This ensures your transformation can handle any list length automatically.

DECLARE lv_occ INT;
SELECT MAX("occurrences") + 2 INTO lv_occ FROM :lt_table;
lt_numbers = SELECT "ELEMENT_NUMBER"
             FROM SERIES_GENERATE_INTEGER(1, 1, :lv_occ);

Here, SERIES_GENERATE_INTEGER creates a virtual table of integers (1, 2, 3, … n),
where n is based on the maximum number of commas found + 2.
In order to use it dynamically, a variable is created (lv_occ) that holds the maximum occurrences +2 value.

Why +2?
Because if the maximum number of commas is 5, that means there are 6 actual values (5 commas + 1 element) and we add one more for an open-ended range.
This ensures we generate enough rows to cover all possible list elements.

Step 3: Cross Join and Split Dynamically

The magic happens here: here’s where the transformation happens. Using a CROSS JOIN and string functions like SUBSTR_BEFORE and LOCATE_REGEXPR, you can extract every value cleanly into its own row.

SELECT
    T."Id",
    T."Account",
    CASE 
        WHEN "occurrences" = 0 THEN TRIM("Account")
        WHEN S."ELEMENT_NUMBER" = 1
            THEN TRIM(SUBSTR_BEFORE(T."Account", ','))
        WHEN LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER") = 0
            THEN TRIM(SUBSTRING(T."Account",
                 LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER" - 1) + 1))
        ELSE TRIM(SUBSTRING(
                 T."Account",
                 LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER" - 1) + 1,
                 LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER")
                 - LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER" - 1)
                 - 1))
    END AS "Splitted"
FROM :lt_table AS T
CROSS JOIN :lt_numbers AS S
WHERE S."ELEMENT_NUMBER" <= OCCURRENCES_REGEXPR(',' IN T."Account") + 1;

In this part of code:

The CROSS JOIN pairs each record with a number series (1, 2, 3...) — one for each possible element in the list.
For example, a record with 3 elements will get 3 matching rows.
Depending on the element number, the code uses SUBSTR_BEFORE, SUBSTRING, and LOCATE_REGEXPR to extract the right part of the text between commas.

No loops or procedural iteration are used, just pure SQL text functions and dynamic row generation.

Making Sense of the CASE WHEN Logic

Now that we’ve seen how the script joins our data with a series of numbers, let’s take a closer look at the CASE WHEN block. Each condition in the CASE block has a purpose — from handling records without commas to correctly extracting the first, middle, and last elements. This is where the actual text splitting happens. Let’s simplify what happens behind each case.
Here’s the part of the CASE WHEN code again:

CASE 
    WHEN "occurrences" = 0 THEN TRIM("Account")
    WHEN S."ELEMENT_NUMBER" = 1
        THEN TRIM(SUBSTR_BEFORE(T."Account", ','))
    WHEN LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER") = 0
        THEN TRIM(SUBSTRING(
                T."Account",
                LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER" - 1) + 1
             ))
    ELSE TRIM(SUBSTRING(
                T."Account",
                LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER" - 1) + 1,
                LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER")
                - LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER" - 1)
                - 1
             ))
END AS "Splitted"

At first glance, it looks complicated, but it’s actually just handling four different situations. Let’s break them down in plain language.

1. When there are no commas at all

No commas? Just keep the original value.

WHEN "occurrences" = 0 THEN TRIM("Account")

If the column doesn’t have any commas, there’s nothing to split.
The code simply returns the original value. The TRIM function is used here in order to remove any leading or trailing spaces.

Example:
B1 → B1

It’s a simple “do nothing” case that keeps single-value records intact.

2. The first element in the list

First item? Grab everything before the first comma.

WHEN S."ELEMENT_NUMBER" = 1
THEN TRIM(SUBSTR_BEFORE(T."Account", ','))

When the element number is 1, we’re dealing with the first value in the list.

The SUBSTR_BEFORE function is used and simply takes everything before the first comma.

Example:
A1,A2,A3 → A1

This is usually the cleanest case since the function doesn’t have to calculate any positions.

3. The last element in the list

Last item? Take what comes after the last comma.

WHEN LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER") = 0
THEN TRIM(SUBSTRING(
T."Account",
LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER" - 1) + 1
))

When there are no more commas left to find, it means we’ve reached the end of the string.
Here, the code looks for the position of the last comma, adds one, and takes everything after that point.
The OCCURRENCE part in the LOCATE_REGEXPR function is expecting an integer after it so that it knows in which occurrence time of the comma it looks for (e.g the 1st time it is found, the 2nd etc.). By using the “ELEMENT_NUMBER” column, the code looks dynamically for the occurrences of each row.

Example:
A1,A2,A3 → A3

So this case handles the tail end of each list.

4. Everything in between

Middle items? Extract what lies between commas.

ELSE TRIM(SUBSTRING(
T."Account",
LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER" - 1) + 1,
LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER")
- LOCATE_REGEXPR(',' IN T."Account" OCCURRENCE S."ELEMENT_NUMBER" - 1)
- 1
))

This one covers all the “middle” values, anything that isn’t the first or last element.
The idea is to find the comma before and the comma after the current value, then extract everything between them.

Example:
A1,A2,A3,A4 → A2, A3

The LOCATE_REGEXPR calls are just finding the positions of those commas so that SUBSTRING can pull out the right piece. The SUBSTRING function expects the text to take the substring for (“Account”), the position it should start from (1,2,3 etc.) and the number of characters to march forward( 1,2,3 etc.) as its parameters.

In this part, the position it should start from is the first occurrence of comma it finds + 1, so that the comma is not included in the text it extracts, and the length of characters to extract is dynamically created for each row using the next and previous comma occurrence position (therefore the subtraction part of the LOCATE_REGEXPR expressions).
Each rule ensures your data stays consistent and readable.

Putting It All Together in SAP Datasphere

Once you run the full SQLScript view, you’ll see each comma-separated list expanded into separate rows — ready for joins, aggregations, and analytics in SAP Datasphere or SAP Analytics Cloud.

04.Full_SQLScript_View_SAP_Datasphere

Splitting Comma-Separated Lists into Rows with SQLScript: Our Conclusion

Comma-separated fields might seem like a harmless shortcut in source systems, but they quickly become a barrier to reliable analytics and scalable data models. With a few powerful SQL functions, you can elegantly transform complex string data into a clean, normalized format — all without loops or heavy scripting. The approach shown here runs efficiently in SAP Datasphere, making it ideal for large data volumes and near-real-time transformations.

If your data landscape still contains legacy patterns that slow down analytics, our team can help. We specialize in designing performant, cloud-ready data models, optimizing SQLScript transformations, and integrating SAP Datasphere seamlessly into your architecture.

Let’s discuss how you can modernize your data layer and unlock better reporting performance — get in touch with us for a short consulting session or hands-on use-cases.

Don't miss out on our free webinar "Bridging Business and Analytics: The Plug-and-Play Future of Data Platforms" next week! Register here.

FAQ - Splitting Comma Separated List with SQLScript

These are some of the most frequently asked questions about Splitting Comma Separated List into Rows in SAP Datasphere with SQLScript.

Why is it a problem to store multiple values in one field separated by commas? Because such structures prevent proper joins, filtering, and aggregation. Analytical tools like SAP Datasphere or SAP Analytics Cloud require normalized data — one value per row — for consistent reporting and modeling.

Can this transformation be done without loops in SAP Datasphere? Yes. The blog shows a fully set-based SQLScript approach using built-in functions like SERIES_GENERATE_INTEGER and OCCURRENCES_REGEXPR, avoiding loops entirely.

What is the benefit of a set-based solution compared to procedural loops? Set-based processing runs in parallel and is optimized for SQL engines, resulting in higher performance and scalability for large datasets.

Which SQL functions are key to this solution? OCCURRENCES_REGEXPR (to count commas), SERIES_GENERATE_INTEGER (to generate row numbers dynamically), SUBSTR_BEFORE, SUBSTRING, and LOCATE_REGEXPR (to extract text parts between commas).

How does the CROSS JOIN work in this context? Each record is paired with a generated series of numbers representing element positions, producing multiple rows per original record — one for each comma-separated value.

Can this method be reused for other list-like fields? Absolutely. It can be applied to any text column with delimited values (comma, semicolon, etc.) in SAP Datasphere or HANA — simply adjust the delimiter in the functions.

Fotios

Fotios Vlantis has been in the SAP world as an ERP consultant in Greece since 2021. He is currently working as a SAP BW/BI consultant with his special focus being on SAP BW/4HANA. As a former javelin throw athlete, he likes to play sports, read and travel.

Splitting Comma-Separated Lists into Rows in SAP Datasphere with SQLScript

The Business Use Case

Watch the recording of our webinar:

"SAP Datasphere and the Databricks Lakehouse Approach"

The Core Challenge - The Set-Based Way to Split Lists

The SQLScript Explained

Making Sense of the CASE WHEN Logic

Putting It All Together in SAP Datasphere

Splitting Comma-Separated Lists into Rows with SQLScript: Our Conclusion

FAQ - Splitting Comma Separated List with SQLScript

Fotios

Got a question about this blog?
Ask Fotios

Blog - NextLytics AG

Subscribe to our newsletter

Related Posts

Recent Posts

Posts by Tags

Splitting Comma-Separated Lists into Rows in SAP Datasphere with SQLScript

The Business Use Case

Watch the recording of our webinar:

"SAP Datasphere and the Databricks Lakehouse Approach"

The Core Challenge - The Set-Based Way to Split Lists

The SQLScript Explained

Making Sense of the CASE WHEN Logic

Putting It All Together in SAP Datasphere

Splitting Comma-Separated Lists into Rows with SQLScript: Our Conclusion

FAQ - Splitting Comma Separated List with SQLScript

Fotios

Got a question about this blog?Ask Fotios

Blog - NextLytics AG

Subscribe to our newsletter

Related Posts

Recent Posts

Posts by Tags

Got a question about this blog?
Ask Fotios