Teradata左连接行数异常及Error 3807/3653报错求助

阿华AIGC实验室

2026-5-8

Let's break down your issues one by one and fix them step by step:

1. Why your LEFT JOIN returns way more rows than expected

Your base volatile_table1 has 559,157 rows, but the LEFT JOIN returns over 2 million rows. The root cause is duplicate matching records in database_table1 (alias b) or volatile_table2 (alias c) for the join keys.

For example, if one row in a matches 3 rows in b and 2 rows in c, that single row from a will expand to 3*2=6 rows in the result. Adding DISTINCT won’t fix this because the columns from b/c are different across those matched rows, so they can’t be deduplicated.

Fix: Aggregate `b` and `c` first to ensure unique join keys

Before joining, pre-aggregate b and c so each a.column1 maps to exactly one row in each table. This prevents row explosion while preserving the data you need:

WITH aggregated_b AS (
    SELECT 
        b.column0, 
        MAX(b.column1) AS column1,  -- Use MAX/MIN/STRING_AGG based on your actual data needs
        MAX(b.column2) AS column2,
        MAX(b.column3) AS column3,
        MAX(b.column4) AS column4,
        MAX(b.column5) AS column5,
        MAX(b.column6) AS column6
    FROM database_table1 b
    GROUP BY b.column0
),
aggregated_c AS (
    SELECT 
        c.column2,
        MAX(c.column3) AS column3,
        SUM(c.column3) AS Total_Column3_Profit,
        AVG(c.column3) AS Column3_Profit_Average
    FROM volatile_table2 c
    GROUP BY c.column2
)
SELECT 
    a.*,
    b.column1, b.column2, b.column3, b.column4, b.column5, b.column6,
    c.column3, c.Total_Column3_Profit, c.Column3_Profit_Average,
    CASE WHEN b.column5 < a.column6 THEN 1 ELSE 0 END AS column_open_flag,
    CASE 
        WHEN a.column4 < 580 THEN 1
        WHEN a.column4 BETWEEN 580 AND 619 THEN 2
        WHEN a.column4 BETWEEN 620 AND 639 THEN 3
        WHEN a.column4 BETWEEN 640 AND 659 THEN 4
        WHEN a.column4 BETWEEN 660 AND 679 THEN 5
        WHEN a.column4 BETWEEN 680 AND 699 THEN 6
        WHEN a.column4 BETWEEN 700 AND 739 THEN 7
        WHEN a.column4 >= 740 THEN 8
        ELSE 0 
    END AS column4_band
FROM volatile_table1 a
LEFT JOIN aggregated_b b ON a.column1 = b.column0
LEFT JOIN aggregated_c c ON a.column1 = c.column2;

2. Error 3807: Object 'a' does not exist

Looking at your first UNION attempt, the third SELECT block has a critical scope mistake:

SELECT c.column3, SUM(c.column3) AS Total_Column3_Profit, AVG(c.column3) AS Column3_Profit_Average 
FROM volatile_table2 c 
GROUP BY a.column1 ,a.column2 ,a.column3 ,a.column4 ,a.column5 ,a.column6 ,b.column1 ,b.column2 ,b.column3 ,b.column4 ,b.column5 ,b.column6 ,c.column3

You’re grouping by columns from a and b, but this SELECT only references table c—there’s no a or b in this query scope. That’s why Teradata throws the "Object 'a' does not exist" error.

On top of that, UNION requires all SELECT branches to have the same number of columns with matching data types, which your query doesn’t (first branch has 8 columns, second has 6, third has 3). That’s the root of the next error you encountered.

3. Error 3653: All select-lists do not contain the same number of expressions

Your second modified query has two major issues:

The UNION ALL branches have mismatched column counts: first has 6 columns, second has 6, third has 3 (c.column3, SUM, AVG). Teradata strictly requires every branch in UNION/UNION ALL to return the same number of columns with compatible data types.
The outer query references a.column1, b.column1, etc., but the subquery is aliased as d—the outer scope has no access to a, b, or c anymore.

Fix: Stick with LEFT JOIN (not UNION) for your use case

UNION is for combining rows vertically (adding more rows to the result), but your goal is to combine columns horizontally (append data from b and c to each row in a). LEFT JOIN is the right tool here—you just needed to fix the row explosion issue as explained in point 1.

If you absolutely need to use UNION for edge cases, every branch must return the same number of columns. For example:

-- This fixes column count, but doesn't achieve your original goal of merging a/b/c data per row
SELECT 
    a.column1, a.column2, a.column3, a.column4, a.column5, a.column6,
    CASE WHEN b.column5 < a.column6 THEN 1 ELSE 0 END AS column_open_flag,
    CASE 
        WHEN a.column4 < 580 THEN 1
        WHEN a.column4 BETWEEN 580 AND 619 THEN 2
        WHEN a.column4 BETWEEN 620 AND 639 THEN 3
        WHEN a.column4 BETWEEN 640 AND 659 THEN 4
        WHEN a.column4 BETWEEN 660 AND 679 THEN 5
        WHEN a.column4 BETWEEN 680 AND 699 THEN 6
        WHEN a.column4 BETWEEN 700 AND 739 THEN 7
        WHEN a.column4 >= 740 THEN 8
        ELSE 0 
    END AS column4_band,
    NULL AS Total_Column3_Profit, NULL AS Column3_Profit_Average
FROM volatile_table1 a
LEFT JOIN database_table1 b ON a.column1 = b.column0
UNION ALL
SELECT 
    b.column0, b.column1, b.column2, b.column3, b.column4, b.column6,
    NULL, NULL, NULL, NULL
FROM database_table1 b
UNION ALL
SELECT 
    c.column2, NULL, NULL, NULL, NULL, NULL,
    NULL, NULL, SUM(c.column3), AVG(c.column3)
FROM volatile_table2 c
GROUP BY c.column2;

But this will return disjoint rows (a’s rows with b data, then b’s rows alone, then c’s aggregates alone)—not the merged result you want. Stick with the pre-aggregated LEFT JOIN approach for your original requirement.

内容的提问来源于stack exchange，提问作者InvisibleInk