You need to enable JavaScript to run this app.
优惠活动
大模型
产品
解决方案
定价
更多
文档控制台
免费开始使用

如何去除字符串中的换行?tMap处理CSV字段问题求助

Solution for tMap String Handling with Line Breaks, Quotes, and Spaces

Hey there! Let's work through this tMap string processing issue you're facing. The key problems here are handling multi-line content (line breaks), conflicting logic for quotes and spaces, and ensuring consistent results for cases like "Other vc_7days". Here's a step-by-step fix:


1. First: Remove All Line Breaks

Multi-line cells are caused by \n (Unix), \r\n (Windows), or \r (old Mac) line breaks. Start by cleaning these out, plus trimming any extra whitespace at the start/end of the string:

// Clean line breaks and trim whitespace
String cleaned = row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim();

This regex covers all common line break formats, so you won't miss any hidden newlines.


2. Optimize Quote & Space Logic (Avoid Conflicts)

Your original code has two separate checks that can conflict (e.g., a string with both quotes and spaces). Let's prioritize and combine the logic to ensure predictable results:

Option A: Use a Readable Nested Logic (for tMap Expression)

If you want to write this directly in tMap's expression field, use this nested approach (we'll handle quotes first, then spaces):

// Full expression for tMap
(
    // Step 1: Clean line breaks and trim
    row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().contains("\"") 
    ? 
        // Step 2: Handle quotes (only if paired)
        (row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().indexOf("\"") != row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().lastIndexOf("\"") 
        ? 
            row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().substring(row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().indexOf("\"")+1, row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().lastIndexOf("\"")) 
        : 
            "null") 
    : 
        row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim()
).contains(" ") 
? 
    // Step 3: Extract text before first space (after quote handling)
    (
        row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().contains("\"") 
        ? 
            (row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().indexOf("\"") != row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().lastIndexOf("\"") 
            ? 
                row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().substring(row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().indexOf("\"")+1, row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().lastIndexOf("\"")) 
            : 
                "null") 
        : 
            row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim()
    ).substring(0, (
        row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().contains("\"") 
        ? 
            (row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().indexOf("\"") != row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().lastIndexOf("\"") 
            ? 
                row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().substring(row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().indexOf("\"")+1, row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().lastIndexOf("\"")) 
            : 
                "null") 
        : 
            row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim()
    ).indexOf(" ")) 
: 
    // Keep the cleaned string if no space exists
    (
        row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().contains("\"") 
        ? 
            (row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().indexOf("\"") != row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().lastIndexOf("\"") 
            ? 
                row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().substring(row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().indexOf("\"")+1, row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim().lastIndexOf("\"")) 
            : 
                "null") 
        : 
            row4.Ad_Set_Name.replaceAll("\\r?\\n|\\r", "").trim()
    )

Option B: Use a Custom Java Method (Better Readability & Maintainability)

For cleaner code (especially if you need to reuse this logic), create a custom utility class in Talend:

// Create this class in Talend's Code tab or as a separate Java file
public class AdSetNameProcessor {
    public static String process(String input) {
        // Handle null input upfront
        if (input == null) return "null";
        
        // Step 1: Remove line breaks and trim whitespace
        String temp = input.replaceAll("\\r?\\n|\\r", "").trim();
        
        // Step 2: Process paired quotes
        if (temp.contains("\"")) {
            int firstQuote = temp.indexOf("\"");
            int lastQuote = temp.lastIndexOf("\"");
            // Only extract content if quotes are paired
            if (firstQuote != lastQuote) {
                temp = temp.substring(firstQuote + 1, lastQuote);
            } else {
                return "null";
            }
        }
        
        // Step 3: Extract text before first space
        if (temp.contains(" ")) {
            temp = temp.substring(0, temp.indexOf(" "));
        }
        
        // Return "null" if the final string is empty
        return temp.isEmpty() ? "null" : temp;
    }
}

Then call this method directly in your tMap expression:

AdSetNameProcessor.process(row4.Ad_Set_Name)

3. Test with Your Example

For input "Other vc_7days":

  1. Line breaks are removed (none here), string becomes "Other vc_7days"
  2. Paired quotes are detected, extract content inside: Other vc_7days
  3. Space is detected, extract text before first space: Other

If you want to keep the full quoted content instead of trimming at the space, just remove the space-handling step from the logic!


内容的提问来源于stack exchange,提问作者Mara M

火山引擎 最新活动