You need to enable JavaScript to run this app.
优惠活动
大模型
产品
解决方案
定价
更多
文档控制台
免费开始使用

使用Curl爬取Twitter搜索返回302,浏览器/Postman返回200求助

Fix for Twitter Search 302 Redirect with Curl

First off, I spot a critical mistake in your code that's definitely contributing to the 302 issue: you're setting the HTTP header after calling curl_exec(). That line does nothing because the request has already been sent by the time you set CURLOPT_HTTPHEADER. Let's fix that first, then address Twitter's anti-scraping checks that are likely blocking your request.

Step 1: Fix the Header Order & Add Required Configs

Move the CURLOPT_HTTPHEADER line before curl_exec($ch), and add additional settings to mimic a real browser session. Here's the corrected code:

$param = "?f=tweets&q=+LAPOR1708&src=typd&max_position=".$scrollCursor;
$url = "https://twitter.com/i/search/timeline".$param;

$ch = curl_init();
curl_setopt($ch, CURLOPT_VERBOSE, true);
// Use a modern user-agent instead of the outdated Firefox 2.0 one
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36');
curl_setopt($ch, CURLOPT_URL, $url);
// Set headers BEFORE executing the request
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language: en-US,en;q=0.5",
    "Referer: https://twitter.com/search?q=LAPOR1708",
    "DNT: 1"
]);
// Follow 302 redirects (Twitter often uses these to set up sessions)
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
// Persist cookies to maintain a valid session, like a browser does
curl_setopt($ch, CURLOPT_COOKIEJAR, 'twitter_cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'twitter_cookies.txt');

$result = curl_exec($ch);
dd(curl_getinfo($ch));
curl_close($ch);

Step 2: Why This Fixes the 302 Issue

Twitter's anti-scraping systems now check far more than just a user-agent. Here's what each change does:

  • Proper Header Order: Headers are applied before the request is sent, so Twitter sees the same accept rules as a real browser.
  • Modern User-Agent: Your old Firefox 2.0 agent is immediately flagged as non-human. A modern agent blends in with regular traffic.
  • Referer Header: Tells Twitter your request originated from the search page, mimicking a user navigating normally.
  • Cookie Persistence: Twitter requires valid session cookies to serve timeline data. By storing and reusing cookies, you maintain a session just like a browser.
  • Follow Redirects: The 302 is likely a temporary redirect to set up a valid session; enabling this lets Curl follow that redirect instead of stopping at the 302 response.

Step 3: Extra Troubleshooting If You Still Get 302s

If the above doesn't work, try these steps:

  1. Manually Add Fresh Cookies: Log into Twitter in your browser, copy the cookies from dev tools (Application > Cookies > twitter.com), and add them directly to the headers:
    curl_setopt($ch, CURLOPT_HTTPHEADER, [
        // ... other headers
        "Cookie: auth_token=YOUR_AUTH_TOKEN; ct0=YOUR_CT0_COOKIE; ..."
    ]);
    
    Note: Cookies expire, so you'll need to refresh them periodically.
  2. Check for JavaScript Rendering: Twitter might be serving content via JavaScript now. If the timeline data isn't in the raw HTML response, you may need to use a headless browser like Puppeteer instead of Curl.

Keep in mind that Twitter actively blocks scraping, so these fixes might stop working over time as they update their anti-scraping measures. Always ensure you're complying with Twitter's Terms of Service when scraping their data.

内容的提问来源于stack exchange,提问作者Rangga Rizky

火山引擎 最新活动