You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

关于Twitter Search API的技术问询:批量获取推文及count参数释义

Hey there! Let's tackle your two questions about the Twitter Search API clearly and practically.

Understanding "per page" in the Twitter Search API

The count parameter's "per page" refers to the number of tweets returned in a single API request. Think of each API call as asking for one "page" of results—you can request up to 100 tweets per such page (since that's the maximum allowed value for count).

In your current script, you haven't specified the count parameter, so the API will default to a smaller number (usually 15 or 20, depending on the API version). If you added count=100 to your raw query (like q=%23myHashtag&geocode=59.347937,18.07243...&count=100), each request would return the maximum 100 tweets per "page" of results.

How to fetch a large number of tweets via the API

To get more than just one page of tweets, you'll need to implement pagination using the max_id parameter. Here's how it works:

  • Make your first API request (with count=100 to get the most results per call).
  • Extract the ID of the oldest tweet in the returned results.
  • Make a follow-up request, adding max_id=[oldest_tweet_id - 1] to your query. This tells the API to return tweets older than that oldest tweet from your previous request.
  • Repeat these steps until the API returns no more tweets (or you hit the API rate limits).

Example modified script

Here's how you can adjust your code to paginate and collect more tweets:

import twitter
import time

# Initialize the API client
api = twitter.Api(
    consumer_key="mykey",
    consumer_secret="mysecret",
    access_token_key="myaccess",
    access_token_secret="myaccesssecret"
)

all_tweets = []
max_id = None
count = 100  # Max tweets per page

while True:
    # Build the query with max_id if available
    query_parts = ["q=%23myHashtag", "geocode=59.347937,18.07243...", f"count={count}"]
    if max_id:
        query_parts.append(f"max_id={max_id}")
    raw_query = "&".join(query_parts)
    
    # Fetch the page of tweets
    try:
        results = api.GetSearch(raw_query=raw_query)
    except Exception as e:
        print(f"Request failed: {e}. Waiting 1 minute before retrying...")
        time.sleep(60)
        continue
    
    if not results:
        break  # No more tweets to fetch
    
    # Add to our collection
    all_tweets.extend(results)
    
    # Update max_id to the oldest tweet's ID minus 1
    oldest_tweet_id = min(tweet.id for tweet in results)
    max_id = oldest_tweet_id - 1
    
    # Print progress
    print(f"Collected {len(all_tweets)} tweets so far...")
    # Small delay to avoid hitting rate limits
    time.sleep(1)

print(f"Done! Total tweets collected: {len(all_tweets)}")

Important notes:

  • Rate Limits: The Twitter Search API (v1.1) has a rate limit of 180 requests per 15-minute window. Adding a small delay between requests (like time.sleep(1)) helps avoid hitting this limit too quickly.
  • Result Freshness: The standard Search API only returns recent tweets (usually 7-10 days worth). If you need access to older historical tweets, you'll need to use the Twitter Academic Research API (if you meet the eligibility criteria).

内容的提问来源于stack exchange,提问作者Sahand

火山引擎 最新活动