关于Twitter Search API的技术问询:批量获取推文及count参数释义
Hey there! Let's tackle your two questions about the Twitter Search API clearly and practically.
The count parameter's "per page" refers to the number of tweets returned in a single API request. Think of each API call as asking for one "page" of results—you can request up to 100 tweets per such page (since that's the maximum allowed value for count).
In your current script, you haven't specified the count parameter, so the API will default to a smaller number (usually 15 or 20, depending on the API version). If you added count=100 to your raw query (like q=%23myHashtag&geocode=59.347937,18.07243...&count=100), each request would return the maximum 100 tweets per "page" of results.
To get more than just one page of tweets, you'll need to implement pagination using the max_id parameter. Here's how it works:
- Make your first API request (with
count=100to get the most results per call). - Extract the ID of the oldest tweet in the returned results.
- Make a follow-up request, adding
max_id=[oldest_tweet_id - 1]to your query. This tells the API to return tweets older than that oldest tweet from your previous request. - Repeat these steps until the API returns no more tweets (or you hit the API rate limits).
Example modified script
Here's how you can adjust your code to paginate and collect more tweets:
import twitter import time # Initialize the API client api = twitter.Api( consumer_key="mykey", consumer_secret="mysecret", access_token_key="myaccess", access_token_secret="myaccesssecret" ) all_tweets = [] max_id = None count = 100 # Max tweets per page while True: # Build the query with max_id if available query_parts = ["q=%23myHashtag", "geocode=59.347937,18.07243...", f"count={count}"] if max_id: query_parts.append(f"max_id={max_id}") raw_query = "&".join(query_parts) # Fetch the page of tweets try: results = api.GetSearch(raw_query=raw_query) except Exception as e: print(f"Request failed: {e}. Waiting 1 minute before retrying...") time.sleep(60) continue if not results: break # No more tweets to fetch # Add to our collection all_tweets.extend(results) # Update max_id to the oldest tweet's ID minus 1 oldest_tweet_id = min(tweet.id for tweet in results) max_id = oldest_tweet_id - 1 # Print progress print(f"Collected {len(all_tweets)} tweets so far...") # Small delay to avoid hitting rate limits time.sleep(1) print(f"Done! Total tweets collected: {len(all_tweets)}")
Important notes:
- Rate Limits: The Twitter Search API (v1.1) has a rate limit of 180 requests per 15-minute window. Adding a small delay between requests (like
time.sleep(1)) helps avoid hitting this limit too quickly. - Result Freshness: The standard Search API only returns recent tweets (usually 7-10 days worth). If you need access to older historical tweets, you'll need to use the Twitter Academic Research API (if you meet the eligibility criteria).
内容的提问来源于stack exchange,提问作者Sahand




