R的httr包调用API返回403 Forbidden,浏览器访问正常
解决Sofascore API 403 Forbidden问题(R httr包)
问题重现
通过浏览器访问API端点 https://www.sofascore.com/api/v1/sport/esports/scheduled-events/2025-03-14 可正常返回JSON,但使用R的httr包携带浏览器复制的请求头发起GET请求时,返回403 Forbidden错误。原代码如下:
library(httr) url = "https://www.sofascore.com/api/v1/sport/esports/scheduled-events/2025-03-13" headers <- add_headers( `accept` = "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7", `accept-encoding` = "gzip, deflate, br, zstd", `accept-language` = "en-US,en;q=0.9", `cache-control` = "max-age=0", `cookie` = "_ga=GA1.1.598020237.1738380109; _cc_id=fe871723732fbf4c2401642c107a26fe; panoramaId_expiry=1742486201782; panoramaId=daa5f27b6006082f849fd5198a3b16d539383b6b209451763912ab7650a957c1; panoramaIdType=panoIndiv; cto_bundle=d8efBl9kZGJhN3FiV1NUcU1sZnNpTkx6ZmFqRm9RdWtzWHlvcGJzbHlLR1Y0d1lpS211NFV0RlNjZVFwcmVrU0ZZeU12MlRjb0hIOExZNWVIZTBubThNV0N1WHJ2NmRwdHdTOUlaRTFLcnVLTkxjTHpaU25yMnlwODJ3eGRCNjd6RWMlMkZpd0pVcm53TUhBazRlSFZKaG5HMm5PUSUzRCUzRA; FCNEC=%5B%5B%22AKsRol-wu_0XC3FpfHpetGephwnn9tq3I6kd5brworJJEdPd-xbuokPGjGGDI8A3ClXx5gkbCZkQnxp3pFBxykXMh08rWsb91bSEJC2NHmP_GKKFEaKHGOYt2jwek5_EqpbmDRYEl8LWRgZtGge3p_CecJbmr2sumQ%3D%3D%22%5D%5D; _awl=2.1741897730.5-b2f593bc3c95e1889f553be2c7879f1b-6763652d75732d6561737431-3; _ga_HNQ9P9MGZR=GS1.1.1741896640.5.1.1741898812.60.0.0", `if-none-match` = "\"8192fa0e83\"", `priority` = "u=0, i", `sec-ch-ua` = "\"Chromium\";v=\"134\", \"Not:A-Brand\";v=\"24\", \"Google Chrome\";v=\"134\"", `sec-ch-ua-mobile` = "?0", `sec-ch-ua-platform` = "\"Windows\"", `sec-fetch-dest` = "document", `sec-fetch-mode` = "navigate", `sec-fetch-site` = "none", `sec-fetch-user` = "?1", `upgrade-insecure-requests` = "1", `user-agent` = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36" ) response = GET(url, headers) response$status_code
解决方案
问题核心是浏览器复制的请求头与API端点需求不匹配:浏览器头是针对HTML页面导航的,而API需要JSON类型的请求。按以下步骤修改:
- 修正Accept头:将
accept改为API期望的JSON格式,服务器会根据这个头判断返回内容类型,错误的Accept会触发拦截。 - 移除不必要的请求头:去掉浏览器特有的导航头(如
sec-fetch-*、upgrade-insecure-requests、cache-control等),这类头会暴露非浏览器的自动化请求特征。 - 移除过期Cookie:Cookie极易过期,且Sofascore公开API多数情况下不需要Cookie验证。
- 添加Referer头(可选):部分服务器会检查请求来源,添加官网Referer可降低被拦截概率。
修改后的代码
library(httr) url <- "https://www.sofascore.com/api/v1/sport/esports/scheduled-events/2025-03-13" headers <- add_headers( `accept` = "application/json, */*", # 关键:指定接受JSON格式 `accept-encoding` = "gzip, deflate, br", `accept-language` = "en-US,en;q=0.9", `sec-ch-ua` = "\"Chromium\";v=\"134\", \"Not:A-Brand\";v=\"24\", \"Google Chrome\";v=\"134\"", `sec-ch-ua-mobile` = "?0", `sec-ch-ua-platform` = "\"Windows\"", `user-agent` = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36", `referer` = "https://www.sofascore.com/esports" # 可选:指定请求来源页 ) response <- GET(url, headers) response$status_code # 正常应返回200 # 解析JSON响应内容 content(response, "parsed")
额外说明
- 如果仍返回403,可尝试更换
user-agent为最新的浏览器UA字符串。 - 避免高频请求,防止触发IP封禁。
- Sofascore的API未公开官方文档,使用时需遵守网站的robots.txt规则。
内容的提问来源于stack exchange,提问作者Nick Amato




