You need to enable JavaScript to run this app.
优惠活动
大模型
产品
解决方案
定价
更多
文档控制台
免费开始使用

R的httr包调用API返回403 Forbidden,浏览器访问正常

解决Sofascore API 403 Forbidden问题(R httr包)

问题重现

通过浏览器访问API端点 https://www.sofascore.com/api/v1/sport/esports/scheduled-events/2025-03-14 可正常返回JSON,但使用R的httr包携带浏览器复制的请求头发起GET请求时,返回403 Forbidden错误。原代码如下:

library(httr)

url = "https://www.sofascore.com/api/v1/sport/esports/scheduled-events/2025-03-13"

headers <- add_headers(
  `accept` = "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
  `accept-encoding` = "gzip, deflate, br, zstd",
  `accept-language` = "en-US,en;q=0.9",
  `cache-control` = "max-age=0",
  `cookie` = "_ga=GA1.1.598020237.1738380109; _cc_id=fe871723732fbf4c2401642c107a26fe; panoramaId_expiry=1742486201782; panoramaId=daa5f27b6006082f849fd5198a3b16d539383b6b209451763912ab7650a957c1; panoramaIdType=panoIndiv; cto_bundle=d8efBl9kZGJhN3FiV1NUcU1sZnNpTkx6ZmFqRm9RdWtzWHlvcGJzbHlLR1Y0d1lpS211NFV0RlNjZVFwcmVrU0ZZeU12MlRjb0hIOExZNWVIZTBubThNV0N1WHJ2NmRwdHdTOUlaRTFLcnVLTkxjTHpaU25yMnlwODJ3eGRCNjd6RWMlMkZpd0pVcm53TUhBazRlSFZKaG5HMm5PUSUzRCUzRA; FCNEC=%5B%5B%22AKsRol-wu_0XC3FpfHpetGephwnn9tq3I6kd5brworJJEdPd-xbuokPGjGGDI8A3ClXx5gkbCZkQnxp3pFBxykXMh08rWsb91bSEJC2NHmP_GKKFEaKHGOYt2jwek5_EqpbmDRYEl8LWRgZtGge3p_CecJbmr2sumQ%3D%3D%22%5D%5D; _awl=2.1741897730.5-b2f593bc3c95e1889f553be2c7879f1b-6763652d75732d6561737431-3; _ga_HNQ9P9MGZR=GS1.1.1741896640.5.1.1741898812.60.0.0",
  `if-none-match` = "\"8192fa0e83\"",
  `priority` = "u=0, i",
  `sec-ch-ua` = "\"Chromium\";v=\"134\", \"Not:A-Brand\";v=\"24\", \"Google Chrome\";v=\"134\"",
  `sec-ch-ua-mobile` = "?0",
  `sec-ch-ua-platform` = "\"Windows\"",
  `sec-fetch-dest` = "document",
  `sec-fetch-mode` = "navigate",
  `sec-fetch-site` = "none",
  `sec-fetch-user` = "?1",
  `upgrade-insecure-requests` = "1",
  `user-agent` = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36"
)

response = GET(url, headers)
response$status_code

解决方案

问题核心是浏览器复制的请求头与API端点需求不匹配:浏览器头是针对HTML页面导航的,而API需要JSON类型的请求。按以下步骤修改:

  1. 修正Accept头:将accept改为API期望的JSON格式,服务器会根据这个头判断返回内容类型,错误的Accept会触发拦截。
  2. 移除不必要的请求头:去掉浏览器特有的导航头(如sec-fetch-*upgrade-insecure-requestscache-control等),这类头会暴露非浏览器的自动化请求特征。
  3. 移除过期Cookie:Cookie极易过期,且Sofascore公开API多数情况下不需要Cookie验证。
  4. 添加Referer头(可选):部分服务器会检查请求来源,添加官网Referer可降低被拦截概率。

修改后的代码

library(httr)

url <- "https://www.sofascore.com/api/v1/sport/esports/scheduled-events/2025-03-13"

headers <- add_headers(
  `accept` = "application/json, */*",  # 关键:指定接受JSON格式
  `accept-encoding` = "gzip, deflate, br",
  `accept-language` = "en-US,en;q=0.9",
  `sec-ch-ua` = "\"Chromium\";v=\"134\", \"Not:A-Brand\";v=\"24\", \"Google Chrome\";v=\"134\"",
  `sec-ch-ua-mobile` = "?0",
  `sec-ch-ua-platform` = "\"Windows\"",
  `user-agent` = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36",
  `referer` = "https://www.sofascore.com/esports"  # 可选:指定请求来源页
)

response <- GET(url, headers)
response$status_code  # 正常应返回200

# 解析JSON响应内容
content(response, "parsed")

额外说明

  • 如果仍返回403,可尝试更换user-agent为最新的浏览器UA字符串。
  • 避免高频请求,防止触发IP封禁。
  • Sofascore的API未公开官方文档,使用时需遵守网站的robots.txt规则。

内容的提问来源于stack exchange,提问作者Nick Amato

火山引擎 最新活动