使用Python Selenium在LinkedIn搜索公司并点击「Companies」按钮
Hey, here's a solid implementation using Python and Selenium to get LinkedIn company-specific results instead of profiles when searching for a business name like CalSTRS. I'll break it down step by step, assuming you've already got your helper functions set up (like handling driver initialization and login):
Step-by-Step Code Implementation
First, import the required Selenium modules (you'll need these alongside your existing helper functions):
from selenium.webdriver.common.by import By from selenium.webdriver.common.keys import Keys from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC import time
Then, the core function to handle the search and filtering:
def search_linkedin_companies(target_company): # Initialize your driver using your pre-defined helper function # Assumes init_driver() returns a logged-in WebDriver instance driver = init_driver() try: # Navigate to LinkedIn's homepage driver.get("https://www.linkedin.com/") # Wait for the search bar to load, then input the company name search_bar = WebDriverWait(driver, 10).until( EC.presence_of_element_located((By.CSS_SELECTOR, "input[aria-label='Search']")) ) search_bar.clear() search_bar.send_keys(target_company) # Trigger the search with Enter search_bar.send_keys(Keys.ENTER) # Wait for results page to load, then click the "Companies" tab # Use XPath to target the button by its visible text companies_tab = WebDriverWait(driver, 10).until( EC.element_to_be_clickable((By.XPATH, "//button[text()='Companies']")) ) companies_tab.click() # Optional: Extract and print sample company results (adjust selectors as needed) # Wait for company list to load company_cards = WebDriverWait(driver, 10).until( EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.entity-result__title-text a")) ) print(f"Top matching companies for '{target_company}':") for i, card in enumerate(company_cards[:5], 1): print(f"{i}. {card.text.strip()}") except Exception as e: print(f"Oops, hit an error: {str(e)}") finally: # Clean up the driver session driver.quit() # Run the function with your target company search_linkedin_companies("CalSTRS")
Key Details & Best Practices
- Smart Waiting: I used
WebDriverWaitinstead oftime.sleep()because it waits only until the element is ready, making the script more reliable (LinkedIn's page loads dynamically, so hard delays can cause failures). - Stable Locators:
- The search bar is targeted using its
aria-labelCSS selector, which is less likely to change than random IDs LinkedIn assigns. - The "Companies" tab uses XPath to match the button's visible text. If you run into localization issues, swap it for
button[aria-label='Companies'](check the element's attributes via browser dev tools).
- The search bar is targeted using its
- Login Handling: Your
init_driver()helper should handle logging into LinkedIn—either by entering credentials programmatically or loading saved cookies to avoid repeated manual logins. - Data Extraction: The example extracts company names, but you can expand this to pull descriptions, locations, or follower counts by updating the CSS selectors (use F12 in your browser to inspect the latest element structure).
Troubleshooting Tips
- If the "Companies" tab won't click, try forcing the click with JavaScript:
driver.execute_script("arguments[0].click();", companies_tab) - LinkedIn has anti-scraping measures—add small random delays between actions (e.g.,
time.sleep(2)after clicking the tab) to avoid triggering captchas. - Ensure your Selenium driver version matches your browser version (e.g., ChromeDriver for Chrome) to avoid compatibility errors.
内容的提问来源于stack exchange,提问作者Merv Merzoug




