在学习Python超强爬虫8天速成(完整版)爬取各种网站数据实战案例Day7 - 06.无头浏览器+规避检测时候老师演示的代码,遇到一些问题及解决过程,供分享和指点
from selenium import webdriver from time import sleep from selenium.webdriver.chrome.options import Options from selenium.webdriver import ChromeOptions # non visual interface chrome_options = Options() chrome_options.add_argument('--headless') chrome_options.add_argument('--disable-gpu') # avoid detection risks option = ChromeOptions() option.add_experimental_option('excludeSwitches', ['enable-automation']) driver = webdriver.Chrome(executable_path='./chromedriver.exe', chrome_options=chrome_options, options=option) driver.get('https://www.baidu.com') # get page source print(driver.page_source) sleep(2) driver.quit()
由于刚开始使用的是seleniumV3.7报错TypeError: __init__() got an unexpected keyword argument 'options' ,作为初学者,比较疑惑,网上没有找到合适的解决办法,尝试将selenium升级到Version4.1.0,但是会有两个warning,
01: DeprecationWarning: executable_path has been deprecated, please pass in a Service object 发生于driver = webdriver.Chrome(executable_path='./chromedriver.exe')
解决方式
from selenium import webdriver from selenium.webdriver.chrome.service import Service # 创建一个Service对象,指定ChromeDriver的路径 service = Service('./chromedriver.exe') # 通过Service对象来初始化Chrome WebDriver driver = webdriver.Chrome(service=service)
02:DeprecationWarning: use options instead of chrome_options 发生于driver = webdriver.Chrome(service=service, chrome_options=chrome_options, options=option),
但是chrome_options和option都需要传入options,不知如何解决,但是最后尝试将无界面和反检测相应配置参数都传入Options对象,如下
from selenium import webdriver from selenium.webdriver.chrome.service import Service # 创建一个Service对象,指定ChromeDriver的路径 service = Service('./chromedriver.exe') # 通过Service对象来初始化Chrome WebDriver driver = webdriver.Chrome(service=service)经过测试,后台运行和防止被检测均生效
最终代码
from selenium import webdriver from time import sleep from selenium.webdriver.chrome.options import Options from selenium.webdriver.chrome.service import Service chrome_options = Options() # non visual interface chrome_options.add_argument('--headless') chrome_options.add_argument('--disable-gpu') # avoid detection risks chrome_options.add_experimental_option('excludeSwitches', ['enable-automation']) # 创建一个Service对象,指定ChromeDriver的路径 service = Service('./chromedriver.exe') # 通过Service对象来初始化Chrome WebDriver driver = webdriver.Chrome(service=service, options=chrome_options) driver.get('https://www.baidu.com') print(driver.page_source) sleep(2) driver.quit()
期待指点...