python爬蟲selenium模塊詳解
selenium優(yōu)勢(shì)
便捷的獲取網(wǎng)站中動(dòng)態(tài)加載的數(shù)據(jù) 便捷實(shí)現(xiàn)模擬登陸selenium使用流程:
1.環(huán)境安裝:pip install selenium
2.下載一個(gè)瀏覽器的驅(qū)動(dòng)程序(谷歌瀏覽器)
3.實(shí)例化一個(gè)瀏覽器對(duì)象
基本使用代碼
from selenium import webdriverfrom lxml import etreefrom time import sleepif __name__ == ’__main__’: bro = webdriver.Chrome(r'E:googleChromeApplicationchromedriver.exe') bro.get(url=’http://scxk.nmpa.gov.cn:81/xk/’) page_text = bro.page_source tree = etree.HTML(page_text) li_list = tree.xpath(’//*[@id='gzlist']/li’) for li in li_list: name = li.xpath(’./dl/@title’)[0] print(name) sleep(5) bro.quit()基于瀏覽器自動(dòng)化的操作
代碼
#編寫基于瀏覽器自動(dòng)化的操作代碼- 發(fā)起請(qǐng)求: get(url)- 標(biāo)簽定位: find系列的方法- 標(biāo)簽交互: send_ keys( ’xxx’ )- 執(zhí)行js程序: excute_script(’jsCod’)- 前進(jìn),后退: back(),forward( )- 關(guān)閉瀏覽器: quit()
代碼
https://www.taobao.com/
from selenium import webdriverfrom time import sleepbro = webdriver.Chrome(executable_path=r'E:googleChromeApplicationchromedriver.exe')bro.get(url=’https://www.taobao.com/’)#標(biāo)簽定位search_input = bro.find_element_by_id(’q’)sleep(2)#執(zhí)行一組js代碼,使得滾輪向下滑動(dòng)bro.execute_script(’window.scrollTo(0,document.body.scrollHeight)’)sleep(2)#標(biāo)簽交互search_input.send_keys(’女裝’)button = bro.find_element_by_class_name(’btn-search’)button.click()bro.get(’https://www.baidu.com’)sleep(2)bro.back()sleep(2)bro.forward()sleep(5)bro.quit()selenium處理iframe:
- 如果定位的標(biāo)簽存在于iframe標(biāo)簽之中,則必須使用switch_to.frame(id)- 動(dòng)作鏈(拖動(dòng)) : from selenium. webdriver import ActionChains- 實(shí)例化一個(gè)動(dòng)作鏈對(duì)象: action = ActionChains (bro)- click_and_hold(div) :長(zhǎng)按且點(diǎn)擊操作- move_by_offset(x,y)- perform( )讓動(dòng)作鏈立即執(zhí)行- action.release( )釋放動(dòng)作鏈對(duì)象
代碼
https://www.runoob.com/try/try.php?filename=jqueryui-api-droppable
from selenium import webdriverfrom time import sleepfrom selenium.webdriver import ActionChainsbro = webdriver.Chrome(executable_path=r'E:googleChromeApplicationchromedriver.exe')bro.get(’https://www.runoob.com/try/try.php?filename=jqueryui-api-droppable’)bro.switch_to.frame(’iframeResult’)div = bro.find_element_by_id(’draggable’)#動(dòng)作鏈action = ActionChains(bro)action.click_and_hold(div)for i in range(5): action.move_by_offset(17,0).perform() sleep(0.3)#釋放動(dòng)作鏈action.release()bro.quit()selenium模擬登陸QQ空間
代碼
https://qzone.qq.com/
from selenium import webdriverfrom time import sleepbro = webdriver.Chrome(executable_path=r'E:googleChromeApplicationchromedriver.exe')bro.get(’https://qzone.qq.com/’)bro.switch_to.frame('login_frame')switcher = bro.find_element_by_id(’switcher_plogin’)switcher.click()user_tag = bro.find_element_by_id(’u’)password_tag = bro.find_element_by_id(’p’)user_tag.send_keys(’1234455’)password_tag.send_keys(’qwer123’)sleep(1)but = bro.find_element_by_id(’login_button’)but.click()無(wú)頭瀏覽器和規(guī)避檢測(cè)
代碼
from selenium import webdriverfrom time import sleep#實(shí)現(xiàn)無(wú)可視化界面from selenium.webdriver.chrome.options import Options#實(shí)現(xiàn)規(guī)避檢測(cè)from selenium.webdriver import ChromeOptions#實(shí)現(xiàn)無(wú)可視化界面chrome_options = Options()chrome_options.add_argument(’--headless’)chrome_options.add_argument(’--disable-gpu’)#實(shí)現(xiàn)規(guī)避檢測(cè)option = ChromeOptions()option.add_experimental_option(’excludeSwitches’,[’enable-automation’])bro = webdriver.Chrome(executable_path=r'E:googleChromeApplicationchromedriver.exe',chrome_options=chrome_options,options=option)bro.get(’https://www.baidu.com’)print(bro.page_source)sleep(2)bro.quit()
到此這篇關(guān)于python爬蟲selenium模塊詳解的文章就介紹到這了,更多相關(guān)python爬蟲selenium模塊內(nèi)容請(qǐng)搜索好吧啦網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持好吧啦網(wǎng)!
相關(guān)文章:
1. Python結(jié)合百度語(yǔ)音識(shí)別實(shí)現(xiàn)實(shí)時(shí)翻譯軟件的實(shí)現(xiàn)2. 教你JS更簡(jiǎn)單的獲取表單中數(shù)據(jù)(formdata)3. 如何通過(guò)vscode運(yùn)行調(diào)試javascript代碼4. Python基于QQ郵箱實(shí)現(xiàn)SSL發(fā)送5. 測(cè)試模式 - XSL教程 - 56. 解決Java中的java.io.IOException: Broken pipe問(wèn)題7. JAVA抽象類及接口使用方法解析8. python如何寫個(gè)俄羅斯方塊9. 《CSS3實(shí)戰(zhàn)》筆記--漸變?cè)O(shè)計(jì)(一)10. python b站視頻下載的五種版本
