python으로 데이터 수집하기 - 3 ) selenium으로 웹사이트 크롤링하기

프로그래밍/python

python으로 데이터 수집하기 - 3 ) selenium으로 웹사이트 크롤링하기

kugancity 2022. 8. 11. 18:36

우선 간단하게 사용 예시

from selenium import webdriver

driver  = webdriver.Chrome('D:\Dropbox\crawler\chromedriver')
driver.implicitly_wait(3)
# url에 접근한다.
driver.get('https://google.com')

from selenium import webdriver
from time import sleep


options = webdriver.ChromeOptions()
options.add_argument("window-size=1920x1080")
options.add_argument("disable-gpu")
options.add_argument("user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36")
options.add_argument("lang=ko_KR") # 한국어!

# chromedriver의 위치를 지정해준다.
driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver_103.exe', chrome_options=options)
driver.implicitly_wait(3)

# url에 접근한다.
try:
    url = "https://www.jobkorea.co.kr/starter/review/view?C_Idx=924&Ctgr_Code=5&FavorCo_Stat=0&schTxt=%EC%B9%B4%EC%B9%B4%EC%98%A4&G_ID=0&Page=1"
    driver.get(url)
    sleep(2)


    items = driver.find_elements_by_xpath('//*[@id="container"]/div[2]/div[3]/ul/li')
    for fid in items:
        #print(fid.text)

        period = fid.find_element_by_xpath('./div/span[1]/span[1]').text
        ctype = fid.find_element_by_xpath('./div/span[1]/span[2]').text
        qtype = fid.find_element_by_xpath('./div/span[2]').text
        question = fid.find_element_by_xpath('./div/span[3]').text
        print(period + "\t" + ctype + "\t" + qtype + "\t" + question)
        sleep(1)

    #크롬창 닫기
    driver.close()
except Exception as e:
    print(e)

python으로 데이터 수집하기 - 1) pycharm 설치 및 관련 패키지 설치하기

python으로 데이터 수집하기 pycharm 설치 및 패키지 설치하기 request, beautifulSoup4를 사용해서 웹페이지 크롤링을 해보려고 합니다. pyCharm을 쓰고 있으면 IDE에서 간단하게 해당 패키지들을 설치할

kugancity.tistory.com

python으로 데이터 수집하기 - 2 ) 크롬 드라이버 설치하기

참고: https://wkdtjsgur100.github.io/selenium-xpath/

https://pythondocs.net/selenium/%EC%85%80%EB%A0%88%EB%8B%88%EC%9B%80-%ED%81%AC%EB%A1%A4%EB%9F%AC-%EA%B8%B0%EB%B3%B8-%EC%82%AC%EC%9A%A9%EB%B2%95/

https://book.coalastudy.com/data_crawling/week6/untitled-2/answer

728x90

저작자표시 (새창열림)