logo
Loading...

出現Couldn't find a tree builder with the features you requested: html5lib? - Cupoy

import requestsfrom bs4 import BeautifulSoupurl = ...

出現Couldn't find a tree builder with the features you requested: html5lib?

2020/06/20 下午 05:31
Python網路爬蟲討論版
佳鈴
觀看數:136
回答數:2
收藏數:0

import requests

from bs4 import BeautifulSoup


url = 'https://www.ettoday.net/news/news-list.htm'

r = requests.get(url)


soup = BeautifulSoup(r.text, "html5lib")


for d in soup.find(class_="part_list_2").find_all('h3'):

   print(d.find(class_="date").text, d.find_all('a')[-1].text)

出現的錯誤訊息如下:

FeatureNotFound                           Traceback (most recent call last)
<ipython-input-2-a7255b1cf22c> in <module>
     5 r = requests.get(url)
     6
----> 7 soup = BeautifulSoup(r.text, "html5lib")
     8
     9 for d in soup.find(class_="part_list_2").find_all('h3'):

/srv/conda/envs/notebook/lib/python3.7/site-packages/bs4/__init__.py in __init__(self, markup, features, builder, parse_only, from_encoding, exclude_encodings, element_classes, **kwargs)
   243                     "Couldn't find a tree builder with the features you "
   244                     "requested: %s. Do you need to install a parser library?"
--> 245                     % ",".join(features))
   246
   247         # At this point either we have a TreeBuilder instance in

FeatureNotFound: Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?


請問要怎麼解決呢?

回答列表

  • 2020/06/22 上午 01:08
    張維元 (WeiYuan)
    贊同數:0
    不贊同數:0
    留言數:0

    嗨,這裡表示預設沒有安裝 html5lib,建議可以手動安裝或者改用其他解析器。


    ```

    !pip install html5lib 

    ```



    ```

    soup = BeautifulSoup(r.text)

    ```


    如果這個回答對你有幫助請主動點選「有幫助」的按鈕,也可以追蹤我的GITHUB帳號。若還有問題的話,也歡迎繼續再追問或者把你理解的部分整理上來,我都會提供你 Review 和 Feedback 😃😃😃

  • 2020/06/22 上午 01:09
    張維元 (WeiYuan)
    贊同數:0
    不贊同數:0
    留言數:0

    另外補充一下,關於解析器的部分:


    常見的解析器有 lxml、html5lib、html.parser 這三個工具,他們主要是告訴 BeatifulSoup 要如何解析 HTML 語法而已,目的是一樣的,嚴格來說差異應該是「嚴謹跟精準程度」跟「效能速度」上略有差異,不過對於使用上我覺得沒有什麼不同。如果你有興趣的話,可以參考一下文件的說明:https://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser


    一般來說: lxml 比較新比較厲害,html5lib 是預設的。


    如果這個回答對你有幫助請主動點選「有幫助」的按鈕,也可以追蹤我的GITHUB帳號。若還有問題的話,也歡迎繼續再追問或者把你理解的部分整理上來,我都會提供你 Review 和 Feedback 😃😃😃