出現Couldn't find a tree builder with the features you requested: html5lib?

2020/06/20 下午 05:31

Python網路爬蟲討論版

佳鈴

觀看數：136

回答數：2

收藏數：0

import requests

from bs4 import BeautifulSoup

url = 'https://www.ettoday.net/news/news-list.htm'

r = requests.get(url)

soup = BeautifulSoup(r.text, "html5lib")

for d in soup.find(class_="part_list_2").find_all('h3'):

print(d.find(class_="date").text, d.find_all('a')[-1].text)

出現的錯誤訊息如下:

FeatureNotFound                           Traceback (most recent call last)
<ipython-input-2-a7255b1cf22c> in <module>
     5 r = requests.get(url)
     6
----> 7 soup = BeautifulSoup(r.text, "html5lib")
     8
     9 for d in soup.find(class_="part_list_2").find_all('h3'):

/srv/conda/envs/notebook/lib/python3.7/site-packages/bs4/__init__.py in __init__(self, markup, features, builder, parse_only, from_encoding, exclude_encodings, element_classes, **kwargs)
   243                     "Couldn't find a tree builder with the features you "
   244                     "requested: %s. Do you need to install a parser library?"
--> 245                     % ",".join(features))
   246
   247         # At this point either we have a TreeBuilder instance in

FeatureNotFound: Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?

請問要怎麼解決呢?

回答列表

2020/06/22 上午 01:08

張維元 (WeiYuan)

贊同數：0

不贊同數：0

留言數：0

嗨，這裡表示預設沒有安裝 html5lib，建議可以手動安裝或者改用其他解析器。

```

!pip install html5lib

```

或

```

soup = BeautifulSoup(r.text)

```

如果這個回答對你有幫助請主動點選「有幫助」的按鈕，也可以追蹤我的GITHUB帳號。若還有問題的話，也歡迎繼續再追問或者把你理解的部分整理上來，我都會提供你 Review 和 Feedback 😃😃😃
2020/06/22 上午 01:09

張維元 (WeiYuan)

贊同數：0

不贊同數：0

留言數：0

另外補充一下，關於解析器的部分：

常見的解析器有 lxml、html5lib、html.parser 這三個工具，他們主要是告訴 BeatifulSoup 要如何解析 HTML 語法而已，目的是一樣的，嚴格來說差異應該是「嚴謹跟精準程度」跟「效能速度」上略有差異，不過對於使用上我覺得沒有什麼不同。如果你有興趣的話，可以參考一下文件的說明：https://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser

一般來說： lxml 比較新比較厲害，html5lib 是預設的。

如果這個回答對你有幫助請主動點選「有幫助」的按鈕，也可以追蹤我的GITHUB帳號。若還有問題的話，也歡迎繼續再追問或者把你理解的部分整理上來，我都會提供你 Review 和 Feedback 😃😃😃