要使用NLTK庫提取關鍵詞,可以按照以下步驟進行:
pip install nltk
import nltk
nltk.download('punkt')
nltk.download('stopwords')
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from collections import Counter
text = "Your text goes here."
# 分詞
words = word_tokenize(text)
# 去除停用詞
stop_words = set(stopwords.words('english'))
filtered_words = [word for word in words if word.lower() not in stop_words]
# 計算詞頻
word_freq = Counter(filtered_words)
# 獲取前N個關鍵詞
top_keywords = word_freq.most_common(N)
在上述代碼中,首先對文本進行分詞,然后去除停用詞,接著計算詞頻并獲取前N個關鍵詞。可以根據具體需求調整代碼中的參數和邏輯來實現更精確的關鍵詞提取。