NLTK的全稱是natural language toolkit,是一套基于python的自然語言處理工具集。
安裝完NLTK模塊之后。安裝nltk_data由于網絡原因安裝一直失敗,只能下載模塊安裝了,由于網絡原因手動下載也經常失敗所以,可以到此網盤下載現成的
下載地址:http://www.mhwhcb.com/down/110/107.html
下載之后安裝:
解壓壓縮包:nltk_data-gh-pages.zip
拷貝packages文件夾到 D:\python\packages 然后修改packages文件夾名稱為D:\python\nltk_data
然后創建系統環境變量
安裝完成
測試
python代碼:
__author__ = .book *
執行結果:
D:\python\python.exe D:/phpstudy/WWW/spiderMasg/python/spider/nltkhandle.py
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
說明安裝成功了
安裝成功了但是nltk_data中沒有中文語料庫,所以你要通過pip安裝一個中文分詞模塊叫jieba</a>
轉載請注明:谷谷點程序 » 手動下載nltk_data,jieba中文語料庫挖掘