win7 x64系統(tǒng)中安裝Scrapy的方法
scrapy是用python開(kāi)發(fā)的爬蟲(chóng)框架,從網(wǎng)上查了安裝方法,感覺(jué)都說(shuō)的挺復(fù)雜,而且很多教程都很有年頭了,于是記錄了自己的安裝過(guò)程。
首先安裝python,地址:https://www.python.org/downloads/release/python-2710/,注意根據(jù)你的系統(tǒng)下64位(Windows x86-64 MSI installer)還是32位的(Windows x86 MSI installer)。
現(xiàn)在是python3.6的天下了,建議大家安裝python3版本。
裝完以后就可以安裝scrapy了,推薦使用pip方式安裝,因?yàn)閟crapy需要調(diào)用很多額外的庫(kù),pip會(huì)全部幫你安裝好,不需要你在到處翻找了。
pip在python安裝完后就已經(jīng)有了,不需要額外安裝,下面只要按照scrapy官網(wǎng)推薦的方法在命令提示符中輸入pip installscrapy(圖1),然后只需靜靜等待即可大功告成。

圖1
裝完以后可以敲入命令pip list看看已安裝的庫(kù)(圖2),出來(lái)很多啊,pip真是好東西。

圖2
現(xiàn)在試下看看建個(gè)爬蟲(chóng)項(xiàng)目,按照說(shuō)明文檔鍵入命令scrapy startproject tutorial,目錄已經(jīng)出來(lái)(圖3),看來(lái)是沒(méi)問(wèn)題了。但為了驗(yàn)證是否安裝成功,還得跑一下看看,第一次創(chuàng)建項(xiàng)目的時(shí)候,系統(tǒng)會(huì)提示可以跑個(gè)例子看看(圖4)。按照提示鍵入命令

圖3

圖4
scrapy genspider example example.com創(chuàng)建一個(gè)爬蟲(chóng),再鍵入命令scrapy crawl example
運(yùn)行爬蟲(chóng),結(jié)果如下(圖5),報(bào)錯(cuò)了,貌似是缺少win32api,立即上網(wǎng)下了一個(gè)(http://sourceforge.net/projects/pywin32/files/pywin32/Build%20219/),

圖5
下的時(shí)候注意對(duì)應(yīng)的python版本。win32api裝好以后再運(yùn)行一次爬蟲(chóng)(圖6),這次成功了,應(yīng)該是沒(méi)問(wèn)題了。

圖6
總結(jié)一下,其實(shí)剛開(kāi)始網(wǎng)上找資料的時(shí)候看到上面寫(xiě)的要先裝這個(gè)庫(kù)那個(gè)庫(kù)的時(shí)候心中很忐忑,結(jié)果發(fā)現(xiàn)不是很復(fù)雜,大多數(shù)問(wèn)題pip都給解決了,剩下的就是具體問(wèn)題具體研究,不過(guò)也沒(méi)碰到很復(fù)雜解決不了的問(wèn)題。另外吐下槽就是網(wǎng)上的教程互抄的太厲害,看著一搜一堆,其實(shí)多數(shù)都大同小異,真正有價(jià)值的沒(méi)幾個(gè),沒(méi)大腿抱就是辛苦呀。
最后說(shuō)一下,scrapy目前還不支持python3.x版本,我用的是python2.7,如果你碰到莫名其妙的問(wèn)題時(shí)請(qǐng)先看看自己有沒(méi)有裝錯(cuò)python版本。
下面是其他網(wǎng)友補(bǔ)充的文章
環(huán)境
Windows7 64位
Python2.7.6 64位
Python的安裝:
- 打開(kāi)http://www.python.org/getit/releases/2.7.6/頁(yè)面,下載Python-2.7.6.amd64.msi 進(jìn)行安裝,安裝完成后,需要配置環(huán)境變量,環(huán)境變量的配置可以參考該文章
- 測(cè)試python是否安裝成功,如果python成功安裝并且配置好環(huán)境變量,那么在cmd中輸入python,就能得到python版本的詳細(xì)信息(如32位或64位)
C:\Users\Administrator>python Python 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)] on win 32
easy_install的安裝
保存ez_setup.py至本地,如D盤(pán)(如果失效了,可以參考下http://www.dhdzp.com/article/151027.htm
#!/usr/bin/env python
"""
Setuptools bootstrapping installer.
Maintained at https://github.com/pypa/setuptools/tree/bootstrap.
Run this script to install or upgrade setuptools.
This method is DEPRECATED. Check https://github.com/pypa/setuptools/issues/581 for more details.
"""
import os
import shutil
import sys
import tempfile
import zipfile
import optparse
import subprocess
import platform
import textwrap
import contextlib
from distutils import log
try:
from urllib.request import urlopen
except ImportError:
from urllib2 import urlopen
try:
from site import USER_SITE
except ImportError:
USER_SITE = None
# 33.1.1 is the last version that supports setuptools self upgrade/installation.
DEFAULT_VERSION = "33.1.1"
DEFAULT_URL = "https://pypi.io/packages/source/s/setuptools/"
DEFAULT_SAVE_DIR = os.curdir
DEFAULT_DEPRECATION_MESSAGE = "ez_setup.py is deprecated and when using it setuptools will be pinned to {0} since it's the last version that supports setuptools self upgrade/installation, check https://github.com/pypa/setuptools/issues/581 for more info; use pip to install setuptools"
MEANINGFUL_INVALID_ZIP_ERR_MSG = 'Maybe {0} is corrupted, delete it and try again.'
log.warn(DEFAULT_DEPRECATION_MESSAGE.format(DEFAULT_VERSION))
def _python_cmd(*args):
"""
Execute a command.
Return True if the command succeeded.
"""
args = (sys.executable,) + args
return subprocess.call(args) == 0
def _install(archive_filename, install_args=()):
"""Install Setuptools."""
with archive_context(archive_filename):
# installing
log.warn('Installing Setuptools')
if not _python_cmd('setup.py', 'install', *install_args):
log.warn('Something went wrong during the installation.')
log.warn('See the error message above.')
# exitcode will be 2
return 2
def _build_egg(egg, archive_filename, to_dir):
"""Build Setuptools egg."""
with archive_context(archive_filename):
# building an egg
log.warn('Building a Setuptools egg in %s', to_dir)
_python_cmd('setup.py', '-q', 'bdist_egg', '--dist-dir', to_dir)
# returning the result
log.warn(egg)
if not os.path.exists(egg):
raise IOError('Could not build the egg.')
class ContextualZipFile(zipfile.ZipFile):
"""Supplement ZipFile class to support context manager for Python 2.6."""
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.close()
def __new__(cls, *args, **kwargs):
"""Construct a ZipFile or ContextualZipFile as appropriate."""
if hasattr(zipfile.ZipFile, '__exit__'):
return zipfile.ZipFile(*args, **kwargs)
return super(ContextualZipFile, cls).__new__(cls)
@contextlib.contextmanager
def archive_context(filename):
"""
Unzip filename to a temporary directory, set to the cwd.
The unzipped target is cleaned up after.
"""
tmpdir = tempfile.mkdtemp()
log.warn('Extracting in %s', tmpdir)
old_wd = os.getcwd()
try:
os.chdir(tmpdir)
try:
with ContextualZipFile(filename) as archive:
archive.extractall()
except zipfile.BadZipfile as err:
if not err.args:
err.args = ('', )
err.args = err.args + (
MEANINGFUL_INVALID_ZIP_ERR_MSG.format(filename),
)
raise
# going in the directory
subdir = os.path.join(tmpdir, os.listdir(tmpdir)[0])
os.chdir(subdir)
log.warn('Now working in %s', subdir)
yield
finally:
os.chdir(old_wd)
shutil.rmtree(tmpdir)
def _do_download(version, download_base, to_dir, download_delay):
"""Download Setuptools."""
py_desig = 'py{sys.version_info[0]}.{sys.version_info[1]}'.format(sys=sys)
tp = 'setuptools-{version}-{py_desig}.egg'
egg = os.path.join(to_dir, tp.format(**locals()))
if not os.path.exists(egg):
archive = download_setuptools(version, download_base,
to_dir, download_delay)
_build_egg(egg, archive, to_dir)
sys.path.insert(0, egg)
# Remove previously-imported pkg_resources if present (see
# https://bitbucket.org/pypa/setuptools/pull-request/7/ for details).
if 'pkg_resources' in sys.modules:
_unload_pkg_resources()
import setuptools
setuptools.bootstrap_install_from = egg
def use_setuptools(
version=DEFAULT_VERSION, download_base=DEFAULT_URL,
to_dir=DEFAULT_SAVE_DIR, download_delay=15):
"""
Ensure that a setuptools version is installed.
Return None. Raise SystemExit if the requested version
or later cannot be installed.
"""
to_dir = os.path.abspath(to_dir)
# prior to importing, capture the module state for
# representative modules.
rep_modules = 'pkg_resources', 'setuptools'
imported = set(sys.modules).intersection(rep_modules)
try:
import pkg_resources
pkg_resources.require("setuptools>=" + version)
# a suitable version is already installed
return
except ImportError:
# pkg_resources not available; setuptools is not installed; download
pass
except pkg_resources.DistributionNotFound:
# no version of setuptools was found; allow download
pass
except pkg_resources.VersionConflict as VC_err:
if imported:
_conflict_bail(VC_err, version)
# otherwise, unload pkg_resources to allow the downloaded version to
# take precedence.
del pkg_resources
_unload_pkg_resources()
return _do_download(version, download_base, to_dir, download_delay)
def _conflict_bail(VC_err, version):
"""
Setuptools was imported prior to invocation, so it is
unsafe to unload it. Bail out.
"""
conflict_tmpl = textwrap.dedent("""
The required version of setuptools (>={version}) is not available,
and can't be installed while this script is running. Please
install a more recent version first, using
'easy_install -U setuptools'.
(Currently using {VC_err.args[0]!r})
""")
msg = conflict_tmpl.format(**locals())
sys.stderr.write(msg)
sys.exit(2)
def _unload_pkg_resources():
sys.meta_path = [
importer
for importer in sys.meta_path
if importer.__class__.__module__ != 'pkg_resources.extern'
]
del_modules = [
name for name in sys.modules
if name.startswith('pkg_resources')
]
for mod_name in del_modules:
del sys.modules[mod_name]
def _clean_check(cmd, target):
"""
Run the command to download target.
If the command fails, clean up before re-raising the error.
"""
try:
subprocess.check_call(cmd)
except subprocess.CalledProcessError:
if os.access(target, os.F_OK):
os.unlink(target)
raise
def download_file_powershell(url, target):
"""
Download the file at url to target using Powershell.
Powershell will validate trust.
Raise an exception if the command cannot complete.
"""
target = os.path.abspath(target)
ps_cmd = (
"[System.Net.WebRequest]::DefaultWebProxy.Credentials = "
"[System.Net.CredentialCache]::DefaultCredentials; "
'(new-object System.Net.WebClient).DownloadFile("%(url)s", "%(target)s")'
% locals()
)
cmd = [
'powershell',
'-Command',
ps_cmd,
]
_clean_check(cmd, target)
def has_powershell():
"""Determine if Powershell is available."""
if platform.system() != 'Windows':
return False
cmd = ['powershell', '-Command', 'echo test']
with open(os.path.devnull, 'wb') as devnull:
try:
subprocess.check_call(cmd, stdout=devnull, stderr=devnull)
except Exception:
return False
return True
download_file_powershell.viable = has_powershell
def download_file_curl(url, target):
cmd = ['curl', url, '--location', '--silent', '--output', target]
_clean_check(cmd, target)
def has_curl():
cmd = ['curl', '--version']
with open(os.path.devnull, 'wb') as devnull:
try:
subprocess.check_call(cmd, stdout=devnull, stderr=devnull)
except Exception:
return False
return True
download_file_curl.viable = has_curl
def download_file_wget(url, target):
cmd = ['wget', url, '--quiet', '--output-document', target]
_clean_check(cmd, target)
def has_wget():
cmd = ['wget', '--version']
with open(os.path.devnull, 'wb') as devnull:
try:
subprocess.check_call(cmd, stdout=devnull, stderr=devnull)
except Exception:
return False
return True
download_file_wget.viable = has_wget
def download_file_insecure(url, target):
"""Use Python to download the file, without connection authentication."""
src = urlopen(url)
try:
# Read all the data in one block.
data = src.read()
finally:
src.close()
# Write all the data in one block to avoid creating a partial file.
with open(target, "wb") as dst:
dst.write(data)
download_file_insecure.viable = lambda: True
def get_best_downloader():
downloaders = (
download_file_powershell,
download_file_curl,
download_file_wget,
download_file_insecure,
)
viable_downloaders = (dl for dl in downloaders if dl.viable())
return next(viable_downloaders, None)
def download_setuptools(
version=DEFAULT_VERSION, download_base=DEFAULT_URL,
to_dir=DEFAULT_SAVE_DIR, delay=15,
downloader_factory=get_best_downloader):
"""
Download setuptools from a specified location and return its filename.
`version` should be a valid setuptools version number that is available
as an sdist for download under the `download_base` URL (which should end
with a '/'). `to_dir` is the directory where the egg will be downloaded.
`delay` is the number of seconds to pause before an actual download
attempt.
``downloader_factory`` should be a function taking no arguments and
returning a function for downloading a URL to a target.
"""
# making sure we use the absolute path
to_dir = os.path.abspath(to_dir)
zip_name = "setuptools-%s.zip" % version
url = download_base + zip_name
saveto = os.path.join(to_dir, zip_name)
if not os.path.exists(saveto): # Avoid repeated downloads
log.warn("Downloading %s", url)
downloader = downloader_factory()
downloader(url, saveto)
return os.path.realpath(saveto)
def _build_install_args(options):
"""
Build the arguments to 'python setup.py install' on the setuptools package.
Returns list of command line arguments.
"""
return ['--user'] if options.user_install else []
def _parse_args():
"""Parse the command line for options."""
parser = optparse.OptionParser()
parser.add_option(
'--user', dest='user_install', action='store_true', default=False,
help='install in user site package')
parser.add_option(
'--download-base', dest='download_base', metavar="URL",
default=DEFAULT_URL,
help='alternative URL from where to download the setuptools package')
parser.add_option(
'--insecure', dest='downloader_factory', action='store_const',
const=lambda: download_file_insecure, default=get_best_downloader,
help='Use internal, non-validating downloader'
)
parser.add_option(
'--version', help="Specify which version to download",
default=DEFAULT_VERSION,
)
parser.add_option(
'--to-dir',
help="Directory to save (and re-use) package",
default=DEFAULT_SAVE_DIR,
)
options, args = parser.parse_args()
# positional arguments are ignored
return options
def _download_args(options):
"""Return args for download_setuptools function from cmdline args."""
return dict(
version=options.version,
download_base=options.download_base,
downloader_factory=options.downloader_factory,
to_dir=options.to_dir,
)
def main():
"""Install or upgrade setuptools and EasyInstall."""
options = _parse_args()
archive = download_setuptools(**_download_args(options))
return _install(archive, _build_install_args(options))
if __name__ == '__main__':
sys.exit(main())
在cmd中運(yùn)行:
d:\>python ez_setup.py
進(jìn)行SetupTools的安裝
在運(yùn)行的時(shí)候會(huì)發(fā)生一個(gè)錯(cuò)誤,該錯(cuò)誤為"ascii codec can't decode byte 0xe8 in position 0:ordinal not in range(128)",大意為ascii編碼不能解析byte 0xe8。
解決方法:找到并打開(kāi)python根目錄/Lib/mimetypes.py文件,在import urllib后,添加代碼:
reload(sys)
sys.setdefaultencoding('gbk')
把默認(rèn)編碼方式改為gbk(網(wǎng)上有寫(xiě)用utf8的,在這個(gè)腳本中是無(wú)效的,需要改成gbk格式)。重新執(zhí)行python ez_setup.py,如果出現(xiàn)刷屏的安裝信息,則說(shuō)明安裝成功了。此時(shí),在python目錄下多了一個(gè)Script文件夾,easy_install就在里面
Scrapy依賴項(xiàng)的安裝
Scrapy的依賴項(xiàng)
安裝lxml-3.2.4.win32-py2.7.exe(64位系統(tǒng)需要安裝lxml-3.2.4.win-amd64-py2.7.exe)
安裝pywin32-218.win32-py2.7.exe(64位系統(tǒng)需要安裝pywin32-218.win-amd64-py2.7.exe)
安裝Twisted-13.2.0.win32-py2.7.exe(64位系統(tǒng)需要安裝Twisted-13.2.0.win-amd64-py2.7.exe)
安裝pyOpenSSL-0.13.1.win32-py2.7.exe(64位系統(tǒng)需要安裝pyOpenSSL-0.13.1.win-amd64-py2.7.exe)
將zope.interface-4.0.5-py2.7-win32.egg拷貝到C:\Python27\Scripts目錄下,執(zhí)行$ easy_install.exe zope.interface-4.0.5-py2.7-win32.egg
驗(yàn)證scrapy依賴項(xiàng)是否安裝成功的方法:
cmd執(zhí)行$ python進(jìn)入python控制臺(tái)
執(zhí)行import lxml,如果沒(méi)報(bào)錯(cuò),則說(shuō)明lxml安裝成功
執(zhí)行import twisted,如果沒(méi)報(bào)錯(cuò),則說(shuō)明twisted安裝成功
執(zhí)行import OpenSSL,如果沒(méi)報(bào)錯(cuò),則說(shuō)明OpenSSL安裝成功
執(zhí)行import zope.interface,如果沒(méi)報(bào)錯(cuò),則說(shuō)明zope.interface安裝成功
如果安裝成功,那么在cmd中執(zhí)行& python,然后執(zhí)行import lxml,如果沒(méi)有報(bào)錯(cuò),則說(shuō)明lxml安裝成功。
安裝Scrapy
方法1: 控制臺(tái)輸入:easy_install scrapy
方法2:解壓縮Scrapy-0.22.2.tar.gz,在其目錄下執(zhí)行$ python setup.py install進(jìn)行Scrapy的安裝。
檢查Scrapy是否安裝成功的方法:可以在cmd控制臺(tái)執(zhí)行 $ scrapy ,如果沒(méi)有報(bào)錯(cuò),說(shuō)明安裝成功。
相關(guān)文章
這篇文章就介紹到這了,需要的朋友可以參考一下。
相關(guān)文章
python使用pil庫(kù)實(shí)現(xiàn)圖片合成實(shí)例代碼
這篇文章主要介紹了python PIL實(shí)現(xiàn)圖片合成實(shí)例代碼,小編覺(jué)得還是挺不錯(cuò)的,具有一定借鑒價(jià)值,需要的朋友可以參考下2018-01-01
python條件變量之生產(chǎn)者與消費(fèi)者操作實(shí)例分析
這篇文章主要介紹了python條件變量之生產(chǎn)者與消費(fèi)者操作,結(jié)合具體實(shí)例形式分析了Python條件變量的概念、原理、及線程操作的相關(guān)技巧,需要的朋友可以參考下2017-03-03
解決pandas報(bào)錯(cuò)'DataFrame' object has no
這篇文章主要介紹了解決pandas報(bào)錯(cuò)'DataFrame' object has no attribute 'as_matrix'問(wèn)題,具有很好的參考價(jià)值,希望對(duì)大家有所幫助,如有錯(cuò)誤或未考慮完全的地方,望不吝賜教2023-08-08
Pygame實(shí)戰(zhàn)之經(jīng)典泡泡龍小游戲
Python版的消除類的游戲還是很多的,今天就出一個(gè)消除類——泡泡龍小游戲。文中的示例代碼很詳細(xì),感興趣的小伙伴快來(lái)跟隨小編一起學(xué)習(xí)一下吧2021-12-12
yolov5 win10 CPU與GPU環(huán)境搭建過(guò)程
這篇文章主要介紹了yolov5 win10 CPU與GPU環(huán)境搭建過(guò)程,本文給大家介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2021-04-04

