How to scrapy?

Environment setup

go into the crawler-system directory by cd crawler-system
using scrapy build-in tools: scrapy genspider <spiderName> <targetUrl> to generate a spider template.

go into the directory there is a settings.py script file.
you can turn on/off the logging, database, pipelines, middlewares, and other components in it (ref: pttCrawlerSystem/setting.py).

go to main.py script file and add new line with cmdline.execute("scrapy crawl <spiderName>".split()), and comment other line with cmdline.execute(...) for testing your spider.
learn scrapy official docs.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
crawler-system		crawler-system
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md