David Ascienzo 76df7378f3 Updated README.md
2018-08-19 15:53:45 -04:00
2018-08-19 15:53:45 -04:00

phpBB Forum Scraper

Python-based scraper for phpBB forums.

Code requires:

  1. Python scraping library, Scrapy.

  2. Python HTML parsing library, BeautifulSoup.

Scraper Output

Scrapes the following information from forum posts:

1. Username

2. User post count

3. Post date & time

4. Post text

5. Quoted text

Edit phpBB.py and specify:

  1. allowed_domains

  2. start_urls

  3. username & password

  4. forum_login=False or forum_login=True

Instructions:

From within /phpBB_scraper/:

scrapy crawl phpBB to launch the crawler.

scrapy crawl phpBB -o posts.csv to launch the crawler and save results to CSV.

Description
No description provided
Readme 102 KiB
Languages
Python 100%