Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites using its HTML structure, In this post, I will explain basic fundaments of web scraping using python and also explore it by a live demonstration with two python libraries Beautifulsoup and requests respectively.
What you will learn from this post:
- basic understanding of web scraping
- how to extract data from a website using classes and HTML tags
- how to use requests module to get data
- how to use Beautifulsoup
Install required dependencies :
- clone or download it from here
- install requirements.txt file
pip install -r requirements.txt
How to run this code
- there are two source code files, one is .py extention and another is .ipynb extention
- one can run Scraping with BeautifulSoup.py file in python by run this cammand in terminal "python3 Web Scraping with BeautifulSoup.py"
- one can run Scraping with BeautifulSoup.ipynb file in jupyter notebook /li>
- one can install juypyter notebook by this command "pip3 install jupyter"
- CLI scraping tool is underdevelopment only beta version is available now