TechieYan Technologies

Web Scraping Bot Using Python

Abstract

In this article, we’ll find out how to build a Python web scraping bot.

Web Scraping is a web site data extraction process. A robot is a piece of code which will automate the task. As a result, A Web Scraping Bot is a program that will automatically scratch a website for data, depending on our requirements.

web scraping

Code Description & Execution

Programming Methodology

In this code, We Will import three modules one is time to update the content which the user is creating a web scraping bot. Now, from selenium module we will import web driver to get chrome driver details as we gave a path to it then according to the updates it will keep updating by it’s own. To get it the user should open google chrome and check out which version is installed in it. According to the version, download the chrome drivers and then go search for something and click on left button there you get an inspect option click on it as the user click on it a dialog box appears on the right side and now the cursor will be at X-Path so just click left button on your mouse and then copy the whole  X-path then Copy in the code or  simply run the code it will run and now by using for loop we run the scraping bot again and again if any exceptions occurs to avoid it we use try and except so  that it can easily be updated if any updates occurs in that website.

How to Execute?

Note: Make sure you have added path while installing the software’s.

http://techieyantechnologies.com/2022/07/how-to-install-anaconda/

http://techieyantechnologies.com/2022/06/get-started-with-creating-new-environment-in-anaconda-configuring-jupyter-notebook-and-installing-libraries-using-requirements-txt-2/

  1. Install the prerequisites/software’s required to execute the code from reading the above blog which is provided in the link above.
  2. Press windows key and type in anaconda prompt a terminal opens up.
  3. Go to the directory where your requirement.txt file is present, not just requirement.txt, if you want to execute any .py or .ipynb files, you need to go to that specific folder or path, where they are saved.
sending emails

4.<<directory of your file:>>. E.g., If my file is in d drive, then

5.Type d:

  1. cd d:\License-Plate-Recognition-main    #CHANGE PATH AS PER YOUR PROJECT, THIS IS JUST AN EXAMPLE

          7. If your project is in c drive, you can ignore step 4 and go with step 5

          8. Run pip install -r requirements.txt or conda install requirements.txt (Requirements.txt is a text file consisting of all the necessary libraries required for executing this python file. If it gives any error while installing libraries, you might need to install them individually.), example: pip install “module_name” i.e., pip install pandas

Binary Search using python execution

      9. If you would like to run .ipynb file, Please follow the link to setup and open jupyter notebook, You will be redirected to the local server there you can select which ever .ipynb file you’d like to run and click on it and execute each cell one by one by pressing shift+enter.

Note: There are 4 different files each seeves different purpose such as,

  • ipynb consists of all the data cleaning steps, which are necessary to build a clean and efficient model.
  • ipynb consist of major steps and exploratory data analysis which allow us to understand more about the data and behavior of it.
  • ipynb consists of data reduction/dimensionality reduction techniques such as Sequential feature selector method to reduce the dimensions in the data and compare the model scores before and after dimensionality reduction.
  • ipynb consists of combination of main.ipynb and variable_selection.ipynb to make it more clear and understable for the audience.

Please follow the above sequence if you would like to execute and the files require good system requirements to run.

Results

Timer

Issues Faced

  1. We might face an issue while installing specific libraries, in this case, you might need to install the libraires manually. Example: pip install “module_name/library” i.e., pip install wikipedia
  2. Make sure you have the latest or specific version of python, since sometimes it might cause version mismatch.
  3. Adding path to environment variables in order to run python files and anaconda environment in code editor, specifically in any code editor.
  4. Make sure to change the paths in the codeaccordingly where your music file and VS code is saved.

Refer to the Below links to get more details on installing python and anaconda and how to configure it.

http://techieyantechnologies.com/2022/07/how-to-install-anaconda/

http://techieyantechnologies.com/2022/06/get-started-with-creating-new-environment-in-anaconda-configuring-jupyter-notebook-and-installing-libraries-using-requirements-txt-2/

All the required data has been provided over here. Please feel free to contact me for model weights and if you face any issues.

http://techieyantechnologies.com/contact/

Yes, you now have more knowledge than yesterday, Keep Going.

Click Here To Download This Code And Associated File.

+91 7075575787