Handle url error in Web Scraping

In our old blog, we checked how to handle http errors by importing HTTPError. This blog explains how to handle url errors, which occur due to mistyped URLs or servers not found.

We will start our program by importing URLError. This has to be imported from urllib.error. Below is the statement

from urllib.error import URLError

We also need to use the urlopen function from the urllib library.

from urllib.request import urlopen

Now we’ll use try and except statements to see if there’s an error when trying to access the url ” https://dos.python.org/3/tutoal/index.html ” from the urlopen function.

try:
page = urlopen("https://dos.python.org/3/tutoal/index.html")
except URLError as error:
print(error)
else:
print('No error detected')

If there’s no error, the output will be “No error detected.” If the urlopen function is unable to access that url, it will return an error.

<urlopen error [Errno 11001] getaddrinfo failed>

Complete Code

from urllib.request import urlopen
from urllib.error import URLError
try:
page = urlopen("https://dos.python.org/3/tutoal/index.html")
except URLError as error:
print(error)
else:
print('No error detected')

Simple web scraping using beautifulsoup library

HTTP Error exception handling

Display title tag with the use of BeautifulSoup library

Advertisement

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s