- First of all we are going to import both libraries into our program.
- Then we will get the HTML code of the desired webpage using urllib2.urlopen() function.
- Then we are going to parse the HTML code using BeautifulSoup() function.
- Once we have the beautifulsoup object ready, it is very easy to fetch all the links. All the links are written inside the anchor tag .
- To get all the anchor tags, we use the function find_all(‘a’) which comes with beautifulsoup. This will give us a list of all the anchor tags.
- But we don’t want the whole anchor tag. We only need the URLs. For that we can use the function get(‘href’) which also is a function of beautifulsoup. It will fetch all the URLs from their respective tags.
- We make a list of all such URLs on the page and we are done.