[Python Patterns] A very simple web scraper
Using BeautifulSoup4 to pull Hacker News.
I use this script as a basis to build other web scrapers using BeautifulSoup4
This script goes out to Hacker News and pulls down the stories for you. I imagine you'd expand on this for other scripts.
# A very simple webscraper
import requests
from bs4 import BeautifulSoup as bs
url = "https://news.ycombinator.com/"
response = requests.get(url)
html = response.text
soup = bs(html, 'html.parser')
articles = soup.find_all('tr', {'class': 'athing'})
for article in articles:
article_title = article.find('a', {'class': 'storylink'}).text
print(article_title)
print("--------")
article_link = article.find('a', {'class': 'storylink'}, href=True)['href']
print(article_link)
print("++++++++")
My blog posts tagged with "Python Patterns" are designed to be a quick look reference for some Python code snippets I use a lot. They are written to be a quick starting point for future projects so I do not need to type as much.