Beautiful Soup
Searching
The find()
method of BeautifulSoup object searches for first element that satisfies the given criteria as an argument.
By ID
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.find(id = 'nm')
print (obj)
By Class
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.find_all(attrs={"class": "mainmenu"})
print (obj)
By Attributes
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.find(attrs={"type":'text'})
print (obj)
The find_all()
method also accepts a filter argument. It returns a list of all the elements with the given id. In a certain HTML document, usually a single element with a particular id. Hence, using find()
instead of find_all()
is preferrable to search for a given id.
By ID
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.find_all(id = 'nm')
print (obj)
By Class
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.find_all(attrs={"class": "mainmenu"})
print (obj)
By Attributes
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.find_all(attrs={"type":'text'})
print (obj)
The select()
method in BeautifulSoup class accepts CSS selector as an argument. The #
symbol is the CSS selector for id. It followed by the value of required id is passed to select()
method. It works as the find_all()
method.
By ID
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.select("#nm")
print (obj)
By Class
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.select(".heading")
print (obj)
By Attributes
The select()
method can be called by passing the attributes to be compared against. The attributes must be put in a list object. It returns a list of all tags that have the given attribute.
In the following code, the select()
method returns all the tags with type attribute.
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.select("[type]")
print (obj)
Like the find_all()
method, the select()
method also returns a list. There is also a select_one()
method to return the first tag of the given argument.
By ID
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.select_one("#nm")
print (obj)
By Class
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.select_one(".heading")
print (obj)
By Attributes
The select()
method can be called by passing the attributes to be compared against. The attributes must be put in a list object. It returns a list of all tags that have the given attribute.
In the following code, the select()
method returns all the tags with type attribute.
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.select_one("[name='marks']")
print (obj)