Solução estendida Python
:
sort_html_by_date.py
:
from bs4 import BeautifulSoup
from datetime import datetime
with open('input.html') as html_doc: # replace with your actual html file name
soup = BeautifulSoup(html_doc, 'lxml')
divs = {}
for div in soup.find_all('div', 'date'):
divs[datetime.strptime(div.string, '%a %B %d %Y')] = \
str(div) + '\n' + div.find_next_sibling('ul').prettify()
soup.body.clear()
for el in sorted(divs, reverse=True):
soup.body.append(divs[el])
print(soup.prettify(formatter=None))
Uso:
python sort_html_by_date.py
A saída:
<!DOCTYPE html>
<html>
<head>
</head>
<body>
<div class="date">Fri May 25 2018</div>
<ul>
<li>
Modify the website according to GDPR
</li>
<li>
Watch YouTube
</li>
</ul>
<div class="date">Thu May 24 2018</div>
<ul>
<li>
Solve the world's hunger problem
<ul>
<li>
Don't tell anyone
</li>
</ul>
</li>
<li>
Get something to wear
</li>
</ul>
<div class="date">Wed May 23 2018</div>
<ul>
<li>
Do laundry
<ul>
<li>
Get coins
</li>
</ul>
</li>
<li>
Wash the dishes
</li>
</ul>
</body>
</html>
Módulos usados: