Home » Tech » Jsoup HTML parser – Tutorial & examples

Jsoup HTML parser – Tutorial & examples

 heard about it a lot and I had the chance -finally- to use it on one of my projects. This is an introductory tutorial of the Jsoup HTML parser.

What is Jsoup?!

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.

With Jsoup we are able to:

  • Scrape and parse HTML from a URL, file, or string
  • Find and extract data, using DOM traversal or CSS selectors
  • Manipulate the HTML elements, attributes and text
  • clean user-submitted content against a safe white-list, to prevent XSS attacks
  • Output tidy HTML
https://aboullaite.me/jsoup-html-parser-tutorial-examples/