If you want to use an up-to-date version of this algorithm,check this newer project:
https://github.com/andreskrey/readability.php
The PHP port of Readability.js by Arc90.
- PHP Version >= 5
- PHP has builded with DOM(Document Object Model)
http://graceco.de/readability/
require 'lib/Readability.inc.php';
$Readability = new Readability($html, $html_input_charset); // default charset is utf-8
$ReadabilityData = $Readability->getContent(); // throws an exception when no suitable content is found
// You can see more params by var_dump($ReadabilityData);
echo "<h1>".$ReadabilityData['title']."</h1>";
echo $ReadabilityData['content'];
PS: For Node.js port, You can check this.