HtmlSource - a new DBO driver for CakePHP
Ok, ok - I’ve been slacking on this blog again, but I will keep that for another post where I will announce some major changes I have been thinking of lately. For today, I’d like to introduce the new DBO Source Driver: HtmlSource - which is completely functional but still lacking some of the features I have planned for it.
So what’s an HTML DBO driver you ask?
Simply put, it’s a way to treat any HTML page like a database and be able to retrieve (scrape) certain parts using an SQL-like command:
SELECT href, title FROM a WHERE class="submit"

