Getting back to shared servers problems. I have some very time consuming scripts running through CRON – some nice web scrapping jobs. They are not processing-intense, but rather slow because of slow websites. All these jobs are really hard to divide in to separate scripts (another article), so one script should have no limits to [...]
Posts tagged ‘scraping’
Prevent scripts from being killed
HTML filtering and XSS protection
If you have been programming websites long enough you would know that user input is first think to worry about when thinking about security. It’s really hard to decide what data is acceptable, especially when user has permission to insert HTML content through form.
For example, if you are developing CMS you need to make sure [...]
Scraping login requiring websites with cURL
Scraping websites with XPath is very easy (read here), but how to scrape user’s friends list from social website if it can be viewed only when user is logged in?
What we need to do is to implement algorithm, which posts login and password fields to website login form and uses the same PHPSESSID id for [...]
Web scraping with PHP and XPath
When I was writing about how I use web scraping, I was still hadn’t tried using Xpath (shame on me). sssscripting blog responded to my article with very good and rich post about all sorts of different techniques for scraping (with Ruby examples) and after reading this post in Kore Nordmann blog I finally decided [...]







