<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Juozas devBlog &#187; security</title>
	<atom:link href="http://dev.juokaz.com/tag/security/feed" rel="self" type="application/rss+xml" />
	<link>http://dev.juokaz.com</link>
	<description>Random ideas, scripts and facts</description>
	<lastBuildDate>Mon, 22 Mar 2010 10:48:42 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>HTML filtering and XSS protection</title>
		<link>http://dev.juokaz.com/php/html-filtering-and-xss-protection</link>
		<comments>http://dev.juokaz.com/php/html-filtering-and-xss-protection#comments</comments>
		<pubDate>Sat, 21 Mar 2009 20:40:24 +0000</pubDate>
		<dc:creator>Juozas</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[Websites]]></category>
		<category><![CDATA[autoloader]]></category>
		<category><![CDATA[cleanup]]></category>
		<category><![CDATA[cron]]></category>
		<category><![CDATA[dom]]></category>
		<category><![CDATA[filter]]></category>
		<category><![CDATA[how-to]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[htmlpurifier]]></category>
		<category><![CDATA[library]]></category>
		<category><![CDATA[review]]></category>
		<category><![CDATA[scraping]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[tidy]]></category>
		<category><![CDATA[tinymce]]></category>
		<category><![CDATA[validate]]></category>
		<category><![CDATA[web scraper]]></category>
		<category><![CDATA[xss]]></category>

		<guid isPermaLink="false">http://dev.juokaz.com/?p=396</guid>
		<description><![CDATA[If you have been programming websites long enough you would know that user input is first think to worry about when thinking about security. It&#8217;s really hard to decide what data is acceptable, especially when user has permission to insert HTML content through form.
For example, if you are developing CMS you need to make sure [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://htmlpurifier.org/"><img class="size-thumbnail wp-image-402" style="float: left;" title="HTML Purifier" src="http://dev.juokaz.com/wp-content/uploads/2009/03/logo-large-150x150.png" alt="HTML Purifier" width="150" height="150" /></a>If you have been programming websites long enough you would know that user input is first think to worry about when thinking about security. It&#8217;s really hard to decide what data is acceptable, especially when user has permission to insert HTML content through form.</p>
<p>For example, if you are developing CMS you need to make sure that user input don&#8217;t break whole template. But that&#8217;s not so easy, because you need very clever HTML validations as even one missing closing tag for &lt;div&gt; or &lt;p&gt; can brake website&#8217;s layout completely. Editors like <a href="http://tinymce.moxiecode.com/">TinyMCE</a> can check and try to fix errors, but in my experience, they sometimes create more of them.</p>
<p>However, problem can be solved, and quite easily. Almost a year ago I was reading some random blog when I find out about <a href="http://htmlpurifier.org/">HTML Purifier</a>. Basically, it&#8217;s library which can filter and fix <strong>any</strong> HTML. <a href="http://htmlpurifier.org/comparison.html">Compared</a> to other libraries, it looks very promising, but since then I haven&#8217;t had a chance to test it &#8211; other libraries have been working fine.</p>
<p>Today I was working with <a href="http://dev.juokaz.com/php/web-scraping-with-php-and-xpath">web scrapper</a> again and ended up stuck because of very badly formatted HTML. When regular expressions are used, code validity isn&#8217;t (shouldn&#8217;t) a case at all, but XPath fails immediately. I tried simplifying queries, hard-coded source fixing, but all that required so many effort that I introduced Purifier filter between source fetching and <a href="http://en.wikipedia.org/wiki/Document_Object_Model">DOM</a> constructing. It worked!</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #b1b100;">require_once</span> <span style="color: #0000ff;">'HTMLPurifier.includes.php'</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$config</span> <span style="color: #339933;">=</span> HTMLPurifier_Config<span style="color: #339933;">::</span><span style="color: #004000;">createDefault</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$config</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">set</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'HTML'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'Doctype'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'XHTML 1.0 Transitional'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$config</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">set</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'HTML'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'TidyLevel'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'heavy'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// Don't remove IDs (&lt;div id=&quot;first&quot; /&gt;)</span>
<span style="color: #000088;">$config</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">set</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'Attr'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'EnableID'</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$obj</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HTMLPurifier<span style="color: #009900;">&#40;</span><span style="color: #000088;">$config</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$clean_html</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$obj</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">purify</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$html</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>In this example I chose worst way &#8211; include all files. Library uses <a href="http://pear.php.net">PEAR</a>-like directory structure so simple auto-loader  can include all required files in background, but for simplicity it&#8217;s not used here. This sample code filters <em>$html</em> variable using XHTML 1.0 and does heavy level <a href="http://en.wikipedia.org/wiki/HTML_Tidy">tidying</a> (quite clear from source code itself).</p>
<p><a href="http://en.wikipedia.org/wiki/Cross-site_scripting">XSS</a>? Purifier protects from them also &#8211; <a href="http://htmlpurifier.org/live/smoketests/xssAttacks.php">full list</a> of tests. Library is also highly customizable (<a href="http://htmlpurifier.org/live/configdoc/plain.html">configuration manual</a>), but documentation is not very clear &#8211; I have spent more than a hour trying to make it to return HTML with  <em>&lt;head&gt;</em> part. I haven&#8217;t found any nice solution (maybe because the library is not made for such things).</p>
<p>HTML Purifier contains about 350 files so it&#8217;s relatively big library, however it performs good and shouldn&#8217;t kill you web server. Today I Purified and using XPath extracted information from more than 1000 pages and it worked really stable &#8211; none of the results where filtered unexpectedly. I definitely recommend it for HTML inputs filtering because it just does wonderful job &#8211; you can try it online <a href="http://htmlpurifier.org/demo.php">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.juokaz.com/php/html-filtering-and-xss-protection/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Scraping login requiring websites with cURL</title>
		<link>http://dev.juokaz.com/php/scraping-login-requiring-websites-with-curl</link>
		<comments>http://dev.juokaz.com/php/scraping-login-requiring-websites-with-curl#comments</comments>
		<pubDate>Mon, 23 Feb 2009 20:50:19 +0000</pubDate>
		<dc:creator>Juozas</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[Websites]]></category>
		<category><![CDATA[crawling]]></category>
		<category><![CDATA[curl]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[hack]]></category>
		<category><![CDATA[login]]></category>
		<category><![CDATA[post]]></category>
		<category><![CDATA[rest]]></category>
		<category><![CDATA[scraping]]></category>
		<category><![CDATA[secure]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[soap]]></category>
		<category><![CDATA[ssl]]></category>
		<category><![CDATA[xml-rpc]]></category>
		<category><![CDATA[xpath]]></category>

		<guid isPermaLink="false">http://dev.juokaz.com/?p=245</guid>
		<description><![CDATA[Scraping websites with XPath is very easy (read here), but how to scrape user&#8217;s friends list from social website if it can be viewed only when user is logged in?
What we need to do is to implement algorithm, which posts login and password fields to website login form and uses the same PHPSESSID id for [...]]]></description>
			<content:encoded><![CDATA[<p>Scraping websites with XPath is very easy (read <a href="http://dev.juokaz.com/php/web-scraping-with-php-and-xpath">here</a>), but how to scrape user&#8217;s friends list from social website if it can be viewed only when user is logged in?</p>
<p>What we need to do is to implement algorithm, which posts login and password fields to website login form and uses the same PHPSESSID id for further calls. For example, if login form is POSTed with 123 session id, then all requests with 123 session id could access users-only pages. This works because PHP (or other language) sets session data to loggedin=true for given session id.</p>
<p>But how you are going to do all this work with cookies and session id? Luckily, PHP has <a href="http://uk.php.net/curl">cURL extension</a> which simplifies connecting to remote addresses, using cookies, staying in one session, POSTing data, etc. It&#8217;s really powerful library, which basically allows you to use all HTTP headers functionality.</p>
<p>For secure pages crawling, I&#8217;ve created very simple <a href="http://dev.juokaz.com/examples/crawler/crawler.phps">Secure_Crawler</a> class, which works like this:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #b1b100;">include</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;crawler.php&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$crawler</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Secure_Crawler<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// Login to website</span>
<span style="color: #000088;">$crawler</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">login</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'my_username'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'secure_password'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// Get Content</span>
<span style="color: #000088;">$content</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$crawler</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'http://www.example.com/secure/profile.php'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// modifications...</span></pre></div></div>

<p>If you look at class source code, you would see that class has these specifications:</p>
<ol>
<li>When Secure_Crawler instance is created, default cURL options are set</li>
<li>Login() method POSTs given credentials to login page (<em>hard-coded</em>)</li>
<li>Get(url) method loads page by given URL (previous session data is used)</li>
</ol>
<p>Class itself is very easily extendible &#8211; as long as you pass Cookies file to cURL object, login information will (should) be used and all users-only content would be available.</p>
<p>Using similar class, I&#8217;ve pseudo-reverse engineered API. I needed to enter information to other website multiple times per day by hand because they didn&#8217;t offered any remote services (like XML-RPC or REST), so I created class which mimics API functionality. From outside it looks like normal API object, but inside code actually POSTs everything to actual website.</p>
<p>All websites works differently so you need to spend some time analysing how login form submission is handled on that specifix website. For example, maybe you need to use SSL protocol or even you own certificate. It depends and differs from site to site, but basics are the same &#8211; it will work as long as you are calm enough to tweak it&#8217;s work-flow. </p>
]]></content:encoded>
			<wfw:commentRss>http://dev.juokaz.com/php/scraping-login-requiring-websites-with-curl/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>PayPal payment with encryption</title>
		<link>http://dev.juokaz.com/php/paypal-payment-with-encryption</link>
		<comments>http://dev.juokaz.com/php/paypal-payment-with-encryption#comments</comments>
		<pubDate>Sun, 22 Feb 2009 14:34:40 +0000</pubDate>
		<dc:creator>Juozas</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[Websites]]></category>
		<category><![CDATA[encryption]]></category>
		<category><![CDATA[ivor durham]]></category>
		<category><![CDATA[library]]></category>
		<category><![CDATA[payment gateway]]></category>
		<category><![CDATA[paypal]]></category>
		<category><![CDATA[phpfour.com]]></category>
		<category><![CDATA[private key]]></category>
		<category><![CDATA[rsa]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://dev.juokaz.com/?p=224</guid>
		<description><![CDATA[Recently phpfour.com posted very interesting library for payment gateways. In my situation, PayPal is only used to pay for orders &#8211; cart and order setup is done in our shop, so I do not want to have additional problems with users changing orders numbers, amount to be paid, etc. Today I&#8217;m going to show how [...]]]></description>
			<content:encoded><![CDATA[<p>Recently phpfour.com posted very interesting <a href="http://www.phpfour.com/blog/2009/02/php-payment-gateway-library-for-paypal-authorizenet-and-2checkout/">library for payment gateways</a>. In my situation, PayPal is only used to pay for orders &#8211; cart and order setup is done in our shop, so I do not want to have additional problems with users changing orders numbers, amount to be paid, etc. Today I&#8217;m going to show how to encrypt PayPal transactions.</p>
<p>I chose to use <a href="https://www.paypal.com/us/cgi-bin/webscr?cmd=p/xcl/rec/ewp-intro-outside">Encrypted Website Payment</a>, which allows you to encrypt all form fields and send them as one encrypted parameter. Only PayPal knows how to decrypt it, because it uses public key encryption technology (you need to upload your certificate in PayPal account).</p>
<p>My recommended PHP library for creating such buttons is written by Ivor Durham and is available on-line <a href="http://www.pdncommunity.com/pdn/attachments/pdn/ewp/87/1/paypalewp.php">here</a>. It&#8217;s not as flexible as phpfour.com one and is probably old, but it does what it needs to do. I have been using it for over a year now and haven&#8217;t had any problems (some hundreds payments).</p>
<p>To create encrypted button you need to write something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$paypal</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> PayPalEWP<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$paypal</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">setTempFileDirectory</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'/tmp'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// Certificate and private key</span>
<span style="color: #000088;">$paypal</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">setCertificate</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'mycompany_cert.pem'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">'mycompany_key.pem'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// Uploaded certificate id</span>
<span style="color: #000088;">$paypal</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">setCertificateID</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'ABCDEFGHIJKL'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// PayPal certificate</span>
<span style="color: #000088;">$paypal</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">setPayPalCertificate</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'paypal_cert_sandbox.pem'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$parameters</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;cmd&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;_xclick&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #0000ff;">&quot;business&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;sales@mycompany.com&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #0000ff;">&quot;item_name&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;Order #ID&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #0000ff;">&quot;amount&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;12.95&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #0000ff;">&quot;no_shipping&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;1&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #0000ff;">&quot;return&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;http://mycompany.com/paypal_ok.php&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #0000ff;">&quot;cancel_return&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;http://mycompany.com/paypal_cancel.php&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #0000ff;">&quot;no_note&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;1&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #0000ff;">&quot;currency_code&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;USD&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #0000ff;">&quot;bn&quot;</span> <span style="color: #339933;">=&gt;</span> <span style="color: #0000ff;">&quot;PP-BuyNowBF&quot;</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$encryptedButton</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$paypal</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">encryptButton</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$parameters</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #b1b100;">echo</span> <span style="color: #0000cc; font-style: italic;">&lt;&lt;&lt;END_HTML
&lt;form action=&quot;https://www.sandbox.paypal.com/cgi-bin/webscr&quot;
method=&quot;post&quot;&gt;
&nbsp;
&lt;input type=&quot;hidden&quot; name=&quot;cmd&quot; value=&quot;_s-xclick&quot;&gt;
&lt;input type=&quot;image&quot;
src=&quot;https://www.sandbox.paypal.com/en_US/i/btn/x-click-but23.gif&quot;
border=&quot;0&quot; name=&quot;submit&quot; alt=&quot;Make payments with PayPal&quot;&gt;
&lt;input type=&quot;hidden&quot; name=&quot;encrypted&quot; value=&quot;
-----BEGIN PKCS7-----
{$encryptedButton}
-----END PKCS7-----
&quot;&gt;
&lt;/form&gt;
END</span>_HTML<span style="color: #339933;">;</span></pre></div></div>

<p>I have customized it a little bit to be modular (I use over 5 different payment gateways), but main concepts left the same. If you just starting PayPal integration, I recommend rewriting it to be more object-oriented and maybe integrating payments validation (which works the same as normal payments).</p>
<p>PayPal has <a href="https://cms.paypal.com/us/cgi-bin/?cmd=_render-content&amp;content_ID=developer/howto_testing_sandbox">sandbox</a> mode and big manuals <a href="https://cms.paypal.com/us/cgi-bin/?cmd=_render-content&amp;content_ID=developer/e_howto_html_encryptedwebpayments">library</a> &#8211; testing PayPal gateway is very easy and shouldn&#8217;t be a problem. I definitely recommend creating sandbox users (merchant and buyer) and playing with virtual money &#8211; it not only allows you to test gateway&#8217;s functionality, but feels very good to have unlimited amount of money.</p>
<p>To finish with, I recommend using encrypted PayPal buttons &#8211; additional security is not bad. How do you handle payments?</p>
]]></content:encoded>
			<wfw:commentRss>http://dev.juokaz.com/php/paypal-payment-with-encryption/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
