<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>shay.co blog</title>
	<atom:link href="http://blog.shay.co/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.shay.co</link>
	<description>Thoughts, ideas and general knowledge</description>
	<lastBuildDate>Sat, 02 Feb 2013 15:50:01 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>My Favorite Development Tools</title>
		<link>http://blog.shay.co/my-favorite-development-tools/</link>
		<comments>http://blog.shay.co/my-favorite-development-tools/#comments</comments>
		<pubDate>Thu, 31 May 2012 10:42:05 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=825</guid>
		<description><![CDATA[<p>This post is a part of a post chain started by a colleague and a friend of mine &#8211; <a href="http://www.idosius.com/posts/my-favorite-web-dev-things/" target="_blank">Ido Schacham</a>. The purpose is to share our favorite development tools.<br /> The rules are simple:</p> Copy-paste the rules. List your favorite tools, libraries, and services for web development. List obvious ones as well, they [...]]]></description>
				<content:encoded><![CDATA[<p>This post is a part of a post chain started by a colleague and a friend of mine &#8211; <a href="http://www.idosius.com/posts/my-favorite-web-dev-things/" target="_blank">Ido Schacham</a>. The purpose is to share our favorite development tools.<br />
The rules are simple:</p>
<ol>
<li>Copy-paste the rules.</li>
<li>List your favorite tools, libraries, and services for web development.</li>
<li>List obvious ones as well, they may not be obvious to others.</li>
<li>Tag other web dev bloggers and let them know.</li>
<li>Link back to the post that tagged you.</li>
</ol>
<h2>My Favorites</h2>
<p>Below are my favorite development tools:</p>
<ul>
<li><strong><a href="http://html5boilerplate.com/" target="_blank">H5BP</a></strong> (HTML5 Boilerplate) &#8211; An awesome front end base-template, provides the tools to build a web application with modern technologies.</li>
<li><strong><a href="http://twitter.github.com/bootstrap/" target="_blank">Twitter Bootstrap</a></strong> &#8211; An amazing set of HTML, CSS and JavaScript for rich UI, easily customizable for your project.</li>
<li><strong><a href="http://www.initializr.com/" target="_blank">Initializr</a></strong> &#8211; Allows you to easily create your favorite mixture of H5BP, Bootstrap and some other libraries and services.</li>
<li><strong><a href="http://jquery.com/" target="_blank">jQuery</a></strong> &#8211; This one is pretty obvious, dramatically simplifies HTML and CSS manipulations, Ajax interactions, events handling, animations, etc.</li>
<li><strong><a href="http://lesscss.org/" target="_blank">LESS</a></strong> &#8211; An extension of CSS, which adds variables, mixins, operators and some more valuable tools. It compiles to normal CSS using a JavaScript compiler.</li>
<li><strong><a href="http://lithify.me/" target="_blank">Lithium</a></strong> &#8211; My favorite PHP framework, based on PHP 5.3, extensive yet lightweight and well tested.</li>
<li><strong><a href="https://developer.mozilla.org/en/" target="_blank">MDN</a></strong> (Mozilla Developer Network) &#8211; The best documentation available for web technologies, I mainly use it for JavaScript.</li>
<li><strong><a href="http://nodejs.org/" target="_blank">Node.js</a></strong> &#8211; JavaScript on the server side, extremely fast and scalable, just check it out!</li>
<li><strong><a href="https://github.com/ded/R2" target="_blank">R2</a></strong> &#8211; CSS LTR &lt;=&gt; RTL converter written in JavaScript. It actually works!</li>
<li><strong><a href="http://humanstxt.org/" target="_blank">Humans.txt</a></strong> &#8211; This isn&#8217;t really a development tool, but a nice way to leave your mark on your recent project.</li>
</ul>
<h3>Tags</h3>
<p><a href="http://yehudakatz.com/" target="_blank">Yehuda Katz</a>, <a href="http://blog.phpdeveloper.org/" target="_blank">Chris Cornutt</a>, <a href="http://gonzalo123.wordpress.com/" target="_blank">Gonzalo Ayuso</a>, <a href="http://www.phpied.com/" target="_blank">Stoyan Stefanov</a>, <a href="http://philsturgeon.co.uk/" target="_blank">Phil Sturgeon</a><br />
Over to you!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/my-favorite-development-tools/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hash Collision Probability</title>
		<link>http://blog.shay.co/hash-collision-probability/</link>
		<comments>http://blog.shay.co/hash-collision-probability/#comments</comments>
		<pubDate>Wed, 28 Mar 2012 16:39:01 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Cryptography]]></category>
		<category><![CDATA[Hash functions]]></category>
		<category><![CDATA[Math]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=644</guid>
		<description><![CDATA[<p>One of the problems of hash functions is collisions, they cause some security vulnerabilities which won&#8217;t be discussed in this article.<br /> In this article I would like to find some approximations concerning collisions.</p> Probability for a collision <p>Let \(\mathbf{M}\) be the size of the range (i.e. for md5 it is \(2^{128}\) since md5 returns 128 [...]]]></description>
				<content:encoded><![CDATA[<p>One of the problems of hash functions is collisions, they cause some security vulnerabilities which won&#8217;t be discussed in this article.<br />
In this article I would like to find some approximations concerning collisions.</p>
<h2>Probability for a collision</h2>
<p>Let \(\mathbf{M}\) be the size of the range (i.e. for md5 it is \(2^{128}\) since md5 returns 128 bits).<br />
Let \(\mathbf{n}\) be the number of  random values in this range.<br />
We shall calculate \(p\), the probability for at least one collision.</p>
<p>First, we calculate \(\overline{p}\), the probability for no collisions:<br />
The first value can be any one from \(\mathbf{M}\), the second value can be any one from \(\mathbf{M}-1\) different values, and so on.<br />
Therefore, \(\overline{p}=\frac{\mathbf{M}}{\mathbf{M}}\frac{\mathbf{M}-1}{\mathbf{M}}&#8230;\frac{\mathbf{M}-(\mathbf{n}-1)}{\mathbf{M}}=\prod_{k=0}^{\mathbf{n}-1}\frac{\mathbf{M}-k}{\mathbf{M}}=\prod_{k=0}^{\mathbf{n}-1}(1-\frac{k}{\mathbf{M}})\).</p>
<p>Since \(p=1-\overline{p}\) it is clear that:<br />
\[(1) p=1-\prod_{k=0}^{\mathbf{n}-1}(1-\frac{k}{\mathbf{M}})\]</p>
<p>Recall that \(e^x=\sum_{k=0}^{\infty}\frac{x^k}{k!}\approx \frac{x^0}{0!}+\frac{x^1}{1!}=1+x\), substituting \(x=-\frac{k}{\mathbf{M}}\) gives \(1-\frac{k}{\mathbf{M}}\approx e^{-\frac{k}{\mathbf{M}}}\).<br />
Now, from \((1)\):<br />
\(p\approx 1-\prod_{k=0}^{\mathbf{n}-1}e^{-\frac{k}{\mathbf{M}}}=1-e^{\sum_{k=0}^{\mathbf{n}-1}{-\frac{k}{\mathbf{M}}}}=1-e^{-\frac{1}{\mathbf{M}} \sum_{k=0}^{\mathbf{n}-1}k}=1-e^{-\frac{\mathbf{n}(\mathbf{n}-1)}{2\mathbf{M}}}\)<br />
Thus:<br />
\[(2) p\approx 1-e^{-\frac{\mathbf{n}(\mathbf{n}-1)}{2\mathbf{M}}}\]</p>
<p>Under the condition \(1\ll \mathbf{n}^2\ll 2\mathbf{M}\) an even simpler formula emerges.<br />
Again, \(e^x\approx 1+x\), substituting \(x=-\frac{\mathbf{n}^2}{2\mathbf{M}}\) gives \(1&#8211;\frac{\mathbf{n}^2}{2\mathbf{M}}\approx e^{-\frac{\mathbf{n}^2}{2\mathbf{M}}}\).<br />
From \((2)\):<br />
\(p\approx 1-e^{-\frac{\mathbf{n}(\mathbf{n}-1)}{2\mathbf{M}}}\approx 1-e^{-\frac{\mathbf{n}^2}{2\mathbf{M}}}\approx 1-(1-\frac{\mathbf{n}^2}{2\mathbf{M}})=\frac{\mathbf{n}^2}{2\mathbf{M}}\)<br />
Therefore:<br />
\[(3) p\approx \frac{\mathbf{n}^2}{2\mathbf{M}}\]</p>
<h2>Amount of values for a probable collision</h2>
<p>Let \(\mathbf{M}\) be the size of the range (i.e. for md5 it is \(2^{128}\) since md5 returns 128 bits).<br />
Let \(\mathbf{p}\) be the desired probability for at least one collision.<br />
We shall calculate \(n\), the number of values such that the probability for a collision is \(p\).</p>
<p>From \((2)\):<br />
\(\mathbf{p}\approx 1-e^{-\frac{n(n-1)}{2\mathbf{M}}}<br />
\\<br />
1-\mathbf{p}\approx e^{-\frac{n(n-1)}{2\mathbf{M}}}<br />
\\<br />
\log(1-\mathbf{p})\approx-\frac{n(n-1)}{2\mathbf{M}}<br />
\\<br />
n(n-1)\approx-2\mathbf{M}\log(1-\mathbf{p})=2\mathbf{M}\log \frac{1}{1-\mathbf{p}}<br />
\\<br />
n^2-n-2\mathbf{M}\log \frac{1}{1-\mathbf{p}}\approx0<br />
\\<br />
n\approx\frac{1+\sqrt{1-8\mathbf{M}\log \frac{1}{1-\mathbf{p}}}}{2}=\frac{1}{2}+\sqrt{\frac{1}{4}+2\mathbf{M}\log \frac{1}{1-\mathbf{p}}}\)<br />
Hence:<br />
\[(4) n\approx \frac{1}{2}+\sqrt{\frac{1}{4}+2\mathbf{M}\log \frac{1}{1-\mathbf{p}}}\]</p>
<p>If \(\mathbf{M}\mathbf{p}\gg 1\) the constants are negligible:<br />
\[(5) n\approx\sqrt{2\mathbf{M}\log \frac{1}{1-\mathbf{p}}}\]</p>
<p>If in addition \(\mathbf{p}\ll 1\), formula \((3)\) implies (can be demonstrated using \((5)\) too):<br />
\[(6) n\approx \sqrt{2\mathbf{M}\mathbf{p}}\]</p>
<h2>Examples</h2>
<h3>Birthday problem</h3>
<p>The birthday problem (often referred as birthday paradox) states that if \(23\) people are randomly selected, the probability that two of them share their birthdays is higher than \(\frac{1}{2}\).<br />
In this case \(\mathbf{M}=365\), \(\mathbf{n}=23\) and \(\mathbf{p}=\frac{1}{2}\).</p>
<p>Using formula\( (1)\):<br />
\(p=1-\prod_{k=0}^{22}(1-\frac{k}{365})=0.507297\)</p>
<p>Luckily, formula \( (2)\) produces a similar result:<br />
\(p\approx 1-e^{-\frac{23\times 22}{2\times 365}}=1-e^{-\frac{253}{365}}=0.500002\)</p>
<p>The inverse can be calculated using formula \( (4)\):<br />
\(n\approx \frac{1}{2}+\sqrt{\frac{1}{4}+2\times 365\log \frac{1}{1-\frac{1}{2}}}=\frac{1}{2}+\sqrt{\frac{1}{4}+730\log 2}=22.9999\)</p>
<p>Formulas \( (3)\), \( (5)\) and \( (6)\) are inappropriate in this case (however their results are not too far from the actual results).</p>
<h3>md5 hash function</h3>
<p>In this case \(\mathbf{M}=2^{128}\) as I have mentioned in the beginning.</p>
<p>Using formula \((5)\) (\(\mathbf{M}\) is very large) we can find \(n\) given \(\mathbf{p}\):<br />
\(\mathbf{p}=\frac{1}{1000}\): \(n\approx\sqrt{2\times 2^{128} \log \frac{1}{1-\frac{1}{1000}}}=8.25170\times 10^{17}\)<br />
\(\mathbf{p}=\frac{1}{100}\): \(n\approx\sqrt{2\times 2^{128} \log \frac{1}{1-\frac{1}{100}}}=2.61532\times 10^{18}\)<br />
\(\mathbf{p}=\frac{1}{10}\): \(n\approx\sqrt{2\times 2^{128} \log \frac{1}{1-\frac{1}{10}}}=8.4678\times 10^{18}\)<br />
\(\mathbf{p}=\frac{1}{2}\): \(n\approx\sqrt{2\times 2^{128} \log \frac{1}{1-\frac{1}{2}}}=2.17194\times 10^{19}\)<br />
\(\mathbf{p}=\frac{3}{4}\): \(n\approx\sqrt{2\times 2^{128} \log \frac{1}{1-\frac{3}{4}}}=3.07158\times 10^{19}\)<br />
\(\mathbf{p}=\frac{99}{100}\): \(n\approx\sqrt{2\times 2^{128} \log \frac{1}{1-\frac{99}{100}}}=5.59832\times 10^{19}\)</p>
<p>For \(\mathbf{p}=\frac{1}{1000}, \frac{1}{100}, \frac{1}{10}\) formula \((6)\) can be used (higher values give bad results):<br />
\(\mathbf{p}=\frac{1}{1000}\): \(n\approx\sqrt{2\times 2^{128} \frac{1}{1000}}=8.24963\times 10^{17}\)<br />
\(\mathbf{p}=\frac{1}{100}\): \(n\approx\sqrt{2\times 2^{128} \frac{1}{100}}=2.60876\times 10^{18}\)<br />
\(\mathbf{p}=\frac{1}{10}\): \(n\approx\sqrt{2\times 2^{128} \frac{1}{10}}=8.24963\times 10^{18}\)<br />
These results are consistent with the previous results.</p>
<h2>Summary</h2>
<p>We have obtained some formulas that relate \(\mathbf{M}\), \(\mathbf{n}\) and \(\mathbf{p}\):<br />
\[(1) p=1-\prod_{k=0}^{\mathbf{n}-1}(1-\frac{k}{\mathbf{M}})<br />
\\<br />
(2) p\approx 1-e^{-\frac{\mathbf{n}(\mathbf{n}-1)}{2\mathbf{M}}}<br />
\\<br />
(3) p\approx \frac{\mathbf{n}^2}{2\mathbf{M}} (1\ll \mathbf{n}^2\ll 2\mathbf{M})<br />
\\<br />
(4) n\approx \frac{1}{2}+\sqrt{\frac{1}{4}+2\mathbf{M}\log \frac{1}{1-\mathbf{p}}}<br />
\\<br />
(5) n\approx\sqrt{2\mathbf{M}\log \frac{1}{1-\mathbf{p}}} (\mathbf{M}\mathbf{p}\gg 1)<br />
\\<br />
(6) n\approx \sqrt{2\mathbf{M}\mathbf{p}} (\mathbf{Mp}\gg 1, \mathbf{p}\ll 1)\]</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/hash-collision-probability/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Generalized Random Sub Array Algorithm</title>
		<link>http://blog.shay.co/generalized-random-sub-array-algorithm/</link>
		<comments>http://blog.shay.co/generalized-random-sub-array-algorithm/#comments</comments>
		<pubDate>Wed, 02 Nov 2011 20:52:29 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Arrays]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Random Sub Arrays]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=572</guid>
		<description><![CDATA[<p>After discussing with some friends on StartupSeeds, I came up with a generalized random sub array algorithm.<br /> This algorithm can be implemented with different data-structures, and it&#8217;s complexity is \(O(n)\) or \(O(k\log\space k)\) for 2 specific implementations.</p> <p>We need a data-structure that can contain integers.<br /> It must have 2 methods which are add [...]]]></description>
				<content:encoded><![CDATA[<p>After discussing with some friends on StartupSeeds, I came up with a generalized random sub array algorithm.<br />
This algorithm can be implemented with different data-structures, and it&#8217;s complexity is \(O(n)\) or \(O(k\log\space k)\) for 2 specific implementations.</p>
<p>We need a data-structure that can contain integers.<br />
It must have 2 methods which are add (adds an integer) and contains (whether an integer is in or not).<br />
We also need to be able to iterate it, I will use foreach loop for that.</p>
<h2>The Algorithm</h2>
<p>Let \(A\) be the original array, with elements of type \(T\).<br />
Let \(k\) be the number of elements we want in the sub array.<br />
Let \(n=length(A)\).<br />
Let \(D\) be the data-structure.</p>
<ol>
<li>Let \(B=T[k]\) (an array of type \(T\) with \(k\) elements)</li>
<li>Let \(flag=(k\le n / 2)\)</li>
<li>Let \(stop=(flag ? k : n-k)\)</li>
<li>\(for(i=0;i\lt stop;i=i+1)\)
<ol style="list-style-type: decimal;">
<li>\(while(true)\)
<ol style="list-style-type: decimal;">
<li>Let \(key=random(0,n-1)\)</li>
<li>\(if(!D.contains(key))\)
<ol style="list-style-type: decimal;">
<li>\(D.add(key)\)</li>
<li>\(break\)</li>
</ol>
</li>
</ol>
</li>
</ol>
</li>
<li>\(if(flag)\)
<ol style="list-style-type: decimal;">
<li>Let \(p=0\)</li>
<li>\(foreach(key\space in\space D)\)
<ol style="list-style-type: decimal;">
<li>\(B[p]=A[key]\)</li>
<li>\(p=p+1\)</li>
</ol>
</li>
</ol>
</li>
<li>\(else\)
<ol style="list-style-type: decimal;">
<li>\(keys=bool[n]\)</li>
<li>\(foreach(key\space in\space D)\)
<ol style="list-style-type: decimal;">
<li>\(keys[key]=true\)</li>
</ol>
</li>
<li>Let \(p=0\)</li>
<li>\(for(i=0;i\lt k;i=i+1)\)
<ol style="list-style-type: decimal;">
<li>\(while(keys[p])\)
<ol style="list-style-type: decimal;">
<li>\(p=p+1\)</li>
</ol>
</li>
<li>\(B[i]=array[p]\)</li>
<li>\(p=p+1\)</li>
</ol>
</li>
</ol>
</li>
<li>\(return\space B\)</li>
</ol>
<h2>Complexity Analysis</h2>
<p>I will calculate each stage&#8217;s complexity separately, and then sum them up.<br />
But first:<br />
Let \(T_c(j)\) be the complexity of \(D.contains\) when there are \(j\) elements in it.<br />
Let \(T_a(j)\) be the complexity of \(D.add\) when there are \(j\) elements in it.<br />
Let \(T_i(j)\) be the complexity of iterating the data-structure&#8217;s elements when there are \(j\) elements in it.</p>
<p>Stage 1 is \(O(k)\) or \(O(1)\) depending on the language&#8217;s implementation for array declarations, we will use \(O(k)\).</p>
<p>Stage 2 and 3 are clearly \(O(1)\).</p>
<p>Stage 4 is more complex to calculate and we can only calculate it for the average case.<br />
In each of the iterations in the for-loop we are looking for a number between \(0\) and \(n-1\) that is not already in \(D\).<br />
Because there are already \(i\) elements in the data-structure, the distribution of the number of while-loop iterations is a geometric distribution with probability \(\frac{n-i}{n}\), therefore the mean number of iterations is \(\frac{n}{n-i}\).<br />
So this stage&#8217;s complexity is:<br />
\[\displaylines{O(\sum_{i=0}^{min(k,n-k)-1}(T_c(i)\frac{n}{n-i}+T_a(i)))<br />
\\<br />
=O(\sum_{i=0}^{min(k,n-k)-1}(T_c(i)\frac{n}{n/2}+T_a(i)))<br />
\\<br />
=O(\sum_{i=0}^{min(k,n-k)-1}(T_c(i)+T_a(i)))}\]</p>
<p>Stage 5 is simply \(O(T_i(k))\).</p>
<p>Stage 6 is also quite simple.<br />
There is an array deceleration (\(O(n)\) as mentioned before), iteration over the data-structure \(O(T_i(n-k))\) and an iteration on the arrays (\(O(n)\).<br />
Summing up to \(O(T_i(n-k)+n)\).<br />
But we know that \(k\ge n/2\), and therefore it is also \(O(T_i(n-k)+k)\).</p>
<p>Bear in mind that only one of stages 5 and 6 occurs, hence there total complexity of these stages is \(O(T_i(min(k,n-k))+k)\).</p>
<p>Stage 7 is also \(O(1)\).</p>
<p>Summing stages 1 to 7 gives us:<br />
\[O(\sum_{i=0}^{min(k,n-k)-1}(T_c(i)+T_a(i))+T_i(min(k,n-k))+k)\]</p>
<h2>Implementation #1 &#8211; Boolean Array</h2>
<p>This implementation was introduced in my previous article, <a href="http://b\log.shay.co/generating-random-sub-array/" target="_blank">Generating Random Sub Array</a>.<br />
In this implementation the data-structure acts the same as the array \(keys\) that is used in stage 6.1.</p>
<p>In this case \(T_c(j)=1\), \(T_a(j)=1\) and \(T_i(j)=n\).<br />
With that in mind the algorithms complexity can be found easily: \(O(\sum_{i=0}^{min(k,n-k)-1}(1+1)+n+k)=O(n)\)</p>
<h2>Implementation #2 &#8211; Self-Balancing Binary Search Tree</h2>
<p>In this implementation our data-structure is a binary search tree.<br />
This implementation is much better than the one mentioned above when \(k\ll n\).</p>
<p>In this case \(T_c(j)=\log\space j\), \(T_a(j)=\log\space j\) and \(T_i(j)=j\).<br />
We can again easily find the total time complexity.<br />
For simplicity I will assume that \(k\le n/2\).<br />
\(O(\sum_{i=0}^{k-1}(\log\space j+\log\space j)+k+k)=O(k\log\space k)\)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/generalized-random-sub-array-algorithm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Generating Random Sub Array</title>
		<link>http://blog.shay.co/generating-random-sub-array/</link>
		<comments>http://blog.shay.co/generating-random-sub-array/#comments</comments>
		<pubDate>Sun, 09 Oct 2011 09:52:22 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Arrays]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Random Sub Arrays]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=507</guid>
		<description><![CDATA[<p>Today I came across an algorithmic problem.<br /> I wanted to choose a random sub array of size \(k\) from an array of size \(n\), while maintaining its order.<br /> For example, if we have the array \(\{0,1,2,3,4,5,6,7,8,9\}\) a possible result for \(k=4\) is the array \(\{1,4,7,9\}\).</p> <p>I came up with an algorithm with average [...]]]></description>
				<content:encoded><![CDATA[<p>Today I came across an algorithmic problem.<br />
I wanted to choose a random sub array of size \(k\) from an array of size \(n\), while maintaining its order.<br />
For example, if we have the array \(\{0,1,2,3,4,5,6,7,8,9\}\) a possible result for \(k=4\) is the array \(\{1,4,7,9\}\).</p>
<p>I came up with an algorithm with average case time complexity \(O(n)\).</p>
<h2>The Algorithm</h2>
<p>Let \(A\) be the original array, with elements of type \(T\).<br />
Let \(k\) be the number of elements we want in the sub array.<br />
Let \(n=length(A)\).</p>
<ol>
<li>Let \(B=T[k]\) (an array of type \(T\) with \(k\) elements)</li>
<li>Let \(keys=bool[n]\) (an array with \(n\) boolean elements)</li>
<li>Let \(flag=(k\le n / 2)\)</li>
<li>Let \(stop=(flag ? k : n-k)\)</li>
<li>\(for(i=0;i\lt stop;i=i+1)\)
<ol style="list-style-type:decimal;">
<li>\(while(true)\)
<ol style="list-style-type:decimal;">
<li>Let \(key=random(0,n-1)\)</li>
<li>\(if(keys[key]=false)\)
<ol style="list-style-type:decimal;">
<li>\(keys[key]=true\)</li>
<li>\(break\)</li>
</ol>
</li>
</ol>
</li>
</ol>
</li>
<li>Let \(p=0\)</li>
<li>\(for(i=0;i\lt k;i=i+1)\)
<ol style="list-style-type:decimal;">
<li>\(while(keys[p] \oplus flag)\) (where \(\oplus\) is xor)
<ol style="list-style-type:decimal;">
<li>\(p=p+1\)</li>
</ol>
</li>
<li>\(B[i]=array[p]\)</li>
<li>\(p=p+1\)</li>
</ol>
</li>
<li>\(return\space B\)</li>
</ol>
<h2>Average Case Time Analysis</h2>
<p>Stages 1 and 2 depends on the language&#8217;s implementation for array declaration.<br />
However they are always \(O(n)\).</p>
<p>Stages 3, 4 and 6 are clearly \(O(1)\).</p>
<p>Let&#8217;s take a look at stage 7.<br />
Since \(p\) can never be bigger than \(n\), and \(i\) can never be bigger than \(k\), and each iteration is \(O(1)\), these stages are \(O(n+k)=O(n)\).</p>
<p>Now, let&#8217;s consider stage 5.<br />
Within each step of the loop we are trying to find a number between \(0\) and \(n-1\) which wasn&#8217;t picked earlier.<br />
The distribution of the number of guesses it takes to find such a number, is a geometric distribution with probability \(\frac{n-i}{n}\), hence the mean is \(\frac{n}{n-i}\).<br />
We know that \(i\) goes from \(0\) to \(min(k,n-k)-1\) which is less than \(n/2\).<br />
Each step in the while loop is \(O(1)\).<br />
Therefore the worst case complexity, on average, is \(O(\sum_{i=0}^{n/2-1}\frac{n}{n-i})=O(\sum_{i=0}^{n/2-1}\frac{n}{n/2})=O((n/2)\frac{n}{n/2})=O(n)\).</p>
<p>Summing all of the stages together, we can see that the algorithm has an average case time complexity of \(O(n)\).</p>
<h2>C# implementation</h2>
<p>I wrote a simple implementation of the algorithm in C#:</p>
<pre class="brush: csharp; title: ; notranslate">public static T[] RandomSubArray&lt;T&gt;(T[] array, int k)
{
	int n = array.Length;

	T[] ret = new T[k];
	bool[] keys = new bool[n];
	Random rand = new Random();

	bool flag = k &lt;= n / 2;

	for (int i = 0, stop = flag ? k : n - k; i &lt; stop; i++)
	{
		while (true)
		{
			int key = rand.Next(0, n);
			if (!keys[key])
			{
				keys[key] = true;
				break;
			}
		}
	}

	int p = 0;
	for (int i = 0; i &lt; k; i++)
	{
		while (keys[p] ^ flag)
		{
			p++;
		}
		ret[i] = array[p++];
	}

	return ret;
}</pre>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/generating-random-sub-array/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Phrase Based Password Entropy</title>
		<link>http://blog.shay.co/phrase-based-password-entropy/</link>
		<comments>http://blog.shay.co/phrase-based-password-entropy/#comments</comments>
		<pubDate>Sun, 18 Sep 2011 19:06:02 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Cryptography]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Passwords]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=482</guid>
		<description><![CDATA[<p>Yesterday I have published an article about <a href="http://b\log.shay.co/password-entropy/" target="_blank">password entropy</a>.<br /> Today I would like to discuss the entropy of a phrase based password.</p> Phrase Based Password <p>A phrase based password is a password assembled from several easy to remember and spell words, delimited by spaces.<br /> As a result, such passwords are very [...]]]></description>
				<content:encoded><![CDATA[<p>Yesterday I have published an article about <a href="http://b\log.shay.co/password-entropy/" target="_blank">password entropy</a>.<br />
Today I would like to discuss the entropy of a phrase based password.</p>
<h2>Phrase Based Password</h2>
<p>A phrase based password is a password assembled from several easy to remember and spell words, delimited by spaces.<br />
As a result, such passwords are very easily remembered.</p>
<p>To produce such a password, one must have a dictionary of words.<br />
Each time a user asks for a password, the system randomly chooses a few words to generate the password.</p>
<h2>The Entropy of a Phrase Based Password</h2>
<p>Let \(N\) be the size of our dictionary, and let \(L\) be the number of words in the password, therefore there are \(T = {N \choose L}L!\) different possible passwords.<br />
Assuming \(N \gg L\) we can approximate this number by \(T \approx N^L\).<br />
As we know the entropy \(H\) is given by \(H = \log_2{T}\), thus \(H \approx \log_2{N^L} = L \log_2{N}\), which is identical to the entropy of a password of length \(L\) under an alphabet with \(N\) different symbols.</p>
<p>Let&#8217;s assume that \(N = 10,000\) (i.e. we have 10,000 unique words in our dictionary), and \(L = 5\), the entropy of a password under these conditions is \(H \approx 5 \log_2{10,000} \approx 66.44 \text{bits}\).<br />
Unfortunately when \(N = 3,000\) and \(L = 4\) the entropy is much lower: \(H \approx 4 \log_2{3,000} \approx 46.2 \text{bits}\)</p>
<h2>Conclusions</h2>
<p>Phrase base passwords are easy to remember; hence they are great in terms of ease of use.<br />
On the other hand, in order for this method to be reliable, the dictionary has to big quite large, and each password must contain at least 4 or 5 words.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/phrase-based-password-entropy/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Password Entropy</title>
		<link>http://blog.shay.co/password-entropy/</link>
		<comments>http://blog.shay.co/password-entropy/#comments</comments>
		<pubDate>Sat, 17 Sep 2011 19:19:24 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Cryptography]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Passwords]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=430</guid>
		<description><![CDATA[<p>A common and easy way to estimate the strength of a password is its entropy.<br /> The entropy is given by \(H = L \log_2{N}\) where \(L\) is the length of the password and \(N\) is the size of the alphabet, and it is usually measured in bits.<br /> The entropy measures the number of [...]]]></description>
				<content:encoded><![CDATA[<p>A common and easy way to estimate the strength of a password is its entropy.<br />
The entropy is given by \(H = L \log_2{N}\) where \(L\) is the length of the password and \(N\) is the size of the alphabet, and it is usually measured in bits.<br />
The entropy measures the number of bits it would take to represent every password of length \(L\) under an alphabet with \(N\) different symbols.</p>
<p>For example, a password of 7 lower-case characters (such as: <em>example</em>, <em>polmnni</em>, etc.) has an entropy of \(H = 7 \log_2{26} \approx 32.9 \text{bits}\).<br />
A password of 10 alpha-numeric characters (such as: <em>P4ssw0Rd97</em>, <em>K5lb42eQa2</em>) has an entropy of \(H = 10 \log_2{62} \approx 59.54 \text{bits}\).</p>
<p>Entropy makes it easy to compare password strengths, higher entropy means stronger password (in terms of resistance to brute force attacks).</p>
<h2>Phrase based password</h2>
<p>An interesting fact is that a password that is usually considered strong, such as <em>f#Mo1e)*TjC8</em> (entropy \(H = 12 \log_2{72} \approx 74.04 \text{bits}\)), usually has lower entropy than a password assembled form several words delimited by spaces, such as <em>carrot ways base split</em> (entropy \(H = 22 \log_2{27} \approx 104.61 \text{bits}\)).<br />
This fact was demonstrated wonderfully by Randall Munroe in the following picture (although I believe his entropy calculation was different than mine):<br />
<a href="http://xkcd.com/936/" target="_blank"><img src="http://imgs.xkcd.com/comics/password_strength.png" alt="" width="740" height="601" /></a></p>
<h2>Entropy calculator</h2>
<p>I wrote a simple entropy calculator in javascript, you can use it online here:</p>
<p>Password:<br />
<input id="password" type="text" value="" /><br />
Entropy: <span id="entropy">0</span></p>
<p>Calculator source: <a href="http://blog.shay.co/files/entropy.js" target="_blank">http://blog.shay.co/files/entropy.js</a>.<br />
<script type="text/javascript" src="http://blog.shay.co/files/entropy.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/password-entropy/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Quicksort Average Case Time Complexity</title>
		<link>http://blog.shay.co/quicksort-average-case-time-complexity/</link>
		<comments>http://blog.shay.co/quicksort-average-case-time-complexity/#comments</comments>
		<pubDate>Sun, 31 Jul 2011 14:05:27 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Arrays]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Quicksort]]></category>
		<category><![CDATA[Sorting]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=311</guid>
		<description><![CDATA[<p>In this article I would like to analyze the performance of one of the best sorting algorithms I know: <a href="http://en.wikipedia.org/wiki/Quicksort" target="_blank">Quicksort</a>.<br /> When I first saw the algorithm I was also told that on average it has \(O(n\log n)\) time complexity, however I never saw a proof until today.<br /> Today I read <a [...]]]></description>
				<content:encoded><![CDATA[<p>In this article I would like to analyze the performance of one of the best sorting algorithms I know: <a href="http://en.wikipedia.org/wiki/Quicksort" target="_blank">Quicksort</a>.<br />
When I first saw the algorithm I was also told that on average it has \(O(n\log n)\) time complexity, however I never saw a proof until today.<br />
Today I read <a href="http://www.im.pwr.wroc.pl/~cichon/Math/QSortAvg.pdf" target="_blank">this article</a> and found a way to simplify their proof.</p>
<p>Let \(a_n\) be the number of comparisons used on average in Quicksort, on an array with \(n\) elements. (\(a_0=a_1=0\))<br />
The pivot has to be compared \(n-1\) times, with each one of the elements.<br />
Then, suppose the left part (the elements smaller than the pivot) has \(k\) elements, thus the right part (the elements bigger than the pivot) has \(n-1-k\) elements.<br />
We have to recursively sort them, which means another \(a_k+a_{n-1-k}\) comparison.<br />
We are looking for the average of the different possibilities, where \(0\le k\le n-1\), which we can express by \(\frac{\sum_{k=0}^{n-1}(a_k+a_{n-1-k})}{n}\).<br />
Therefore:<br />
\(a_n=n-1+\frac{\sum_{k=0}^{n-1}a_k}{n}\)</p>
<p>All we have to do is solve this equation:<br />
\(a_n=n-1+\frac{\sum_{k=0}^{n-1}(a_k+a_{n-1-k})}{n}<br />
\\=n-1+\frac{\sum_{k=0}^{n-1}a_k+\sum_{k=0}^{n-1}a_{n-1-k}}{n}<br />
\\=n-1+\frac{2\sum_{k=0}^{n-1}a_k}{n}<br />
\\\Rightarrow a_{n-1}=n-2+\frac{2\sum_{k=0}^{n-2}a_k}{n-1}\)<br />
From here we get:<br />
\(na_n=n(n-1)+2\sum_{k=0}^{n-1}a_k<br />
\\(n-1)a_{n-1}=(n-1)(n-2)+2\sum_{k=0}^{n-2}a_k\)<br />
Subtracting the two gives us:<br />
\(na_n-(n-1)a_{n-1}=n(n-1)-(n-1)(n-2)+2a_{n-1}<br />
\\\Rightarrow na_n=(n-(n-2))(n-1)+2a_{n-1}+(n-1)a_{n-1}<br />
\\\Rightarrow na_n=2(n-1)+(n+1)a_{n-1}<br />
\\\Rightarrow \frac{a_n}{n+1}=2\frac{n-1}{n(n+1)}+\frac{a_{n-1}}{n}<br />
\\=2\frac{2n-(n+1)}{n(n+1)}+\frac{a_{n-1}}{n}<br />
\\=2(\frac{2}{n+1}-\frac{1}{n})+\frac{a_{n-1}}{n}\)<br />
Let \(b_n=\frac{a_n}{n+1}\), hence:<br />
\(b_n=2(\frac{2}{n+1}-\frac{1}{n})+b_{n-1}<br />
\\=2(\frac{2}{n+1}-\frac{1}{n})+2(\frac{2}{n}-\frac{1}{n-1})+b_{n-2}<br />
\\=2(\frac{2}{n+1}-\frac{1}{n})+2(\frac{2}{n}-\frac{1}{n-1})+2(\frac{2}{n-1}-\frac{1}{n-2})+b_{n-3}<br />
\\=\ldots<br />
\\=2\sum_{k=1}^{n}(\frac{2}{k+1}-\frac{1}{k})<br />
\\=4\sum_{k=1}^{n}\frac{1}{k+1}-2\sum_{k=1}^{n}\frac{1}{k}\)</p>
<p>Until now, my proof is the same as the original, but now I choose a different direction.<br />
I will try so simplify the proof as much as I can.<br />
\(b_n=4\sum_{k=1}^{n}\frac{1}{k+1}-2\sum_{k=1}^{n}\frac{1}{k}<br />
\\=4-4+4\sum_{k=2}^{n+1}\frac{1}{k}-2\sum_{k=1}^{n}\frac{1}{k}<br />
\\=-4+4\times\frac{1}{n+1}+4\sum_{k=1}^{n}\frac{1}{k}-2\sum_{k=1}^{n}\frac{1}{k}<br />
\\=\frac{4}{n+1}-4+2\sum_{k=1}^{n}\frac{1}{k}<br />
\\\approx2\sum_{k=1}^{n}\frac{1}{k}<br />
\\\approx2\int_1^n\frac{1}{k}\,\mathrm{d}k<br />
\\=2(\log n-\log 1)<br />
\\=2\log n\)</p>
<p>Now all we have to do is find \(a_n\):<br />
\(b_n=\frac{a_n}{n+1}<br />
\\\Rightarrow a_n=(n+1)b_n<br />
\\=2(n+1)\log n\)<br />
Or:<br />
\(a_n=O(n\log n)\)</p>
<p>We haven&#8217;t found the exact result, but it is very close to the real value, to prove that Quicksort is \(O(n\log n)\) algorithm.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/quicksort-average-case-time-complexity/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Newton&#8217;s Method</title>
		<link>http://blog.shay.co/newtons-method/</link>
		<comments>http://blog.shay.co/newtons-method/#comments</comments>
		<pubDate>Wed, 29 Jun 2011 11:19:22 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Square Root]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=245</guid>
		<description><![CDATA[<p>Newton&#8217;s method is a method for finding approximations to the roots of a given function.</p> <p>The method itself is very simple, you give an initial guess of the root, \(x_0\), and calculate further values using the following equation: \(x_{n+1}=x_n-\frac{f&#8217;(x_n)}{f(x_n)}\).<br /> Generally, each term in the sequence is more accurate in comparison to the previous terms.</p> [...]]]></description>
				<content:encoded><![CDATA[<p>Newton&#8217;s method is a method for finding approximations to the roots of a given function.</p>
<p>The method itself is very simple, you give an initial guess of the root, \(x_0\), and calculate further values using the following equation: \(x_{n+1}=x_n-\frac{f&#8217;(x_n)}{f(x_n)}\).<br />
Generally, each term in the sequence is more accurate in comparison to the previous terms.</p>
<h2>How does it work</h2>
<p>Let \(f(x)\) be the function whose root we are looking for.<br />
Let \(x_0\) be our initial guess for the root.<br />
Let \(x_n\) be our latest guess for the root.</p>
<p>In each step we take our best guess, \(x_n\), and try to make it more accurate.<br />
To do so, we draw the tangent in the point \((x_n, f(x_n))\) and find its root, which we than call \(x_{n+1}\).<br />
\(x_{n+1}\) will be more accurate (usually).</p>
<p>A demonstration: (click on the image to enlarge it)<br />
<a href="http://blog.shay.co/files/cord.png" target="_blank"><img src="http://blog.shay.co/files/cord_small.png" alt="" /></a></p>
<p>The math behind this is not very difficult.<br />
The equation of the tangent line is: \(y-f(x_n)=f&#8217;(x_n)(x_{n+1}-x_n)\).<br />
We look for \(x_{n+1}\) when \(y=0\), hence<br />
\(0-f(x_n)=f&#8217;(x_n)(x_{n+1}-x_n)<br />
\\\Rightarrow -f(x_n)=x_{n+1}f&#8217;(x_n)-x_nf&#8217;(x_n)<br />
\\\Rightarrow x_{n+1}f&#8217;(x_n)=x_nf&#8217;(x_n)-f(x_n)<br />
\\\Rightarrow x_{n+1}=x_n-\frac{f(x_n)}{f&#8217;(x_n)}\)</p>
<h2>Example &#8211; Square Root</h2>
<p>Say we want to calculate the square root of \(a\).<br />
We need to find a value of \(x\) such that \(x=\sqrt{a}\Rightarrow x^2=a\Rightarrow x^2-a=0\).<br />
So, we are looking for the non-negative root (\(x=\sqrt{a}\ge0\)) of the function \(f(x)=x^2-a\).<br />
The derivative of this function is \(f&#8217;(x)=2x\).<br />
And therefore:<br />
\(x_{n+1}=x_n-\frac{x_n^2-a}{2x_n}=\frac{2x_n^2-x_n^2+a}{2x_n}=\frac{x_n^2+a}{2x_n}=\frac{x_n}{2}+\frac{a}{2x_n}\)</p>
<h3>Calculating \(\sqrt{2}\)</h3>
<p>Let&#8217;s try and find an approximation for \(\sqrt{2}\) with initial guess \(x_0=1\).<br />
\(x_1=\frac{1}{2}+\frac{2}{2\times1}=\frac{3}{2}<br />
\\ x_2=\frac{3}{2\times2}+\frac{2\times2}{2\times3}=\frac{17}{12}<br />
\\ x_3=\frac{17}{2\times12}+\frac{2\times12}{2\times17}=\frac{577}{408}<br />
\\ x_4=\frac{577}{2\times408}+\frac{2\times408}{2\times577}=\frac{665857}{470832}\approx1.414213562\)<br />
The error from the real value is lower than \(0.0000000000016\)!<br />
We have got this amazing result with only four steps!</p>
<p>Calculating further will produce very accurate massive fractions:<br />
\(|x_5-\sqrt{2}|<9\times10^{-25}<br />
\\ |x_6-\sqrt{2}|<2.9\times10^{-49}<br />
\\ |x_7-\sqrt{2}|<2.9\times10^{-98}\)</p>
<h3>Implementation</h3>
<p>The C# implementation is super easy.</p>
<p>The parameters are \(a\), which is the number whose square root we are looking for, and \(times\), which determines the number of calculations.</p>
<pre class="brush: csharp; title: ; notranslate">public double Sqrt(double a, int times)
{
	if (a &lt; 0)
		throw new Exception(&quot;Can not sqrt a negative number&quot;);
	double x = 1;
	while (times-- &gt; 0)
		x = x / 2 + a / (2 * x);
	return x;
}</pre>
<h2>Conclusions</h2>
<p>Newton&#8217;s method is a very efficient way to find roots of functions, including square roots of numbers.<br />
It is much more efficient than the method I introduced in my article <a href="http://blog.shay.co/square-root-algorithm/" target="_blank">Square Root Algorithm</a>.</p>
<h3>Further Reading</h3>
<p>You can read more about <a href="http://en.wikipedia.org/wiki/Newton%27s_method" target="_blank">Newton&#8217;s method in Wikipedia</a>.<br />
There is more great information in <a href="http://mathworld.wolfram.com/NewtonsMethod.html" target="_blank">Math World</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/newtons-method/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>PDO Persistent Connection Analysis</title>
		<link>http://blog.shay.co/pdo-persistent-connection-analysis/</link>
		<comments>http://blog.shay.co/pdo-persistent-connection-analysis/#comments</comments>
		<pubDate>Mon, 27 Jun 2011 11:14:01 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[Databases]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Tips]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=220</guid>
		<description><![CDATA[<p>In this article, we will learn the benefits of persistent connection in PDO, in terms of memory usage and time of execution.<br /> The article will only observe connection to MySQL database.</p> Introduction <p><a href="http://php.net/manual/en/book.pdo.php" target="_blank">PDO</a> is an abstraction layer for database connections in PHP, and it became increasingly popular in the past few years.</p> [...]]]></description>
				<content:encoded><![CDATA[<p>In this article, we will learn the benefits of persistent connection in PDO, in terms of memory usage and time of execution.<br />
The article will only observe connection to MySQL database.</p>
<h2>Introduction</h2>
<p><a href="http://php.net/manual/en/book.pdo.php" target="_blank">PDO</a> is an abstraction layer for database connections in PHP, and it became increasingly popular in the past few years.</p>
<p>PDO gives us the option to use a persistent connection.<br />
If we don&#8217;t use this option, a new connection is created for each request.<br />
If we do use this option, the connection is not closed at the end of the script, and it is then re-used by other script requests.</p>
<p>Connection to the database using this option is not very different from a regular connection:</p>
<pre class="brush: php; title: ; notranslate">$pdo = new PDO('mysql:dbname=database;host=127.0.0.1', 'username', 'password'); // Regular connection
$pdo = new PDO('mysql:dbname=database;host=127.0.0.1', 'username', 'password', array(PDO::ATTR_PERSISTENT =&gt; true)); // Persistent connection</pre>
<h2>Testing Environment</h2>
<p>All of tests ran on my PC using zend server.<br />
CPU: Intel E8400<br />
RAM: 4GB DDR2 800MHz<br />
I used a MySQL database with root permissions, with an empty database named pdotest.</p>
<h2>Memory Usage</h2>
<p>In order to test the memory usage I wrote two simple scripts, which establish a connection and print the memory difference between the time before the connection was created to the time after it.<br />
The results indicates that the persistent connection consumed 18.5 times less memory than the non-persistent connection.<br />
The non-persistent connection used 6232 bytes while the persistent connection used 336 bytes.</p>
<h2>Requests per Second</h2>
<p>In order to test the speed of the connection I used <a href="http://httpd.apache.org/docs/2.0/programs/ab.html" target="_blank">Apache&#8217;s ab</a> tool.<br />
I tested both methods with 1, 2, 3, 5, 10, 50 and 100 concurrent users, for 10 seconds every time.<br />
Test results show that the persistent connection gave 6.3 to 1.5 times more requests per second than the non-persistent version.</p>
<h2>Conclusions</h2>
<p>The results are clear, persistent connection uses less memory and runs faster.<br />
You can <a href="http://blog.shay.co/files/pdo.rar" target="_blank">download the test files and the full results here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/pdo-persistent-connection-analysis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Square Root Algorithm</title>
		<link>http://blog.shay.co/square-root-algorithm/</link>
		<comments>http://blog.shay.co/square-root-algorithm/#comments</comments>
		<pubDate>Sun, 22 May 2011 20:54:34 +0000</pubDate>
		<dc:creator>Shay Ben Moshe</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Square Root]]></category>

		<guid isPermaLink="false">http://blog.shay.co/?p=88</guid>
		<description><![CDATA[<p>A few days ago I thought of an algorithm to calculate square roots. In this article I would like to introduce my algorithm.<br /> I should mention that this algorithm probably exists.</p> <p>First of all, my algorithm can only find roots between a specific range (for example \(0\) to \(1\)), thus we need to decrease [...]]]></description>
				<content:encoded><![CDATA[<p>A few days ago I thought of an algorithm to calculate square roots. In this article I would like to introduce my algorithm.<br />
I should mention that this algorithm probably exists.</p>
<p>First of all, my algorithm can only find roots between a specific range (for example \(0\) to \(1\)), thus we need to decrease the number whose root we are looking for to our range.<br />
To do so we use a simple math identity: \(\sqrt{p^2x}=p\sqrt{x}\).<br />
All we have to do is pick a number \(p\), divide \(x\) by \(p^2\) until it&#8217;s square root is in our range, and remember the number of times we repeated it &#8211; \(q\). When we have the result we need to multiply it by \(p^q\).</p>
<p>For example, if the range is \(0\) to \(1\), \(x=289\) and \(p=2\), we have to divide \(289\) by \(p^2=4\) \(q=5\) times to get \(0.2822265625\). Then, we use the algorithm I will explain later to find the root which is \(0.53125\). Now we can return \(0.53125*2^5=17\).<br />
And indeed \(\sqrt{289}=17\).</p>
<h3>The Algorithm</h3>
<p>The algorithm is very simple and uses the same concept of binary search; it works by minimizing the possible range of the square root.<br />
I use the algorithm in the range \(0\) to \(1\) but it can be used with any range.</p>
<p>We know that \(x\) is between \(0\) and \(1\), therefore its square root&#8217;s lower bound is \(a=0\) and upper bound is \(b=1\).<br />
The next step is calculating the average of the bounds \(t=(a+b)/2\).<br />
If \(t^2=x\) we return \(t\), if \(t^2&lt;x\) all of the numbers between \(a\) and \(t\) are not the square root, hence \(a=t\). Similarly, if \(t^2&gt;x\) our upper bound, \(b=t\).<br />
We can repeat this step as many times as we want. Each iteration doubles the precision.<br />
If we didn&#8217;t find the specific square root when we finish, we should return \((a+b)/2\), as it is the closest we can get to the actual square root.</p>
<h3>The Code</h3>
<p>I will demonstrate the whole algorithm in C#. In the following implementation \(p\) and the number of iterations are parameters.</p>
<pre class="brush: csharp; title: ; notranslate">public double Sqrt(double x, int p, int iterations)
{
	if (x &lt; 0)
		throw new Exception(&quot;Can not sqrt a negative number&quot;);

	long multiplier = 1;
	int p2 = p * p;
	while (x &gt; 1)
	{
		multiplier *= p;
		x /= p2;
	}

	if (x == 1 || x == 0)
		return multiplier * x;

	double a = 0;
	double b = 1;

	for (int i = 0; i &lt; iterations; i++)
	{
		double t = (a + b) / 2;
		if (t * t == x)
			return multiplier * t;
		else if (t * t &lt; x)
			a = t;
		else
			b = t;
	}

	return multiplier * (a + b) / 2;
}</pre>
<h3>Complexity</h3>
<p>Rather than calculating the complexity of \(k\) iterations, I will calculate the complexity for precision of \(1\) to \(n\).</p>
<p>The first part consists of dividing \(x\) by \(p^2\) and multiplying the multiplier by \(p\), which is \(O(1)\) for each iteration.<br />
\(x\) is divided \(\log_{p^2} x\) times, hence this part&#8217;s complexity is \(O(\log_{p^2} x)\).<br />
Multiplying the result by the multiplier decreases the precision so we will take it in to consideration later. Finding the value of multiplier is simple: \(p^{\log_{p^2} x}=p^{(\log_p x)/(\log_p p^2)}=\sqrt{p^{\log_p x}}=\sqrt{x}\)</p>
<p>In the main part each iteration has a complexity of \(O(1)\) and it doubles the precision, as I explained before.<br />
Let \(i\) be the minimal number of iterations to get a precision of \(1\) to \(n\).<br />
That means that \(1/n=\sqrt{x}/2^i \Rightarrow 2^i=n \sqrt{x} \Rightarrow i=\log_2 (n \sqrt{x})\).<br />
Therefore the total complexity of the main part is \(O(\log_2 (n \sqrt{x}))\).</p>
<p>Summing these two, the total complexity is<br />
\(O(\log_{p^2} x + \log_2 (n \sqrt{x}))<br />
\\=O((\log_p x)/(\log_p p^2) + \log_2 n + \log_2 \sqrt{x})<br />
\\=O(0.5 \log_p x + \log_2 n + 0.5 \log_2 x)<br />
\\=O(\log_2 x + \log_p x + \log_2 n)\)</p>
<p>We know that \(p \ge 2 \Rightarrow \log_p x \le \log_2 x\), therefore the worst-case complexity is \(O(\log_2 x + \log_2 x + \log_2 n)=O(\log_2 x + \log_2 n)=O(\log_2 nx)\).</p>
<h3>Conclusions</h3>
<p>I don&#8217;t know many square root algorithms which means that I don&#8217;t know other algorithm&#8217;s complexity. That said my algorithm is still very efficient.<br />
More over, this method seems pretty intuitive to me, it is easy to understand and easy to implement.<br />
It is also very easy to implement an algorithm that calculates square roots with a constant maximum error.</p>
<p>I hope someone will find this article useful.</p>
<p><strong>Update 24/05/2011:</strong><br />
I read about <a href="http://en.wikipedia.org/wiki/Newton%27s_method" target="_blank">Newton&#8217;s method</a>, which is a method used to find the roots of functions. I found out that it is far more easy to implement and much more efficient in comparison to my method.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.shay.co/square-root-algorithm/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
