Probability of increasing sequence

December 15, 2018

Question : Three points, say p_1, p_2, p_3, are drawn independently from uniform distribution U(0,1), what is the probability that p_1 < p_2 < p_3

dx is the probability that p_1 \in (x, x+dx)

Similarly, dy is the probability that p_2 \in (y, y+dy) and

dz is the probability that p_3 \in (z, z+dz)

Then required probability is \int_0^1 \int_x^1 \int_y^1 \,dz \,dy \,dx  = \frac{1}{6}



Expected value of minimum of n i.i.d U(0,1) random variables

January 23, 2017

What is the expected value of minimum of n i.i.d random variables that are drawn from U(0,1) distribution

Let u_1, u_2, ... , u_n are the n random variables.

Let X = min(u_1, u_2, ... , u_n)

Plan of Action

  1. Find CDF for X
  2. Find PDF for X using CDF
  3. Compute E[X] = \int_{0}^{\infty} x\, f_X(x) \,dx

\begin{array}{rcl} F_X(x)  &=& P(X \leq x) \\  &=& 1-P(X>x) \\  &=&  1- [ P(u_1>x) \wedge P(u_2>x) \wedge \dots \wedge P(u_n>x) ] \\  &=& 1-(1-x)^n  \end{array}

Note that F_X(x) = \int_{-\infty}^{x} f_X(t) \,dt

Thus f_X(x) = \frac{d}{dx}F_X(x)

so f_X(x) = n (1-x)^{n-1}

\begin{array}{rcl} E[X]  &=& \int_{0}^{\infty} x f_X(x) \,dx \\  &=&n \,\int_{0}^{1} x\,(1-x)^{n-1}\, dx \\  &=&\frac{1}{n+1}  \end{array}

Alternative way :

For non-negative random variables, some times it is easy to calculate expected value of random variable using following formula :

E[X] = \int_{0}^{\infty} P(X>x) \,dx

\begin{array}{rcl} P(X>x)  &=& P(u_1>x) \wedge P(u_2>x) \wedge \dots \wedge P(u_n>x) \\  &=&  (1-x)^n \end{array}

Since in our case X cannot take value greater than 1

E[X] = \int_{0}^{1} P(X>x) \, dx =\int_{0}^{1}(1-x)^n \,dx = \frac{1}{n+1}

Proof from THE BOOK via vectors

September 3, 2016

Theorem : no matter what quadrilateral you start with, you always get a parallelogram when you connect the midpoints.

Proof : quadrilateral-midpoints-vectors

Distance travelled on average between successive seeks

August 26, 2014

Assuming that next I/O request resides in any cylinder with equal probability, what is the distance travelled by head, on average, between successive seeks on disk with n cylinders.

D = random variable denoting the distance travelled between successive seeks

\begin{array}{rcl} E[D] &=& \sum_{d=0}^{n-1} d . pr(D=d) \\ &=& 0 . \frac{n}{n^2} + \sum_{d=1}^{n-1} d . \frac{2(n-d)}{n^2} \\ &=& \frac{n^2-1}{3n} \\ &\approx& \frac{n}{3} \end{array}

So, if all cylinders have equal probability of access, the average seek distance is 1/3 the maximum seek

Threaded Pins

December 21, 2009

For description of the problem see Threaded Pins

How many pieces of thread will be needed in general ? i.e; given number of pins N and clockwise gap G, Express the number of Threads needed T, as a function of N & G.

Source : ” Thinking Mathematically ” by John Mason

PS: This problem is related to “rotation of array” problem where you are asked to rotate the array by a given amount using only O(1) extra space. For example, rotating array containing 3, 5, 9, 14, 1, 2, 11 by 3 positions will yield 14, 1, 2, 11, 3, 5, 9.

PPS : I highly recommend “Thinking Mathematically”  for kids, it helps to develop good practices which aid in solving problems.

Calculating Probability via Generating Functions

December 8, 2009

You have coins C_1, C_2, . . . , C_n . For each k, coin C_k is biased so that, when tossed, it has probability \frac{1}{2k+1} of falling heads. If the n coins are tossed, what is the probability that number of heads is odd? Express the answer as a rational function of n ?

Let P_n denote the desired probability. Then P_1 =\frac{1}{3}, and for n>1,  By conditioning on n^{th} coin’s outcome, we can see that

\begin{array}{rcl} P_n &= &\frac{2n}{2n+1} P_{n-1} + \frac{1}{2n+1} (1-P_{n-1}) \\ &=& \frac{2n-1}{2n+1} P_{n-1} + \frac{1}{2n+1}\end{array}

Using above recurrence we get P_2= \frac{2}{5}, P_3=\frac{3}{7}. Now it is easy to guess that P_n=\frac{n}{2n+1}. It is easy to verify this guess using Mathematical Induction.

Now, let us see alternative way to derive the same result using Generating Functions.

If X is a discrete random variable taking values on some subset of the non-negative integers,{0,1,….,n}, then the probability-generating function of X is defined as :

G_{X}(z) = E( z^{X}) = \sum_{x=0}^{n} p(X=x) z^x

where symbol E denotes the Expectation of Random Variable.

If X_1, X_2, ..., X_n is a sequence of independent ( not necessarily identically distributed ) random variables, and X = \sum_{i=1}^{n} X_i, then the probability-generating function of X is given by

G_{X}(z) = E( z^{X}) = E(z^{\sum_{i=1}^{n}X_i})= G_{X_1}(z)G_{X_2}(z)....G_{X_n}(z).

The last equality follows from the fact that the expectation of  product of independent random variables is equal to the product of expectation of individual random variables.

Let the probability of heads on C_i be donted by p_i. So p_i=\frac{1}{2k+1}. Let q_i=1-p_i.

Let X_i denote the indicator random variable, which takes on value 1 if C_i falls on heads, takes value 0 otherwise

P.G.F of X_i is G_{X_i}(z) = q_i + p_iz,\forall i=1,2,...,n

Let X denote the number of heads in the n coin tosses. So X = \sum_{i=1}^{n} X_i

P.G.F of X is G_X(z) = Q_0 + Q_1z +Q_2z^2 ...+ Q_nz^n

The probability of getting exactly m heads is equal to the coefficient of z^m in the generating fucntion, that is Q_m.

Since the coin tosses are independent,

G_X(z) = Q_0 + Q_1z +Q_2z^2 ...+ Q_nz^n = \prod_{i=1}^{n}(q_i +p_iz)

Notice that

G_X(1) = Q_o + Q_1 + Q_2 + ... + Q_n = \prod_{i=1}^{n}(q_i +p_i)=1


G_X(-1) = Q_o - Q_1 + Q_2 - ... + (-1)^{n} Q_n =\prod_{i=1}^{n}(q_i-p_i)

The required probability is Q_1 + Q_3 + Q_5 + ...


\begin{array}{rcl} P_n &=& \frac{G_X(1) - G_X(-1)}{2} \\ &=& \frac{1}{2}( 1-\prod_{i=1}^{n}(1-2p_i)) \\ &=& \frac{1}{2}( 1-\prod_{i=1}^{n}\frac{2k-1}{2k+1})\\ &=& \frac{1}{2}(1-\frac{1}{2n+1})\\ &=& \frac{n}{2n+1} \end{array}

It is easy to interpret the result of substituting 1 for z in terms of probability : it is the probability of Sample Space , which is 1. The result of substituting -1 for z, is the Difference between  probability of getting Even number of heads and probability of getting Odd number of heads ( Generating functions actually helped in calculating this difference easily ). The difference is equal to \prod_{i=1}^{n}(q_i-p_i) . In retrospect, we can see why this is so. If we expand this product we see that each +ve term in the expansion correspond to the probability of some sample point which has even number of heads in it, and each -ve term in the expansion correspond to probability of some sample point which has Odd number of heads in it. Hence the sum of +ve terms is equal to probability of getting Even number of heads, and the sum of -ve terms is equal to probability of getting Odd number of heads.

Algorithmic Question : 1

November 30, 2009

String A can be chained with string B , if the last letter of  A matches with the first letter of  B. For example “GIRI” can be chained with “IDLE”. This can be generalized to a list of strings, string1 chains with string2 which in turn chains with string3 and so on.

Now, Given a list of n strings find if they can be chained together . Note that this is YES/NO question, i.e; you need not come up with the actual order of chaining.

Try to come up with O(n) solution.

Probability Question : 2

November 21, 2009

Using Random5 function ( which returns a integer k \in [1,5]  with probability \frac{1}{5} ) , design Random7 function ( which returns a integer k \in [1,7]  with probability \frac{1}{7} ).

Probability Question : 1

November 5, 2009

Consider the set { 1, 2 ,…, n} of n nodes that attach to each other in the following way. Node 2 attaches to node 1. Node 3 attaches to node 1 with probability \frac{1}{2} and to node 2 with probability \frac{1}{2} . In general, node k ( k = 2, 3 ,.., n) attaches to a uniformly selected node from the previous ones, independently of previous attachments. The process of n steps produces a random rooted tree in which each vertex is one of the nodes and such that j being attached to i means that j is a child of i. The resulting trees are called recursive trees. Let size denote the number of vertices in the recursive tree.

Let X_m be a random variable which denotes the size of the subtree of a randomly chosen vertex in recursive tree of size m. Show that

1) For m\geq{1} , P( X_m = m ) = \frac{1}{m}

and, for m\geq{ 2},

2) P( X_m = k ) = \frac{1}{ k(k+1) }, if 1 \leq{k} \leq{m-1}.

PS:: This result is proved in following paper : ” Breakage and Restoration in Recursive Trees ” . I am looking for different approach. Any Ideas ??