Featured Answers

1
Active contributors today

Answer:

#1/12#

Explanation:

Since the events are independent,

#P(tails and 6) = P(tails) * P(6)#

#P(tails and 6) = 1/2 * 1/6#

#P(tails and 6) = 1/12#

Answer:

It is a term used to denote applying the maximum likelihood approach along with a log transformation on the equation to simplify the equation.

Explanation:

For example suppose i am given a data set #X in R^n# which is basically a bunch of data points and I wanted to determine what the distribution mean is. I would then consider which is the most likely value based on what I know. If I assume the data comes from the normal distribution #N(mu,sigma^2)# with #mu# as the mean and #sigma^2# as the variance then we have #f(X|mu,sigma^2) =prod_i^n 1/sqrt(2pi sigma ^2)e^(-1/(2sigma^2)(x_i-mu)^2)#.

If #mu# is not known then I would try to estimate it by way of maximum likelihood or using the equation I would state

#l(mu|X,sigma^2)=prod_i^n 1/sqrt(2pi sigma ^2)e^(-1/(2sigma^2)(x_i-mu)^2)#

Here the equation is the same but the paramter of interest is #mu#. To solve we take the derivative, set it equal to 0 and solve for #mu# so we have.

#(partial)/(partialmu) prod_i^n 1/sqrt(2pi sigma ^2)e^(-1/(2sigma^2)(x_i-mu)^2) #

However before doing so I see that I can apply the natural log before finding the derivative to solve for #x# and simplify the equation thus ...

#ln(l(mu|X,sigma^2))= sum_i^n ln(1/sqrt(2pi sigma ^2)) -1/(2sigma^2)(x_i-mu)^2 #

#(partial)/(partialmu) sum_i^n ln(1/sqrt(2pi sigma ^2)) -1/(2sigma^2)(x_i-mu)^2 #
#= 1/(sigma^2)sum_i^n(x_i-mu) =0#
#= 1/(sigma^2)sum_i^nx_i = 1/(sigma^2)sum_i^nmu#
#= sum_i^nx_i = n*mu#
#= 1/nsum_i^nx_i = mu#

so an approximation of #mu# would be the average of the data or #barx = 1/nsum_i^nx_i#.

Using MLE we can also find out what the estimated standard deviation is.

Answer:

We use our good friend Bayes' rule to help us deduce:
#Pr("took 2 rolls"|"score is 5")=2/15.#

Explanation:

Warning: Long answer ahead!

One form of Bayes' rule states the following:

#Pr(B|A)=(Pr(AB))/(Pr(A))=(Pr(A|B)*Pr(B))/(Pr(A))#

This allows us to write a conditional probability of "B given A" in terms of "A given B", which may be easier to calculate. In this question,

B = "die rolled twice", and
A = "score is 5".

Let's write this into the equation:

#Pr("2 rolls"|"score is 5")=(Pr("score is 5"|"2 rolls")*Pr("2 rolls"))/(Pr("score is 5"))#

The numerator on the RHS is easy to deduce; however, the denominator, in its current state, is not. But we can make it easier. We recall that, if #A# can be partitioned into #k# disjoint events #AnnC_1, AnnC_2, ..., AnnC_k# (where none of these intersected regions overlap, and together they all form #A#), then

#Pr(A)=sum_(i=1)^kPr(A|C_i)*Pr(C_i)#

We actually have that here; the #C_i#'s will be the number of possible rolls on a turn. What we're saying is that

#Pr("score is 5")=Pr("score is 5"|"1 roll")*Pr("1 roll")#
#color(white)"XXXXXXXXX"+Pr("score is 5"|"2 rolls")*Pr("2 rolls")#
#color(white)"XXXXXXXXX"+Pr("score is 5"|"3 rolls")*Pr("3 rolls")#

Kind of a "the whole is equal to the sum of its parts" thing.

From hereon, let's use the shorthand #S_5# for "the score was 5" and #R_n# for "#n# number of rolls". Putting this all together, we have

#Pr(R_2|S_5)=(Pr(S_5|R_2)Pr(R_2))/(Pr(S_5|R_1)Pr(R_1)+Pr(S_5|R_2)Pr(R_2)+Pr(S_5|R_3)Pr(R_3))#
#=[Pr(S_5nnR_2)]/[Pr(S_5nnR_1)+Pr(S_5nnR_2)+Pr(S_5nnR_3)]#

Now, we do the calculations!

#Pr(S_5nnR_1)=Pr("roll a 5")#
#=1/6#
#Pr(S_5nnR_2)=Pr("roll a 2, then a 3")=1/6*1/6#
#=1/36#
#Pr(S_5nnR_3)=Pr["roll a 1; then (1,3), (2,2), or (3,1)"]=1/6*3/36#
#=1/72#

Finally, we place these values back into the equation for #Pr(R_2|S_5):#

#Pr(R_2|S_5)=(1/36)/(1/6+1/36+1/72)color(blue)(*72/72)#

#color(white)(Pr(R_2|S_5))=(2)/(12+2+1)#

#color(white)(Pr(R_2|S_5))=2/15#

For completeness, #Pr(R_1|S_5)=12/15=4/5,# and #Pr(R_3|S_5)=1/15.# These three probabilities sum to 1, which is what we'd expect, since if the player's score was 5, it had to take either 1, 2, or 3 rolls.

Bonus:

There's a great Numberphile video on YouTube discussing Bayes' rule here:

Answer:

See below:

Explanation:

We'll start with burger patties on top. Burgers with cheese will be in #color(green)"green"# and with no cheese will be in #color(red)("red")#

#color(white)(0)1color(white)(00000000000000)2color(white)(0000000000000)3color(white)(00000)"Burger patties"#

#77%color(white)(0000000000)16%color(white)(00000000000)7%#

#/color(white)(000)"\"color(white)(0000000000)"/"color(white)(00)"\"color(white)(0000000000)"/"color(white)(000)"\"#

#color(white)(0)|color(white)(000000)"|"color(white)(00000000)"|"color(white)(00000)"|"color(white)(00000000)"|"color(white)(00000)"|"color(white)(00000)#

#color(green)("42%")color(white)(0000)color(red)("35%")color(white)(00000)color(green)("7%")color(white)(000)color(red)("9%")color(white)(000000)color(green)("5%")color(white)(000)color(red)("2%")color(white)(0)"Percentage of orders in total"#

Ok - how to do this:

We know there are 3 types of burger options: 1 patty, 2 patties, 3 patties and these will add to 100%:

#77%+16%+"3 patties"=100%#

#"3 patties"=7%#

We know that #42%# of the orders are 1 patty with cheese. This means that the remaining percentage of single patty burgers are no cheese:

#42%+"1 patty no cheese"=77%#

#"1 patty no cheese"=35%#

We know that #2%# of all orders are the 3 patties no cheese. Which means the remaining amount has cheese:

#2%+"3 patties with cheese"=7%#

#"3 patties with cheese"=5%#

We know that 54% of all orders have cheese. We know the total orders for 1 and 3 patties, so we can solve for 2 patties:

#42%+"2 patties with cheese"+5%=54%#

#"2 patties with cheese"=7%#

And lastly we can figure out the number of 2 patties no cheese:

#7%+"2 patties no cheese"=16%#

#"2 patties no cheese"=9%#

Answer:

A hypothesis is what informs an experiment or what is being tested/measured. It is often called an educated guess.

Explanation:

A hypothesis is what informs an experiment or what is being tested/measured. It is often called an educated guess.

Created by KM

Examples of hypotheses:

  • As the number of cigarettes a person smokes per day increases, the risk of lung cancer increases.
  • Black cats never get adopted from animal shelters.

A hypothesis can be further broken down into a null hypothesis and an alternative hypothesis. The null hypothesis states that there is no relation and the alternative hypothesis states that there is a relation.

Referring to the first example given above, the null hypothesis would be that there is no relationship between the number of cigarettes a person smokes per day and the risk of lung cancer. The alternative hypothesis is that there is an effect of the number of cigarettes a person smokes per day and the risk of lung cancer OR that the larger the number of cigarettes a person smokes per day, the larger the risk of lung cancer.

A good hypothesis should not only be clear and informative, but it also needs to be measurable.

Hypotheses should be developed after studying the problem or issue as thoroughly as possible, building upon previous knowledge and observations.

Answer:

Mean: #mu=1.4#
Variance: #sigma^2=0.64#
Standard deviation: #sigma=0.8#

Explanation:

We are given that #X# could take on the values #{0,1,2,3}# with respective probabilities #{0.15, 0.35, 0.45, 0.05}#. Since #X# is discrete, we can imagine #X# as a 4-sided die that's been weighted so that it lands on "0" 15% of the time, "1" 35% of the time, etc.

The question is, when we roll this die once, what value should we expect to get? Or perhaps, if we roll the die a huge number of times, what should the average value of all those rolls be?

Well, of the 100% of the rolls, 15% should be "0", 35% should be "1", 45% should be "2", and 5% should be "3". If we add all these together, we'll have what's known as a weighted average.

In fact, if we placed these relative weights at their matching points on a number line, the point that would "balance the scale" is the mean that we seek.

This is a good way to interpret the mean of a discrete random variable. Mathematically, the mean #mu# is the sum of all the possible values, weighted by their probabilities. As a formula, this is:

#mu = E[X] = sum_("all " x)[x * P(X=x)]#

In our case, this works out to be:

#mu = [0*P(0)]+[1*P(1)]+[2*P(2)]+[3*P(3)]#
#color(white)mu=(0)(0.15)+(1)(0.35)+(2)(0.45)+(3)(0.05)#
#color(white)mu="       "0"       "+"    "0.35"    "+"     "0.9"     "+"    "0.15#
#color(white)mu=1.4#

So, over a large number of rolls, we would expect the average roll value to be #mu=1.4#.

The variance is a measure of the "spread" of #X#. Going back to our "balanced number line" idea, if we moved our weights out from our "centre of gravity" #mu# so that they are twice as far away, #mu# itself wouldn't change, but the variance would increase, by a factor of 4.

That's because the variance #sigma^2# of a random variable is the average squared distance between each possible value and #mu#. (We square the distances so that they're all positive.) As a formula, this is:

#sigma^2="Var"(X)=E[(X-mu)^2]#

Using a bit of algebra and probability theory, this becomes

#sigma^2=E[X^2]-mu^2#
#color(white)(sigma^2)=sum_("all x")x^2P(X=x)" "-" "mu^2#

For this problem, we get

#sigma^2=[0^2*P(0)]+[1^2*P(1)]+[2^2*P(2)]#
#color(white)(sigma^2=)+[3^2*P(3)]" "-" "1.4^2#
#color(white)(sigma^2)=(0)(0.15)+(1)(0.35)+(4)(0.45)+(9)(0.05)#
#color(white)(sigma^2=)-1.96#
#color(white)(sigma^2)=0.64#

So the average squared distance between each possible #X# value and #mu# is #sigma^2=0.64#.

Standard deviation is easy—it's just the square root of the variance. But, why bother with it if it's pretty much the same? Because the units of #sigma^2# are the square of the units of #X#. If #X# measures time, for example, its variance is in units of #"(time)"^2#, which really doesn't help us if we're trying to establish a "margin of error".

That's where standard deviation comes in. The standard deviation #sigma# of #X# is a measure of how far from #mu# we should expect #X# to be. It's simply

#sigma= sqrt (sigma^2)#

For this problem, that works out to be

#sigma = sqrt(0.64)=0.8#

So every time we pick an #X#, the expected distance between #mu# and that #X# is #sigma=0.8#. And since #sigma# is in the same "units" as #X#, it's much more easy to use to help us construct a margin of error. (See: confidence intervals.)

View more
Questions
Ask a question Filters
Loading...
This filter has no results, see all questions.
×
Question type

Use these controls to find questions to answer

Unanswered
Need double-checking
Practice problems
Conceptual questions