5
Active contributors today

## In a survey of 375 dog and cat owners, there were 215 dog owners and 193 cat owners. How many in the survey own a dog and no cat?

BeeFree
Featured 10 months ago

Draw a Venn Diagram (see below)

#### Explanation:

If you sum the dog and cat owners ...

$215 + 193 = 408$

$408$ is greater than $375$ because the intersection of the two categories (see Venn diagram below) was counted twice . The value of the intersection (both cat and dog owners) is

$408 - 375 = 33$ own both a cat and a dog .

The count of owners of a dog and no cat is $215 - 33 = 182$

hope that helped

## What to do if a problem has a non-tabled degrees of freedom?

David B.
Featured 10 months ago

Choose the closest value in the table.

#### Explanation:

When the degrees of freedom is very high, the value of the inverse function changes very slowly. This is the case for Student's t-distributions and Chi-Squared where we are normally using the table to look up a value which corresponds to some cumulative probability being met.

These tests are most sensitive when the number of degrees of freedom is low - that's where all the action is. What they are telling us is that if we gather more data (more degrees of freedom) then our answer will get better. But there is a diminishing return as we gather more data, to the point that another point really doesn't change the test by a significant amount. This is reflected in our tables, where at some point they start skipping values and taking larger steps. This can be seen in the following graph, showing a student's t test for an $\alpha = 0.95$

As the degrees of freedom gets very large the change becomes insignificant, so most tables jump from some high number, like 30, to $\infty$.

So the rule of thumb is to choose the table row closest to the degrees of freedom that you have. The error in doing so will be small, but you can, if you like, interpolate between the values.

If your degrees of freedom is larger than the largest integer entry in the table, use the value for $\infty$. If there is no entry for $\infty$, use the largest valued entry.

## If a wholesaler sells to 500 stores and one store shows a 50% uptick in sales, how can the wholesaler determine if this uptick is significant or if it is expected for a few stores to randomly see an uptick of 50%?

Zor Shekhtman
Featured 10 months ago

There is no single simple answer. It depends on additional parameters that are not given.
See explanation below.

#### Explanation:

Important parameters that are not given in this problem are distribution of goods among stores and the number of customers buying in these stores.

Let's try to address a problem generally, and then we will make certain reasonable assumptions.

The distribution of goods among stores is related to probability of customers to buy goods in each specific store.
Assume that the probability of a single item to be bought at store ${S}_{1}$ is ${p}_{1}$, at store ${S}_{2}$ is ${p}_{2}$, ... at store ${S}_{i}$ is ${p}_{i}$,... at store ${S}_{500}$ is ${p}_{500}$.

Assume further that the total number of items purchased is $n$.

Consider now a store ${S}_{i}$. Introduce a random variable ${\xi}_{i}$ that is equal to $1$ when an item is bought at store ${S}_{i}$ (with probability ${p}_{i}$) and is equal to $0$ otherwise (with probability $1 - {p}_{i}$).
This is a Bernoulli random variable.
Its mathematical expectation is
$E \left({\xi}_{i}\right) = 1 \cdot {p}_{i} + 0 \cdot \left(1 - {p}_{i}\right) = {p}_{i}$,
its variance is
$V a r \left({\xi}_{i}\right) = {\left(1 - {p}_{i}\right)}^{2} \cdot {p}_{i} + {\left(0 - {p}_{i}\right)}^{2} \cdot \left(1 - {p}_{i}\right) = {p}_{i} \left(1 - {p}_{i}\right)$,
its standard deviation is
$\sigma \left({\xi}_{i}\right) = \sqrt{{p}_{i} \left(1 - {p}_{i}\right)}$

The wholesaler has certain number $n$ of items of his goods that he distributes among $500$ stores. It's reasonable to assume that the number of items $n$ is rather big to cover all stores and must be significantly higher than the number of stores.
For instance, if we are talking about bottles of soda, it must be thousands per store.

Consider now $n$ random variables independent of each other and each distributed identically with ${\xi}_{i}$:
${\xi}_{i 1}$, ${\xi}_{i 2}$,...${\xi}_{i n}$
Here random variable ${\xi}_{i j}$ indicates whether $j$th item was bought at $i$th store.
Obviously, the sum of the above random variable is a random variable equal to the number of items bought at $i$th store:
${\eta}_{i} = {\xi}_{i 1} + {\xi}_{i 2} + \ldots + {\xi}_{i n}$

Let's analyse the distribution of probabilities of ${\eta}_{i}$.
First of all, according to the Central Limit Theorem, this distribution should be very close to Normal.
Since it's a sum of independent identically distributed random variables, its expectation is a sum of expectations of its components and its variance is a sum of variances:
$E \left({\eta}_{i}\right) = {p}_{i} \cdot n$
$V a r \left({\eta}_{i}\right) = {p}_{i} \cdot \left(1 - {p}_{i}\right) \cdot n$
$\sigma \left({\eta}_{i}\right) = \sqrt{{p}_{i} \cdot \left(1 - {p}_{i}\right) \cdot n}$

It's time to make some additional assumption. To simplify the problem, let's assume that all stores are approximately equal in the number of customers who buy there. Therefore, the probability of a single item to be bought in store ${S}_{i}$ is independent of store and, therefore, equal to $\frac{1}{500} = 0.002$.
That makes all ${\eta}_{i}$ to have the same distribution of probabilities - Normal with expectation $E \left({\eta}_{i}\right) = 0.002 \cdot n$ and standard deviation $\sigma \left({\eta}_{i}\right) \cong 0.0447 \cdot \sqrt{n}$

Let's say, we want to determine the probability of purchases in store ${S}_{1}$ (or any other fixed store for this matter) to be within reasonable limits around average with total number of items distributed among all stores $n = 10 , 000$.
In this case
$E \left({\eta}_{1}\right) = 0.002 \cdot 10000 = 20$,
$\sigma \left({\eta}_{1}\right) \cong 0.0447 \cdot \sqrt{10000} = 4.47$

According to the "rule of 2$\sigma$", with 95% certainty we can say that deviation of the value of our random variable ${\eta}_{1}$ from its mathematical expectation $E \left({\eta}_{1}\right)$ should not exceed $2 \cdot \sigma \left({\eta}_{1}\right) \cong 9$, which is slightly less than 50% of its average value $20$.
So, under the condition of equal probabilities of purchase in different stores ${p}_{i} = \frac{1}{500}$ and about $10 , 000$ items purchased in all stores combined, the probability of the number of items purchased in store ${S}_{1}$ (or any other fixed store) not to exceed 50% of average is greater than 95%.

The second part of this problem is related to probability of ANY store purchase not to exceed 50% of its average. With certain degree of precision it can be calculated as the product of corresponding probabilities in EACH store.
To achieve 95% certainty that number of purchases in any store would not exceed 95%, we need the probability of each store to be
#0.95^(1/500)~=0.9999=99.99%#

To achieve this probability for each store we need the number of purchases to be very high. "Rule of 3$\sigma$" states that Normal random variable takes values not further than 3$\sigma$ from its average with probability 99.7%. To achieve 99.99% certainty we have increase the interval around average to 6$\sigma$.

Thus, with $n = 100 , 000$ we have
$E \left({\eta}_{1}\right) = 0.002 \cdot 100000 = 200$
$\sigma \left({\eta}_{1}\right) \cong 0.0447 \cdot \sqrt{100000} = 14.14$
$6 \sigma \left({\eta}_{1}\right) \cong 85$,
which is about 43% of the average, so it's sufficient to have 100,000 items to distribute to make sure that none of the store would have more than 50% extra purchases with certainty of 95%.

If, evenly distributing 100,000 items among 500 relatively equivalent (in average number of purchases) stores, at least one store exceeded its sale by more than 50%, something abnormal and unexpected happened.

Please refer to Unizor for details on probabilities and statistics.

## You toss a coin and roll a number cube. What is P(tails and a six)?

Dayton K.
Featured 7 months ago

$\frac{1}{12}$

#### Explanation:

Since the events are independent,

$P \left(t a i l s \mathmr{and} 6\right) = P \left(t a i l s\right) \cdot P \left(6\right)$

$P \left(t a i l s \mathmr{and} 6\right) = \frac{1}{2} \cdot \frac{1}{6}$

$P \left(t a i l s \mathmr{and} 6\right) = \frac{1}{12}$

## What assumptions does an F-test make?

Kate M.
Featured 3 months ago

An F-test assumes that data are normally distributed and that samples are independent from one another.

#### Explanation:

An F-test assumes that data are normally distributed and that samples are independent from one another.

Data that differs from the normal distribution could be due to a few reasons. The data could be skewed or the sample size could be too small to reach a normal distribution. Regardless the reason, F-tests assume a normal distribution and will result in inaccurate results if the data differs significantly from this distribution.

F-tests also assume that data points are independent from one another. For example, you are studying a population of giraffes and you want to know how body size and sex are related. You find that females are larger than males, but you didn't take into consideration that substantially more of the adults in the population are female than male. Thus, in your dataset, sex is not independent from age.

## A 12 member jury for a criminal case will be selected from a pool of 15 men and 15 women. What is the probability that the jury will have 6 men and 6 women?

Parzival S.
Featured 4 weeks ago

$\cong .2896$

#### Explanation:

To find this, we need to know the number of ways the jury can be picked overall, and then the number of ways 6 men and 6 women can be on the jury. The fraction of the two will be the probability.

The total number of ways the jury can be picked is the combination of a pool of 30 people and choosing 12:

#C_(30,12)=(30!)/((12!)(30-12)!)=(30!)/((12!)(18!))#

and now let's evaluate this:

#(30xx29xx28xx27xx26xx25xx24xx23xx22xx21xx20xx19xxcancel(18!))/(12xx11xx10xx9xx8xx7xx6xx5xx4xx3xx2xxcancel(18!))#

$\frac{\textcolor{red}{\cancel{30}} \times 29 \times \textcolor{b l u e}{\cancel{28}} \times \textcolor{g r e e n}{\cancel{27}} \times {\textcolor{\tan}{\cancel{26}}}^{13} \times 25 \times \textcolor{b r o w n}{\cancel{24}} \times 23 \times {\textcolor{\mathmr{and} a n \ge}{\cancel{22}}}^{\textcolor{\tan}{\cancel{2}}} \times 21 \times {\textcolor{p u r p \le}{\cancel{20}}}^{\textcolor{\tan}{\cancel{2}}} \times 19}{\textcolor{b r o w n}{\cancel{12}} \times \textcolor{\mathmr{and} a n \ge}{\cancel{11}} \times \textcolor{p u r p \le}{\cancel{10}} \times \textcolor{g r e e n}{\cancel{9}} \times \textcolor{\tan}{\cancel{8}} \times \textcolor{b l u e}{\cancel{7}} \times \textcolor{red}{\cancel{6 \times 5}} \times \textcolor{b l u e}{\cancel{4}} \times \textcolor{g r e e n}{\cancel{3}} \times \textcolor{b r o w n}{\cancel{2}}}$

$29 \times 13 \times 25 \times 23 \times 21 \times 19 = 86 , 493 , 225$

Ok - we know the number of ways the jury can be picked. Now how many ways can the jury consist of 6 men and 6 women?

For the men, there are 15 in the pool and we're picking 6:

#C_(15,6)=(15!)/((6!)(15-6)!)=(15!)/((6!)(9!))#

and now let's evaluate that:

#(15xx14xx13xx12xx11xx10xxcancel(9!))/(6xx5xx4xx3xx2xxcancel(9!))#

$\frac{\textcolor{red}{\cancel{15}} \times {\textcolor{g r e e n}{\cancel{14}}}^{7} \times 13 \times \textcolor{b l u e}{\cancel{12}} \times 11 \times {\textcolor{g r e e n}{\cancel{10}}}^{5}}{\textcolor{b l u e}{\cancel{6}} \times \textcolor{red}{\cancel{5}} \times \textcolor{g r e e n}{\cancel{4}} \times \textcolor{red}{\cancel{3}} \times \textcolor{b l u e}{\cancel{2}}}$

$7 \times 13 \times 11 \times 5 = 5005$

The same number of choices of women are available, so we multiply the number of men's choices and women's choices:

$5005 \times 5005 = 25 , 050 , 025$

And now we can find the probability:

$\frac{25 , 050 , 025}{86 , 493 , 225} \cong .2896$

##### Questions
• 9 minutes ago · in What is Statistics?
• 12 minutes ago · in Measures of Center
• An hour ago · in One-sample z test
• · 2 hours ago · in Conditional Probability
• 2 hours ago · in Addition Rule
• 6 hours ago · in One-sample z test
• 6 hours ago · in Boxplots