Answer: When we are comparing two samples, whether means or proportions pay close attention to the claim. Here, the claim is that the difference in the two mean salaries is more than $5000. That requires the claim to be the alternative hypothesis since “more than” is a > operator, which is an inequality. So, Ha: μ_{1} – μ_{2 }> $5000.

I know. You are saying that you don’t see a claim. But when a problem asks a question as this one does, that is the claim to be tested unless you find a more definitive claim later in the problem.

Note that you need to follow the “standard” format which is that we consider μ_{1}– μ_{2 }and not the reverse because that is the way the claim is stated in the problem: Region 1 is mentioned before Region 2. I think it might be less confusing if the problem had compared Alabama and Florida salaries, but the first entity (population) mentioned is logically μ_{1}.

Here is the StatCrunch solution:

]]>At any rate. 8.1.11, the difference between two means hypothesis test standardized test statistic requires µ

Answer: James, for most of our mean difference problems, we will not be given the assumed population means or the mean difference. If not, you use 0 for the mean difference. In your problem, you are told that μ_{1} = μ_{2}, so the mean difference μ_{1} – μ_{2}, is 0.

Here is the StatCrunch solution with slightly different summary data:

]]>Yes, this is a demanding course for most people. My strong sense is that it, like other statistics courses, should only be taught in the 15-week format. I say that knowing the strong preference among adult students for 8-week or shorter courses they can more quickly check as “Completed” on their degree To Do list.

We need to remember that regionally-accredited degree programs require courses to satisfy the Carnegie credit system in which a credit-hour represents the equivalent of 3 student work hours per week for 15 weeks. (Silve & White, 2015) Thus, this 3-credit-hour course must require 9 student work hours per week in the 15-week format, which equates to about 17 hours per week in the 8-week format.

Again, my strong sense is that most adult students rationalize “they” can get the work done in less time either consciously or subconsciously. And that can lead to stress when the inevitable work/life issues occur which disrupt our plans. I believe that this type of added stress does not help people learn.

A second reason I believe this quant course should be taught in only 15-week terms is that stats is a subject in which time is needed to process and to really learn the concepts. There are two aspects of this:

First, most of us need time to reflect on what we have read and perhaps go back and re-read the material or read supporting material [you can also apply this concept to material you have watched.] I’m guessing that all of us have had instances of where we leave a discussion/argument with less than satisfactory results only to have the “perfect” response pop into our minds later after we mull over the discussion. Similarly, I have no doubt that we all have had the experience of coming up with a solution to a problem after we “sleep” on it. That same thing happens to me a lot when I ponder how to solve a complex stats problem.

There is an analogy in sports/exercise. Recall the “burn” in muscles we all experience when we begin to learn a new sport/exercise which uses muscles differently then we are used to using them. We are told to space our exercise to allow our muscles time to recover. (Bishop & Woods, 2008) We are well advised to space our exercise at least 48 hours including a good night’s sleep if at all possible. Same for studying stats, in my opinion. (Kapur, 2014)

Second, there is good research that shows better results in math-like courses occur when students use spaced-repetition, which is nothing more than having time between their work sessions on topics. (Cepeda, Pashler, Vul, Wixted, & Rohrer, 2006) That is one reason I always recommend my students space their work over the course of the week, beginning early in the week, and not delay “everything” for a crashed pace on the weekend.

Finally, there is the fact that this is an online course which limits the student-student and student-instructor interactions which I believe are important in most difficult topics. I took a few online courses during my doctorate, but they were not really parallel to this course because I could still see and talk to my classmates and instructor at school the following days to rehash what went on in the online course. I have tried holding Google Hangouts in my online courses but find that only a small portion of the class can participate each time I try to hold them. And some of my students complained that the Hangouts were unfair to them because their work/life did not allow them to attend regardless of when I scheduled the Hangouts. Viewing the video of the hangout did not satisfy their need for interaction the way actual attendance would. But my sense is that if all my courses were 15-week terms, it is more likely that every student would be able to attend some of the weekly/twice-weekly hangouts. And that would be materially beneficial.

My opinion based on my observations (admittedly anecdotal evidence) in teaching stats for seven years is that adult students with all their family and job responsibilities do better (learn more with less stress) in 15-week terms. Period.

Bishop, P., & Woods, A. (2008). Recovery from training: a brief review. *Journal of Strength and Conditioning Research*, 22(3):1015-1024.

Cepeda, N., Pashler, H., Vul, E., Wixted, J., & Rohrer, D. (2006). Distributed practice in verbal recall tasks: A review and quantitative synthesis. . *Psychological Bulletin*, 132:354-380. Retrieved from UCLA College of Life Sciences.

Kapur, M. (2014). Productive failure in learning math. *Cognitive Science*, 38(5): 1008-1022.

Silve, E., & White, T. :. (2015). *The Carnegie Unit: A Century-Old Standard in a Changing Education Landscape.* Standford: Carnegie Foundation for the Advancement of Teaching.

]]>

Two important points you bring up:

- Do not round intermediate values in a string of calculations. Wait until the very last step/answer to round to the number of decimals required by MyStatLab. On this problem and many others in this course, rounding too early can lead to wrong answers.

In the instructor view, I can cycle through the variations of that problem different students might see. On about half, I could round the z value to two decimal places and still get the correct answer. But for the one shown, I get the wrong answer. On all of them, if I round the standard error, sigma x-bar, to three decimal places, I always got the wrong answer. Rounding early is tempting if you are using a calculator and writing down the intermediate values instead of storing them in the calculator’s memory, if that is possible for your calculator.

- All tables today are created using technology, not the reverse. So do not delude yourself into thinking the tables are more accurate. If you enter the normal table on this problem with the rounded z of -2.02, you will get the wrong answer unless you interpolate between the table values. Note: on some problems where MyStatLab does not tell you to use the tables, it
**may**accept the nearest table value (intersection of highlights). But if the problem says, “Use technology,” the approximate table value will be counted wrong.

(Larson & Farber, 2015, p. A16)

Larson, R., & Farber, B. (2015). *Elementary Statistics: Picturing the World, 6th.* Boston: Pearson.

If you send an email to drdawn@thestatsfiles.com telling me you have subscribed to my YouTube channel, I will send you a copy of the workbook. ]]>

Section 5.5: Normal Approximation to the Binomial Distribution

The premise in our Larson textbook for using the normal approximation is that finding the binomial probability of an “less than x” or “greater than x” problem would be onerous – having to manually calculate each of the discrete values and then sum them up. They give an example of a doctor who performs a surgery that has an 85% success rate, but you want to know the probability that fewer than 100 of 150 attempted surgeries will be successful. And yes, finding each of the 101 discrete values (0 to 100) would be time-consuming if done manually using the equations in the book.

But, the reality today is that our software can easily do that – Excel, StatCrunch, and I think even the TI. Those software “technology” tools can find the cumulative probability from the left tail to the value of x we need. So, in the “real” world today, you would rarely, if ever, need to use the normal approximation to the binomial and its concomitant continuity correction.

Remember, the normal is an __approximation__ and not as precise as using the true binomial distribution.

That said, for this academic course, you do need to know how to use the normal approximation to the binomial at least for the homework. I have a blog post on the details of doing this for a problem from Section 5.5 here https://www.drdawnwright.com/?p=16678

I also have an online Excel calculator that makes this easy to do here https://www.drdawnwright.com/?p=17877

]]>A 95% confidence interval means that if you repeat exactly the same process/experiment many times, 95% of the time the population parameter, e.g. the mean, will fall within that interval. On a practical level, we can say that we are 95% confident the population parameter is within that interval.

One metaphor I like to use to help understand why a confidence interval gets wider as we increase the level of confidence is that of a basketball hoop. The standard hoop is 18 inches in diameter. If a player hits 90% of her free throws using a standard 18” rim, wouldn’t you think her success rate would increase as the size of the hoop increased beyond 18” and her success rate would decrease as the hoop got smaller than 18”? With a 20-inch diameter hoop, her success rate might increase to 95%. With a 24-inch diameter, her free throw success rate might be 99%. If the hoop were only 15 inches, her success rate might fall to 80%.

So it is with confidence intervals. The more confident we want to be, the wider the interval must be.

This is true because of the formula for the margin of error, E = z_{c}*σ/sqrt(n). Assuming the standard deviation σ and sample size n stay constant, E increases as the critical value of z increases. Z-critical increases as we increase the desired confidence level, c.

Here is how to find the critical value of z using StatCrunch. Granted, you will soon memorize the critical values for standard confidence levels of 90%, 95% and 99%, but you may be asked to find different values on quizzes and exams, e.g. 98% or 88%.

Use the steps for using the normal calculator to find z-scores but select the “Between” side (step 4). Enter the confidence level, c, in decimal format, e.g. here c= 95% is 0.95, in the window at step 5, and click compute. The red area is the 95% and the remaining 5% is split between the two tails. You will have the critical values of z, here +1.96 and – 1.96.

On the left, is the normal calculator set up to find the critical values for a confidence level of 88%. You should be able to see by comparing the two graphs that the 95% interval is wider than the 88% interval because the z-critical is larger for the 95%.

Hope this helps.

]]>She said in her discussion post:

My plan is to visit a haunted inn in Cobleskill, NY and stop by at the in-law’s house in Saratoga Spring, NY. I would also like to visit Salem, MA for historical sightseeing, and to stop by at the seafood restaurant in Mystic, CT.”

She chose to use a variation of the classic Traveling Salesman optimization problem and implemented it in Excel.

The result is shown below. Initially, the total distance was 606 miles. After using Solver, it was reduced to 586 miles. The order of cities was also rearranged. Solver suggests starting from Albany, Salem, Mystic, Cobleskill, and Saratoga. I was surprised that Cobleskill and Saratoga were the last cities to visit since they are so close to Albany. I believe the optimization models such as the Traveling Salesman Problem are beneficial for finding the minimum or maximum solutions. Without analyzing, I would have gone to visit Cobleskill and Saratoga first.”

Here is the sketch she created of her travel route.

Don’t you just love technology? And enterprising students?

]]>The difference between data, information, knowledge and wisdom ]]>

Democrats Are Wrong About Republicans. Republicans Are Wrong About Democrats.

]]>