# Statistics question

Discussion in 'Off Topic [BG]' started by Ziltoid, Feb 7, 2013.

1. ### Ziltoid

Apr 10, 2009
I know there's a fairly good amount of Math and Stats guys out there, so let's try this.

I was asking myself if it was accurate to use the Central Limit Theorem (confidence interval) to calculate the margin of error (say in pre-electoral polls) when there's more than two "serious options" (say >2 candidates with chances of winning).

The margin of error on one percentage is calculated this way: +-2 ([p(1-p)]/n)^0.5. (19/20, hence the 2)

But does it really makes sense to used a centric model (50-50 on each side of the curve)? Especially when there is >2 probable answers?

Wouldn't some other probability law modeling, multidimensional or/and not centered, be more accurate?

I'm probably confused on some point, I'm not very good at this. One hint that makes me think I'm just wrong somewhere is that this way of calculating only gives the margin of error in regards of the sample size, not other stuff. (But this begs the question, is there better ways?)

Also, is there ways of calculating the margin of error that includes the different %? I've seen with one and two percentages but not more.

My statistics and polls books/manuals are too introductory to help me on this, hence me asking

Help?

2. ### LiquidMidnight

Dec 25, 2000
Would a confidence interval sufficiently solve your problem?

3. ### Ziltoid

Apr 10, 2009
Most likely, it is considered as a "safe" way to calculate a margin of error. I was simply wondering if there was more precise ways and if my logic was flawed for the >2 part.

edit: I think I mixed up statistical error and margin of error for the >2 thing...

I'm still wondering why we always suppose it always follows a Gaussian Law... One of my manuals on polls says it "most often works" but doesn't add any details or sources to that statement. I'm curious.

4. ### MarkMgibson

Oct 24, 2012
Brisbane, Australia
Hing fow matug. Patuk ila midta farang.

Aug 26, 2009
New Zealand

7. ### MarkMgibson

Oct 24, 2012
Brisbane, Australia

Actually, I was speaking Klingon, but your interpretation was almost perfect.

8. ### jmattbassplaya

Jan 13, 2008
Atlanta, GA.
Talkbass, it's not for everybody!

9. ### MarkMgibson

Oct 24, 2012
Brisbane, Australia
Don't knock it, my German Shepherd (the valiant Sir Zachary) understands Klingon very well. I read it in one of those trekky books, and decided I'd teach him Klingon - people find it rather odd, but they never criticize lest I give him the Klingon word for "attack"! I should say though, when your carotid artery is being ripped out, language becomes completely irrelevant.

10. ### Downunderwonder

Aug 26, 2009
New Zealand
This is going to wind up very uninteresting.

11. ### MarkMgibson

Oct 24, 2012
Brisbane, Australia
Considering the way it started, that's no surprise. Statistical analysis - few real humans understand that stuff. For those that do understand it, I'd love to have a game of chess with you.

12. ### Ziltoid

Apr 10, 2009
I was mixed up, I managed to sort this out.

And who the hell speaks klingon anyways, I'd rather speak numbers

13. ### TOOL460002Supporting Member

Nov 4, 2004
Santa Cruz CA
Isn't "farang" the Thai equivalent of "gringo?" Maybe I got that wrong... odds are I did.

14. ### MatticusManiaLANA! HE REMEMBERS ME!

Sep 10, 2008
Pomona, SoCal
I left my house yesterday without a t-shirt, yadda yadda yadda, I woke up on a subway surrounding by asian schoolgirls doing math homework.

15. ### 4dog

Aug 18, 2012
Whatdoyoumeanschoolgirls?

16. ### Downunderwonder

Aug 26, 2009
New Zealand
Don't be boring, use your imagination. How many schoolgirls does it take to fill a subway car, and could any of them answer the quiz?

17. ### LiquidMidnight

Dec 25, 2000
Statistical tests are built on assumptions, and many of them are built on the assumption that the sample follows a Guassian curve. Of course, life isn't perfect, and contrary to what the stats books tell you, many sampling distributions don't fall into a nice normal curve. Luckily, you can sometimes smooth a non-Guassian distribution with logarithms to a certain extent. It's a real handy technique when you're dealing with certain regression tests that require normal distributions. But you do have to be careful with interpretation then, because the data is no longer in raw form; and you have to transform the variables back for interpretation. On the other hand, some statistical tests, like robust regression, can deal fairly well with non-normal distributions.

18. ### Ziltoid

Apr 10, 2009
I prefer regression, cloud cluster analysis and metaheuristics. Gaussian curve is cheap but sadly needed for many regressions

I intend to get good at stat modeling. And maybe in a life time or three I'll be able to create metaheuristics models.

19. ### fdeckSupporting Member

Mar 20, 2004
Disclosures:
HPF Technology: Protecting the Pocket since 2007
Besides making the math easy, Gaussian distributions are often a good assumption in the physical sciences because many observables are the result of averaging processes. For instance the voltage on the terminals of a resistor is the average of a bazillion electrons whanging back and forth, and follows Gaussian statistics. I trust a plain 100k resistor to function as a Gaussian noise source.

On the other hand, the characteristics of components such as tubes and JFETs are not Gaussian because they go through a sorting process and out-of-spec parts are thrown away.

Given that I'm not a statistician, I usually cheat and find confidence intervals by Monte Carlo simulation. If nothing else, it's a useful check for whether the formula that I'm using is reasonable.

20. ### LiquidMidnight

Dec 25, 2000
Haha, have fun with that.

My undergraduate training was in psychology, so I was raised on a steady diet of ANOVA, t-tests, chi-squares, and correlations, with the occasional OLS or logistical regression thrown in. My doctoral training has a heavy sociological and program evaluation slant to it, which means way more regression in order to get statistical control. But I like to say that you can take the psychologist out of psychology, but you can't take the psychology out of the psychologist.* The lenghts I'll go to design true experiments is almost comical due to my obsession with internal validity.

*I realize that makes absolutely no sense, but I thought it sounded cool.

P.s. And oh, if you are are planning on doing the level of analysis that you're talking about, I don't know what statistical software you're using, but if it's SPSS, you're eventually going to find it limiting. SPSS is ANOVA-based. For high level stats, Stata will take you a lot further. Or if you are really a super-duper genuis, you can start using SAS.

Apr 10, 2009