Polling isn't easy and requires careful consideration
Authored by: roncross@cox.net on Feb 04, '06

I guess we are gaining a better appreciate for people who have to do this kind of thing full time. That's a good thing because it makes us all think of how we gather and use information in our lives on a daily basis.

There are several considerations in polling. The first is the distribution that you are dealing with. Most of the answers to the poll is considered attribute or countable data (1,2,3, etc...) which means that it is highly unlikely that you will ever have a normally distributed population. This type of data is generally skewed most of the time. So the interval should be selected based on the distribution, however, this is difficult to do since the distribution changes as more people vote. Perhaps the thing to do is to readjust the intervals at various times in the polling to better reflect the results of the voters. Of course, you don't have this capability now, but if you are going to continue to poll, you might want to think about this for the future.

The second consideration is sample size. How many people on average take your polls? I noticed that you have performed a lot of polls so you have some idea of the average number of people that will take a poll at any given time. Based on this information and the predicted distribution, it is easier to estimate intervals and provide better results from polling. With an average number of voters over 1000 and using countable data, you really need finer intervals or you need to readjust the intervals at various points in the polling.

For example, it is clear that values above 200 really don't matter. This means that you can break down the intervals from 0 to 200 in the following way.

0 - special case
1-13, 14-26,27-39,40-52,53-65,66-78,79-91, 92-104, 105-117, 118-130, .... 196-208,
> 208 - this is where it starts to not matter much.

Unfortunately, the way the polling is set up on this site and the limitations of the software, you are unable to do what I am recommending. You might want to consider the way in which your statistics are gathered. One idea would be to let people put in actual values and let the software bin the results based off actual values, particularly if the data is countable or continuous.

The third consideration is the questions being asked. From what I see in terms of how the binning works on this site, questions that are categorical will yield better results than questions that involve countable data. In other words, the voting and statistics are more favorable for categorical responses such as {yes, no} or {imac, ibook, powerbook, emac, mini, etc...}.

So yes, you can please most people if the response is categorical, you are less likely to please if the data is countable. You will please no one if the data is continuous such as the number of seconds a browsers loads to the nearest 100th of a second.

Polling isn't easy and requires careful consideration
Authored by: robg on Feb 05, '06

Ron:

Mostly the polls here are just fun things that ask questions I think might be interesting. That's the sum amount of science that goes into it :).

I agree with your analysis, though, if we were going to get more serious about polling. I have prepared a few polls for Macworld in the past, and we used a complete standalone polling app, which is much better suited to the task than the simple "only multiple choice" polling feature built into Geeklog.

