
Sunday, January 31, 2010
Testing of hypothesis
1. The research hypothesis is: There are significant total mass changes in nucella lamellosa based on location.
2. The statistical null hypothesis is: Regardless of where they grow, the species total mass will not change.
3. The level of significance used is: P-0.05. There is a 5% probability due to chance.
4. Is the t-statistic greater than or less than the t for that level of significance? The calculated value of “T” is 0.55. There is a 55% chance that they are the same.
5. Do I accept or reject the statistical null hypothesis? 55% is larger than 5% so I do not reject the null hypothesis.
6. Do I accept or reject the research hypothesis? This is rejected.
7. Is this a one-tailed test or a two-tailed test? This is a two-tailed test with 59% falling in one direction.
8. Where was the data found? From a supplied database constructed from scientific observation.
9. How was the data collected? By taking weight measurements.
10. The sample being studied is “total mass” compared between two different locations.
11. What species? nucella lamellosa
12. Where do they live? Cantilever North and False Bay.
13. What program was used to conduct the analysis? Excel.
Thursday, January 28, 2010
Data Analysis workshop/Jan 28

My research hypothesis: Emersion DOES NOT slow growth for Nucella Lamallosa.
My statistical null hypothesis: Emersion DOES slow growth for Nucella Lamallosa because snails need to be in the water to grow their shells.
The P-value is more than alpha, so I fail to reject the statistical null hypothesis.
1. What do the measures of central tendency tell us? A dimension is identified and a reference point is chosen, and measurement of distance to the center of the distribution is from that reference point. They are the mean, median, and mode, that under the bell curve all are equal.
2. When is each appropriate?
a. The mean is the average. This is good to use to get a better idea when there is data skewed.
b. The median does refer to the number in the middle. Use if there are extremely high or extremely low values in the data set.
c. The mode of a data set refers to the number that occurs most often. When you have categorical data, or data that appears as words instead of numbers, you need to use the mode.
3. What do the measures of variability (dispersion) tell us? Variance is the average squared deviation from the population mean. The variance is a number which tells us about the distance of the numbers from the central mean. The standard deviation (or the square root of the variance) as an average variability of the numbers around the mean. It shows relationships between two things. Standard Deviation Error is for one thing.
4. When is each appropriate? Standard Deviation Error is to be used when you don’t have replicants.
5. What does it mean to be one standard deviation away from the mean? 68% of the data falls into one deviation.
6. What does it mean to be two standard deviations away from the mean? 95% of the data falls into one deviation.
7. Why do we need to look at both central tendency and variability? Between the two you get a better picture.
What do error bars mean? They give a level of confidence in a certain range. If 95% CI error bars do not overlap, you can be sure the difference is statistically significant (P <>
8. What is a level of significance ()? It’s where we choose our level of cut-off for distinguishing between a true signal and chance.
9. How do we choose it? This is an arbitrary choice. The standard is 0.05.
10. What is a P-value? This is the probability that your statistical null hypothesis is correct.
11. How do you interpret the P-value in plain English? You say, “There is a XX% probability due to chance.”
12. What is the null hypothesis of the t-test? The null hypothesis of the t-test says that the two means are the same.
13. How can a research hypothesis and a statistical null hypothesis differ? The null hypothesis typically proposes a general or default position. Research hypothesis testing is used to decide whether the data contradicts the null hypothesis.
14. Are they always different? Yes
15. Why? They are opposites of each other.
16. What is the difference between a one- and two-tailed test? They are both the concept of directionality. A one-tailed test is directional. A two-tailed test does not specify a direction. The test is designed so that the criteria uses both the upper and lower part of a distribution.
Wednesday, January 27, 2010
Data analysis/Jan. 27
What is the level of significance (alpha)? It's where we choose our level of cut-off for distinguishing between a true signal and chance. This is an arbitrary choice. How willing are we to make a mistake? For what we are doing, a one in twenty chance is okay (90% of the time). This would show up at +5 and -5 on a bell curve. Choose alpha before conducting the experiment.
The P-value in output from your statistical test. It's a probability that your statistical null hypothesis is correct. If P-value = 0.06, "There is a 6% probability due to chance," then there's a six percent chance that the mean of the control = the mean of the experimental treatment. At 5% we accepted the null hypothesis. With alpha of 6%, "P" is greater than alpha.
If P-value = 0.0001 (one in 10K), then there's a one in ten thousand chance that random fluctuations alone can account for the differences in mean that we observe. Here is a very small chance that the experiment result is random. With alpha of 5%, "P" is much less than alpha. "P" <>
The statistical null hypothesis asks how does the control equal the experiment? You can never accept a null hypothesis. It's the "t" value that we actually interpret. Say "fail to reject." :0
The value of the "T" statistic doesn't depend on your means, that's why we go to "P" value, which is something that we can compare to any statistical test, as long as you know what you null hypothesis is.
What's a research hypothesis: emerson slows growth because snails need to be in the water to grow their shells. Our control is going to grow more than our emersion, so this is a one-tailed test. The statistical null hypothesis is that the control and the "emersion" are the same, so fail to reject.
Excel shows you that a Exponent is next is with an "E" (26.4E-13)
For error bars in Excel you have to set a custom value. Highlight the graph, click layout, then click error bar.
Directionality in a one-tailed test, one tail is covered.
Tuesday, January 26, 2010
Data Analysis Lab; Jan 25 continued...
Statistical null hypothesis: Always the same and assumes the averages of the two samples being compared are equal.
T-Test Null Hypothesis: A math model that assumes µa = µb as a normal distribution, and uses the mean to be calculated.
Research Hypothesis: What you think will happen
Two-tale: Either affect of predicted variable can take place
One-tale: Predicted effect (“either/or” ≠ One-tale)
Central Tendency: Where most of your data falls in a distribution. Can be represented by mean, median, and mode, depending on the hypothesis.
Mean: average for normal distribution (mid-point)…it changes in the direction of the tail
Median: “middle” number of data and for skewed distribution
Mode: value with largest occurrence
Measures of Variation: Gives a value to let you know how different your variables have to be. Used when there are no replicates
Standard Deviation: Where 76% of your data is, 2 std. dev. = where 98% of your data is
Standard Deviation Error: Has multiple replicates. SE=stdev/square root of n
Non-Parametric Test: the same as a t-test but uses the median to be calculated.
Monday, January 25, 2010
Data analysis workshop/ 1.25.10 am
In a t-test, the Ho is always that the means in the two populations that we're comparing are equal.
The research hypothesis - depends on what we expect
We use the research hypothesis to determine whether our test is one-tailed or two-tailed. One-tailed predicts a direction, two-tailed means you aren't sure about the direction.
Central tendency: where most of the data are in a distribution - where the values cluster. Use mean when the distribution is equally distributed (normal distribution). Use median when you have a highly skewed distribution.
Measures of variation: how different do your samples need to be, for you to conclude that they're really different? Standard deviation - great for comparing two samples when you don't have replicates. 1 stdev describes 76% of the data; 2 stdev describe 98% of the date, standard error - use when you have multiple replicates (like we do in our data set!)
Control
mean median mode St. Dev. Variance St. Error
110.25 107.5 75 29.66812041 880.1973684 0.632526452
Emersion.5h
mean median mode St. Dev. Variance St. Error
1.666666667 0 0 2.5 6.25 0.372677996
Sunday, January 24, 2010
Hinton 1995
With what assumption does Hinton open the last paragraph on p. 78? Hinton is looking for size differences and an accurate way to test them. The assumption is that all sample means can be compared with the standard error of difference.
What is the null hypothesis? The null hypothesis, or N₀, is a hypothesis which the researcher tries to disprove or reject the initial hypothesis, or N₁.
What statistical tool do we use to determine “what differences would we expect between two samples simply by chance alone”? Use the distribution of differences between two sample means. The standard deviation is a statistic that tells you how tightly all the various examples are clustered around the mean in a set of data. ST is raw data minus the mean, squared (variance), totaled, divided by the quantity and rooted.
What is a t-test? ‘T’ tests are the test statistics used to determine hypothesis. The method uses a small sample and analyzes the resulting mean.
What are the assumptions of the t-test? Statistical tests of the t-test assume that for accuracy, certain conditions are being met.
Summarize the first example given:
1. What question is this teacher asking (predicting)? The teacher believed that her young students worked more effectively in the morning than in the afternoon.
2. What data does she collect to test her research hypothesis? Eight math questions.
3. Is this a one-tailed test or a two-tailed test? When using a two-tailed test, you are testing for the possibility of the relationship in both directions of the bell curve, and a one tailed test looks only at one direction. The test consists of a morning and afternoon sampling of all subjects, and is a one-tailed test as the prediction was that the children would perform better in the morning.
4. What is the statistical null hypothesis? A quantifiable aspect of the null hypothesis.
5. What level of significance did the teacher choose? P-0.05.
6. Is the t-statistic greater than or less than the t for that level of significance? The calculated value of “T” is 2.65. The table value of “T”=1.895 is less.
7. Does she accept or reject her statistical null hypothesis? She rejects the null hypothesis.
8. Does she accept or reject her research hypothesis? She accepted it.
Summarize the second example given:
- What is the research hypothesis? Will a new sleeping pill have a different effect on men and women?
- What data were collected to test the research hypothesis? The number of extra hours slept.
- Is this a one-tailed test or a two-tailed test? Two-tailed.
- What is the statistical null hypothesis? 12
- What level of significance was chosen? P=0.05
- Is the t-statistic greater than or less than the t for that level of significance? The calculated value of “T” is 1.82. The table value of “T”=2.179 is more.
- Should you accept or reject the statistical null hypothesis? Accept.
- Should you accept or reject the research hypothesis? Reject.
What question does the t-test answer, in general? The t-test compares two groups. It calculates the differences, and analyzes that list of differences based on the assumption that the differences in the entire population follow a bell curve distribution.
What question do you think you’ll use as part of the t-test to explore in your research project? Are there total mass changes in nucella lamellosa based on specimens from False Bay to the ones at the Land Bank?