# Mann-Whitney U test

Evaluación | Biopsicología | Comparativo | Cognitivo | Del desarrollo | Idioma | Diferencias individuales | Personalidad | Filosofía | Social | Métodos | Estadística | Clínico | Educativo | Industrial | Artículos profesionales | Psicología mundial | Estadística: Método científico · Métodos de búsqueda · Diseño experimental · cursos de pregrado de estadistica · Pruebas estadísticas · Teoría de juego · Decision theory In statistics, the Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test) is a non-parametric test for assessing whether two independent samples of observations have equally large values. It is one of the best-known non-parametric significance tests. It was proposed initially by Frank Wilcoxon in 1945, for equal sample sizes, and extended to arbitrary sample sizes and in other ways by H. B. Mann and D. R. Whitney (1947). MWW is virtually identical to performing an ordinary parametric two-sample t test on the data after ranking over the combined samples. Contents 1 Assumptions and formal statement of hypotheses 2 Cálculos 3 Ejemplos 3.1 Illustration of calculation methods 3.2 Illustration of object of test 4 Normal approximation 5 Relation to other tests 5.1 Comparison to Student's t-test 5.2 Different distributions 5.2.1 Alternatives 5.3 Kendall's τ 5.4 ρ statistic 6 Example statement of results 7 Implementations 8 Ver también 9 Notas 10 Referencias 11 External links Assumptions and formal statement of hypotheses Although Mann and Whitney (1947) developed the MWW test under the assumption of continuous responses with the alternative hypothesis being that one distribution is stochastically greater than the other, there are many other ways to formulate the null and alternative hypotheses such that the MWW test will give a valid test.[1] A very general formulation is to assume that: All the observations from both groups are independent of each other, The responses are ordinal or continuous measurements (es decir. one can at least say, of any two observations, which is the greater), Under the null hypothesis the probability of an observation from one population (X) exceeding an observation from the second population (Y) equals the probability of an observation from Y exceeding an observation from X, Es decir, there is a symmetry between populations with respect to probability of random drawing of a larger observation. Under the alternative hypothesis the probability of an observation from one population (X) exceeding an observation from the second population (Y) (after correcting for ties) is not equal to 0.5. The alternative may also be stated in terms of a one-sided test, por ejemplo: P(X> Y) + 0.5 P(X = Y) > 0.5. If we add more strict assumptions than those above such that the responses are assumed continuous and the alternative is a location shift (es decir. F1(x) = F2(x +δ)), then we can interpret a significant MWW test as showing a significant difference in medians. Under this location shift assumption, we can also interpret the MWW as assessing whether the Hodges–Lehmann estimate of the difference in central tendency between the two populations is zero. The Hodges–Lehmann estimate for this two-sample problem is the median of all possible differences between an observation in the first sample and an observation in the second sample. The general null hypothesis of a symmetry between populations with respect of obtaining a larger observation is sometimes stated more narrowly as both populations having exactly the same distribution. Sin embargo, such a specific formulation of MWW test is not consistent with the original formulation of Mann and Whitney (1947), furthermore it leads to problems with interpretation of a test results when both distributions have different variances: por ejemplo, the test will never reject the null hypothesis if both populations have normal distribution with the same mean but different variances. En realidad, if we formulate the null hypothesis as X and Y having the same distribution, the alternative hypothesis must be that the distributions of X and Y are the same except for a shift in location -- otherwise the test may have little power (or no power at all) to reject the null hypothesis. Calculations The test involves the calculation of a statistic, usually called U, whose distribution under the null hypothesis is known. In the case of small samples, the distribution is tabulated, but for sample sizes above ~20 there is a good approximation using the normal distribution. Some books tabulate statistics equivalent to U, such as the sum of ranks in one of the samples, rather than U itself. The U test is included in most modern statistical packages. It is also easily calculated by hand, especially for small samples. There are two ways of doing this. For small samples a direct method is recommended. It is very quick, and gives an insight into the meaning of the U statistic. Choose the sample for which the ranks seem to be smaller (The only reason to do this is to make computation easier). Call this "sample 1," and call the other sample "sample 2." Taking each observation in sample 1, count the number of observations in sample 2 that are smaller than it (count a half for any that are equal to it). The total of these counts is U. For larger samples, a formula can be used: Arrange all the observations into a single ranked series. Es decir, rank all the observations without regard to which sample they are in. Add up the ranks for the observations which came from sample 1. The sum of ranks in sample 2 follows by calculation, since the sum of all the ranks equals Template:Frac where N is the total number of observations. U is then given by: where n1 is the sample size for sample 1, and R1 is the sum of the ranks in sample 1. Note that there is no specification as to which sample is considered sample 1. An equally valid formula for U is The smaller value of U1 and U2 is the one used when consulting significance tables. The sum of the two values is given by Knowing that R1 + R2 = N(N + 1)/2 and N = n1 + n2, and doing some algebra, we find that the sum is The maximum value of U is the product of the sample sizes for the two samples. In such a case, el "otro" U would be 0. The Mann–Whitney U is equivalent to the area under the receiver operating characteristic curve that can be readily calculated Examples Illustration of calculation methods Suppose that Aesop is dissatisfied with his classic experiment in which one tortoise was found to beat one hare in a race, and decides to carry out a significance test to discover whether the results could be extended to tortoises and hares in general. He collects a sample of 6 tortoises and 6 hares, and makes them all run his race. The order in which they reach the finishing post (their rank order, from first to last) is as follows, writing T for a tortoise and H for a hare: T H H H H H T T T T T H What is the value of U? Using the direct method, we take each tortoise in turn, and count the number of hares it is beaten by (lower rank), getting 0, 5, 5, 5, 5, 5, which means U = 25. Alternativamente, we could take each hare in turn, and count the number of tortoises it is beaten by. En este caso, we get 1, 1, 1, 1, 1, 6. So U = 6 + 1 + 1 + 1 + 1 + 1 = 11. Note that the sum of these two values for U is 36, which is 6× 6. Using the indirect method: the sum of the ranks achieved by the tortoises is 1 + 7 + 8 + 9 + 10 + 11 = 46. Therefore U = 46− (6×7)/2 = 46 − 21 = 25. the sum of the ranks achieved by the hares is 2 + 3 + 4 + 5 + 6 + 12 = 32, leading to U = 32− 21 = 11. Illustration of object of test A second example illustrates the point that the Mann–Whitney does not test for equality of medians. Consider another hare and tortoise race, con 19 participants of each species, in which the outcomes are as follows: H H H H H H H H H T T T T T T T T T T H H H H H H H H H H T T T T T T T T T The median tortoise here comes in at position 19, and thus actually beats the median hare, which comes in at position 20. Sin embargo, the value of U (for hares) es 100 (using the quick method of calculation described above, we see that each of 10 hares is beaten by 10 tortoises so U = 10× 10). Consulting tables, or using the approximation below, shows that this U value gives significant evidence that hares tend to do better than tortoises (p X)), the Wilcoxon–Mann–Whitney test can be used even if the shapes of the distributions are different. The concordance probability is exactly equal to the area under the receiver operating characteristic curve (AUC) that is often used in the context.[cita necesaria] If one desires a simple shift interpretation, the U test should not be used when the distributions of the two samples are very different, as it can give erroneously significant results. Alternatives In that situation, the unequal variances version of the t test is likely to give more reliable results, but only if normality holds. Alternatively, some authors (p. ej.. Conover) suggest transforming the data to ranks (if they are not already ranks) and then performing the t test on the transformed data, the version of the t test used depending on whether or not the population variances are suspected to be different. Rank transformations do not preserve variances so it is difficult to see how this would help. The Brown–Forsythe test has been suggested as an appropriate non-parametric equivalent to the F test for equal variances. Kendall's τ The U test is related to a number of other non-parametric statistical procedures. Por ejemplo, it is equivalent to Kendall's τ correlation coefficient if one of the variables is binary (Es decir, it can only take two values). ρ statistic A statistic called ρ that is linearly related to U and widely used in studies of categorization (discrimination learning involving concepts) is calculated by dividing U by its maximum value for the given sample sizes, which is simply n1× n2. ρ is thus a non-parametric measure of the overlap between two distributions; it can take values between 0 y 1, and it is an estimate of P(Y> X) + 0.5 P(Y = X), where X and Y are randomly chosen observations from the two distributions. Both extreme values represent complete separation of the distributions, while a ρ de 0.5 represents complete overlap. This statistic was first proposed by Richard Herrnstein (see Herrnstein et al., 1976). The usefulness of the ρ statistic can be seen in the case of the odd example used above, where two distributions that were significantly different on a U-test nonetheless had nearly identical medians: el ρ value in this case is approximately 0.723 in favour of the hares, correctly reflecting the fact that even though the median tortoise beat the median hare, the hares collectively did better than the tortoises collectively. Example statement of results In reporting the results of a Mann–Whitney test, it is important to state: A measure of the central tendencies of the two groups (means or medians; since the Mann–Whitney is an ordinal test, medians are usually recommended) The value of U The sample sizes The significance level. In practice some of this information may already have been supplied and common sense should be used in deciding whether to repeat it. A typical report might run, "Median latencies in groups E and C were 153 y 247 ms; the distributions in the two groups differed significantly (Mann–Whitney U = 10.5, n1 = n2 = 8, P

Si quieres conocer otros artículos parecidos a **Mann-Whitney U test** puedes visitar la categoría **Psywiki articles needing clarification**.

Deja una respuesta