Stat Help

Summary Statistics

Columns

Provides the following descriptive statistics in tabular format for the column(s) selected: sample size (n), mean, variance, standard deviation (Std. Dev.), Standard Error (Std. Err.), median, range, minimum, maximum, first quartile (Q1) and third quartile (Q3).

Select the columns for which summary statistics will be computed.
Enter an optional Where clause to specify the data rows to be included in the computation.
Select an optional Group By column to group results. If a Group By column is selected, choose whether to display the output in separate tables for each column selected or in separate tables for each group.
Click the Next button to select the summary statistics to be computed. The statistics will be displayed in the order in which they are selected (from right to left). Additional percentiles may also be entered as a space or comma delimited list.
Check the Store output in data table option if the output is to be placed in the data table.
Click the Calculate button to view the results.

Rows

Provides the following summary statistics for each row in the data table for the columns selected: count, sum, mean, variance, standard deviation, minimum, median and maximum.

Select the columns for which summary statistics will be computed.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Next button to select the summary statistics (by default, all are selected) to be computed. The statistics will be displayed in the order in which they are selected (from right to left).
Check the Store output in data table option if the output is to be placed in the data table.
Click the Calculate button to view the results.

Correlation

Computes the Pearson correlation between two columns or the corresponding correlation matrix if three or more columns are selected.

Select the columns to be used in the computation.
Enter an optional Where clause to specify the data rows to be included in the computation.
Select an optional Group By column to group results. A separate result (matrix) will be computed for each distinct value of the Group By column.
Click Next to alter the format of the correlation matrix.
Select options under Display in the correlation matrix to add additional values to the correlation matrix.
By default all selected columns will displayed in the correlation matrix. To only display specific columns, choose the Selected option under Display columns and specify those to be included. The specified columns will also be displayed in the order in which they are selected.
Under Sort rows by correlation with, a column for sorting the correlation matrix in either ascending or descending order may be specified. The column that is selected for sorting will be the first one displayed in the matrix.
Click the Calculate button to view the results.

Covariance

Computes the covariance between two columns or the lower half of the covariance matrix if three or more columns are selected.

Select the columns to be used in the computation.
Enter an optional Where clause to specify the data rows to be included in the computation.
Select an optional Group By column to group results. A separate result (matrix) will be computed for each distinct value of the Group By column.
Click the Calculate button to view the results.

Grouped/Binned data

Provides descriptive statistics in tabular format for data in binned format consisting of bin values and associated counts.

Select the column containing the bins for which summary statistics will be computed. Bin values must use "to" or "-" as a delimter, e.g. "10 to 20 " or "10 - 20 ".
Enter an optional column for the counts associated with each bin. If this column is omitted, each bin will have a count of 1.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Next button to select the summary statistics to be computed. The statistics will be displayed in the order in which they are selected (from left to right). Additional percentiles may also be entered as a space or comma delimited list.
Check the Store output in data table option if the output is to be placed in the data table.
Click the Calculate button to view the results.

Tables

Frequency

Provides the frequency and relative frequency for each unique value within selected columns.

Select the columns to be used in the computation.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Calculate button to view the results.

Contingency

Computes a two-way frequency table for distinct values in two separate columns and provides a test for independence. The table can be generated using raw data (default) or summary data. To use summary data click on the Use summary data! link at the top of the dialog page.

If the data set consists of raw measurements made on individual units, then follow the instructions below to construct a contingency table.
1. Select the column that contains the values to be categorized across the rows of the contingency table.
2. Select the column that contains the values to be categorized across the columns of the contingency table.
3. Enter an optional Where clause to specify the data rows to be included in the computation.
4. Select an optional Group By column to group results. A separate contingency table will be constructed for each distinct value of the Group By column.
5. Click the Next button to specify additional information (Row, Column and Total percents) to be displayed in each table cell.
6. Click the Next button again to specify the hypothesis testing and confidence interval results to be computed.
7. Click the Calculate button to view the results.
The following table shows how summary data must be entered in the StatCrunch data table to form a contingency table. The data are summary output relating back problems (Yes or No) to sex (Male or Female).

sex No Yes

Female 31 24

Male 37 8

If the data set contains summary counts as shown above, then follow the instructions below to construct a contingency table.
1. Select the columns that contain the summary counts. In the example above, these columns would be No and Yes.
2. Select the column that contains the row labels. In the example above, this column would be sex.
3. Enter a name for the column variable. In the example above, the name might be "back problems".
4. Click the Next button to specify additional information (Row, Column and Total percents) to be displayed in each table cell.
5. Click the Next button again to specify the hypothesis testing and confidence interval results to be computed.
6. Click the Calculate button to view the results.

Outcome Table

Displays a table highlighting delimited outcomes in selected columns that contain unstructured lists of outcomes. An outcomes table goes through a column row by row and finds the number of unique items across all of the lists in the whole column. The number of rows, which match each individual outcome, are tabulated and displayed both as a count and as a horizontal bar of a scaled width so that the counts for individual items can be easily compared. See the StatCrunch Friend Data application at http://www.statcrunch.com/frienddata/ for numerous examples of outcome tables.

Select the columns containing the outcomes of interest.
Enter an optional Where clause to specify the data rows to be included.
Select an optional Group by column to group results. By default if a Group by column is specified, a separate table will be produced for each unique value of the Group by column.
Specify the delimiter to be used to separate outcomes. A comma is the default delimiter.
Specify whether or not to limit an outcome to one occurrence per cell. This option can be used to avoid duplication of outcomes for experimental units.
Click the Next button to specify additional options.
Choose the items (counts and/or percentages) to be tabled for each outcome.
Specify the maximum bar width in pixels. The default value is 500.
Choose the ordering for the outcomes to be tabled. By default the outcomes will be tabled in descending order in terms of their frequencies.
Click the Next button to specify additional options.

Specify whether or not to remove common words and/or punctuation from the outcomes table. These options are not checked by default. They are most useful when the outcomes being tabled are individual words. StatCrunch has a long list of common words shown below and in the dialog window. Words can be deleted from the common words list by removing them from the text area, and additional words can be added to the list by entering them in the text area.

Common words list:

& a able about above abroad according accordingly across actually adj after afterwards again against ago ahead 
ain't all allow allows almost alone along alongside already also although always am amid amidst among amongst 
an and another any anybody anyhow anyone anything anyway anyways anywhere apart appear appreciate appropriate 
are aren't around as a's aside ask asking associated at available away awfully b back backward backwards be became 
because become becomes becoming been before beforehand begin behind being believe below beside besides best better 
between beyond both brief but by c came can cannot cant can't caption cause causes certain certainly changes 
clearly c'mon co co. com come comes concerning consequently consider considering contain containing contains 
corresponding could couldn't course c's currently d dare daren't definitely described despite did didn't different 
directly do does doesn't doing done don't down downwards during e each edu eg eight eighty either else elsewhere 
end ending enough entirely especially et etc even ever evermore every everybody everyone everything everywhere ex 
exactly example except f fairly far farther few fewer fifth first five followed following follows for forever former 
formerly forth forward found four from further furthermore g get gets getting given gives go goes going gone got gotten 
greetings h had hadn't half happens hardly has hasn't have haven't having he he'd he'll hello help hence her here hereafter 
hereby herein here's hereupon hers herself he's hi him himself his hither hopefully how howbeit however hundred i 
i'd ie if ignored i'll i'm immediate in inasmuch inc inc. indeed indicate indicated indicates inner inside insofar 
instead into inward is isn't it it'd it'll its it's itself i've j just k keep keeps kept know known knows l last lately 
later latter latterly least less lest let let's like liked likely likewise little look looking looks low lower ltd m 
made mainly make makes many may maybe mayn't me mean meantime meanwhile merely might mightn't mine minus miss more 
moreover most mostly mr mrs much must mustn't my myself n name namely nd near nearly necessary need needn't needs neither 
never neverless nevertheless new next nine ninety no nobody non none nonetheless noone no-one nor normally not 
nothing notwithstanding novel now nowhere o obviously of off often oh ok okay old on once one ones one's only onto 
opposite or other others otherwise ought oughtn't our ours ourselves out outside over overall own p particular particularly 
past per perhaps placed please plus possible presumably probably provided provides q que quite qv r rather rd re really 
reasonably recent recently regarding regardless regards relatively respectively right round s said same saw say saying 
says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious seriously seven several 
shall shan't she she'd she'll she's should shouldn't since six so some somebody someday somehow someone something sometime 
sometimes somewhat somewhere soon sorry specified specify specifying still sub such sup sure t take taken taking tell 
tends th than thank thanks thanx that that'll thats that's that've the their theirs them themselves then thence there 
thereafter thereby there'd therefore therein there'll there're theres there's thereupon there've these they they'd 
they'll they're they've thing things think third thirty this thorough thoroughly those though three through throughout 
thru thus till to together too took toward towards tried tries truly try trying t's twice two u un under underneath 
undoing unfortunately unless unlike unlikely until unto up upon upwards us use used useful uses using usually v value 
various versus very via viz vs w want wants was wasn't way we we'd welcome well we'll went were we're weren't we've what 
whatever what'll what's what've when whence whenever where whereafter whereas whereby wherein where's whereupon wherever 
whether which whichever while whilst whither who who'd whoever whole who'll whom whomever who's whose why will willing 
wish with within without wonder won't would wouldn't x y yes yet you you'd you'll your you're yours yourself yourselves 
you've z zero

Z Statistics

One Sample

Provides hypothesis tests and confidence intervals for a population mean based on a single sample when the population variance is known.

If your data set contains raw data values, then follow the instructions below.
1. Select the column containing the sample values for the calculation(s). If more than one column is selected, a separate test will be done for each column. Resulting P-values are not adjusted for multiple comparisons.
2. Enter an optional standard deviation value. If nothing is entered, the sample standard deviation will be used.
3. Enter an optional Where clause to specify the data rows to be included in the computation.
4. Select an optional Group By column to group results. A separate hypothesis test/confidence interval will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
5. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null mean value of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the population mean.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
6. Click the Calculate button to view the results. The output will consist of a table containing the number of observations (n), the sample mean, the standard error of the mean (Std. Err.), the Z statistic (Z-stat) and the P-value for the test.
If you have summary statistics for your sample, then follow the instructions below.
1. Enter the values for the sample mean, the standard deviation (known value or sample value) and the sample size.
2. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null mean value of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the population mean.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
3. Click the Calculate button to view the results. The output will consist of a table containing the number of observations (n), the sample mean, the standard error of the mean (Std. Err.), the Z statistic (Z-stat) and the P-value for the test.
If you select power/sample size, then follow the instructions below for the calculator.
- The first tab on the top is for Hypothesis Test Power. This allows you to find the difference, power, or sample size given any two of the values as inputs.
  - Enter the required parameters of alpha (level of significance), the standard deviation, and the type of alternative hypothesis.
  - Enter two of the following three values: the difference between the true and null means, power, sample size.
  - Click Compute to calculate the value left blank while updating the power curve.
  - Click and drag the red dot on the graph to interactively investigate the power/difference relationship. Note the values below for the difference and power change accordingly.
- The second tab on the top is for Confidence Interval Width. This allows you to find the width or the sample size associated with a confidence interval.
  - Enter the required parameters of confidence level and standard deviation.
  - Enter either the width or the sample size of the confidence interval.
  - Click Compute to calculate the value left blank while updating the sample size/width curve.
  - Click and drag the red dot to interactively investigate the width/sample size relationship. Note how the values below for width and sample size change accordingly.

Two Sample

Provides hypothesis tests and confidence intervals for the difference (first sample minus second sample) in two means from independent samples.

If your data set contains raw data values, then follow the instructions below.
1. Select the column containing the first sample.
2. Enter an optional known standard deviation value for the first sample. If nothing is entered, the sample standard deviation of the first sample will be used.
3. Enter an optional WHERE clause to specify the data rows to be included in the first sample.
4. Select the column containing the second sample. This column can be the same column containing the first sample.
5. Enter an optional known standard deviation value for the second sample. If nothing is entered, the sample standard deviation of the second sample will be used.
6. Enter an optional WHERE clause to specify the data rows to be included in the second sample.
7. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null mean difference value of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the mean difference.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
8. Click the Calculate button to view the results. The output will consist of a table containing the number of observations from each sample (n₁ and n₂), the difference in sample means (Sample Mean), the standard error of the sample mean difference (Std. Err.), the Z statistic (Z-stat) and the P-value for the test.
If you have summary statistics for each of your samples, then follow the instructions below.
1. Enter the values for the sample mean, the standard deviation (known value or sample value) and the sample size for each sample.
2. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null mean difference value of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the mean difference.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
3. Click the Calculate button to view the results. The output will consist of a table containing the number of observations from each sample (n₁ and n₂), the difference in sample means (Sample Mean), the standard error of the sample mean difference (Std. Err.), the Z statistic (Z-stat) and the P-value for the test.
If you select power/sample size, then follow the instructions below for the calculator.
- The first tab on the top is for Hypothesis Test Power. This allows you to find the difference, power, or sample size given any two of the values as inputs.
  - Enter the required parameters of alpha (level of significance), the first standard deviation, the second standard deviation, and the type of alternative hypothesis.
  - Enter two of the following three values: the difference between the two means, power, and sample size per group.
  - Click Compute to calculate the value left blank while updating the power curve.
  - Click and drag the red dot to interactively investigate the power/difference relationship. Note how the values below for difference and power change accordingly.
- The second tab on the top is for Confidence Interval Width. This allows you to find the width or the sample size for an associated confidence interval.
  - Enter the required parameters of confidence level, first standard deviation, and second standard deviation.
  - Enter either the width or the sample size per group for the confidence interval.
  - Click Compute to calculate the value left blank while updating the sample size/width curve.
  - Click and drag the red dot to interactively investigate the sample size/width relationship. Note how the values below for width and sample size change accordingly.

Proportions

One Sample

Provides hypothesis tests and confidence intervals for the proportion of successes in one sample of trials. The procedure used is a Z test using the normal approximation to the binomial. The procedure can be used with raw data (default) or summary data. To use summary data click on the Use summary data! link at the top of the dialog page.

If the data set consists of outcomes of individual trials, then follow the instructions below.
1. Select the column containing the outcomes for the sample.
2. Specify the outcome that denotes a success.
3. Enter an optional Where clause to specify the data rows to be included in the computation.
4. Select an optional Group By column to group results. A separate hypothesis test/confidence interval will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
5. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null success proportion of 0.5 and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the proportion of successes.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
6. Click the Calculate button to view the results. The output will consist of a table containing the number of success outcomes (Count), the total number of trials (Total), the sample proportion of successes (Sample prop.), the standard error of the sample proportion (Std. Err.), the Z statistic (Z-stat) and the P-value for the test.
If the data is only summary information on the number of successes in a specific number of trials, then follow the steps below.
1. Enter a value for the number of successes.
2. Enter a value for the number of trials.
3. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null success proportion of 0.5 and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the proportion of successes.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
4. Click the Calculate button to view the results. The output will consist of a table containing the number of success outcomes (Count), the total number of trials (Total), the sample proportion of successes (Sample prop.), the standard error of the sample proportion (Std. Err.), the Z statistic (Z-stat) and the P-value for the test.
If you select power/sample size, then follow the instructions below for the calculator.
- The first tab on the top is for Hypothesis Test Power. This allows you to find the difference, power, or sample size given any two of the values as inputs.
  - Enter the required parameters of alpha (level of significance) , the null proportion, and the type of alternative hypothesis.
  - Enter two of the following three values: the population proportion, power, and sample size.
  - Click Compute to calculate the value left blank while updating the power curve.
  - Click and drag the red dot in the power curve to interactively investigate the power/difference relationship. Note how the values below for difference and power change accordingly.
- The second tab on the top is for Confidence Interval Width. This allows you to find the width or the sample size for an associated confidence interval.
  - Enter the required parameters of confidence level and the null proportion.
  - Enter either the width or the sample size for the confidence interval.
  - Click Compute to calculate the value left blank while updating the sample size/width curve.
  - Click and drag the red dot to interactively investigate the sample size/width relationship. Note how the values below for width and sample size change accordingly.

Two Sample

Provides confidence intervals and/or hypothesis tests for the difference (first sample minus second sample) in the proportion of successes using data from two independent samples. The procedure used is a Z test using the normal approximation to the binomial. The procedure can be used with raw data (default) or summary data. To use summary data click on the Use summary data! link at the top of the dialog page.

If the data set consists of outcomes of individual trials, then follow the instructions below.
1. Select the column containing the outcomes for the first sample.
2. Specify the outcome that denotes a success for the first sample.
3. Enter an optional WHERE clause to specify the data rows to be included in the first sample.
4. Select the column containing the outcomes for the second sample. This column can be the same column contain-g the first sample.
5. Specify the outcome that denotes a success for the second sample.
6. Enter an optional WHERE clause to specify the data rows to be included in the second sample.
7. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null difference in success proportions of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the difference in the proportion of successes.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
8. Click the Calculate button to view the results. The output will consist of a table containing the number of success outcomes in the first sample (Count1), the total number of trials in the first sample (Total1), the number of success outcomes in the second sample (Count2), the total number of trials in the second sample (Total2), the difference in the sample proportions of successes (Sample diff.), the standard error of the sample difference (Std. Err.), the Z statistic (Z-stat) and the P-value for the test.
If the data is only summary information on the number of successes in a specific number of trials, then follow the steps below.
1. Enter a value for the number of successes in the first sample.
2. Enter a value for the number of trials in the first sample.
3. Enter a value for the number of successes in the second sample.
4. Enter a value for the number of trials in the second sample.
5. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null difference in success proportions of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the difference in the proportion of successes.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
6. Click the Calculate button to view the results. The output will consist of a table containing the number of success outcomes in the first sample (Count1), the total number of trials in the first sample (Total1), the number of success outcomes in the second sample (Count2), the total number of trials in the second sample (Total2), the difference in the sample proportions of successes (Sample diff.), the standard error of the sample difference (Std. Err.), the Z statistic (Z-stat) and the P-value for the test.
If you select power/sample size, then follow the instructions below for the calculator.
- The first tab on the top is for Hypothesis Test Power. This allows you to find the difference, power, or sample size given any two of the values as inputs.
  - Enter the required parameters of alpha (level of significance), the second population proportion, and the type of alternative hypothesis.
  - Enter two of the following three values: the first population proportion, power, and sample size per group.
  - Click Compute to calculate the value left blank while updating the power curve.
  - Click and drag the red dot to interactively investigate the power/difference relationship. Note how the values below for difference and power change accordingly.
- The second tab on the top is for Confidence Interval Width. This allows you to find the width or the sample size for an associated confidence interval.
  - Enter the required parameters of confidence level and both the first and second population proportions.
  - Enter either the width or the sample size per group of the confidence interval.
  - Click Compute to calculate the value left blank while updating the sample size/width curve.
  - Click and drag the red dot to interactively investigate the sample size/width relationship. Notice how the values below for width and sample size change accordingly.

T Statistics

One Sample

Provides confidence intervals and/or hypothesis tests for a population mean based on a single sample when the population variance is not known.

If your data set contains raw data values, then follow the instructions below.
1. Select the column containing the values for the calculation(s). If more than one column is selected, a separate test will be done for each column. Resulting P-values are not adjusted for multiple comparisons.
2. Enter an optional Where clause to specify the data rows to be included in the computation.
3. Select an optional Group By column to group results. A separate hypothesis test/confidence interval will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
4. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null mean value of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the population mean.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
5. Click the Calculate button to view the results. The output will consist of a table containing the sample mean, the standard error of the mean (Std. Err.), the degrees of freedom (DF), the T statistic (T-stat) and the P-value for the test.
If you have summary statistics for your sample, then follow the instructions below.
1. Enter the values for the sample mean, the sample standard deviation and the sample size.
2. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null mean value of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the population mean.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
3. Click the Calculate button to view the results. The output will consist of a table containing the sample mean, the standard error of the mean (Std. Err.), the degrees of freedom (DF), the T statistic (T-stat) and the P-value for the test.
If you select power/sample size, then follow the instructions below for the calculator.
- The first tab on the top is for Hypothesis Test Power. This allows you to find the difference, power, or sample size given any two of the values as inputs.
  - Enter the required parameters of alpha (level of significance), the population standard deviation, and the type of alternative hypothesis.
  - Enter two of the following three values: the difference, power and sample size.
  - Click Compute to calculate the value left blank while updating the power curve.
  - Click and drag the red dot to interactively investigate the power/difference relationship. Note how the values below for difference and power change accordingly.
- The second tab on the top is for Confidence Interval Width. This allows you to find the width or the sample size for an associated confidence interval.
  - Enter the required parameters of confidence level and the population standard deviation.
  - Enter either the width or the sample size of the confidence interval.
  - Click Compute to calculate the value left blank while updating the sample size/width curve.
  - Click and drag the red dot to interactively investigate the sample size/width relationship. Note how the values below for width and sample size change accordingly.

Two Sample

Provides hypothesis tests and confidence intervals for the difference (first sample minus second sample) in two means from independent samples.

If your data set contains raw data values, then follow the instructions below.
1. Select the column containing the first sample.
2. Enter an optional WHERE clause to specify the data rows to be included in the first sample.
3. Select the column containing the second sample. This column can be the same column contain-g the first sample.
4. Enter an optional WHERE clause to specify the data rows to be included in the second sample.
5. Select the "Pool variance" option if desired. If selected (done by default), information from both samples will be pooled to compute an overall estimate of variance.
6. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null mean difference value of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the mean difference.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
7. Click the Calculate button to view the results. The output will consist of a table containing the sample mean difference (Sample Mean), the standard error of the difference in sample means (Std. Err.), the degrees of freedom (DF), the T statistic (T-stat) and the P-value for the test.
If you have summary statistics for each of your samples, then follow the instructions below.
1. Enter the values for the sample mean, the sample standard deviation and the sample size for each sample.
2. Select the "Pool variance" option if desired. If selected (done by default), information from both samples will be pooled to compute an overall estimate of variance.
3. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null mean difference value of zero and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the mean difference.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
4. Click the Calculate button to view the results. The output will consist of a table containing the sample mean difference (Sample Mean), the standard error of the difference in sample means (Std. Err.), the degrees of freedom (DF), the T statistic (T-stat) and the P-value for the test.
If you select power/sample size, then follow the instructions below for the calculator.
- The first tab on the top is for Hypothesis Test Power. This allows you to find the difference, power, or sample size given any two of the values as inputs.
  - Enter the required parameters of alpha (level of significance), the pooled standard deviation, and the type of alternative hypothesis.
  - Enter two of the following three values: the difference, power and sample size per group.
  - Click Compute to calculate the value left blank while updating the power curve.
  - Click and drag the red dot to interactively investigate the power/difference relationship. Note how the values below for difference and power change accordingly.
- The second tab on the top is for Confidence Interval Width. This allows you to find the width or the sample size for an associated confidence interval.
  - Enter the required parameters of confidence level and the pooled standard deviation.
  - Enter either the width or the sample size per group for the confidence interval.
  - Click Compute to calculate the value left blank while updating the sample size/width curve.
  - Click and drag the red dot to interactively investigate the sample size/width relationship. Note how the values below for width and sample size change accordingly.

Paired

Provides hypothesis tests and confidence intervals for a difference in population means with paired data. Pairwise differences for values in selected columns (first minus second) serve as the basis for the computation

Select the column containing the first sample.
Select the column containing the second sample.
Enter an optional Where clause to specify the data rows to be included in the computation.
Select an optional Group By column to group results. A separate hypothesis test/confidence interval will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
Check the "Save differences" option to save the differences to the data table.
Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null mean difference value of zero and a not equal alternative hypothesis.
For a hypothesis test:
- Enter a null value for the mean difference.
- Choose from the options of not equal, less than and greater than for the alternative hypothesis.
For a confidence interval
- Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
Click the Calculate button to view the results. The output will consist of a table containing the sample mean difference (Sample Mean), the standard error of the difference in sample means (Std. Err.), the degrees of freedom (DF), the T statistic (T-stat) and the P-value for the test.

Variance

One Sample

Provides hypothesis tests and confidence intervals for a population variance based on a single sample when the data come from a normal distribution.

If your data set contains raw data values, then follow the instructions below.
1. Select the column containing the sample values for the calculation(s). If more than one column is selected, a separate test will be done for each column. Resulting P-values are not adjusted for multiple comparisons.
2. Enter an optional Where clause to specify the data rows to be included in the computation.
3. Select an optional Group By column to group results. A separate hypothesis test/confidence interval will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
4. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null population variance of one and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the population variance.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
5. Click the Calculate button to view the results. The output will consist of a table containing the sample variance (Sample Var.), the degrees of freedom (DF), the Chi-Square statistic (Chi-Square stat) and the P-value for the test.
If you have summary statistics for your sample, then follow the instructions below.
1. Enter the values for the sample variance and the sample size.
2. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null population variance of one and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the population variance.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
3. Click the Calculate button to view the results. The output will consist of a table containing the sample variance (Sample Var.), the degrees of freedom (DF), the Chi-Square statistic (Chi-Square stat) and the P-value for the test.
If you select power/sample size, then follow the instructions below for the calculator.
- The first tab on the top is for Hypothesis Test Power. This allows you to find the difference, power, or sample size given any two of the values as inputs.
  - Enter the required parameters of alpha (level of significance), the hypothesized variance, and the type of alternative hypothesis.
  - Enter two of the following three values: the true variance, power and sample size.
  - Click Compute to calculate the value left blank while updating the power curve.
  - Click and drag the red dot to interactively investigate the power/difference relationship. Note how the values below for difference and power change accordingly.
- The second tab on the top left is for Confidence Interval Width. This allows you to find the width or the sample size for an associated confidence interval.
  - Enter the required parameters of confidence level and the sample variance.
  - Enter either the width or the sample size of the confidence interval.
  - Click Compute to calculate the value left blank while updating the sample size/width curve.
  - Click and drag the red dot to interactively investigate the sample size/width relationship. Note how the values below for width and sample size change accordingly.

Two Sample

Provides confidence intervals and hypothesis tests for the ratio of two population variances (first sample / second sample) when the samples come from two independent normal distributions.

If your data set contains raw data values, then follow the instructions below.
1. Select the column containing the first sample.
2. Enter an optional WHERE clause to specify the data rows to be included in the first sample.
3. Select the column containing the second sample. This column can be the same column containing the first sample.
4. Enter an optional WHERE clause to specify the data rows to be included in the second sample.
5. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null ratio of one and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the ratio of population variances.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
6. Click the Calculate button to view the results. The output will consist of a table containing the number of observations in the first sample (n₁), the number of observations in the second sample (n₂), the ratio of the sample variances (Sample Ratio), the F statistic (F-stat) and the P-value for the test.
If you have summary statistics for your sample, then follow the instructions below.
1. Enter the values for the sample variance and the sample size for each sample.
2. Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null ratio of one and a not equal alternative hypothesis.
  For a hypothesis test:
  - Enter a null value for the ratio of population variances.
  - Choose from the options of not equal, less than and greater than for the alternative hypothesis.
  For a confidence interval
  - Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
3. Click the Calculate button to view the results. The output will consist of a table containing the number of observations in the first sample (n₁), the number of observations in the second sample (n₂), the ratio of the sample variances (Sample Ratio), the F statistic (F-stat) and the P-value for the test.

Regression

Simple Linear Regression

Provides routines for fitting the simple linear regression model.

Select the X variable (independent variable) for the regression.
Select the Y variable (dependent variable) for the regression.
Enter an optional Where clause to specify the data rows to be included in the computation.
Select an optional Group By column to group results. A separate regression analysis will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
Click the Next button to select between a hypothesis test and a confidence interval computation for the regression parameters. By default the results of a two-sided hypothesis test with a null value of zero is performed for each parameter.
For hypothesis tests:
- Enter a null value for the value of the regression intercept and the regression slope.
- Choose from the options of not equal, less than and greater than for the alternative hypothesis for each regression parameter.
For confidence intervals:
- Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval. The specified level will be used to construct a confidence interval for both the regression intercept and the regression slope.
Click the Next button for the following options:
- Plot the fitted line - If selected, a scatter plot of the X values versus the Y values with the overlaid estimated regression line will be included in the output.
- Save residuals - If selected, the residuals from the regression fit will be saved to the data table. Diagnostics on the residuals may then be analyzed.
- Predict Y for X = - If selected, space delimited numerical values for the X variable must be included in the adjoining input field. Predicted values and 95% prediction intervals will then be included in the regression output for each X value.
Click the Next button again to select from a variety of diagnostic plots.
Click the Next button again to specify graph layout options.
Click the Calculate button to view the results.

Polynomial Regression

Provides routines for fitting a regression model of any degree from one up to six.

Select the order of Polynomial Regression (default is 2).
Select the X variable (independent variable) for the regression.
Select the Y variable (dependent variable) for the regression.
Enter an optional Where clause to specify the data rows to be included in the computation.
Select an optional Group By column to group results. A separate regression analysis will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
Select the optional No Intercept feature to compute all results without an intercept in the model.
Click the Next button to select between a hypothesis test and a confidence interval computation for the regression parameters. By default the results of a two-sided hypothesis test with a null value of zero is performed for each parameter.
For hypothesis tests:
- Enter a null value for the value of all of the regression parameters.
- Choose from the options of not equal, less than and greater than for the alternative hypothesis for all of the regression parameters.
For confidence intervals:
- Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval. The specified level will be used to construct a confidence interval for all of the regression parameters.
Click the Next button for the following options:
- Predict Y for X = If selected, comma delimited numerical values for the X variable must be included in the adjoining input field. Predicted values and the inputed percentage level prediction intervals will then be included in the regression output for each X value.
- Save residuals If selected, the residuals from the regression fit will be saved to the data table. Diagnostics on the residuals may then be analyzed.
- Save fitted values If selected, the predicted values for each observation from the regression fit will be saved to the data table.
- Save mode estimates If selected, the estimates for coefficients, standard errors, correlation coeffiecient, model standart error estimate, and F statistic will be stored in the data table. All other output will be suppressed.
Click the Next button again to select from a variety of diagnostic plots.
- Plot the fitted line - If selected, a scatter plot of the X values versus the Y values with the overlaid estimated regression function will be included in the output.
  - On the graph, if you click on the plotted polynomial regression line then it will show you the formula for it.
  - Click Add confidence interval box to add a two-sided 95% confidence interval to the plot.
  - Click Add prediction interval box to add a two-sided 95% prediction interval to the plot.
- Click Residual index plot to display a graph of the lined-plotting for the index vs the residuals.
- Click Histogram of residuals to display a histogram with a normal overlay in order to have a visual look at how normal the data appears.
- Click QQ plot to display a comparison of the residuals to the normal quantile.
- Click Y-values vs. residuals to have a graphical representation of how random the data is or if there seems to be a trend.
- Click Fitted values vs. residuals to display how the data has changed with the polynomial fit.
Click the Next button again to specify graph layout options.
Click the Calculate button to view the results.

Multiple Linear Regression

Provides routines for fitting a multiple linear regression model.

In the first panel of options, specify the variable to be included in the regression model as discussed below. Note that only columns with at least one numeric column are available for selection in most input fields.
- Select the Y variable (dependent variable) for the regression.
- Select the X variables (independent variables) for the regression.
- Create interaction terms by selecting two or more variables and clicking the Add button. Interaction terms will then be shown in the area to the right. To delete an interaction term, select it and then click the Delete button. To center the variables in each interaction term, turn on the Center interaction terms option.
- Enter an optional Where clause to specify the data rows to be included in the computation.
- Select an optional Group By column to group results. A separate regression analysis will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
Click the Next button for the following variable selection options in the second dialog panel.
- None - No variable selection will be conducted, and all variables specified in the previous panel will be included in the model. This is the default option.
- All subsets - With this approach, all possible models for each subset of the variables specified are computed. The models are then ranked according to their adjusted R-squared values. The results for the model with the highest adjusted R-squared value are then displayed. Due to the potential computational intensity that can be encountered with this method, the number of variables (including interaction terms) is restricted to be 11 or less when using this method.
- Stepwise - With this method, this method starts with no variables in the model. Each step begins by adding the variable that is most significant (smallest P-value) that is below the P-value to enter value. Next, all variables currently in the model are examined and the variable that is least significant (highest P-value) that is larger than the P-value to leave value, is removed from the model. The process continues until no more variables are added or deleted from the model. The P-value to enter value must be smaller than P-value to leave value to avoid an infinite loop.
- Forward selection - This method is similar to the stepwise method described above, but no variables are ever deleted from the model. In each step, the variable that is most significant (smallest P-value) that is below the P-value to enter value is added to the model. The process continues until no more variables are added to the model.
- Backward elimination - With this method, the process begins with all variables included in the model. In each step, the variable that is least significant (highest P-value) that is larger than the P-value to leave value is added to the model. No variable is ever added to the model. The process continues until no more variables are deleted from the model.
Click the Next button for the following save options in the third dialog panel.
- Residuals - Residuals from the regression fit will be saved to the data table. Diagnostics on the residuals may then be analyzed.
- Studentized residuals - Studentized residuals from the regression fit will be saved to the data table. Diagnostics on the studentized residuals may then be analyzed.
- Predicted values - Predicted values will be saved for each row of the independent variables. Values can be added to these columns for the purposes of prediction without impacting the fit of the model as long as the corresponding value of the Y-variable is empty or nonnumeric.
- Standard error of the mean response - The standard error of the mean response will be saved for each row of the independent variables. This option is useful when computing a confidence interval for mean response at a level other than 95% as described below.
- 95% confidence interval for the mean response - The standard lower and upper limits of a 95% confidence interval for the mean response will be saved for each row of the independent variables.
- Standard error for individual prediction - The standard error for individual predictions will be saved for each row of the independent variables. This option is useful when computing a prediction interval at a level other than 95% as described below.
- 95% confidence interval for individual prediction - The standard lower and upper limits of a 95% prediction interval will be saved for each row of the independent variables.
- Cook's distance - The value of Cook's distance will be saved for each row of the independent variables. These values are useful when considering leverage/influence points for the regression model.
- DFFITS - The value of DFFITS will be saved for each row of the independent variables. These values are useful when considering leverage/influence points for the regression model.
Click the Calculate button to view the regression results.

Logistic Regression

Provides routines for fitting a regression model where the dependent variable is binary.

If you have raw data,
1. Select the X variables (independent variables) for the regression.
2. Select the Y variable (dependent variable) for the regression.
3. Specify the value of the Y variable that defines a success (the outcome of interest). The logistic model is for the probability of this outcome.
4. Enter an optional Where clause to specify the data rows to be included in the computation.
5. Select an optional Group By column to group results. A separate regression analysis will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
6. Click the Next button for the following options:
  - Save Pearson residuals - If selected, the Pearson residuals from the regression fit will be saved to the data table. Diagnostics on the residuals may then be analyzed.
7. Click the Calculate button to view the results.
If you have summary data containing the unique combinations of independent variables, the corresponding number of successes (outcomes of interest) and the corresponding number of samples,
1. Select the X variables (independent variables) for the regression.
2. Select the column containing the number of successes.
3. Select the column containing the number of samples (totals).
4. Click the Next button for the following options:
  - Save Pearson residuals - If selected, the Pearson residuals from the regression fit will be saved to the data table. Diagnostics on the residuals may then be analyzed.
5. Click the Calculate button to view the results.

ANOVA

One Way

Provides for testing the equality of several population means using independent samples from each population.

Depending on the format of the data, select one of the following options:
- If the samples are in separate columns, select the compare selected columns option and then select the columns containing the samples.
- If the samples are in a single column, select the compare values in a single column option, and then specify the column containing the samples (Responses In) and the column containing the population identifiers (Factors).
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Calculate button to view the results. the output will consist of a table of information about the sample means and the ANOVA table.
Select the Tukey HSD option and specify a confidence level if you wish perform a pairwise means analysis. This option will compute confidence intervals for each mean difference adjusted for multiplicity.

Two Way

Provides for testing the equality of several population means where the populations are stratified across two factors (row and column). This procedure in StatCrunch is restricted to an equal number of samples for each factor pairing.

Select the column which contains the sample responses.
Select the column which contians the values of the row factor.
Select the column which contians the values of the column factor.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Next button for the following options:
- Plot interactions - If selected, interaction plots across each factor will be produced.
- Display means table - If selected, a matrix of mean values for each factor pairing will be produced.
- Fit additive model - If selected, interaction between factors is excluded from the model.
Click the Next button again to specify graph layout options.
Click the Calculate button to view the results. the output will consist of a table of information about the sample means and the ANOVA table.

Nonparametrics

Sign Test

Provides hypothesis tests and confidence intervals for a population mean based on a single sample.

Select the column containing the sample values for the calculation(s). If more than one column is selected, a separate test will be done for each column. Resulting P-values are not adjusted for multiple comparisons.
Enter an optional Where clause to specify the data rows to be included in the computation.
Select an optional Group By column to group results. A separate hypothesis test/confidence interval will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null median value of zero and a not equal alternative hypothesis.
For a hypothesis test:
- Enter a null value for the population median.
- Choose from the options of not equal, less than and greater than for the alternative hypothesis.
- Data values that are equal to the hypothesized median will be excluded from the analysis. The P-values reported are exact.
For a confidence interval
- Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
- An exact confidence interval is returned with an achieved confidence level of at least that requested by the user when possible.
Click the Calculate button to view the results. The hypothesis test output will consist of a table containing the number of observations (n), the number used for the test (n for test), the sample median, the number of values below the hypothesized median (Below), the number of values equal to the hypothesized median (Equal), the number of values above the hypothesized median (Above) and the P-value for the test. For a confidence interval, the output consists of the number of observations (n), the sample median, the achieved confidence level and the lower and upper limits on the interval.

Wilcoxon Signed Ranks

Provides hypothesis tests and confidence intervals for a population mean based on a single sample using signed ranks.

Select the column containing the sample values for the calculation(s). If more than one column is selected, a separate test will be done for each column. Resulting P-values are not adjusted for multiple comparisons.
Enter an optional Where clause to specify the data rows to be included in the computation.
Select an optional Group By column to group results. A separate hypothesis test/confidence interval will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null median value of zero and a not equal alternative hypothesis.
For a hypothesis test:
- Enter a null value for the population median.
- Choose from the options of not equal, less than and greater than for the alternative hypothesis.
- Data values that are equal to the hypothesized median will be excluded from the analysis. For samples of size 40 or smaller with no ties in the absolute values of the ranks, the P-values reported are exact. Otherwise, the Normal approximation with a continuity correction is used. The method used to compute the P-value is included in the output.
For a confidence interval
- Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
- For samples of size 40 or smaller, an exact confidence interval is returned with an achieved confidence level of at least that requested by the user when possible. For larger sample sizes, the normal approximation is used. The method used to construct the confidence interval and the level of confidence is included in the output.
Click the Calculate button to view the results. The hypothesis test output will consist of a table containing the number of observations (n), the number used for the test (n for test), the estimated median using Walsh averages, the Wilcoxon statistic, the P-value for the test and the method used to compute the P-value. For a confidence interval, the output consists of the number of observations (n), the estimated median using Walsh averages, the achieved confidence level and the lower and upper limits on the interval.

Mann-Whitney

Provides hypothesis tests and confidence intervals for comparing two population medians using sample ranks.

Select the column containing the first sample.
Enter an optional WHERE clause to specify the data rows to be included in the first sample.
Select the column containing the second sample. This column can be the same column containing the first sample.
Enter an optional WHERE clause to specify the data rows to be included in the second sample.
Click the Next button to select between a hypothesis test and confidence interval computation. The hypothesis test option is selected by default using a null median difference of one and a not equal alternative hypothesis.
For a hypothesis test:
- Enter a null value for the difference between population medians.
- Choose from the options of not equal, less than and greater than for the alternative hypothesis.
- If the data specified has no ties and both sample sizes are less than or equal to 50, the P-values reported are exact. Otherwise, the Normal approximation with a continuity correction is used. The method used to compute the P-value is included in the output.
For a confidence interval
- Enter a value between 0 and 1 for the confidence level. A value of 0.95 provides a 95% confidence interval.
- If both sample sizes are less than or equal to 50, an exact confidence interval is returned with an achieved confidence level of at least that requested by the user when possible. For larger sample sizes, the normal approximation is used. The method used to construct the confidence interval and the level of confidence is included in the output.
Click the Calculate button to view the results. The output will consist of a table containing the number of observations in the first sample (n₁), the number of observations in the second sample (n₂), the estimated difference between the medians, the Mann-Whitney statistic (Test Stat.), the P-value for the test and the method used to compute the P-value.

Kruskal-Wallis

Provides hypothesis tests and confidence intervals for comparing two or more population medians using sample ranks.

Depending on the format of the data, select one of the following options:
- If the samples are in separate columns, select the compare selected columns option and then select the columns containing the samples.
- If the samples are in a single column, select the compare values in a single column option, and then specify the column containing the samples (Responses In) and the column containing the population identifiers (Factors).
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Calculate button to view the results. The output will consist of a table of information about the sample ranks and the results of the test of the hypothesis that all the medians are equal. Please note that this test is only valid for large samples.

Chi-square goodness of fit test

Provides a chi-square goodness of fit test.

Select the column contianing the observed values.
Select the column containing the expected values.
Enter an optional Where clause to specify the data rows to be included in the computation.
Select an optional Group By column to group results. A separate hypothesis test will be computed for each distinct value of the Group By column. Resulting P-values are not adjusted for multiple comparisons.
Click the Calculate button to view the results.

Control Charts

X-bar

Displays an X-bar chart for monitoring the mean of a process using samples from the process.

Depending on the format of the data, select one of the following options:
- If the data has samples across several rows of the same column, select samples in selected columns option, and then select the columns containing the samples.
- If the data has samples in a single column, select the samples in a single column option, and then specify the column containing the samples and the column containing the numerical values that identify the samples.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Next button for the following options:
- Target Mean - an optional value for the target mean of the process. If specified this value will be used to compute the center line and control limits of the resulting control chart.
- Save Means - If selected, the mean from each sample will be saved to the data table.
Click the Calculate button to view the results.

R

Displays an R chart for monitoring the variability of a process using samples from the process.

Depending on the format of the data, select one of the following options:
- If the data has samples across several rows of the same column, select samples in selected columns option, and then select the columns containing the samples.
- If the data has samples in a single column, select the samples in a single column option, and then specify the column containing the samples and the column containing the numerical values that identify the samples.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Next button for the following options:
- Target Std. Dev. - an optional value for the target standard deviation of the process. If specified this value will be used to compute the center line and control limits of the resulting control chart.
- Save Ranges - If selected, the range from each sample will be saved to the data table.
Click the Calculate button to view the results.

X-bar, R

Displays a stacked X-bar and R chart. See references to the X-bar and R charts for help on these items.

np Chart

Displays an np chart for monitoring the number of defectives produced by a process using samples from the process.

Select the column containing the number of defectives in each sample.
Depending on the format of the data, select one of the following options:
- If each sample is of the same size, select the Constant option under Sample Size, and then enter an integer value for the sample size in the adjoining entry field.
- If sample sizes are in a specific column, select the In Column option under Sample Size, and then specify the column that contains the sample sizes.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Next button for the following options:
- Target Proportion - an optional value for the target proportion of defectives for the process. If specified this value will be used to compute the center line and control limits of the resulting control chart.
- Save Proportions - If selected, the proportion of defectives from each sample will be saved to the data table.
Click the Calculate button to view the results.

p Chart, np Chart

Displays a p chart for monitoring the proportion of defectives produced by a process using samples from the process.

Select the column containing the number of defectives in each sample.
Depending on the format of the data, select one of the following options:
- If each sample is of the same size, select the Constant option under Sample Size, and then enter an integer value for the sample size in the adjoining entry field.
- If sample sizes are in a specific column, select the In Column option under Sample Size, and then specify the column that contains the sample sizes.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Next button for the following options:
- Target Proportion - an optional value for the target proportion of defectives for the process. If specified this value will be used to compute the center line and control limits of the resulting control chart.
- Save Proportions - If selected, the proportion of defectives from each sample will be saved to the data table.
Click the Calculate button to view the results.

c Chart

Displays a c chart for monitoring the number of defectives per sample.

Select the column containing the number of defectives in each sample.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Next button for the following options:
- Target mean defects per sample - an optional value for the target mean number of defectives per sample. If specified this value will be used to compute the center line and control limits of the resulting control chart.
Click the Calculate button to view the results.

u Chart

Displays a u chart for monitoring the number of defectives per unit.

Select the column containing the number of defectives in each sample.
Depending on the format of the data, select one of the following options:
- If each sample is of the same size, select the Constant option under Sample Size, and then enter an integer value for the sample size in the adjoining entry field.
- If sample sizes are in a specific column, select the In Column option under Sample Size, and then specify the column that contains the sample sizes.
Enter an optional Where clause to specify the data rows to be included in the computation.
Click the Next button for the following options:
- Target mean defects per unit - an optional value for the target mean number of defectives per unit. If specified this value will be used to compute the center line and control limits of the resulting control chart.
- Save mean defects per unit - If selected, the sample mean number of defectives from each sample will be saved to the data table.
Click the Calculate button to view the results.

Calculators

StatCrunch has graphical calculators that can be used to compute probabilities for the distributions listed below:

Beta http://en.wikipedia.org/wiki/Beta_distribution
Binomial http://en.wikipedia.org/wiki/Binomial_distribution
Cauchy http://en.wikipedia.org/wiki/Cauchy_distribution located at 0 with scale 1
Chi-Square http://en.wikipedia.org/wiki/Chi-square_distribution
Exponential http://en.wikipedia.org/wiki/Exponential_distribution
F http://en.wikipedia.org/wiki/F_distribution
Gamma http://en.wikipedia.org/wiki/Gamma_distribution
Normal http://en.wikipedia.org/wiki/Normal_distribution
Poisson http://en.wikipedia.org/wiki/Poisson_distribution
T http://en.wikipedia.org/wiki/T_distribution
Weibull http://en.wikipedia.org/wiki/Weibull_distribution with a=k and b = l^k

Resample

Resampling capabilities have recently been added to StatCrunch. These capabilities can be used to perform bootstrap and permutation methods for confidence intervals and hypothesis tests.

Select the columns to resample.
Enter an expression for the Statistic to be computed for each resample. Examples of common forms for expressions in this case are mean("Age"), median("Age") and sum("Gender"="F")/10 where Age and Gender are columns to be resampled from the underlying data set. The expression may include columns that are being resampled along with those that are not.
Select the resampling method. To bootstrap, select the with replacement option. To shuffle or permute, select the without replacement option.
Select the type of resampling. A univariate type will resample from each selected column independently one at a time. A multivariate type will sample each selected column at the same row.
Specify the number of resamples (by default 1000).
Click the Next button for the following options:
- By default, StatCrunch computes the 2.5th, 5th, 50th, 95th and 97.5th percentiles of the resampled statistics. Additional percentiles can be added in a comma separated format.
- Resampled statistics can be stored for future analysis by checking the Store resampled statistics in data table option (off by default)
- Options for producing a histogram and QQ plot of the resampled statistics are checked by default
Click Resample Statistic to collect the resamples and to produce summary output.

sex	No	Yes
Female	31	24
Male	37	8