Graphics Help

Interacting with Graphics

Interacting with graphics is a very powerful tool for doing data analysis. Most of the graphics in StatCrunch 3.0 are interactive (with the exception of the Means Plot and Chart Group Stats). To interact with a graphic, draw a rectangle around the desired objects in the graph by clicking and dragging the mouse. The objects will be highlighted in the graph as well as all other interactive graphics and the data table. The method that each individual graphic uses to handle interaction is discussed below.

Graph Layout

The appearance of graphics may be customized by specifying graph layout options such as the number and format of graphs per page and the color scheme used in the graphics. These options are typically specified in the last dialog screen when producing graphics. By default, the number of rows and the number of columns per page is one, so one graph per page is produced. A page here is defined by the visible width and height of a browser window. When the window is resized, the graph resizes to fill the entire browser window. By changing the number of rows and number of columns, one can produce a matrix of plots. For example, if the number of rows is set to three and the number of columns is set to two, the resulting output will be formatted so that a three by two matrix of six plots per page will be visible. Color Schemes are discussed below in detail.

Color Schemes

By adding/changing color scheme options under the Graphics menu, one can control the way that StatCrunch accesses colors when producing graphics. A color scheme is defined by an ordered sequence of colors that are accessed in succession when StatCrunch produces a graphic. One color from the sequence is defined as the background for the graphic, and one color is defined as the foreground which is the default color of axes and other standard graphic elements. In a graphic that uses multiple colors, the background and foreground colors are not included as StatCrunch cycles through the list. The default color scheme consists of the sequence: black, white, red, blue, green, yellow, cyan, orange, dark green, and gray with black being the foreground and white being the background.

StatCrunch also offers two color schemes which consist of a gradient of colors between two primary colors: Grayscale (white to black) and Red to Blue. These scale color schemes may be most useful when grouping by a binned numerical column. The number of colors in the sequence for a scale color scheme depends on the number of colors needed in a particular situation. If 28 colors are required, then StatCrunch defines a sequence of 28 colors between the two primary colors. The background and foreground for these color schemes are white and black, respectively, and these definitions may not be changed.

StatCrunch allows users to edit existing color schemes and add new color schemes. One may do this by clicking on the Color Schemes link under the Graphics menu. To edit an existing color scheme, select the color scheme from the list of defined color schemes and click the OK button. To add a new color scheme, select the Add a new color scheme option and then specify a name for the color scheme. To construct a scale color scheme, select the Use scale option.

When one is finished editing the color scheme, click the Save button. The screen displaying the options for editing and adding color schemes will then reappear. Click the Cancel button when finished editing color schemes.


Bar Plot

Displays the frequency (or relative frequency) for all distinct values of selected columns.
  1. Select the column(s) to be displayed in the plot(s). A separate plot will be generated for each column selected.
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to group results. By default if a Group by column is specified, the frequency (relative frequency) of each distinct value of each selected column will be displayed in a sequence of bars color-coded by the corresponding values of the Group by column. This is commonly referred to as a side-by-side bar plot and is denoted by the Split bars option in StatCrunch. There is also an option to stack bars when a group by column is used. In this case, the color-coded bars corresponding to the different values of the grouping column are stacked one on top of the other. This type of graphic may be a great choice when emphasis is to be placed on the total number in a category rather than emphasizing different totals between groups. To create a separate bar graph for each unique value of the Group by column, select the separate graph for each group option.
  4. Click the Next button to choose between plotting the frequency, relative frequency, percent, relative frequency (within category) or percent (within category) on the y-axis. For each type of plot, the distinct values (categories) of the columns selected will be shown on the x-axis. The within category plot types are used only when a group by column is specified. When these plot types are specified, relative frequencies and percents are calculated for each unique value of the group by column relative to the total number of observations within each category. Otherwise, relative frequencies and percents are calculated relative to the total number of observations across all categories.
  5. Click the Next button again to specify graph layout options.
  6. Click the Create Graph! button to create the plot(s).

Pie Chart

Displays the relative frequency for all distinct values of selected columns.
  1. Select the column(s) for which a pie chart is to be constructed. A separate chart will be generated for each column selected.
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to construct a separate pie chart for each distinct value of this column.
  4. Click the Next button to choose what information (count and/or percent of total) to display in the label for each category.
  5. Click the Next button again to specify graph layout options.
  6. Click the Create Graph! button to create the plot(s).

Histogram

Displays the frequency, relative frequency or density for numerical data bined into classes.
  1. Select the column(s) to be displayed in the plot(s). A separate plot will be generated for each column selected.
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to construct a histogram for each distinct value of this column.
  4. Click the Next button to select either the Frequency, Relative Frequency or Density histogram. In addition, optional values for the starting point of the bins and the bin width may be specified. These parameters will apply to all of the histograms to be constructed.
  5. Click the Next button again to specify graph layout options.
  6. Click the Create Graph! button to create the plot(s).

Stem and Leaf

Displays a character based plot of a column that is similar to a histogram turned on its side. The actual (or approximate) data values are represented in the plot.
  1. Select the column(s) to be displayed in the plot(s). A separate plot will be generated for each column selected.
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to construct a separate stem and leaf plot for each distinct value of this column.
  4. Click the Next button to choose how to trim outliers. StatCrunch offers three trimming options: no trimming, mild and extreme outliers, and extreme only (the default). Trimming mild and/or extreme outliers will remove the appropriatte data values from the plot and place these outliers on separate Low and/or High stems. Mild outliers are more than1.5 times the interquartile range below (above) the first (third) quartile. Extreme outliers aremore than 3 times the interquartile range below (above) the first (third) quartile.
  5. Click the Create Graph! button to create the plot(s).

Boxplot

Displays a graphical representation of the 5-number summary for a set of numerical values, or optionally, a boxplot using inner and outer fences.
  1. Select the column(s) to be displayed in the plot(s). If multiple columns are selected, the plots will be stacked in the reverse order of selection in the same graphic.
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to construct boxplots for each distinct value of this column. If a Group by column is specified, select either to stack plots of each group for each column to be plotted or to stack plots of each column for each group.
  4. Click the Next button to choose to use fences when constructing the plots. By default, this option is not selected.
  5. Click the Next button again to specify graph layout options.
  6. Click the Create Graph! button to create the plot(s).

Dotplot

Displays a graphical representation of numerical values as points on a number line. Points with the same pixel representation are stacked on top of each other. If the number of points in a stack exceeds the height of the graphic, each point on the plot may represent more than one observation. If this occurs, the number of observations per point will be shown in the title of the graphic.
  1. Select the column(s) to be displayed in the plot(s). If multiple columns are selected, the plots will be stacked in the reverse order of selection in the same graphic.
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to construct dotplots for each distinct value of this column. If a Group by column is specified, select either to stack the plots of each group for each column or to stack plots of each column for each group.
  4. Click the Next button to specify graph layout options.
  5. Click the Create Graph! button to create the plot(s).

Means Plot

Displays the mean plus or minus two standard errors for a set of numerical values. This is not an interactive graphic.
  1. Select the column(s) to be displayed in the plot(s). If multiple columns are selected, the plots will be stacked in the reverse order of selection in the same graphic.
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to construct means plots for each distinct value of this column. If a Group by column is specified, select either to stack the plots of each group for each column or to stack plots of each column for each group.
  4. Click the Next button to specify graph layout options.
  5. Click the Create Graph! button to create the plot(s).

QQ Plot

Displays the sample quantiles of a variable versus the quantiles of a standard normal distribution.
  1. Select the column(s) to be displayed in the plot(s).
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to generate a separate QQ plot for each distinct value of the Group by column.
  4. Click the Next button to specify graph layout options.
  5. Click the Create Graph! button to create the plot(s).

Scatter Plot

Displays pairs of numerical values (points) on typical Cartesian (perpendicular) axes.
  1. Select the column containing the X-values of the points.
  2. Select the column containing the Y-values of the points.
  3. Enter an optional Where clause to specify the data rows to be included in the computation.
  4. Select an optional Group by column to group results. By default if a Group by column is specified, a single scatter plot will be generated where the points are color-coded according to the distinct values of the Group by column. To create a separate scatter plot for each unique value of the Group by column, check the separate graph for each group option.
  5. Click the Next button to specify graph layout options.
  6. Click the Create Graph! button to create the plot(s).

Multi Plot

Plots multiple pairs of points on the same graph or separate graphs. Pairs may be plotted as points, connected with lines or both plotted with points and connected with lines.
  1. Select the column containing the X-values of the points.
  2. Select the column containing the Y-values of the points.
  3. Choose the method for plotting the pairs (points, lines or both).
  4. Click on Add to add the pairing to the plot. The pairing will then be displayed in the selection box. To delete the pairing, select it and click on Delete.
  5. Repeat the above steps to select multiple pairings.
  6. Check the Separate graph for each variable combination option if you do not want the selected pairs plotted on the same graphic.
  7. Enter an optional Where clause to specify the data rows to be included in the plot.
  8. Select an optional Group by column to group results. If a Group by column is used, the Separate graph for each variable combination is ignored. By default if a Group by column is specified, a single plot will be generated where each pairing is color-coded according to the distinct values of the Group by column. To create a separate plot for each unique value of the Group by column, check the separate graph for each group option.
  9. Click the Next button to specify graph layout options.
  10. Click the Create Graph! button to create the plot(s).

Index/Time Plot

Display the values of a column versus index values, time/date options or custom labels for the x-axis. Consecutive points in the plot are connected with lines.
  1. Select the column(s) to be displayed in the plot(s). By default if more than one column is selected, the values for each column are color-coded and displayed in a single plot. To display each column in a separate graph, check the separate graph for each column option.
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Click the Next button to specify the format for the x-axis.
  4. If the index plot is chosen, the graph will work as in index plot. Change the start index and increment to customize the x-axis.
  5. If the time option is chosen, there are a variety of time/date options available. Select the desired type of time or date, and enter the starting values along with the increment between values. The increment will always be in units of the smallest time/date unit. Note: if "Hour Day" or "Minute Hour Day" is selected, the day variable will be sequential and not based on a celendar.
  6. If the custom option is chosen, any column in the data may serve as the labels for the x-axis. Select which column will be used and enter both the starting row of the labels along with the spacing between labels displayed on the x-axis.
  7. Click the Next button to customize what is displayed on the graph, points, lines, etc...
  8. Click the Next button again to specify graph layout options.
  9. Click the Create Graph! button to create the plot(s).

Chart Group Stats

Displays the selected statistics for values in a column grouped by another column.
  1. Select the statistics to chart.
  2. Select the column for Data In that contains the values for which the statistics are to be computed.
  3. Select the column for Groups In that contains the distinct values used to define groups.
  4. Enter an optional Where clause to specify the data rows to be included in the computation.
  5. By default all selected statistics will be color-coded and displayed on a single plot. To construct a separate graph for each statistic, check the separate graph for each statistic option.
  6. Click the Next button to choose what to plot on each graphic. The choices are:
  7. Click the Next button again to specify graph layout options.
  8. Click the Create Graph! button to create the plot(s).

Parallel Coordinates Plot

Displays data for two or more variables on parallel axes. The coordinates of a data value are connected with lines.
  1. Select the column(s) to be displayed in the plot(s).
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to group results. By default if a Group by column is specified, a single parallel coordinates plot will be generated where the lines connecting coordinates are color-coded according to the distinct values of the Group by column. To create a separate parallel coordinates plot for each unique value of the Group by column, check the separate graph for each group option.
  4. Click the Next button to specify graph layout options.
  5. Click the Create Graph! button to create the plot(s).

Pairs Plot

Displays a matrix of pairwise scatter plots for two or more selected columns.
  1. Select the column(s) to be displayed in the plot(s).
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Select an optional Group by column to group results. By default if a Group by column is specified, a single plot matrix will be generated where the points in each plot are color-coded according to the distinct values of the Group by column. To create a separate plot matrix for each unique value of the Group by column, check the separate graph for each group option.
  4. Click the Next button to specify graph layout options.
  5. Click the Create Graph! button to create the plot(s).

3D Rotating Plot

Displays a rotating XYZ scatter plot of three selected columns.
  1. Select the column containing the X-values of the points.
  2. Select the column containing the Y-values of the points.
  3. Select the column containing the Z-values of the points.
  4. Enter an optional Where clause to specify the data rows to be included in the computation.
  5. Select an optional Group by column to group results. By default if a Group by column is specified, a single 3D rotating plot will be generated where the points are color-coded according to the distinct values of the Group by column. To create a separate plot for each unique value of the Group by column, check the separate graph for each group option.
  6. Click the Next button to specify graph layout options.
  7. Click the Create Graph! button to create the plot(s).

Stars Plot

Displays a sequence of "stars" for each observation in a multivariate data set. The value of each variable is represented by a line segment drawn at a specific angle from the center of the star outward. The length of the segment represents the magnitude of the value relative to other values of the variable.
  1. Select the columns containing the multivariate data.
  2. Enter an optional Where clause to specify the data rows to be included in the computation.
  3. Specify an optional column which contains labels to be used for each star.
  4. Select the use full circle if you would like the stars to be drawn around a complete circle. By default the stars are drawn around the top half of a circle.
  5. Click the Next button to specify graph layout options.
  6. Click the Create Graph! button to create the stars.

Word Wall

Displays a graph highlighting the most common words in the selected columns. Each word is displayed within a bar with a width that is proportional to the number of times the words occurs. The bars are filled with different colors to better separate them visually. The bars are also stacked in a manner to fill the space available within the graphic. See http://www.statcrunch.com/twitter for an example word wall.
  1. Select the columns containing the words of interest.
  2. Enter an optional Where clause to specify the data rows to be included.
  3. Select an optional Group by column to group results. By default if a Group by column is specified, a separate graph will be produced for each unique value of the Group by column.
  4. Specify the delimiter to be used to separate words. A blank space is the default delimiter.
  5. Click the Next button to specify additional options.
  6. Specify whether or not to add an axis showing word frequency. This option is checked by default.
  7. Specify whether or not to remove common words and/or punctuation from the word wall. These options are checked by default. StatCrunch has a long list of common words shown below and in the dialog window. Words can be deleted from the common words list by removing them from the text area, and additional words can be added to the list by entering them in the text area.

    Common words list:
    & a able about above abroad according accordingly across actually adj after afterwards again against ago ahead 
    ain't all allow allows almost alone along alongside already also although always am amid amidst among amongst 
    an and another any anybody anyhow anyone anything anyway anyways anywhere apart appear appreciate appropriate 
    are aren't around as a's aside ask asking associated at available away awfully b back backward backwards be became 
    because become becomes becoming been before beforehand begin behind being believe below beside besides best better 
    between beyond both brief but by c came can cannot cant can't caption cause causes certain certainly changes 
    clearly c'mon co co. com come comes concerning consequently consider considering contain containing contains 
    corresponding could couldn't course c's currently d dare daren't definitely described despite did didn't different 
    directly do does doesn't doing done don't down downwards during e each edu eg eight eighty either else elsewhere 
    end ending enough entirely especially et etc even ever evermore every everybody everyone everything everywhere ex 
    exactly example except f fairly far farther few fewer fifth first five followed following follows for forever former 
    formerly forth forward found four from further furthermore g get gets getting given gives go goes going gone got gotten 
    greetings h had hadn't half happens hardly has hasn't have haven't having he he'd he'll hello help hence her here hereafter 
    hereby herein here's hereupon hers herself he's hi him himself his hither hopefully how howbeit however hundred i 
    i'd ie if ignored i'll i'm immediate in inasmuch inc inc. indeed indicate indicated indicates inner inside insofar 
    instead into inward is isn't it it'd it'll its it's itself i've j just k keep keeps kept know known knows l last lately 
    later latter latterly least less lest let let's like liked likely likewise little look looking looks low lower ltd m 
    made mainly make makes many may maybe mayn't me mean meantime meanwhile merely might mightn't mine minus miss more 
    moreover most mostly mr mrs much must mustn't my myself n name namely nd near nearly necessary need needn't needs neither 
    never neverless nevertheless new next nine ninety no nobody non none nonetheless noone no-one nor normally not 
    nothing notwithstanding novel now nowhere o obviously of off often oh ok okay old on once one ones one's only onto 
    opposite or other others otherwise ought oughtn't our ours ourselves out outside over overall own p particular particularly 
    past per perhaps placed please plus possible presumably probably provided provides q que quite qv r rather rd re really 
    reasonably recent recently regarding regardless regards relatively respectively right round s said same saw say saying 
    says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious seriously seven several 
    shall shan't she she'd she'll she's should shouldn't since six so some somebody someday somehow someone something sometime 
    sometimes somewhat somewhere soon sorry specified specify specifying still sub such sup sure t take taken taking tell 
    tends th than thank thanks thanx that that'll thats that's that've the their theirs them themselves then thence there 
    thereafter thereby there'd therefore therein there'll there're theres there's thereupon there've these they they'd 
    they'll they're they've thing things think third thirty this thorough thoroughly those though three through throughout 
    thru thus till to together too took toward towards tried tries truly try trying t's twice two u un under underneath 
    undoing unfortunately unless unlike unlikely until unto up upon upwards us use used useful uses using usually v value 
    various versus very via viz vs w want wants was wasn't way we we'd welcome well we'll went were we're weren't we've what 
    whatever what'll what's what've when whence whenever where whereafter whereas whereby wherein where's whereupon wherever 
    whether which whichever while whilst whither who who'd whoever whole who'll whom whomever who's whose why will willing 
    wish with within without wonder won't would wouldn't x y yes yet you you'd you'll your you're yours yourself yourselves 
    you've z zero
    
  8. Click the Next button two more times to specify graph layout options.