STATISTICS: The technology of extracting meaning from data (in order to make inferences)
- Data Cycle:
- Collecting Data (sampling, tally charts);
- Representing Data (numbers, graphs);
- Analysing Data (comparisons, conclusions).
- Types of Data
Stem and Leaf Diagrams
Worked Example 1
- Easiest to put in unordered first and then do again ordered;
- Mark values when used to help avoid double counting;
- Do “hash-check” after completing to ensure all values collected;
- Can use back-to-back stem-and-leaf diagrams for comparisons;
- Must have a key.
Worked Example 2
Large Sets of Data
Large sets of data are generally presented in classes. This involves some loss of data. Typically 5-10 is a “good” number of classes.
Labelling of tables for such data must be unambiguous.
What do you think about this? What kind of data is it? Is there an issue? Is there a better way to write it?
- Used to represent continuous data
- y-axis shows frequency density = frequency / class width
- Why do we need to do this and not just put frequency on the y-axis?
- No spaces between bars
- Area of bar is proportional to frequency.
Cumulative Frequency Graphs
Each point on the y-axis denotes a running total
Can be connected to zero
Points may be joined with straight lines (frequency polygon) or with curves (frequency curve)
Can read off from graph, e.g. number of values in top 20%
Worked Example: Heights of giraffes
Summary of Data Representation
Each of the methods of representing data that we learn have different pro’s and con’s. For instance, grouping data in a frequency table makes it more concise and easier to read, but loses some of the detail of the raw data.
Our last exercise lets us contrast the different types of data representation and as normal is followed by mixed exercises on this topic.