### MathCAD Data Analysis (Descriptive Statistics) Preview

PostPosted:

**Fri Sep 25, 2009 7:38 am**Descriptive statistics is commonly used in the field of medical research studies. It is used to quantitatively describe the main features of a set of data. Inferential statistics differs in that it is used to reach conclusions that generalise beyond the immediate data. Descriptive statistics are used to present quantitative descriptions of large amounts of data in a clear and understandable way. MathCAD offers a variety of descriptive statistical functions to reduce large amounts of data into a much simpler summary. Descriptive statistics are generally presented along with more formal analyses, to give the audience an overall sense of the data being analysed.

I plan to go through the following basic functions:

1. mean(A,B,C,...)

2. median(A,B,C,...)

3. mode(A,B,C,...)

4. percentile(v,p)

These functions take single or multiple scalars or arrays, and return the mean, median, and mode, respectively, giving measures of the location of a data point relative to the rest of the distribution. The best choice of location estimator depends on the general dispersion or distribution of your data.

Mean

In statistics we refer to this also as the "arithemetic mean" and is the most commonly used type of average. To calculate the mean of a set of numbers this involves simply adding the total sum of the numbers in a set divided by the number of items in the set. The other types of averages such as the median and th mode will be discussed later on.

Example:

Consider the following numeric data:

The arithmetic mean or average of N values is given by the following formula:

The mean is sensitive to changes in values of one or more data points:

The mean is greatly affected by significant outliers. So you may find that the mean is a poor description of the central location if this is the case.

You may choose to trim the outliers and find the "trimmed mean" for a better estimate.

Consider the "trimmed" numeric data:

As you can see I have chosen to leave out the value 46 which was a significant outlier in this set.

MathCAD automatically readjusts all the values of the formulas and recalculate the new mean for this data set.

I plan to go through the following basic functions:

1. mean(A,B,C,...)

2. median(A,B,C,...)

3. mode(A,B,C,...)

4. percentile(v,p)

These functions take single or multiple scalars or arrays, and return the mean, median, and mode, respectively, giving measures of the location of a data point relative to the rest of the distribution. The best choice of location estimator depends on the general dispersion or distribution of your data.

Mean

In statistics we refer to this also as the "arithemetic mean" and is the most commonly used type of average. To calculate the mean of a set of numbers this involves simply adding the total sum of the numbers in a set divided by the number of items in the set. The other types of averages such as the median and th mode will be discussed later on.

Example:

Consider the following numeric data:

The arithmetic mean or average of N values is given by the following formula:

The mean is sensitive to changes in values of one or more data points:

The mean is greatly affected by significant outliers. So you may find that the mean is a poor description of the central location if this is the case.

You may choose to trim the outliers and find the "trimmed mean" for a better estimate.

Consider the "trimmed" numeric data:

As you can see I have chosen to leave out the value 46 which was a significant outlier in this set.

MathCAD automatically readjusts all the values of the formulas and recalculate the new mean for this data set.