Jump to content

SAS/Descriptive Statistics

From Wikibooks, open books for an open world
< SAS

List and describe your data

[edit | edit source]

Describe your data :

The proc contents returns the list and the type of all variables in the datasets.

 proc contents data= lib.data ;
 title "Describe the content of a database";
 run;

List your data in the output :

The proc print prints the data in the output window. The firstobs option gives the first line to be printed and the obs option the number of lines to print.

 proc print data= lib.data (firstobs=30 obs=40);
 title "Partial Listing";
 run;

Discrete Variables

[edit | edit source]
 proc freq data=lib_name.data_name;
 tables x1 x2 ;
 title "frequence table";
 run;
  • weight specify weights
 proc freq data=lib_name.data_name;
 weight extri;
 tables x1 / out=temp4 outexpect;
 run;

Contingency Tables

[edit | edit source]
 proc freq data=lib_name.data_name;
 tables x1*x2 ;
 title "contingency table";
 run;

Continuous Variables

[edit | edit source]

proc means presents descriptive statistics for each variable listed in the var statement or for each numeric variable in the data set if there is no var statement. Here are some of the keywords that can be used to tell SAS which statistics you wish to see.

  • n : count of non missing variables
  • sum : summation of the variable
  • range : largest value minus smallest value
  • mean : average
  • var : variance
  • stddev : standard deviation
 proc means data= libdata n sum range mean var stddev ;
 var x1 x2;
 run;

The class statement makes statistics for each group of the categorical variable in the class statement. The weights statement weights the observations.

 proc means data=lib_name.data_name;
 var x1 x2;
 class sexe;
 weight extri;
 run;

The proc univariate gives more options. It also returns the quantiles. There is also an histogram statement which can be useful.

 proc univariate data=lib_name.data_name;
 var x1;
 histogram / normal(color=red mu=0 sigma=0.045) kernel(color=blue);
 title "Proc Univariate";
 run;

Kernel and Histograms

[edit | edit source]

If you want to do a kernel or an histogram, you can use proc univariate with the histogram statement or the proc capability.

 proc univariate data=lib_name.data_name;
 var x1;
 histogram / normal(color=red mu=0 sigma=0.045) kernel(color=blue);
 title "Proc Univariate";
 run;

Proc capability :

 proc capability data=lib_name.data_name;
 histogram x1 / normal(color=red mu=0 sigma=0.045)
 kernel(color=blue);
 title "Proc Capability";
 run;


Correlations and scatterplots

[edit | edit source]
 proc corr data=lib_name.data_name;
 var x1 x2 x3;
 weight extri;
 title3 "Linear correlation";
 run;

T Test

[edit | edit source]

The following code test the assumption that the expected value of variable x in the dataset taille is 1.75.

proc ttest data = taille h0=1.75 alpha=0.05;
var x;
run;