MA177 SAS Fast Track

course homepage      email instructor      email helpdesk    syllabus     course outline     texts/software
 
Lecture II
This lecture will cover an introduction to writing programs in SAS. The menu to the right provides links to lecture material as well as to review questions, a sample test, and the assignment page (if any) for this week.
Lecture Menu

INTRODUCTION

SOME BASIC RULES

ASSIGN SAS LIBRARIES USING THE LIBNAME STATEMENT

OPTIONS STATEMENT

DATA STEP

Review Questions & Amswers

Sample Test

Assignment

I. INTRODUCTION

A basic, uncomplicated SAS job begins with a DATA statement and proceeds from here. If any external files are needed, however, or if data will be written out or saved during a session, the SAS system has to know where to find or send such data. In addition, it may be necessary to specify or change system options. These operations are performed by such statements as LIBNAME, FILENAME and OPTIONS. These statements usually may be entered anywhere in a SAS job, but are typically taken care of at the beginning of a session and remain in effect until changed or until the session is terminated.

II. SOME BASIC RULES
 

The SEMICOLON
All SAS statements end with a semicolon. This is how SAS knows where one statement ends and the next one begins. One of the most common SAS errors results from missing semicolons. While the user might intend a new SAS statement, without a semicolon SAS tries to read the next word(s) as continuation of the previous statement. When errors occur, always check to see that appropriate semicolons are present.

Free Format Style
SAS statements are free-format which means:

  • One or more blanks or special characters can be used to separate words.
  • They can begin and end in any column.
  • A single statement can span multiple lines.
  • Several statements can be on the same line.

The RUN Statement
To execute, each data step or procedure in SAS should end with a RUN statement.

III. ASSIGN SAS LIBRARIES USING THE LIBNAME STATEMENT

You can use the LIBNAME statement or function to assign librefs and engines to one or more folders, including the working folder. The examples in this section use the LIBNAME statement.

The LIBNAME statement has the following basic syntax:

LIBNAME libref <engine-name> 'SAS-data-library'

CAUTION:
The words CON, NUL, LPT1 - LPT9, COM1 - COM9, and PRN are reserved words under Windows. Do not use these reserved words as librefs.

Assigning a Libref to a Single Folder
If you have Version 8 SAS data sets stored in the C:\MYSASDIR folder, you can submit the following LIBNAME statement to assign the libref TEST to that folder:

libname test V8 'c:\mysasdir';

This statement indicates that Version 8 SAS files stored in the folder C:\MYSASDIR are to be accessed using the libref TEST. Remember that the engine specification is optional.

Assigning a Libref to the Working Folder
If you want to assign the libref MYCURR to your current SAS System working folder, use the following LIBNAME statement:

libname mycurr '.';

Assigning a Libref to Multiple Folders
If you have SAS files located in multiple folders, you can treat these folders as a single SAS data library by specifying a single libref and concatenating the folder locations, as in the following example:

libname income ('c:\revenue','d:\costs');
This statement indicates that the two folders, C:\REVENUE and D:\COSTS, are to be treated as a single SAS data library. When you concatenate SAS data libraries, the SAS System uses a protocol (a set of rules) for accessing the libraries, depending on whether you are accessing the libraries for read, write, or update.

Furthermore, you may concatenate multiple libraries by specifying only their librefs, as in the following example:

libname sales (income revenue);
This statement indicates that two libraries that are identified by librefs INCOME and REVENUE are treated as a single SAS data library whose libref is SALES.

Assigning Engines
If you want to use another access method, or engine, instead of the Version 8 engine, you can specify another engine name in the LIBNAME statement. For example, if you want to access Release 6.10 SAS data sets from your Version 8 SAS session, you can specify the V6 engine in the LIBNAME statement, as in the following example:

libname oldlib V6 'c:\sas610';

As another example, if you plan to share SAS files between Version 8 under Windows and Release 6 under Windows, you should use the V6 engine when assigning a libref to the SAS data library. Here is an example of specifying the V6 engine in a LIBNAME statement:

libname lib6 V6 'c:\sas6';

The V6 engine is particularly useful in your Version 8 SAS session if you are going to be accessing the same SAS files from a Release 6 SAS session. Remember that while Version 8 can read Release 6 SAS data sets, Release 6 cannot read Version 8 SAS data sets.

Using the LIBNAME Statement in SAS Autoexec Files
If you prefer, you can store LIBNAME statements in your SAS autoexec file so that the librefs are available as soon as the SAS System initializes. For example, your SAS autoexec file may contain the following statements:

libname test 'c:\mysasdir';
libname mylib ('c:\mydata','d:\tempdata');
libname oldlib V6 'c:\sas6';

Assigning Multiple Librefs and Engines to a Folder
If a folder contains SAS files created by several different engines, only those SAS files created with the engine assigned to the given libref can be accessed using that libref. You can assign multiple librefs with different engines to a folder. For example, the following statements are valid:

libname one V6 'c:\mydir';
libname two V8 'c:\mydir';

Data sets referenced by the libref ONE are created and accessed using the compatibility engine (V6), whereas data sets referenced by the libref TWO are created and accessed using the default engine (V8). You can also have multiple librefs (using the same engine) for the same SAS data library. For example, the following two LIBNAME statements assign the librefs MYLIB and INLIB (both using the V8 engine) to the same SAS data library:

libname mylib V8 'c:\mydir\datasets';
libname inlib V8 'c:\mydir\datasets';

Because the engine name and SAS data library specifications are the same, the librefs MYLIB and INLIB are identical and can be used interchangeably.


IV. OPTIONS STATEMENT
  • Use the OPTIONS statement to change SAS system options.
  • The change(s) will remain in effect for the rest of the job/session or until changed again.
  • The OPTIONS statement can appear at any place in a program, except within data or card lines.

Frequently Used Options...

  • LINESIZE=n and PAGESIZE=n
    • Affect the printer line width and number of lines printed per page for log and output files.
    • In windowing environment, check the print setup for printer maximums.
    • For special pf output, use:
      options lineseze=88 pagesize=64; ** portrait **;
      options linesize=179 pagesize=65; ** landscape **;
  • PAGENO=n will start the output file at page number n.

    options pageno=1 linesize=90 pagesize=55;

Some Other Options...

  • NONUMBER removes the page numbers from the output file.
  • NODATE removes the date and time from the output window or lst file.
  • NOCENTER left justifies the output file.
  • SKIP=n will tell SAS to skip n lines before printing on a page.
  • MISSING='character' specifies the character to print for missing numeric values.

V. DATA STEP

Overview

A SAS program is a collection of SAS statements that may include keywords, various names (e.g., data sets, and variables), special characters, and operators.A SAS statement may be used in a DATA step, PROC (procedure) steps, or anywhere in a SAS program.

A SAS program consists of DATA steps and PROC (procedure) steps. DATA steps handle data sets, while PROC steps actually conduct analyses.

A DATA step is used to create or modify data sets by creating and modifying variables; checking and correcting errors in data sets; and writing programs (for simulations).

SAS has following basic rules.

  • A statement begins and ends at any place.
  • A statement ends with semi-colon (;). A line can have more than one statements.
  • SAS is not case-sensitive.
  • Operators (+, -, *, and /) do not work with missing values, while functions ignore missing.
  • A comment is enclosed by /* and */

Reading data instream with DATALINES statement.

Example1:

data one;
input height sex $ weight;
datalines;
77 m 220
61 f 110
72 m 185
67 m 125
; proc print;
run;

Note: The dollar sign indicates that sex is a character variable, while height and weight (both of which are not followed by dollar signs) are numerical variables.

Reading data from an external file with the INFILE statement.

If the data file is called sample.dat, and is located in the /home/econ312 directory, use the following infile and input statements to read in the data from external file sample.dat.

Example2:

data one;
infile '/home/econ312/sample.dat';
input height sex $ weight;
proc print;
run;

A sample.dat file might look like this:

77 m 220
61 f 110
72 m 185
67 m 125

Creating Permanent SAS data sets

Naming data sets.

When you create a SAS data set, you designate the name with statement DATA one;. If a SAS data set is named one, it is actually called work.one by SAS. This is an example of SAS's use of a two-level naming system for SAS data sets. SAS stores data sets in groups called libraries (i.e. directories, in Unix terms). Each library is comprised of  data sets that contain related information; the data sets do not have to be related in any way. The second level name is the actual name of the data set. If you do not specify a first level for a data set name, SAS will always assume that work is the first level. The work library is a special library for temporary data sets. Work data sets are only available for use for the duration of your current SAS session. When you leave SAS, the work data sets are deleted.

Creating permanent SAS data sets.

To create a permanent SAS data set, you should allocate a library with the LIBNAME statement and two-level data set name with a DATA statement:

LIBNAME libref ‘pathname’;
DATA libref.datasetname;

Example3:

LIBNAME mylib ‘c:/temp’;
DATA mylib.one;
infile ‘/home/econ312/sample.dat’;
input height sex $ weight;
run;

After you execute this SAS program, a permanent SAS data set called one.ssd01 will appear under the directory ‘c:/temp’.

Using permanent SAS data sets in PROC step.

Once a SAS data set is stored permanently, you can use it directly in the PROC step without DATA step.

Example 4:

LIBNAME mylib ‘c:/temp’;
proc contents data=mylib.one;
run;

This will give you the contents of SAS data set one that was created in Example 3.

Browse SAS data descriptor using PROC CONTENTS.

    The descriptor portion of a SAS data set contains
  • general information about the SAS data set (such as data set name and number of observations)
  • variable attributes (name, type, length, position, informat, format, label).

The CONTENTS procedure displays the descriptor portion of a SAS data set.

proc contents data=work.STAFF varnum;
run;This will give you the contents of SAS data STAFF. By default, PROC CONTENTS displays variable lists in alphabetical order. The VARNUM option will display variable list by their logical position in the data set.

Browse SAS data portion using PROC PRINT or PROC FSVIEW.

  • The PRINT or FSVIEW procedure displays the data portion of a SAS data set.
  • PROC FSVIEW displays SAS data without writing data to the output window. The display of data cannot be saved or printed.

proc fsview data=work.STAFF;
run;

  • ROC PRINT displays data to the output window. The display can be printed or as an output file.

proc print data=work.STAFF;
run;

Review Questions & Answers
Sample Test

1. Which of these are valid Windows file names?
a) states1.raw
b) main | memory
c) 1st Quarter Report
d) letter3.doc
e) Lift?.dat

2. Which of these are valid SAS variable names?
a) first name
b) Last_Name
c) 1stName
d) address_2
e) address#2

3. In the programming process, what activities may be performed after reviewing the results?

4. In this code, which variables are numeric?
data survey;
infile ‘survey.dat’;
input Initials $
Gender $
State $
Years
Profession $;
run;

5. In the above code, which variables hold character data?

Sample Test Answers

1. a, c, d
2. b, d
3. modify or debug
4. Years
5. Initials, Gender, State, Profession