Using SAS - A Case Study

Tips from a SAS/Graph expert - how I work...

Hopefully you've seen my hundreds of fancy SAS/Graph examples on robslink.com, and wondered "How does this guy do that?" In this paper, I describe "how I work" ... in other words, my 'programming environment', or the way I get things done. This paper's not about individual SAS/Graph tips & techniques, but rather the big-picture of how I use SAS.

Basically, I write my SAS jobs on Unix, using the "vi" editor. I run my SAS jobs from a DOS window, running them from the command line in 'batch' mode, and set up the SAS jobs to write their output to the current directory. I use "ods html" and device=gif, and I view the output in a web browser.

Which Version of SAS?

Which Version of SAS?

What version of SAS do I use? ... The latest one that's available! (which, as of October 2009, is version 9.2)

If you use older versions of SAS, you'll be missing out on lots of great new features & bug fixes. V9.2 supports millions of colors and shades in a single graph (whereas previous versions had a 256 color limit) - this feature alone is reason enough to switch! Also, v9.2 gets you nice smooth-edged fonts in graphs on Unix (yay!). There are *many* other great SAS/Graph improvements in v9.2, and you can read about them in this paper.

I cannot stress enough how strongly I recommend using the current/latest version of SAS/Graph!

Windows or Unix?

Should you run your SAS jobs on Windows or Unix? ... The answer is: Yes!

Thankfully this is pretty much a non-issue these days - use whichever is more convenient to you! I run my SAS jobs on Windows and Unix, interchangeably (I generally run on Windows because that's my desktop system.)

As of v9.2, both Windows and Unix both support smooth-edged fonts (and also default to smooth-edged fonts), so the operating system where you run your SAS jobs is basically a non-issue now.

Also, modern versions of SAS use the .sas7bdat data sets, which can be read by either Unix or Windows SAS (via CEDA), (as opposed to the .ssd01 & .ssd02 data sets, which could only be read by one OS or the other).

One other Unix/Windows issue to consider ... pathnames. I generally put all the files I'm going to be reading in the current directory, and write all output there also, so I don't have to deal with pathnames. Occasionally I have some files in another directory, but in that case I reference them via a relative pathname, and use the (Unix) forward-slash, since modern versions of Windows can handle both forward- and back-slash in the pathnames. So, basically, if you use relative pathnames, and write path/filenames so that they will work on Unix (using forward-slash, and being careful to enter your filenames carefully in the correct mixed-case), then they should run fine on both Unix and Windows. For example, here's the relative pathname to a directory that's up two directories, and then down into the 'My_data' directory ...

   libname foo '../../My_data';

DMS or Batch?

I seldom run DMS (Display Management System) SAS - I only use it when ... 1) I'm just experimenting, trying to figure out some syntax, or 2) I'm running SAS Help (from the "Help->SAS Help and Documentation" pulldown menu).

99% of the time, I run batch SAS from a DOS command line, rather than DMS SAS! You can bring up a DOS window on Windows using "Start->Programs->Accessories->Command Prompt".

This:

Rather than this:

Why so old-school, you might ask?...

Why not use EG, so it will help you generate the correct syntax? EG is a great help for beginners, but EG doesn't have a GUI interface for all of the SAS/Graph proc options, and in particular "annotate" (which I use in a *lot* of my fancy/custom graphs).

Why not use DMS SAS so you can see the color-coded SAS logs? Well, I don't make mistakes, so I don't need to see the SAS log! And in the few times when there is an error in the log, I can bring up the log in the "vi" editor, and very quickly do a "/ERROR" to locate the error, rather than scrolling a DMS window. Oh, and batch SAS automatically clears/replaces the log with every run, which I think is much easier than manually clearing the SAS log window in DMS sas. Also, having a separate .log file for each sas job is kinda handy, if you need to go look at them later (to see how much cpu time they took, etc).

Why not use DMS SAS so you don't have to re-run all your code each time? It's a trade-off -- by re-running all the code each time, it might take a little longer to run, but I get a fresh start every time, and I know that the current sas job contains all the code needed to produce the output (whereas in DMS sas, some command you ran a while ago, which is no longer in your program editor window, might be producing some of what you're seeing in your graph). I don't have to worry about doing a "goptions reset" or deleting grsegs so I can re-use the names, etc.

How to run Batch SAS?

If you're going to run Batch SAS (from the command line), you definitely don't want to type the full path to the SAS executable every time you run SAS. Therefore, you'll want to add the path to the SAS executable to your search path, so you only have to type 'sas' to run it.

For example, on Unix, if the SAS executable was in ... /foo/software/SASFoundation/9.2/sas, you could either add that directory to your path (export PATH=$PATH:/foo/software/SASFoundation/9.2/) or you could make a symbolic link to the sas executable from some location that's already in your path (such as /usr/local/bin) - either of which is a one-time-thing ...


   cd /usr/local/bin/
   ln -s /foo/software/SASFoundation/9.2/sas sas

Or, to add the SAS install path to your Windows search path (this might vary slightly, under different versions of Windows)...


   Start -> Settings -> Control Panel
   In the Control Panel "classic view", double-click the "System" icon.
   Within the "System Properties" GUI, select the "Advanced" tab.
   On the "Advanced" tab, click the "Environment Variables" button
    (it's near the bottom of the GUI).
   Under "System variables", scroll down and select "Path".
   And the click the "Edit" button.
   In the tiny "Variable value:" window, scroll to the right
   and add the following to the end of the line ...
   (if you have installed sas in a different location, then
   figure out what the location to your sas.exe is)
   
      ;c:\program files\sas\sasfoundation\9.2\
   
   Be sure to include the semicolon (;) to separate this path from
   all the other paths.
   
   Click "Ok", "Ok", and "Ok".

Then you can run SAS jobs in batch (ie, from the command line) as follows.


   cd {to the directory containing the foo.sas job}
   sas foo.sas

Combine that with command-line recall, and you have become a lean-mean batch-SAS machine! (I'm way quicker re-submitting my SAS jobs this way, than in DMS SAS!)

What Fileserver?

Working in a large computer-savvy company certainly has its advantages when it comes to infrastructure! I house all my SAS code on a fileserver in the data center, rather than on my local hard drive. More specifically, it's a NetApp (Network Appliance) fileserver. The NetApp is very fast, has redundant hot-swappable RAID drives, hot-swappable power supplies, does dynamic snapshot backups, and maybe most importantly supports multiple protocols so both Windows and Unix machines can access the files conveniently in their native format (CIFS for Windows, and NFS for Unix). [Note: this is not an 'endorsement' for NetApp fileservers - I am merely stating what I use, and describing the features that I find useful.]

This allows me to easily get to the same SAS code using the Unix path ~realliso/public_html/democd42/ and the Windows path U:\public_html\democd42\, for example. And there are also various other NFS and UNC pathnames I can use to get to my files, if the occasion calls for it. This provides a *very* flexible working environment!

Which Editor?

Which editor should you use to write & edit your SAS code? ... whichever one you're most productive with!

If you run your SAS jobs in batch mode, this frees you from being tied to the editor that's included in DMS SAS. You can leave a non-SAS editor open in one window, and submit your SAS jobs in batch in another window, instead of using DMS.

Which editor do I use? ... "vi" on Unix.

I run a Unix "xterm" and display it on my PC (using the Hummingbird eXceed X-server), and then use the "vi" editor in the xterm. When I save/write my files to the multi-protocol NetApp fileserver, I have immediate access to the updated sas file in my Windows DOS window (which is accessing the same folder & files via a CIFS path), and I then run the SAS job in batch. I leave the file open in the "vi" editor window, save/write it to disk, run it in batch from the DOS window, view the output, and repeat as needed. [Note: this is not an 'endorsement' for the Hummingbird eXceed X-server - I am merely stating what I use.]

"vi" is a bit old-school, and might not be for everyone. I like "vi" because I'm very good at using it - in particular, I'm very productive when combining the find command '/', the commands to change text such as 'cw' (change word), the 'n' command (to find the next occurrence), and the '.' command (to repeat the last 'cw', etc). As opposed to the GUI point-and-click editors, I like "vi" because I can keep my fingers on the keyboard 100% of the time.

My Code?

You could just about say that I've only written one SAS job in my (almost) 20 year SAS career! ... After writing that first program, all the other ones were just a "slight modification" of it :)

Lately, the following small example demonstrates the how I basically set up the SAS code for all my web-based SAS examples. Note that I set up a 'name' at the top of the job, and then use that for the html file name (in the body=), and the gif name (in the name=). I also tell ods to write the output to the current directory (ie, '.').

   %let name=foo;
   filename odsout '.';
   
   GOPTIONS DEVICE=gif;
   ODS LISTING CLOSE;
   ODS HTML path=odsout body="&name..htm";
   
   proc gmap data=maps.us map=maps.us;
    id state;
    choro state / des='' name="&name";
   run;
   
   quit;
   ODS HTML CLOSE;
   ODS LISTING;

Viewing Output?

With DMS SAS, you'd typically view your output in the DMS SAS Graphics window. But if you are running your SAS jobs in batch, you won't have a SAS Graphics window ... not to worry though! Just use a web browser instead!

Web browsers these days can pretty much view any type of SAS Graphics output - gif, png, java, activex, svg, pdf, etc. And, in particular, if you do all your development of java & activex SAS Graphs using the "internal browser" in DMS SAS, you're probably going to have some problems/differences when you try to deploy your graphs on a real web server/browser later ... might as well use a real web server & browser all along the way, you you'll know that your graphs work "from the get-go" :)

I almost always use "ods html" and create web-capable output (which supports html tags for drilldown and rollover text), and save the files on a NetApp, in a directory that just so happens to be accessible via a web server. Therefore the 'foo.gif' and 'foo.htm' files that are written to U:\public_html\ (or via unix in ~/public_html/), are also viewable via the web url httt://sww.sas.com/~realliso/

In Summary

So, in a nutshell, I have all my files on a multi-protocol (NetApp) file server, so that I can get to them from Unix, Windows, and a web URL. I bring up a Unix xterm to edit my SAS jobs (using 'vi'), I run the sas jobs in batch mode from the DOS command line, and set up my jobs to write the output into the current directory (or a relative path to the current directory), and I view the output using the Internet Explorer web browser. These 3 windows basically provide my own "DMS SAS", in a way that I feel most productive.

Edit SAS jobs using your favorite editor
Submit your SAS jobs in batch from the command line
View your output in a web browser
(wash, rinse, repeat)