SAS Creating Tables
In this post, I’ll briefly detail my method for storing tables in SAS using datalines. This can keep assumptions/input close to the programs that use the information while simultaneously allow for version control.
Below is an example of a pending claims dataset for life insurance that I will reference in additional posts.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
data claims_pending;
input
@1 claim_number $
@9 date_pending :mmddyy10.
@23 date_death :mmddyy10.
@37 date_submitted :mmddyy10.
@51 name_insured_first $
@62 name_insured_last $
@73 name_beneficiary_first $
@84 name_beneficiary_last $
@97 claim_amount comma16.2
;
format
date_pending mmddyy10.
date_death mmddyy10.
date_submitted mmddyy10.
claim_amount dollar16.2
;
datalines;
0001 01/31/2021 12/31/2020 01/02/2021 John Smith Jane Smith 50,000.00
0002 01/31/2021 11/15/2020 01/05/2021 Tom Johnson Mary Johnson 100,000.00
0003 01/31/2021 12/05/2020 01/05/2021 Sarah Lee John Lee 75,000.00
0004 01/31/2021 12/29/2020 01/10/2021 David Kim Emily Kim 250,000.00
0005 01/31/2021 12/15/2020 01/15/2021 Mike Smith Susan Smith 150,000.00
0006 01/31/2021 12/20/2020 01/20/2021 Anne Lee Grace Lee 200,000.00
0007 01/31/2021 10/01/2020 01/25/2021 Ashley Park Jessica Park 1,000,000.00
0008 01/31/2021 11/17/2020 01/30/2021 James Kim Sarah Kim 30,000.00
0009 01/31/2021 12/25/2020 01/31/2021 Emily Lee John Lee 80,000.00
0010 01/31/2021 01/10/2021 01/31/2021 Robert Kim Chris Kim 120,000.00
0001 02/28/2021 12/31/2020 02/02/2021 John Smith Jane Smith 50,000.00
0002 02/28/2021 11/15/2020 02/05/2021 Tom Johnson Mary Johnson 100,000.00
0003 02/28/2021 12/05/2020 02/05/2021 Sarah Lee John Lee 75,000.00
0004 02/28/2021 12/29/2020 02/10/2021 David Kim Emily Kim 250,000.00
0005 02/28/2021 12/15/2020 02/15/2021 Mike Smith Susan Smith 150,000.00
0006 02/28/2021 12/20/2020 02/20/2021 Anne Lee Grace Lee 200,000.00
0011 02/28/2021 01/20/2021 02/02/2021 Rosa Lopez Juan Lopez 25,000.00
0012 02/28/2021 02/01/2021 02/05/2021 Oscar Gomez Maria Gomez 30,000.00
0013 02/28/2021 01/25/2021 02/05/2021 Isabella Rivera Carlos Rivera 80,000.00
0014 02/28/2021 02/10/2021 02/15/2021 Lucas Hernandez Ana Hernandez 120,000.00
;
run;
Input - This instructs SAS how to read in the values
- The “@” symbol indicates what position to begin reading the specified value. Using this eliminates the need for delimiters and allows us to explicitely reference where the data begins.
- All field names have common descriptions followed by increasing detail. This facilitates concise keep/drop dataset options (e.g. “drop name:” would drop all four name columns, or “keep claim:” would keep only claim number and amount)
- The last item of the input step indicates what type the data is; however, a colon is necessary for the date variables as discussed in the SAS paper That Mysterious Colon
@ - The position of characters can easily be obtained from Notepad after enabling the Status Bar (View->Status Bar). The position of the cursor will not be featured at the bottom of the window (e.g. For claim 0007 if you click right before the 1,000,000 amount it will say “Ln 1, Col 97”)
Format - This instructs SAS how to display the information
- SAS dates are numeric, here we tell SAS to display those as MM/DD/YYYY
- Claim_amount is formatted to include the currency
Datalines - This is the information that SAS reads in
This section is not white-space agnostic because it is loading based on character position; additionally, spaces and tabs are not interchangeable.
Below is a dataset of paid claims which will be referenced in future posts such as this introduction to PROC SQL.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
data claims_paid;
input
@1 claim_number $
@9 date_paid :mmddyy10.
@23 date_death :mmddyy10.
@37 date_submitted :mmddyy10.
@51 name_insured_first $
@62 name_insured_last $
@73 name_beneficiary_first $
@84 name_beneficiary_last $
@97 claim_amount comma16.2
;
format
date_pending mmddyy10.
date_death mmddyy10.
date_submitted mmddyy10.
claim_amount dollar16.2
;
datalines;
0008 02/04/2021 11/17/2020 01/30/2021 James Kim Sarah Kim 15,000.00
0008 02/09/2021 11/17/2020 01/30/2021 James Kim Sarah Kim 15,000.00
0009 02/04/2021 12/25/2020 01/31/2021 Emily Lee John Lee 50,000.00
0010 02/19/2021 01/10/2021 01/31/2021 Robert Kim Chris Kim 120,000.00
;
run;