Description
Below adaptable SAS code for generation of recency, frequency and monetary variables. Algorithm may be applied in a lot of contexts in which records contain dates (‘day of entry’, ‘day of exit’, ‘day of visit’) id number and maybe even revenue or cost.
Citation
Please cite this code as: Laier, G.H. (2016) Recency, Frequency and Monetary SAS programming script [computer software]. Denmark. Link: http://hellmund.blogspot.dk/2016/02/recency-frequency-monetary-rtf-variable.html
Thanks! Gunnar Hellmund Laier, PhD, MSc
Explanation
In this context we form variable for analyses of Danish National Register data and form variables containing information on contacts, hospitalizations and days in hospital 14, 30, 91 and 180 days before a hospitalization or contact.
Key variables: cpr (security number), pattype (patient type, in- or outpatient), inddto (day of entry), uddto (day of exit), ambdto (day of visit).
Data step program
data calc.RFMdata(drop=dto_hist ind_hist pat_hist ind k pdt pat i inddto_ uddto_); set stalist; retain dto_hist ind_hist pat_hist; by cpr;
prev=.; length dto_hist $4000; length ind_hist $4000; length pat_hist $4000;
timedif=0; *Number of hospitalizations within last 14 days; tnum14=0; *Number of hospitalizations within last 30 days; tnum30=0; *Number of hospitalizations within last 91 days; tnum91=0; *Number of hospitalizations within last 180 days; tnum180=0;
*Inpatient hospitalizations (contacts/days in bed); num14=0; seng14=0; *Within last 30 days; num30=0; seng30=0; *Within last 91 days; num91=0; seng91=0; *Within last 180 days; num180=0; seng180=0;
uddto_=uddto; inddto_=inddto; format inddto_; format uddto_;
if pattype EQ 2 then do; uddto_=ambdto; inddto_=ambdto; sengdage=0; end;
if missing(uddto_) then do; if (pattype in (0 1)) AND not(missing(sengdage)) then uddto_=inddto_+sengdage; end;
if first.cpr then do; dto_hist=strip(put(uddto_,8.)); ind_hist=strip(put(inddto_,8.)); pat_hist=strip(put(pattype,8.)); end; else do;
k=count(strip(dto_hist),’;’); do i = 1 to k+1;
pdt=input(scan(dto_hist,i,’;’),8.); ind=input(scan(ind_hist,i,’;’),8.); pat=input(scan(pat_hist,i,’;’),8.);
if pdt LT inddto_ then do;
if missing(prev) then prev=pdt;
timedif=inddto_-pdt; if missing(pdt) then do; put”ERROR Missing pdt value, obs number “; put _n_; end;
if (timedif LE 180) AND (timedif GT 91) then do; tnum180=tnum180+1; if pat in (0 1) then do; num180=num180+1; seng180=seng180+pdt-max(ind,inddto_-180); end; end; if (timedif LE 91) AND (timedif GT 30) then do; tnum180=tnum180+1; tnum91=tnum91+1; if pat in (0 1) then do; num91=num91+1; num180=num180+1; seng91=seng91+pdt-max(ind,inddto_-91); seng180=seng180+pdt-max(ind,inddto_-180); end; end; if (timedif LE 30) AND (timedif GT 14) then do; tnum180=tnum180+1; tnum91=tnum91+1; tnum30=tnum30+1; if pat in (0 1) then do; num30=num30+1; num91=num91+1; num180=num180+1; seng30=seng30+pdt-max(ind,inddto_-30); seng91=seng91+pdt-max(ind,inddto_-91); seng180=seng180+pdt-max(ind,inddto_-180); end; end; if timedif LE 14 then do; tnum14=tnum14+1; tnum30=tnum30+1; tnum91=tnum91+1; tnum180=tnum180+1; if pat in (0 1) then do; num14=num14+1; num30=num30+1; num91=num91+1; num180=num180+1; seng14=seng14+pdt-max(ind,inddto_-14); seng30=seng30+pdt-max(ind,inddto_-30); seng91=seng91+pdt-max(ind,inddto_-91); seng180=seng180+pdt-max(ind,inddto_-180); end; end; end;
end;
dto_hist=strip(put(uddto_,8.))||’;’||strip(dto_hist); ind_hist=strip(put(inddto_,8.))||’;’||strip(ind_hist); pat_hist=strip(put(pattype,8.))||’;’||strip(pat_hist); end;
run;
Comments