top of page
Search

Recency, Frequency, Monetary (RTF) variable generation using Danish National Health Register data.

Description

Below adaptable SAS code for generation of recency, frequency and monetary variables. Algorithm may be applied in a lot of contexts in which records contain dates (‘day of entry’, ‘day of exit’, ‘day of visit’) id number and maybe even revenue or cost.

Citation

Please cite this code as: Laier, G.H. (2016) Recency, Frequency and Monetary SAS programming script  [computer software]. Denmark. Link: http://hellmund.blogspot.dk/2016/02/recency-frequency-monetary-rtf-variable.html

Thanks! Gunnar Hellmund Laier, PhD, MSc

Explanation

In this context we form variable for analyses of Danish National Register data and form variables containing information on contacts, hospitalizations and days in hospital 14, 30, 91 and 180 days before a hospitalization or contact.

Key variables: cpr (security number), pattype (patient type, in- or outpatient), inddto (day of entry), uddto (day of exit), ambdto (day of visit).

Data step program

data calc.RFMdata(drop=dto_hist ind_hist pat_hist ind k pdt pat i inddto_ uddto_); set stalist; retain dto_hist ind_hist pat_hist; by cpr;

prev=.; length dto_hist $4000; length ind_hist $4000; length pat_hist $4000;

timedif=0; *Number of hospitalizations within last 14 days; tnum14=0; *Number of hospitalizations within last 30 days; tnum30=0; *Number of hospitalizations within last 91 days; tnum91=0; *Number of hospitalizations within last 180 days; tnum180=0;

*Inpatient hospitalizations (contacts/days in bed); num14=0; seng14=0; *Within last 30 days; num30=0; seng30=0; *Within last 91 days; num91=0; seng91=0; *Within last 180 days; num180=0; seng180=0;

uddto_=uddto; inddto_=inddto; format inddto_; format uddto_;

if pattype EQ 2 then do; uddto_=ambdto; inddto_=ambdto; sengdage=0; end;

if missing(uddto_) then do; if (pattype in (0 1)) AND not(missing(sengdage)) then uddto_=inddto_+sengdage; end;

if first.cpr then do; dto_hist=strip(put(uddto_,8.)); ind_hist=strip(put(inddto_,8.)); pat_hist=strip(put(pattype,8.)); end; else do;

 k=count(strip(dto_hist),’;’);  do i = 1 to k+1;

  pdt=input(scan(dto_hist,i,’;’),8.);   ind=input(scan(ind_hist,i,’;’),8.);   pat=input(scan(pat_hist,i,’;’),8.);

  if pdt LT inddto_ then do;

  if missing(prev) then prev=pdt;

  timedif=inddto_-pdt;   if missing(pdt) then do;   put”ERROR Missing pdt value, obs number “;   put _n_;   end;

  if (timedif LE 180) AND (timedif GT 91) then do;     tnum180=tnum180+1; if pat in (0 1) then do; num180=num180+1; seng180=seng180+pdt-max(ind,inddto_-180); end;   end;   if (timedif LE 91) AND (timedif GT 30) then do;     tnum180=tnum180+1;     tnum91=tnum91+1; if pat in (0 1) then do; num91=num91+1; num180=num180+1; seng91=seng91+pdt-max(ind,inddto_-91); seng180=seng180+pdt-max(ind,inddto_-180); end;   end;   if (timedif LE 30) AND (timedif GT 14) then do;     tnum180=tnum180+1;     tnum91=tnum91+1; tnum30=tnum30+1; if pat in (0 1) then do; num30=num30+1; num91=num91+1; num180=num180+1; seng30=seng30+pdt-max(ind,inddto_-30); seng91=seng91+pdt-max(ind,inddto_-91); seng180=seng180+pdt-max(ind,inddto_-180); end;   end;   if timedif LE 14 then do; tnum14=tnum14+1; tnum30=tnum30+1; tnum91=tnum91+1; tnum180=tnum180+1; if pat in (0 1) then do; num14=num14+1; num30=num30+1; num91=num91+1; num180=num180+1; seng14=seng14+pdt-max(ind,inddto_-14); seng30=seng30+pdt-max(ind,inddto_-30); seng91=seng91+pdt-max(ind,inddto_-91); seng180=seng180+pdt-max(ind,inddto_-180); end;   end;   end;

  end;

dto_hist=strip(put(uddto_,8.))||’;’||strip(dto_hist); ind_hist=strip(put(inddto_,8.))||’;’||strip(ind_hist); pat_hist=strip(put(pattype,8.))||’;’||strip(pat_hist); end;

run;

1 view0 comments

Recent Posts

See All

dplyr or base R

dplyr and tidyverse are convenient frameworks for data management and technical analytic programming. With more than 25 years of R experience, I have a tendency to analyze programmatic problems before

©2020 by Danish Institute for Data Science. Proudly created with Wix.com

bottom of page