Skip to main content
An official website of the United States government

BrCa RAM Readme

ReadMe.fil                                                                  12/14/12

Documentation and user guide for SAS macro to project for absolute risk based on
the relative risk models for (whites, hispanic, other), asian-american, or
african-american. 1-AR, composite breast cancer incidences, competing hazards
handling of missing covariate values and covariate editing procedures follow
NCI BrCa Risk Assessment Tool (NCI BCRAT).

In this release of the SAS macro, in addition to the abs risk projection for the
women with under investigation, for each women, an associated race specific
abs risk projection for an "average" women is also provided.  This quantity is
included to follow the NCI Breast Cancer Risk Assessment Tool which provides an
"avg" women risk projection as well.

Life time risk for a women can be obtained by setting her "projection age" to 90.

A simple 3 step example program  (BCRAM_example.sas)  on the use of the SAS macro
(Br)east (Ca)ncer (R)isk (A)ssessment (M)acro  --  BrCa_RAM.

Step 1:  the included sas program  BCRAM_example.sas  reads the supplied data file
       "Sample.in", which contains the Gail BrCa risk covarites and projection
        age interval for 26 hypothetical women.  It then saves a temporary SAS
        system file with name of "ExampleIn" to be used as input to the SAS macro
        BrCa_RAM:

        data    ExampleIn;             *** name of the sas system file which the macro parameter
                                          &In_File  should point to upon macro invocation;

            infile 'Sample.in'    firstobs=9;       *** "Sample.in"  is the RR covariate input file
                                                         firstobs=9  skips first 8 header
                                                         records on input file  "Sample.in";
                *** SAS variable names;

            input   IDD

                    InitalAge
                    ProjtnAge

                    NBiop
                    HP

                    AgeMenarchy
                    AgeFstLive
                    Num_Rels

                    Ethnicity;
        run;

Step 2:  sas program  BCRAM_example.sas  runs the SAS macro  BrCa_RAM:

        %include  "BrCa_RAM";                   *** include the sas MACRO BrCa_RAM;

        Involking the sas macro BrCa_RAM to perform the BrCa projections.

        The temporary sas input  file is set to   "ExampleIn".
        The temporary sas output file is set to   "ExampleOut".

        The macro parameters  WID, T1, T2, N_Biop, HyperPlasia, AgeMen,
        Age1st, N_Rels, and Race point to their corresponding sas variables
        on the sas file  "ExampleIn",  namely
        IDD, InitalAge, ProjtnAge, NBiop, HP, AgeMenarchy, AgeFstLive,
        Num_Rels and Ethnicity respectively.

        The macro parameter AbsRsk points to the sas variable Abs_Risk which
        will be added to the output sas file  "ExampleOut".  The output sas
        file will also contain all the variables on the input sas file.

                      Macro          pointing      SAS file name or
                      parameter      to            SAS variable name;

        %BrCa_RAM    (In_File        =             ExampleIn       ,
                      Out_File       =             ExampleOut      ,

                      WID            =             IDD             ,

                      T1             =             InitalAge       ,
                      T2             =             ProjtnAge       ,

                      N_Biop         =             NBiop           ,
                      HyperPlasia    =             HP              ,

                      AgeMen         =             AgeMenarchy     ,
                      Age1st         =             AgeFstLive      ,
                      N_Rels         =             Num_Rels        ,

                      Race           =             Ethnicity       ,
                      CharRace       =             CharRace        ,

                      RR_Star1       =             RR_Star1        ,
                      RR_Star2       =             RR_Star2        ,

                      AbsRsk         =             Absolute_Risk);

Step 3    It thens list the contents of the temporary output sas system file
         "ExampleOut" which contains the projected absolute risk as well as the
          relative risk covariate values.  Note that any further processing
          requiring the projected absolute risk, must be performed on the output
          sas system file  "ExampleOut"  named in this sample program;

          data  ExampleOut;                  *** output file from macro, defined by pointing the;
          set   ExampleOut;                  *** macro parameter  &Out_File  to  "ExampleOut";

                file print;

                if (_N_ eq 1) then do;
                   put "                                                            ";
                   put "                        #  Hypr    HP  Age  Age    #       "
                       "      RR      RR     Abs";
                   put "     ID    T1    T2  Biop  plas    RR  Men  1st  Rel   Race"
                       "  Age<50  Age>50    Risk(%)";
                   put " ";
                end;

                *** all variables below take on their SAS variable names, not their macro names;
                *** see SAS variable names defined in Step 1;

                if (_n_ le 100) then

                   put IDD                  7.0
                       InitalAge            6.1
                       ProjtnAge            6.1

                       NBiop                6.0
                       HP                   6.0
                       R_Hyp                6.2

                       AgeMenarchy          5.0
                       AgeFstLive           5.0
                       Num_Rels             5.0

                      "  "
                       Ethnicity            2.0
                      "="
                       CharRace        $char2.

                       RR_Star1             8.4
                       RR_Star2             8.4

                       Absolute_Risk       10.4;
          run;

Detailed description of the operation and output items from the SAS macro BrCa_RAM:

Input data:
----------
In_File= should "point" to a SAS data set containing all the required input data
items needed  to perform risk projections, such as initial age, projection age, BrCa
relative risk covariates and race.  See the paragraph "Input data items ... " below,
for a detailed description of all required data items.

Output data:
-----------
Out_File= should "point" to a SAS output data set which will contain the projected
absolute risk of BrCa as well as the original input data items.

Macro structure:
---------------
       Macro          Macro
       name           parameters               "points" to SAS names

%macro  BrCa_RAM      (In_File     =,            name of input  sas data set

                      Out_File    =,            name of output sas data set

                      WID         =,            ID #   1,2,3 ...  postive integers

                      T1          =,            initial age,    age at beginning of
                                                                projection interval

                      T2          =,            projection age, age at end of
                                                                projection interval

                      N_Biop      =,            # biopsies performed
                      HyperPlasia =,            did biopsy exhibit atypical hyperplasia?

                      AgeMen      =,            age at menarchy
                      Age1st      =,            age at 1st live birth
                      N_Rels      =,            # 1st degree relatives with brca

                      Race        =,            race
                      CharRace    =,            2 character abbreviation for race

                      RR_Star1    =,            rr for ages lt 50
                      RR_Star2    =,            rr for ages ge 50

                      AbsRsk      =);           projected absolute risk of brca (%)

appropriate sas file/sas variable names must be associated with all macro parameters
on the invocation of the sas macro  "BrCa_RAM".

For example by coding  "In_File = AARPin"  tells the macro that the user created
sas file  "AARPin"  is to be used for input of variables.  Similarly coding
"N_Biop = Num_Biops",  lets the macro know that the sas variable "Num_Biops" in the
sas input file  "AARPin"  contiains the count of the # of biopsies performed.

To involke the sas macro in your sas program, an %include statement must be coded in
your sas program, which points to the sas macro  "BrCa_RAM".

For example:

the statement: %include "BrCa_RAM";                points to the sas macro  BrCa_RAM
                                                  stored in your current directory

the statement: %include "c:\sas.macro\BrCa_RAM";   points to the sas macro  BrCa_RAM
                                                  stored in the directory
                                                  c:\sas.macro

Input data items needed to project for BrCa absolute risk and consistency requirements:

Macro
parameter      Definition                    Valid values

WID            ID # for each woman           postive integers 1,2,3....

T1             Initial         age           all real numbers T1 in [20,90)
T2             BrCa projection age           all real numbers T2 such that T2 > T1

              CONSTRAINT on T1 and T2:      20 <= T1 < T2 <= 90

N_Biop         # of biopsies                 0,1,2 ...       99=unk (99 recoded to 0)

HyperPlasia    Did biopsy display            0=no,  1=yes,   99=unk or no biopsy
              atypical hyperplasia?

AgeMen         Age at menarchy               positive integer age less than or equal to T1, 99=unk

Age1st         Age at first live birth       integer age greater or equal to age at menarchy
                                            and less than or equal to initial age.

                                            98=nulliparous (no live birth),
                                            99=unk

N_Rels         # 1st degree relatives        0,1,2 ...       99=unk
              with BrCa

Race           Race                          1=Wh   white 1983-87 SEER rates (rates used by NCI BrCa Risk Assessment Tool)
                                            2=AA   african-american,
                                            3=Hi   hispanic,
                                            4=NA   other                    (native americans and unknown race)
                                            5=Wo   white 1995-03 SEER rates (rates used for further research)

                                            6=Ch   chinese
                                            7=Ja   japanese
                                            8=Fi   filipino
                                            9=Hw   hawaiian
                                           10=oP   other pacific islander
                                           11=oA   other asian

                                            note that hispanic and other ethnic women
                                            risks are based on white women log relative
                                            risks.  hispanic women risk are also based on
                                            hispanic seer rates while other women
                                            risk are based on white women seer rates.

              NOTE: even though it is allowed, from good data processing practice
                    it is recommended NOT to mix the two different rates for
                    white women during the same analysis.  if a comparison
                    of the change in absolute risk is desired from using the
                    two different rates, two analysis runs should be performed,
                    once when one rate is used (i.e. Race=1, 1983-87 seer rates) and
                    once when the other rate is used (i.e. Race=5, 1995-2003 seer rates).
                    The rates used by tne NCI Breast Cancer Risk Assessment Tool is the
                    11983-1987 seer rates.

Recoding and checking of relative risk covariate values performed by  "BrCa_RAM":

                                    raw value                       recoded to

N_Biop: # biopsies                   0 or 99                                  0
                                    1                                        1
                                    2,3,4 ... and not 99                     2

AgeMen: age at menarchy              14,15,16 ... 99                          0
                                    12,13                                    1
                                    11 and younger                           2

Age1st: age at 1st live birth        19 and younger  or  99                   0
                                    20,21,22,23,24                           1
                                    25,26,27,28,29  or  98=(nulliparous)     2
                                    30,31,32 ...    and not 98 and not 99    3

N_Rels: # 1st degree rel with BrCa   0 or 99                                  0
                                    1                                        1
                                    2,3,4 ... and not 99                     2

Consistency patterns for  # of Biopsies and Hyperplasia:

Requirment: (A) N_Biops = 0  or   99  then  Hyperplasia  MUST = 99 (not applicable)
           (B) N_Biops > 0 and < 99  then  Hyperplasia       =  0, 1 or 99 (unk)

if ANY of the above 2 REQUIREMENTS are violated, the absolute risk will be set to the
sas missing value ".".  The consequences to the relative risk (RR) for the above two
requirements is:

(A) # biopsies = 0 or   99  &  Hyperplasia  =99 (not applicable) inflates  RR  by  1.00

(B) # biopsies > 0 and <99  &  Hyperplasia  = 0 ( no hyprplasia) inflates  RR  by  0.93
                                           = 1 (yes hyprplasia) inflates  RR  by  1.82
                                           =99 (unk hyprplasia) inflates  RR  by  1.00

Edit checking for remaining relative risk covariates, AgeMen, Age1st and N_Rels:

AgeMen:  age at menarchy must be postive integer less than equal to initial age T1

  NOTE  For African-American women  AgeMen <= 11 are grouped with AgeMen = 12 or 13

Age1st:  age at 1st live birth must be postive integer greater than equal to AgeMen and
        less than or equal to Initial age T1

  NOTE  For African-American women Age1st is not included in the RR model and all values
        for this variable are recoded to 0

N_Rels:  # of 1st degree relatives with BrCa must be 0,1,2...

Following is a listing of the sample raw input data set "Sample.in"
(column heading included for clarity):

                           Num    Hyp    Age    Age    Num
 IDD       T1       T2    Biop   Plas    Men    1st    Rel        Race

   1     45.2     53.3      99     99     10     20      1           0
   2     45.2     53.3      99      1     10     20      1           1
   3     45.2     53.3      99      0     10     20      1           2
   4     45.2     53.3       0     99     10     20      1           3
   5     45.2     53.3       1     99     10     20      1           4
   6     45.2     53.3       1     99     14     19      1           5
   7     45.2     53.3      99     99     99     19      1           6
   8     45.2     53.3       1      1     14     19      1           7
   9     45.2     53.3      99      1     14     99      1           8
  10     45.2     53.3       1      0     14     19      1           9
  11     45.2     53.3      99      0     99     99      1          10
  12     45.2     53.3       0      0     14     19      1          11
  13     45.2     53.3       0     99     10     20      1          12
  14     45.2     53.3       0      1     10     20      1           0
  15     45.2     53.3       0      0     10     20      1           1
  16     45.2     53.3       1      0     10     20      1           2
  17     35.0     40.0       4     99     11     25      0           3
  18     35.0     40.0       4     99     11     98      0           4
  19     35.0     40.0       4     99     11     10      0           5
  20     35.0     40.0       4     99     36     25      0           6
  21     27.0     90.0      99     99     13     22      0           7
  22     27.0     90.0      99     99     13     22     99           8
  23     18.0     26.0      99     99     13     22     99           9
  24     27.0     26.0      99     99     13     22     99          10
  25     85.0     91.0      99     99     13     22     99          11
  26     86.0     90.0      99     99     13     22     99          12

After the absolute risks have been generated, descriptive statistics by applying PROC
MEANS to the quantities Error_Ind, AbsRsk, RR_Star1 and RR_Star2 is performed.  When the
mean and standard deviation for the variable  "Error_Ind"  is 0, implies that no errors
have not been found.  Otherwise when the mean and std for  "Error_Ind" is not 0, implies
that errors have been found.  When errors are found, the # of records with errors is
the count asscociated with "AbsRsk" listed under NMiss (# of missing).  Furthermore, a
listing file for erroronious records follows the PROC Means output.  For example:

BrCa_RAM,  sas macro to project for BrCa absolute risk                 September 15, 2010
Quick check for errornous records on input file

IF MEAN OF  'Error_Ind'   EQUALS  0,   ERROR  FREE.    ERROR LISTING BELOW WILL BE EMPTY.
IF MEAN OF  'Error_Ind'   IS NOT  0,   ERRORS EXISTS.  CHECK ERROR LISTING BELOW.

(# of records with errors is the # listed under the NMiss column in the 'AbsRsk' line)
                                                                                       N
Variable         Label                                           Mean   Std Dev   N  Miss
-----------------------------------------------------------------------------------------
Error_Ind        If mean not 0, implies ERROR in file         0.57692   0.50383  26     0
Absolute_Risk    Abs risk(%) of BrCa in age interval [T1,T2)  3.76766   2.57844  11    15
RR_Star1         Relative risk age lt 50                      3.43948   1.92321  13    13
RR_Star2         Relative risk age ge 50                      2.86656   1.54840  13    13
-----------------------------------------------------------------------------------------

Since NMiss=15 for Absolute Risk, we note that the error listing lists 15 records below:

Error listing for the input file

 ID                 #  Hypr  Hypr  Age  Age    #             RR      RR              Pat
  #    T1    T2  Biop  plas    RR  Men  1st  Rel   Race  Age<50  Age>50   AbsRsk(%)    #

  1  45.2  53.3    99    99  1.00   10   20    1      0     .       .          .      29
     45.2  53.3     0    99  1.00    2    1    1     ??

  2  45.2  53.3    99     1   .     10   20    1      1     .       .          .       .
     45.2  53.3     A     A   A      2    1    1     Wh

  3  45.2  53.3    99     0   .     10   20    1      2     .       .          .       .
     45.2  53.3     A     A   A      1    0    1     AA

  9  45.2  53.3    99     1   .     14   99    1      8     .       .          .       .
     45.2  53.3     A     A   A      0    0    1     Fi

 11  45.2  53.3    99     0   .     99   99    1     10     .       .          .       .
     45.2  53.3     A     A   A      0    0    1     oP

 12  45.2  53.3     0     0   .     14   19    1     11     .       .          .       .
     45.2  53.3     A     A   A      0    0    1     oA

 13  45.2  53.3     0    99  1.00   10   20    1     12     .       .          .      29
     45.2  53.3     0    99  1.00    2    1    1     ??

 14  45.2  53.3     0     1   .     10   20    1      0     .       .          .       .
     45.2  53.3     A     A   A      2    1    1     ??

 15  45.2  53.3     0     0   .     10   20    1      1     .       .          .       .
     45.2  53.3     A     A   A      2    1    1     Wh

 19  35.0  40.0     4    99  1.00   11   10    0      5     .       .          .       .
     35.0  40.0     2    99  1.00    2    .    0     Wo

 20  35.0  40.0     4    99  1.00   36   25    0      6     .       .          .       .
     35.0  40.0     2    99  1.00    .    .    0     Ch

 23  18.0  26.0    99    99  1.00   13   22   99      9     .       .          .       .
       .   26.0     0    99  1.00    1    .    0     Hw

 24  27.0  26.0    99    99  1.00   13   22   99     10    1.42    1.42        .      16
       .     .      0    99  1.00    1    1    0     oP

 25  85.0  91.0    99    99  1.00   13   22   99     11    1.42    1.42        .      16
     85.0    .      0    99  1.00    1    1    0     oA

 26  86.0  90.0    99    99  1.00   13   22   99     12     .       .          .      16
     86.0  90.0     0    99  1.00    1    1    0     ??

For each of the records with error, the record is listed followed by a line which gives
some indication as to where the error occured.  For example, the record with ID=2 has
an "A" listed under the 3 variables associated with Biopy i.e. N_Biop, Hyperplasia
and Hypr_RR.  This means that ID=2 has violated consistency defined by Requirement
(A). Similarly for IDs 3,9,11,12,14 and 15 which display violations of
Requirements (A).  For IDs 19 and 20, violation of AgeMen and/or Age1st consistency
are seen.  Note the SAS missing value "." listed under AgeMen and/or Age1st.
For IDs 23, 24 and 25 violation of T1 and/or T2 consistency requirements are seen.
Again, note the "." listed under T1 and/or T2.  This small sample data set "Sample.in"
in no way exhausts all the possible ways in which the data can be in error, but it should
give a guide and indication on how to check and correct errors when they do occur.

Finally,  the listing from Step3:

Listing of the first 100 records in temporary output sas system file  ExampleOut
Further analysis depending on the projected abs risk must be performed using the
output sas system file which is invoked by the sas macro parameter  'Out_File'

                       #  Hypr    HP  Age  Age    #               RR        RR   AbsRisk   AbsRisk
    ID    T1    T2  Biop  plas    RR  Men  1st  Rel   Race    Age<50   Age>=50        (%)  AvgWm(%)

     1  45.2  53.3    99    99  1.00   10   20    1   0=??     .         .         .         .
     2  45.2  53.3    99     1   .     10   20    1   1=Wh     .         .         .         .
     3  45.2  53.3    99     0   .     10   20    1   2=AA     .         .         .         .
     4  45.2  53.3     0    99  1.00   10   20    1   3=Hi    3.2354    3.2354    2.1081    1.1313
     5  45.2  53.3     1    99  1.00   10   20    1   4=NA    5.4926    4.1180    4.4413    1.7673
     6  45.2  53.3     1    99  1.00   14   19    1   5=Wo    4.4263    3.3185    3.9762    1.7673
     7  45.2  53.3    99    99  1.00   99   19    1   6=Ch    2.2075    2.2075    1.2496    1.1644
     8  45.2  53.3     1     1  1.82   14   19    1   7=Ja    6.9820    6.9820    5.7757    1.7279
     9  45.2  53.3    99     1   .     14   99    1   8=Fi     .         .         .         .
    10  45.2  53.3     1     0  0.93   14   19    1   9=Hw    3.5677    3.5677    3.9061    2.2614
    11  45.2  53.3    99     0   .     99   99    1  10=oP     .         .         .         .
    12  45.2  53.3     0     0   .     14   19    1  11=oA     .         .         .         .
    13  45.2  53.3     0    99  1.00   10   20    1  12=??     .         .         .         .
    14  45.2  53.3     0     1   .     10   20    1   0=??     .         .         .         .
    15  45.2  53.3     0     0   .     10   20    1   1=Wh     .         .         .         .
    16  45.2  53.3     1     0  0.93   10   20    1   2=AA    2.3458    2.0974    2.6899    1.6479
    17  35.0  40.0     4    99  1.00   11   25    0   3=Hi    5.3860    3.0274    0.6789    0.2183
    18  35.0  40.0     4    99  1.00   11   98    0   4=NA    5.3860    3.0274    1.0230    0.2814
    19  35.0  40.0     4    99  1.00   11   10    0   5=Wo     .         .         .         .
    20  35.0  40.0     4    99  1.00   36   25    0   6=Ch     .         .         .         .
    21  27.0  90.0    99    99  1.00   13   22    0   7=Ja    1.4210    1.4210    8.8277   12.2076
    22  27.0  90.0    99    99  1.00   13   22   99   8=Fi    1.4210    1.4210    6.7678    9.4245
    23  18.0  26.0    99    99  1.00   13   22   99   9=Hw     .         .         .         .
    24  27.0  26.0    99    99  1.00   13   22   99  10=oP    1.4210    1.4210     .         .
    25  85.0  91.0    99    99  1.00   13   22   99  11=oA    1.4210    1.4210     .         .
    26  86.0  90.0    99    99  1.00   13   22   99  12=??     .         .         .         .

Statistical issues  should be directed to:   Dr. Mitchell Gail    gailm@exchange.nih.gov
Technical   details should be directed to:   Mr. David Pee        peed@imsweb.com

Email