ReadMe.fil 7/17/09 Documentation and user guide for a SAS macro to project for white women BrCa absolute risk based on a RR model which includes % mammographic density, body weight and a subset of the standard breast cancer relative risk covariates. Relative risk covariates, 1-AR, composite breast cancer incidences, competing hazards are described in "Projecting absolute invasive breast cancer risk in white women with a model that includes mammographic density". JNCI 2006 98(17) 1215-26. Handling of missing covariate values and covariate editing procedures follow NCI BrCa Risk Assessment Tool (NCI BCRAT). A simple 3 step example on the use of the SAS macro: "BrCa_MD_RAM" (Br)east (Ca)ncer with (M)ammographic (D)ensity (R)isk (A)ssessment (M)acro Step 1: the included sas program "BC_MD_example.sas" reads the supplied data file "Sample.in", which contains the BrCa risk covarites and projection age interval for 50 hypothetical women. It then saves a SAS system file with name of "ExampleIn" to be used as input to the SAS macro "BrCa_MD_RAM": data ExampleIn; *** name of the saved sas system file; infile 'Sample.in'; *** sample RR covariate input file; *** SAS variable names selected by user; input InitalAge ProjtnAge PerCent_Den Num_Rels NBiop AgeFstLive Body_Weight; Rec_Num = _n_; run; Step 2: the included sas program BCRAM_example.sas then runs the SAS macro "BrCa_MD_RAM": %include "BrCa_MD_RAM"; *** include the sas MACRO "BrCa_MD_RAM"; Involking the sas macro "BrCa_MD_RAM" to perform the BrCa projections. The temporary sas input file is set to "ExampleIn". The temporary sas output file is set to "ExampleOut". The macro parameters T1, T2, PDensty, N_Rels, N_Biop, Age1st and BdyWght point to their corresponding sas variables on the sas file "ExampleIn", namely InitalAge, ProjtnAge, PerCent_Den, Num_Rels, NBiop, AgeFstLive and Body_Weight. The macro parameter AbsRsk points to the sas variable Absolute_Risk which will be added to the output sas file "ExampleOut". The output sas file will also contain all the variables on the input sas file. Macro pointing SAS file name or parameter to SAS variable name; %BrCa_MD_RAM (In_File = ExampleIn , Out_File = ExampleOut , T1 = InitalAge , T2 = ProjtnAge , PDensty = PerCent_Den , N_Rels = Num_Rels , N_Biop = NBiop , Age1st = AgeFstLive , BdyWght = Body_Weight , AbsRsk = Absolute_Risk); Step 3 It thens list the contents of the temporary output sas system file "ExampleOut" which contains the projected absolute risk as well as the relative risk covariate values. Note that any further processing requiring the projected absolute risk, must be performed on the output sas system file; data ExampleOut; set ExampleOut; file print; if (_N_ eq 1) then do; put " Record % # # Age Body" " (1-ar)RR (1-ar)RR Pattrn"; put " # T1 T2 Dens Rel Biop 1st Wght" " Age<50 Age>50 AbsRsk #"; put " "; end; *** all variables below take on their SAS variable names, not their macro names see SAS variable names defined in Step 1; if (Rec_Num le 100) then put Rec_Num 7.0 InitalAge 6.1 ProjtnAge 6.1 PerCent_Den 7.1 Num_Rels 5.0 NBiop 6.0 AgeFstLive 5.0 Body_Weight 8.1 One_AR_RR1 10.4 One_AR_RR2 10.4 Absolute_Risk 12.6 PattrnNumber 8.0; run; Detailed description of the operation and output items from the SAS macro "BrCa_MD_RAM": Input data: ---------- In_File= should "point" to a SAS data set containing all the required input data items needed to perform risk projections, such as initial age, projection age, BrCa relative risk covariates. See the paragraph "Input data items ... " below, for a detailed description of all required data items. Output data: ----------- Out_File= should "point" to a SAS output data set which will contain the projected absolute risk of BrCa as well as the original input data items. Macro structure: --------------- Macro Macro name parameters "points" to SAS names %macro BrCa_MD_RAM (In_File =, name of input sas data set Out_File =, name of output sas data set T1 =, initial age, age at beginning of projection interval T2 =, projection age, age at end of projection interval PDensty =, % mammographic density N_Rels =, # 1st degree relatives with brca N_Biop =, # biopsies performed Age1st =, age at 1st live birth BdyWght =, body weight in lbs AbsRsk =); projected absolute risk of brca appropriate sas file/sas variable names must be associated with each macro parameters on the invocation of the sas macro "BrCa_MD_RAM". For example coding "In_File = AARPin" tells the macro that the user created sas file "AARPin" is to be used for input of variables. Similarly, coding "N_Biop = Num_Biops", lets the macro know that the sas variable "Num_Biops" in the sas input file "AARPin" contiains the count of the # of biopsies performed. To involke the sas macro in your sas program, an #include statement must be coded in your sas program. For example: the statement: %include "BrCa_MD_RAM"; points to the sas macro BrCa_MD_RAM stored in your current directory the statement: %include "c:\sas.macro\BrCa_MD_RAM"; points to the sas macro BrCa_MD_RAM stored in the directory c:\sas.macro Input data items needed to project for BrCa absolute risk and consistency requirements: Macro parameter Definition Valid values T1 Initial age any real number T1 in [20,85] T2 BrCa projection age any real number T2 such that T2 > T1 PDensty % mammographic density any real number in [0,100] N_Rels # 1st degree relatives 0,1,2 ... 99=unk with BrCa N_Biop # of biopsies 0,1,2 ... 99=unk Age1st Age at first live birth less than or equal to initial age. 98=nulliparous, 99=unk BdyWght Body weight in lbs any positive real number Recoding and checking of relative risk covariate values performed by the SAS macro: raw value recoded to PDensty: % mammographic density 0 0 ( 0, 25) 1 [25, 50) 2 [50, 75) 3 [75,100] 4 N_Rels : # 1st degree rel with BrCa 0 or 99=unk 0 1 1 2,3,4 ... and not 99=unk 2 N_Biop : # biopsies 0 or 99=unk 0 1 or 1 2,3,4 ... and not 99=unk 2 Age1st : age at 1st live birth 19 and younger or 99=unk 0 20,21,22,23,24 1 25,26,27,28,29 or 98(nulliparous) 2 30,31,32 ... and not 98 and not 99 3 BdyWght: body weight in lbs ( 0,100] 0 (100,125] 1 (125,150] 2 (150,175] 3 (175,200] 4 (200, +] 5 In the above, note the liberal use of "open", "closed" interval limits notation. Edit checking for relative risk covariates, PDensty,N_Rels,N_Biop,Age1st & BdyWght: PDensty: % mammographic density must be between 0 and 100 inclusive. no accomodating for unknown % density. unkown %density results in an unknown absolute risk projection. N_Rels : # of 1st degree relatives with BrCa must be 0,1,2... unk=99 N_Biop : # biopsies must be 0,1,2... unk=99 Age1st : age first live birth must be less than or equal to Initial age T1 nulliparous=98, unk=99 BdyWght: body weight must be a positive real number. no accomodation for unknown body weight. unknown weight results in an unknown absolute risk. Following is a listing of the sample raw input data set "Sample.in" (column heading included for clarity): Age Inital Projtn PerCent_ Fst Body_ Rec_Num Age Age Den Num_Rels NBiop Live Weight 1 67 72 96 0 2 25 175 2 44 49 24 0 1 30 125 3 38 43 0 2 2 19 125 4 54 59 72 0 0 19 200 5 54 59 72 2 0 25 100 6 44 49 24 1 2 19 150 7 44 49 24 2 2 25 125 8 44 49 24 0 2 25 150 9 54 59 72 1 0 19 200 10 44 49 24 2 0 30 150 11 67 72 96 1 2 19 100 12 58 63 48 0 2 30 125 13 67 72 96 0 1 25 100 14 67 72 96 1 0 19 125 15 54 59 72 0 2 30 200 16 67 72 96 0 0 19 200 17 67 72 96 0 2 19 200 18 44 49 24 2 0 19 150 19 44 49 24 0 2 19 125 20 58 63 48 1 1 19 200 21 44 49 24 1 1 20 125 22 44 49 24 1 2 25 175 23 44 49 24 1 0 20 100 24 58 63 48 0 1 20 175 25 38 43 0 0 1 25 100 26 54 59 72 0 1 30 100 27 67 72 96 0 0 30 150 28 54 59 72 1 0 25 125 29 54 59 72 2 0 25 200 30 67 72 96 2 1 25 125 31 58 63 48 2 2 30 100 32 58 63 48 1 1 20 175 33 58 63 48 2 1 19 250 34 58 63 48 1 1 19 250 35 58 63 48 2 0 25 125 36 44 49 24 2 0 25 175 37 67 72 96 0 2 30 200 38 54 59 72 0 0 25 200 39 58 63 48 0 1 25 100 40 38 43 0 0 2 25 200 41 54 59 72 1 2 19 250 42 67 72 96 1 2 19 125 43 58 63 48 1 0 25 200 44 38 43 0 0 2 19 200 45 67 72 96 2 2 19 250 46 67 72 96 0 0 20 150 47 44 49 24 0 2 25 125 48 67 72 96 0 0 19 175 49 54 59 72 1 1 19 200 50 67 72 96 2 1 20 250 Following is a listing of the consequence of applying the macro to the input file Note the raw values for the RR covaraties are listed and not the recoded values of 0,1,2,3,4 or 5 Record % # # Age Body (1-ar)RR (1-ar)RR Pattrn # T1 T2 Dens Rel Biop 1st Wght Age<50 Age>50 AbsRsk # 1 67.0 72.0 96.0 0 2 25 175.0 3.6546 4.1787 0.092145 928 2 44.0 49.0 24.0 0 1 30 125.0 0.7936 0.9074 0.007421 260 3 38.0 43.0 0.0 2 2 19 125.0 1.1110 1.2703 0.005499 194 4 54.0 59.0 72.0 0 0 19 200.0 1.4190 1.6225 0.026740 653 5 54.0 59.0 72.0 2 0 25 100.0 1.9960 2.2822 0.037406 805 6 44.0 49.0 24.0 1 2 19 150.0 1.2335 1.4104 0.011511 339 7 44.0 49.0 24.0 2 2 25 125.0 2.1257 2.4305 0.019753 422 8 44.0 49.0 24.0 0 2 25 150.0 1.0861 1.2418 0.010142 279 9 54.0 59.0 72.0 1 0 19 200.0 2.2110 2.5281 0.041351 725 10 44.0 49.0 24.0 2 0 30 150.0 1.8498 2.1150 0.017211 381 11 67.0 72.0 96.0 1 2 19 100.0 2.1743 2.4861 0.055917 985 12 58.0 63.0 48.0 0 2 30 125.0 1.4302 1.6353 0.030560 500 13 67.0 72.0 96.0 0 1 25 100.0 1.4815 1.6940 0.038458 901 14 67.0 72.0 96.0 1 0 19 125.0 1.6153 1.8469 0.041855 938 15 54.0 59.0 72.0 0 2 30 200.0 3.8077 4.3537 0.070136 719 16 67.0 72.0 96.0 0 0 19 200.0 1.9789 2.2627 0.051028 869 17 67.0 72.0 96.0 0 2 19 200.0 3.3044 3.7783 0.083705 917 18 44.0 49.0 24.0 2 0 19 150.0 1.1511 1.3161 0.010745 363 19 44.0 49.0 24.0 0 2 19 125.0 0.6381 0.7297 0.005972 266 20 58.0 63.0 48.0 1 1 19 200.0 2.0486 2.3424 0.043481 533 21 44.0 49.0 24.0 1 1 20 125.0 0.9013 1.0306 0.008424 320 22 44.0 49.0 24.0 1 2 25 175.0 2.0994 2.4004 0.019511 352 23 44.0 49.0 24.0 1 0 20 100.0 0.5623 0.6429 0.005264 295 24 58.0 63.0 48.0 0 1 20 175.0 1.2414 1.4194 0.026581 466 25 38.0 43.0 0.0 0 1 25 100.0 0.3916 0.4478 0.001942 37 26 54.0 59.0 72.0 0 1 30 100.0 1.2443 1.4227 0.023487 691 27 67.0 72.0 96.0 0 0 30 150.0 2.0666 2.3630 0.053225 885 28 54.0 59.0 72.0 1 0 25 125.0 1.5891 1.8169 0.029896 734 29 54.0 59.0 72.0 2 0 25 200.0 4.7268 5.4046 0.086309 809 30 67.0 72.0 96.0 2 1 25 125.0 4.4622 5.1021 0.111303 1046 31 58.0 63.0 48.0 2 2 30 100.0 2.7992 3.2006 0.058928 643 32 58.0 63.0 48.0 1 1 20 175.0 1.9344 2.2118 0.041106 538 33 58.0 63.0 48.0 2 1 19 250.0 3.9599 4.5278 0.082322 606 34 58.0 63.0 48.0 1 1 19 250.0 2.5414 2.9058 0.053650 534 35 58.0 63.0 48.0 2 0 25 125.0 1.7754 2.0300 0.037794 590 36 44.0 49.0 24.0 2 0 25 175.0 1.9590 2.2400 0.018219 376 37 67.0 72.0 96.0 0 2 30 200.0 5.3103 6.0718 0.130975 935 38 54.0 59.0 72.0 0 0 25 200.0 1.9468 2.2260 0.036501 665 39 58.0 63.0 48.0 0 1 25 100.0 0.7617 0.8709 0.016395 469 40 38.0 43.0 0.0 0 2 25 200.0 1.1984 1.3703 0.005930 65 41 54.0 59.0 72.0 1 2 19 250.0 4.5799 5.2367 0.083744 774 42 67.0 72.0 96.0 1 2 19 125.0 2.6972 3.0840 0.068881 986 43 58.0 63.0 48.0 1 0 25 200.0 2.1751 2.4870 0.046102 521 44 38.0 43.0 0.0 0 2 19 200.0 0.8735 0.9988 0.004326 53 45 67.0 72.0 96.0 2 2 19 250.0 9.9526 11.3798 0.231001 1062 46 67.0 72.0 96.0 0 0 20 150.0 1.5063 1.7223 0.039088 873 47 44.0 49.0 24.0 0 2 25 125.0 0.8755 1.0011 0.008184 278 48 67.0 72.0 96.0 0 0 19 175.0 1.5953 1.8240 0.041347 868 49 54.0 59.0 72.0 1 1 19 200.0 2.8571 3.2668 0.053105 749 50 67.0 72.0 96.0 2 1 20 250.0 9.0215 10.3152 0.211935 1044 After the absolute risks have been generated, descriptive statistics by applying PROC MEANS to the quantities Error_Ind, AbsRsk, One_AR_RR1 and One_AR_RR2 is performed. When the mean and standard deviation for the variable "Error_Ind" is 0, implies that no errors have not been found. Otherwise when the mean and std for "Error_Ind" is not 0, implies that errors have been found. When errors are found, the # of records with errors is the count asscociated with "AbsRsk" listed under NMiss (# of missing). Furthermore, a listing file for erroronious records follows the PROC Means output. For example: 10:52 Wednesday, July 15, 2009 BrCa_MD_RAM, sas macro to project for BrCa absolute risk Quick check for errornous records on input file IF MEAN OF 'Error_Ind' EQUALS 0, ERROR FREE. ERROR LISTING BELOW WILL BE EMPTY. IF MEAN OF 'Error_Ind' IS NOT 0, ERRORS EXISTS. CHECK ERROR LISTING BELOW. (# of records with errors is the # listed under the NMiss column in the 'AbsRsk' line) The MEANS Procedure N Variable Label Mean Std Dev N Miss -------------------------------------------------------------------------------------------------- Error_Ind If mean not 0, implies ERROR in file -----> 0.00000 0.00000 50 0 Absolute_Risk Abs risk of BrCa in age interval [T1,T2) 0.04673 0.04672 50 0 One_AR_RR1 (1-AR)*RelRsk age lt 50 2.33432 1.87141 50 0 One_AR_RR2 (1-AR)*RelRsk age ge 50 2.66907 2.13977 50 0 -------------------------------------------------------------------------------------------------- BC_MD_example.sas, example sas program using sas macro BrCa_MD_RAM 4 10:52 Wednesday, July 15, 2009 BrCa_MD_RAM, sas macro to project for BrCa absolute risk Error listing for the input file Record % # # Age Body (1-ar)RR (1-ar)RR Patrn # T1 T2 Dens Rel Biop 1st Wght Age<50 Age>50 AbsRsk # No errors detected for initial age, projection age and relative risk covaraites ------------------------------------------------------------------------------- Statistical issues should be addressed to: Dr. Mitchell Gail gailm@exchange.nih.gov Technical details should be addressed to: Mr. David Pee peed@imsweb.com