simriscparams(7)

simrisc configuration file organization
(simrisc.15.05.00)

2020-2024

NAME

simriscparams - The description of the
configuration files

DESCRIPTION

This page describes the organization of the simrisc configuration files. These files are formatted like standard unix configuration files. Lines are interpreted after removing initial white-space (blanks and tabs) and after removing all characters from lines starting at the first # character: this is considered comment and is ignored. If a line (not containing a # character) ends in a backslash (\), then the next line (initial white-space removed) is appended to the current line.

Note that all parameter identifiers are interpreted case sensitively. E.g., Costs: is a different parameter than costs:. The numeric values used in this man-page are for illustration purpose only. Some restrictions apply though: standard deviations cannot be negative; proportions and probabilities must lie in the range 0..1; multiple probabilities (like the ones used for breast densities) must add up to 1; etc. If restrictions apply then they are mentioned at the various parameter descriptions below.

DEFAULT CONFIGURATION FILE

The configuration file provided in the simrisc distribution is
/usr/share/doc/simrisc/simrisc.gz.

Usually this file is unzipped by the user to the user's ~/.config directory:


    gunzip < /usr/share/doc/simrisc/simrisc.gz > ~/.config/
   
whereafter ~/.config/simrisc can be edited to contain local modifications.

Various parameters specify probability distributions. Usually the Normal distribution is specified. The program also recognizes the LogNormal and Uniform distributions, and uses the Beta distribution when handling parameter variations of the beta parameter used for lung cancer simulations (note that the similarity of the names beta (the parameter) and Beta (the distribution) is sheer accidentally).

Parameter specifications start with keywords, followed by a colon. All keywords are covered below. The format of the specifications is fixed, but empty lines and white space may be used to improve the specifications' readabilities.

Parameter specifications starting with uppercase letters (like Scenario:) specify (sub)sections and contain no additional specifications. Specifications starting with lowercase letters (like ageGroup:) are followed by actual parameter values.

The configuration file must define all parameters of all configuration sections, but configuration parameters can be modified using a separate analysis file or they can be modified by command-line parameters.

Section `Scenario:'

This section starts with a line containing Scenario: and it defines some general parameters that are used during the simulation process. The default configuration file contains the following specifications:

Section `Costs:'

This section starts with a line containing Costs: and it defines several parameters used for cost-calculations. Modality-specific cost parameters are specified at Section Modalities: (see below). The default configuration file specifies:

Section `BreastDensities:'

This section starts with a line containing BreastDensities: which are used with breast-cancer simulations. It defines breast density values for various age groups, covering ages 0 through the maximum age for simulated cases. The default configuration file contains the following specifications:


    #                  bi-rad:  a       b       c       d
    ageGroup:  0  - 40         0.05    0.30    0.48    0.17
    ageGroup:  40 - 50         0.06    0.34    0.47    0.13
    ageGroup:  50 - 60         0.08    0.50    0.37    0.05
    ageGroup:  60 - 70         0.15    0.53    0.29    0.03
    ageGroup:  70 - *          0.18    0.54    0.26    0.02
       
Age groups are half-open ranges: starting at their first ages, and end at (not including) their second ages. The first ages of subsequent age groups must be equal to the second ages of their previous age groups. For the last age group the specification * can be used, indicating that all ages at or above the last age group's begin age are handled by that group.
For each age group the probabilities of the four bi-rad classifications must sum to 1.0.

Section Modalities:

This section starts with a line containing Modalities: and it specifies cancer-scanning modalities. Currently three modalities are supported: Mammo, Tomo, MRI (which are used with breast-cancer simulations), and CT (which is used with lung-cancer simulations).

Some modalities specify age groups, which are (like the age ranges used for breastDensities) half-open ranges: they start at their first ages, and end at (not including) their second-ages, while subsequent age ranges must connect. Also, the last age group may use the end-age specification *.

The default configuration file contains (below the line Modalities:) the following specifications (if modalities aren't used their specifications are optional):

Section `Screening:'

This section starts with a line containing Screening: and it defines the ages at which screenings are performed, the used screening modality/modalities for each of the used screening ages, and contains the the screenings attendance probability. If no screening rounds should be used then specify a single round-specification line:


    round: none
        
Otherwise, each screening round is defined by the keyword round: followed by an age, which in turn is followed by a list of at least one space delimited modality specification. Currently Mammo, Tomo, MRI and CT are available. Mammo, Tomo, and MRI can be specified when performing breast-cancer simulations, CT can be specified when performing lung-cancer simulations. The default configuration file contains the Screening: specifications also used in versions before 15.00.00:

    round:     50  Mammo
    round:     52  Mammo
    round:     54  Mammo
    round:     56  Mammo
    round:     58  Mammo
    round:     60  Mammo
    round:     62  Mammo
    round:     64  Mammo
    round:     66  Mammo
    round:     68  Mammo
    round:     70  Mammo
    round:     72  Mammo
    round:     74  Mammo
        
For lung cancer simulations the CT modality must be specified, either in an analysis specification or by altering the configuration file. E.g.,

Screening:
    round:     50  CT
    round:     52  CT
    round:     54  CT
    round:     56  CT
    round:     58  CT
    round:     60  CT
    round:     62  CT
    round:     64  CT
    round:     66  CT
    round:     68  CT
    round:     70  CT
    round:     72  CT
    round:     74  CT

    #                   probability:
    attendanceRate:     .8
    

Section `Tumor:'

This section starts with a line containing Tumor: and it defines the parameters specifying tumor characteristics. Several of the parameters in this section can be varied by specifying spread: true in the section Scenario:, in which case statistical variations are applied to these parameters.

Supported distributions are Normal, Uniform, LogNormal, and (for the lung-cancer Beir7 parameters) the Beta distribution . If value is the specified value parameter value, and spread the specified spread parameter then the values that are actually used during the simulations are:

The spread parameters may not be negative. If spread values are configured then their distributions must also be specified. If spread is not specified, then the value parameter won't vary if spread: true is specified in the Scenario section. The same holds true for the Beta distribution: if no spreading should be applied, even though spread: true was specified, then the Beta distribution's specificatins should be omitted.

The Tumor: section has four subsections: Beir7:, Growth, Incidence:, Survival:, and S3:. They contain the following parameter specifications:

Beir7:

BEIR (tumor induction) parameters: only tumor induction type 7 (i.e., beir7) is used. The default configuration file contains specifications for breast cancer simulations and for male and female lung cancer simulations:


        #    eta   beta  spread   dist.
    breast:  -2.0   0.51   0.32  Normal

    #                            Beta-distribution parameters:
    # LC:    eta   beta  dist    constant factor   aParam    bParam
    male:    -1.4   .32  Beta    .234091  1.72727  2.664237  5.184883
    female:  -1.4  1.40  Beta    .744828   .818966 3.366115  4.813548
       
If spread: true is specified then the actually used beta parameters are drawn from their respective distributions.

See also National Research Council. 2006. Health Risks from Exposure to Low Levels of Ionizing Radiation: BEIR VII Phase 2. Washington, DC: The National Academies Press (https://doi.org/10.17226/11340).

Growth:

Tumor growth specifications consist of three elements: start diameters, self-detect parameterss and doubling time specifications.

The start parameters define the start diameter of emerging tumors used with respectively, breast and lung cancer simulations . The default configuration file specifies


        #  breast   lung
    start:    5       3
        

The default configuration file contains these specifications of the self-detect parameters for breast and lung cancer simulations:


    #selfDetect:      # stdev       mean    spread  dist
    breast:             .70         2.92    .084    Normal
    lung:               .014        3.037   .61     Normal
        

Four parameters are used to determine the diameter at which self-detection is possible. These parameters are:

The actually used self-detect diameter is computed using:


    diameter = L(mean, stdev)
   

Finally, the Growth: subsection also defines tumor doubling times for various age groups when using breast cancer simulations and for all ages when using lung cancer simulations.

Doubling times are computed like the self-detect diameters, i.e., using lognormal distributions. Thus, for each age group and for the lung cancer simulation four parameters are specified (of which the final two are optional): the standard deviation of the lognormal distribution, the mean value of the lognormal distribution, and the spread and name of the distribution that is used when spread: true was specified.

The age groups (used with breast cancer simulations) must cover ages 0 through the maximum age for simulated cases, and are specified as described at section BreastDensities:. The default configuration file contains the following specifications:


    DoublingTime:
        #                   stdev   mean  spread  dist.
        ageGroup:  1 - 50    .61    4.38   .43    Normal
        ageGroup: 50 - 70    .26    5.06   .17    Normal
        ageGroup: 70 - *     .45    5.24   .23    Normal

        #     all ages      stdev   mean  spread  dist.
        lung:                .21    4.59   .74    Normal
   

Incidence:

For breast cancer simulations three carrier types are supported: Normal, BRCA1 and BRCA2. Each having a probability of occurrence. The probabilities of these carriers must add to 1. In the default configuration file BRCA1 and BRCA2 are specified, but their probabilties are set to 0, in which case their specifications can also be removed from configuration files.

Each carrier is identified by name (i.e., when performing breast cancer simulations Breast:, BRCA1:, and BRCA2; when performing lung cancer simulations: Male: and Female:) followed by their parameter specifications:

The lifetime risk, mean age and standard deviation parameters may optionally be followed by the standard deviation (spread) and distribution used to vary the probability when spread: true is specified;

The default configuration file contains these specifications:


Incidence:
    Male:
        #                   value   spread  distr.
        lifetimeRisk:         .22   .005    Normal
        meanAge:            72.48  1.08     Normal
        stdDev:              9.28  1.62     Normal

    Female:
        #                   value   spread  distr.
        lifetimeRisk:         .20   .004    Normal
        meanAge:            69.62  1.49     Normal
        stdDev:              9.73  1.83     Normal

    Breast:
        probability:    1
        #                   value   spread  distr.
        lifetimeRisk:         .226  .0053   Normal
        meanAge:            72.9    .552    Normal
        stdDev:             21.1

    BRCA1:
        probability:    0
        #                   value   spread  distr.
        lifetimeRisk:         .96
        meanAge:            53.9
        stdDev:             16.51

     BRCA2:
        probability:    0
        #                   value   spread  distr.
        lifetimeRisk:         .96
        meanAge:            53.9
        stdDev:             16.51
   

Survival:

For breast cancer simulations four types of survival parameters must be specified. Each type (a..d) specifies a mean, and (optionally) a spread and distribution (which are used when spread: true has been specified). The default configuration file specifies:


Survival:
    #              value        spread      dist:
    type:  a        .00004475   .000004392  Normal
    type:  b       1.85867      .0420       Normal
    type:  c       -.271        .0101       Normal
    type:  d       2.0167       .0366       Normal
        

For lung cancer simulations table S4 is used to determine the a..d parameters. Table S4 contains four categories (lung0..lung3) defining these a..d parameters, where (for a known cancer's diameter) the category is randomly determined using table S3 (see below).

Table S4 is appended to the breast cancer specifications. The default configuration file contains the following specifications of table S4:


   # table S4: 4 columns per a..d parameter
   # lungX: X is table S4's column index
   lung0: a        .00143      .00095      Normal
   lung0: b       1.84559      .33748      Normal
   lung0: c       -.22794      .07823      Normal
   lung0: d       1.06799      .16226      Normal

   lung1: a        .01530      .00381      Normal
   lung1: b       1.69434      .10979      Normal
   lung1: c       -.19358      .02105      Normal
   lung1: d        .66690      .03869      Normal

   lung2: a        .78600      .29815      Normal
   lung2: b        .69791      .05425      Normal
   lung2: c        .0          .0          Normal
   lung2: d        .0          .0          Normal

   lung3: a       1.25148      .32305      Normal
   lung3: b        .77852      .34149      Normal
   lung3: c        .0          .0          Normal
   lung3: d        .0          .0          Normal
        

S3:

With lung cancer specifications tables S3 and S4 are used to determine the survival parameters. The tumor's diameter determines the row of table S3, and then its column is randomly determined using the probabilities listed in S3's rows. For each row the probabilities must sum to 1. Once the S3 column has been determined the column index which of the lungX: specifications is used. The row and column indices are 0-based. E.g., if a tumor diameter is 24, then row 2 (diameter <= 30) is selected. Then, if the random value is .630, column 1 is used (column N1-3,M0). Whenever a tumor is present these pairs of indices are reported in the comma-separated data file in the column marked as TNM, using an entry like 2,1.

The default configuration file contains the following table S3:


    S3:
       #      diameter (mm)
       # T-row    <=       N0,M0   N1-3,M0   N1-3,M1a-b   N0-3M1c
       prob:      10:       .756    .157      .048        .039       # T1a,b
       prob:      20:       .703    .197      .055        .045       # T1b
       prob:      30:       .559    .267      .095        .078       # T1c
       prob:      50:       .345    .351      .167        .137       # T2a,b
       prob:      70:       .196    .408      .218        .178       # T3
       prob:       *:       .187    .347      .256        .210       # T4
        
(cf. https://www.sciencedirect.com/science/article/pii/S2667005421000491 and Table 1 of its appendix: https://ars.els-cdn.com/content/image/1-s2.0-S2667005421000491-mmc1.pdf)

bc:

When performing breast cancer simulations TNM indices (cf. the description of the S3 table) are also determined. Here the second TNM value is always 0, and the first TNM value is, as with table S3, determined by the tumor's diameter. The default configuration file contains the following bc: specification:


    #  BC TNM categories thru (<=) diameters (mm):
    bc:        20  50   *
    #  TNM:    T1  T2  T3
        
(cf. https://www.cancerresearchuk.org/about-cancer/breast-cancer/stages-types-grades/tnm-staging).

(see also option --tnm).

BETA DISTRIBUTIONS

Values generated from Beta distributions range between 0 and 1 (cf. https://en.wikipedia.org/wiki/Beta_distribution). The Beta distribution is computed using two Gamma distributions (cf. https://www.fmrib.ox.ac.uk/datasets/techrep/tr03tb1/tr03tb1/node24.html, https://stats.stackexchange.com/questions/502146/how-does-numpy-generate-samples-from-a-beta-distribution):


    gamma1 = Gamma(aParam, 1)
    Beta(aParam, bParam) = gamma1 / (gamma1 + Gamma(bParam, 1))
        
When using lung cancer simulations the 95% confidence interval (CI) for male cases ranges from .15 to .70, with a mean value of .32, and for women ranging from .94 to 2.10 with a mean value of 1.40.

The male and female CI ranges are transformed to .025 to .975 ranges using linear transformations. To transform values x from the male range to the .025 to .975 range the transformation y = 1.72727 * x - .234091 is used, and to transform back the transformation (y + .234091) / 1.72727 is used. For the female CI the transformations are y = .818966 * x - .744828 and (y + .744828) / .818966.

The aParam and bParam values are determined by first generating 1000 values so that their CI span the range 0.025 to .975, with a mean value of 0.318635 (for male cases) and 0.401724 (for female cases). Next the parameters of the corresponding beta distribution were estimated using maximum likelihood fitting, resulting in aParam = 2.664237 and bParam = 5.184883 for the distribution used with male lungcancer simulations, and aParam = 3.366115 and bParam = 4.813548 for the distribution used with female lungcancer simulations.

The default configuration file shows these values at the Beir7 beta parameters.

PARAMETER RESPECIFICATION

Parameters can be respecified by defining a separate parameter configuration file or by providing alternate parameter specifications in Analysis: sections of the program's input file, or by providing alternative parameter specifications as command-line arguments (cf. the simrisc(1) man-page)

FILES

SEE ALSO

simrisc(1)

BUGS

Versions before version 15.03.00 should not be used for lung cancer simulations. The bug invalidating lung cancer simulations was repaired in version 15.03.00.

COPYRIGHT

This is free software, distributed under the terms of the GNU General Public License (GPL).

AUTHOR

Frank B. Brokken (f.b.brokken@rug.nl),