diff --git a/doc/_static/petab_schema.yaml b/doc/_static/petab_schema.yaml index 107e54fd..95316be0 100644 --- a/doc/_static/petab_schema.yaml +++ b/doc/_static/petab_schema.yaml @@ -38,13 +38,26 @@ properties: files and optional visualization files. properties: - sbml_files: - type: array - description: List of PEtab SBML files. - - items: - type: string - description: PEtab SBML file name or URL. + model_files: + type: object + description: One or multiple models + + # the model ID + patternProperties: + "^[a-zA-Z_]\\w*$": + type: object + properties: + location: + type: string + description: Model file name or URL + language: + type: string + description: | + Model language, e.g., 'sbml', 'cellml', 'bngl', 'pysb' + required: + - location + - language + additionalProperties: false measurement_files: type: array @@ -78,8 +91,16 @@ properties: type: string description: PEtab visualization file name or URL. + mapping_files: + type: array + description: List of PEtab mapping files. + + items: + type: string + description: PEtab mapping file name or URL. + required: - - sbml_files + - model_files - observable_files - measurement_files - condition_files diff --git a/doc/documentation_data_format.rst b/doc/documentation_data_format.rst index f4d4272c..79e32368 100644 --- a/doc/documentation_data_format.rst +++ b/doc/documentation_data_format.rst @@ -2,7 +2,7 @@ PEtab data format specification =============================== -Format version: 1 +Format version: 2.0.0 This document explains the PEtab data format. @@ -41,12 +41,11 @@ Overview --------- The PEtab data format specifies a parameter estimation problem using a number -of text-based files (`Systems Biology Markup Language (SBML) `_ -and +of text-based files ( `Tab-Separated Values (TSV) `_) (Figure 2), i.e. -- An SBML model [SBML] +- A model - A measurement file to fit the model to [TSV] @@ -67,6 +66,9 @@ and - (optional) A visualization file, which contains specifications how the data and/or simulations should be plotted by the visualization routines [TSV] +- (optional) A mapping file, which allows mapping PEtab entity IDs to entity + IDs in the model, which might not have valid PEtab IDs themselves [TSV] + .. figure:: gfx/petab_files.png :alt: Files constituting a PEtab problem @@ -91,11 +93,11 @@ problem as such. - Fields in "[]" are optional and may be left empty. -SBML model definition ---------------------- - -The model must be specified as valid SBML. There are no further restrictions. +Model definition +---------------- +PEtab 2.0.0 is agnostic of specific model formats. A model file is referenced +in the PEtab problem description (YAML) via its file name or a URL. Condition table --------------- @@ -107,7 +109,7 @@ different experimental conditions). This is specified as a tab-separated value file in the following way: +--------------+------------------+------------------------------------+-----+---------------------------------------+ -| conditionId | [conditionName] | parameterOrSpeciesOrCompartmentId1 | ... | parameterOrSpeciesOrCompartmentId${n} | +| conditionId | [conditionName] | modelEntityId1 | ... | modelEntityId${n} | +==============+==================+====================================+=====+=======================================+ | STRING | [STRING] | NUMERIC\|STRING | ... | NUMERIC\|STRING | +--------------+------------------+------------------------------------+-----+---------------------------------------+ @@ -140,32 +142,44 @@ Detailed field description Condition names are arbitrary strings to describe the given condition. They may be used for reporting or visualization. -- ``${parameterOrSpeciesOrCompartmentId1}`` - - Further columns may be global parameter IDs, IDs of species or compartments - as defined in the SBML model. Only one column is allowed per ID. - Values for these condition parameters may be provided either as numeric - values, or as IDs defined in the SBML model, the parameter table or both. - - - ``${parameterId}`` - - The values will override any parameter values specified in the model. - - - ``${speciesId}`` - - If a species ID is provided, it is interpreted as the initial - condition of that species (as amount if `hasOnlySubstanceUnits` is set to `True` - for the respective species, as concentration otherwise) and will override the - initial condition given in the SBML model or given by a preequilibration - condition. If no value is provided for a condition, the result of the - preequilibration (or initial condition from the SBML model, if - no preequilibration is defined) is used. - - - ``${compartmentId}`` - - If a compartment ID is provided, it is interpreted as the initial - compartment size. - +- ``${modelEntityId}`` + + Further columns may be the IDs of model entities that have globally unique + IDs, such as parameters, species or compartments defined in the model to set + condition-specific values. Only one column is allowed per ID. + Values for these entities may be provided either as numeric values, or as IDs + of globally unique entity IDs as defined in the model, the mapping table or + the parameter table. + + Any non-``NaN`` value will override the original values of the model, or if + preequilibration was used, they will override the value obtained from + preequilibration. A ``NaN`` value indicates that the original value of the + model is to be used (when used in the preequilibration condition, or in the + simulation condition if no preequilibration is used) or that the result of + preequilibration is to be used (when used in the simulation condition after + preequilibration). + + The value in the condition table either replaces the initial value or the + value at all timepoints based on whether the model entity has a rate law + assigned or not: + + * For model entities that have constant algebraic assignments + (but not necessarily constant values), i.e, that do not have a rate of + change with respect to time assigned and that are not subject to event + assignments, the algebraic assignment is replaced statically at all + timepoints. Examples for such model entities are the targets of SBML + `AssignmentRules`. + + * For all other entities, e.g., those that are assigned by SBML `RateRules`, + only the initial value can be assigned in the condition table. If an + assignment of the rate of change with respect to time or event assignment + is desired, the values of model entities that are used to define rate of + change or event assignments must be assigned in the condition table. + If no such model entities exist, assignment is not possible. + + If the model has a concept of species and a species ID is provided, its + value is interpreted as amount or concentration in the same way as anywhere + else in the model. Measurement table ----------------- @@ -705,6 +719,49 @@ Detailed field description legend and which defaults to the value in ``datasetId``. +Mapping table +------------- + +Mapping PEtab entity IDs to entity IDs in the model. This optional file may be +used to reference model entities in PEtab files where the ID in the model would +not be a valid identifier in PEtab (e.g., due to inclusion of blanks, dots, or +other special characters). + +The TSV file has two mandatory columns, ``petabEntityId`` and +``modelEntityId``. Additional columns are allowed. + ++---------------+---------------+ +| petabEntityId | modelEntityId | ++===============+===============+ +| STRING | STRING | ++---------------+---------------+ +| reaction1_k1 | reaction1.k1 | ++---------------+---------------+ + + +Detailed field description +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- ``petabEntityId`` [STRING, NOT NULL] + + A valid PEtab identifier that is not defined in any other part of the PEtab + problem. This identifier may be referenced in condition, measurement, + parameter and observable tables, but cannot be referenced in the model + itself. + +- ``modelEntityId`` [STRING, NOT NULL] + + A globally unique identifier defined in the model, + *that is not a valid PEtab ID* (see :ref:`identifiers`). + + For example, in SBML, local parameters may be referenced as + ``$reactionId.$localParameterId``, which are not valid PEtab IDs as they + contain a ``.`` character. Similarly, this table may be used to reference + specific species in a BNGL model that may contain many unsupported + characters such as ``,``, ``(`` or ``.``. However, please note that IDs must + exactly match the species names in the BNGL-generated network file, and no + pattern matching will be performed. + Extensions ~~~~~~~~~~ @@ -743,7 +800,7 @@ Parameter estimation problems combining multiple models ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Parameter estimation problems can comprise multiple models. For now, PEtab -allows to specify multiple SBML models with corresponding condition and +allows one to specify multiple models with corresponding condition and measurement tables, and one joint parameter table. This means that the parameter namespace is global. Therefore, parameters with the same ID in different models will be considered identical. @@ -1070,6 +1127,8 @@ float values are demoted to boolean values. For example, in ``1 + true``, the expression is interpreted as ``true && true = true``. +.. _identifiers: + Identifiers -----------