Standardization of DSM2 Studies

Ralph Finch

Min Yu

October 25, 2005


The use of standards is common throughout society and industry.  For instance, automobiles have adopted standard placement of the accelerator, brake, and clutch foot pedals, regardless of make or model.  The benefit of this standardization—less confusion when using different cars and therefore much less risk of accident and injury—far outweighs any possible advantage that different pedal placement could offer.  In construction, states have adopted Uniform Building Codes that codify safe and efficient building practices, while still allowing architects freedom to design working and living spaces according to their clients’ needs and aesthetics.


The authors believe that studies involving DSM2 could also benefit by adopting standard processes and inputs.  By definition, however, “standards” must be adopted by almost all users to be effective.  Therefore this paper seeks to start a serious dialogue among the DSM2 modeling community that will result in a set of useful and practical standards.


Types of Studies

A series of DSM2 model runs, resulting in a study, consists of several steps involving assumptions about the proposed Delta configuration, input data, its description, and transformation of data. Standards will be helpful for all steps of a DSM2 study.


DSM2 studies generally fall into one of two basic categories: historical-based or planning.  Planning studies can use either temporary or permanent barriers.


Historical-based studies, as the name implies, are based on historical (observed) inputs and Delta configuration.  The base run would usually be a complete historical run, perhaps identical to a validation run.  Selected inputs and/or components in the configuration that are to be studied would then be altered from historical values to serve as the comparison runs for the study.  Forecasting studies may be considered a variation of historical-based studies; the model is warmed up with recent historical data, then run with projected near-term future conditions.


Planning studies use greatly modified flows from CALSIM studies or similar sources.  They correspond only loosely to historical conditions, and may be better thought of as more akin to a synthetic hydrology.  To examine different flow regimes in a current Delta configuration, temporary barrier (gate) operations in the South Delta are calculated as a pre-run process and executed during the run.  To simulate expected future long-term configurations, permanent barriers are used in place of temporary barriers.  Their placement and operation is significantly different than temporary barriers.

Text vs. DB Version

The current production version of DSM2 receives non-time-varying input from text files.  This offers maximum flexibility, but it is very difficult to control file versions and to see how versions differ, resulting in a hodge-podge of subtly different files for different studies which are copied among users. 


The database version of DSM2 ( eliminates much of the confusion and ambiguity of the text version, among other new features, by consolidating almost all non-time-varying data into a relational database.  Access to the RDB is through a Graphical User Interface.  The combination results in a system that is still flexible but differences in component versions can be readily identified and shared in their exact form between users.  Because of this important addition as well as other features, this paper assumes the use of the new, database version of DSM2.

Areas of Proposed Standards for DSM2

Delta Components (Input Data)

It is helpful to think of the Delta as consisting of many physical components (features) which can be selected to assemble a particular Delta configuration for a study.  The historical Delta serves as a starting point for all other configurations.  By altering some components (e.g. dredged channels, pumping capacity, flows), adding others (permanent barriers, a Through-Delta Facility), and removing still others (temporary barriers), the final configuration can change significantly from the historic.


Accepting standard parameters and placement of components not directly under study can result in substantial benefit to the DSM2 community.  For instance, in the past, studies were done of different configurations of Franks Tract, and the TDF, but not in coordination with each other.  Later, it was desirable to evaluate the two independently-performed studies with each other to develop a sense of comparative water quality benefits of the two potential facilities.  But it was not practical to do a comparison since the studies used different input components: flows, exports, and Delta physical configuration.

Metadata (Study Documentation)

Metadata is a description of data.  In this discussion, metadata will be a description of the assumptions and input data comprising a study:  The important CALSIM study descriptions (Level of Development, water quality standards, etc.), gate operations, explanation of barriers and operations, and so forth.  Metadata is important to document the characteristics of a study so that interested parties in the future know what went into it.

Scripts, Functions, Conversion Equations

Some inputs must be prepared before a planning DSM2 run, such as smoothing monthly CALSIM flows, calculating the Martinez EC boundary,  and setting gate positions.  These inputs are prepared using a variety of scripts in Python and Excel, and different persons have developed different methods to accomplish the same objective.  For instance, an internal review in Delta Modeling revealed four distinct scripts for calculating the Delta Cross Channel gate position.  Standard scripts should be developed for each common function needed, well-documented, and placed in a shared common area.  These can also be made available via the Informix relational database system.


In this category we also include empirical equations to convert between water quality parameters (for instance, EC, Cl, TDS).  The conversions are quite dependent on assumptions made to account for the salt source (ocean or agriculture), and standardizing on accepted equations would remove another variable in the effort to make studies directly comparable.

Output Locations and Analyses

Even the best model studies are of little benefit if their output and reduction is not carefully designed.  While individual studies will certainly have their own output and reporting requirements, some general output locations and types of data, and standard reduction and reporting, will be important to comparing results between studies.

Input Variation/Time Scales

In the real Delta hydrologic and tidal variations happen every instant.  In the model world such detail is not possible so simplifying assumptions are made.  Currently DSM2 uses a mix of variation: CALSIM hydrology is usually produced with a monthly average variation, the adjusted-astronomical tide at Martinez is an improvement over the previously used repeating tide, and a 16-year period is assumed to represent the larger 73-year period, which in turn is assumed to represent future hydrology.


How good are these assumptions?  Why not just routinely use a 73-year run with adjusted astronomical tide and daily hydrology?  We are close to being able to do the latter, but running time and post-run analyses may still be too long to do so routinely.  If so, we need information on what kind of simplifications we can safely make without compromising the results.

Status Establishing Standards

Input Data

From discussions in email and meetings we have agreed that the CBDA-BDPAC Common Assumptions Work Team effort will largely satisfy DSM2 Input Data standards.  Those DSM2 or Delta components not addressed by Common Assumptions will be developed using the same discussion methods, as well details of implementation (for instance, how to specify a gate or operating rule within DSM2).



Some discussion has occurred with interested members of the Users Group, but detailed implementation of this item will wait until the new version of DSM2 is more widely distributed.  The new version’s use of a relational database could make it easier to create, store and retrieve metadata for studies.  HDF5 and DSS also could be used for general metadata storage.


Some items that could be metadata:


·      Program name and version producing data (e.g. DSM2-Hydro 7.3.1; HECDSSVue 4.3)

·      Agency/Division name producing study (e.g. DWR/DMS)

·      Person doing run

·      Date and time of run

·      Study name (e.g. Franks Tract)

·      Study description (e.g. "A series of studies to examine the configuration of Franks Tract for best water quality.")

·      Run name (e.g. FT-Base, FT-Alt1, SDPGAlt-NoGates)

·      Run description (e.g. "Base--standard Delta configuration”; “Alt1--Franks Tract hydraulically closed.")

·      Output location in standard notation (whatever the standard output notation is...)

·      Output location type (e.g. channel, reservoir, node)

·      Output type (e.g. flow, stage, EC)

·      Output units (e.g. cfs, feet, umhos/cm)

·      Output time interval (e.g. 1Hour, 1Day)

·      Output measurement type (e.g. instantaneous, average, running average)

·      Other?


Marianne Guerin (CCWD, provided the following table for metadata definitions:




eg – SDIP

A carefully planned and organized effort to accomplish a specific (and usually one-time) goal. It has a well-defined beginning and end. Composed of one or more components

Modeling project


Eg - 2003DEIR

A well-defined component of a project, composed of tasks and activities whose end product is one or more mathematical, conceptual and/or numerical models.

Modeling study

eg – 2020Revfish

A series of related modeling simulations in a modeling project.

Model simulation

eg – 2020Revfish_3barrier

A well-defined set of modeling assumptions realized as a group of input files to one or more numerical models.

Model run

The output from ‘running’ a numerical model using a single model simulation.

Base case


eg – CALSIM 2020Base

The set of model assumptions for a single simulation among a group of simulations that is used as the standard to which all other simulations are compared.

Alternative case

The set of model assumptions for a single simulation among a group of simulations that is considered as an alternative to a ‘base case’.

Single case

The ‘base case’ when there are no alternative cases proposed.

Model assumptions

A well-defined set of conceptual or numerical conditions, such as initial or bounding conditions, used as a basis for preparing input to a numerical model.

Numerical model


Conceptual model


Mathematical model


Input files

The set of files used as instructions to run a numerical model.

Output files

The set of files produced in running a numerical model.

Model documentation

Written or graphical instructions, summaries and explanations used to guide the implementation of a numerical model. Can include comments within input files to or output files from a numerical model.



Scripts and Conversions

Little progress made; but see Output Locations and Analyses.


Output Locations and Analyses

A tentative list of standard output locations and types of data was developed by interacting with interested users; these will be stored in the new DSM2 Output Time Series, and refined when use of the new version is more widespread.


A short but interesting discussion occurred over analyzing output.  In question are two items: what kind of basic yet informative analyses should be standardized, and how should they be done?  Jones & Stokes, CH2M Hill, DWR, and others have developed their own analyses using HEC’s DSSMATH batching scripts or HECDSS-VUE, Excel/VBA, and Vscript/Vplotter routines.


Input Variation/Time Scales

Some progress has occurred with this item.  A preliminary analysis done by comparing the historical-data run with monthly-averaged historical data shows that in some locations at some times the monthly averages lose desirable detail, but at other times monthly averaging is not important.  The next test will be to compare a CALSIM monthly timestep run with a CALSIM daily timestep run using the new version of DSM2.  This should reveal greater differences, since while the two CALSIM runs use the same input boundary hydrology their intra-monthly operations are somewhat different; and with active operating rules in DSM2, daily variations in flows will be more important than with the text version of DSM2.


DWR has implemented Condor, a distributed high-throughput computing system that takes advantage of unused machines on a network.  With this system we will be able to make testing 73-year DSM2 runs more feasible.