Transcript: An Objective Test of Riverine Water Quality Models

Fri Sep 19 2008

Post a comment

[Ed: A video version of this presentation is on the way, complete with slides and narration. The transcript here is provided as an alternative.]

Hi, my name is Geoff Parker and this is an [online transcript] of the talk entitled ‘An Objective Test of Riverine Water Quality Models’, given on September 9th at the 2008 IWA World Water Congress in Vienna, Austria.

So, why are we interested in water quality models enough to bother with this? Well, here are two sources that describe the types of cases in which they might be useful, and particularly the problem is one of spatial and temporal changes of constituents in river systems. So what are constituents of concern? DO, Nutrients like N and P, toxics like Atrazine and other pesticides – these are the types of things we’re generally taking about. And the processes we’re interested in are those ones that drive the spatial and temporal changes – and they range from physical processes like sedimentation to chemical processes like redox reactions to biological processes like nutrient uptake. Where are we interested in such things? Well, in a cartoon view of a watershed, we’re talking mostly about what’s going on in the riverine system itself, which we’ll generally model as a 1-D system.

Having covered the why, the what, and the where – now we move on to the crux of the problem, the how – which is also really the when, as we’ll see. So the way we usually see these problems approached is through what we call ‘deterministic’ paradigms. Many different and flavours of coded models exist for these types of problems, but fundamentally they rely on variations of a handful of conceptual models. So we can pose a general formulation of these models, based on 3 major terms, an advection term, a dispersion term, and a conversion-kinetics term. In IOGA terms, the first two are the input and output drivers while the last is the generation component. And for the most part, this is the formulation of models as they have been used since Streeter-Phelps.

The problem, however is that stochastic issues – those dealing with model uncertainty — are really the issue of the day in modeling, either expressly or, at probably even more often, in an implied manner seen in the use, misuse and/or mistrust of model results. So let me give you an example now of what I mean when I talk about deterministic approaches. Here we have a schematic of the major parts of the model construction process.

If we’re going to use this approach, the first step usually is to identify the data points against which everything is ultimately measured. With that done, the process of adjusting parameters to fit the model results begin. So we adjust one way… and the model fits well in some places but maybe not others, so we adjust the other way… and the model responds, or maybe over responds, until eventually… we reach a compromise that reaches what we deem to be an acceptable fit. And maybe then someone comes along and helpfully suggests — “Wait a second – what about the uncertainty here? Or the model sensitivity? Have you considered those?”. So what is usually done then to placate such concerns is a perturbation of the parameter set within a given range and observing the range of the outputs around the calibrated model. In this context, one of the questions we’re asking today is whether this parameter-related uncertainty is really the only issue we should be concerned with? Can we maybe identify other sources of uncertainty, for example structural sources based on conceptual and/or discretization differences? So our proposed approach takes a bit of a different tack on things.

The first thing we stress is the need to discard the notion of a correct calibration .Instead, we recognize that calibration is simply the process of identifying one of potentially many representative sets of model parameters that match a given data set with a given calibration criteria, either formal or not. If we recognize this, then we can further posit that any representative set meeting those criteria is equally valid to any other that does. And together, these sets of representative parameters (and their associated outputs) form what we call a model response surface. We can further argue that the response surface and representative parameter sets themselves are likely to be dependent on the particular conceptual model structure used.

Now, if we revisit the previous framework, then we move from this view of things to a view more akin to this, where we formally admit the existence of many acceptable model sets, and the inherent ‘backward’ flow of output data through the system. If we can somehow now identify those sets of models and parameters that the data supports we can also identify and consider the set, or surface, of model predictions supported by the calibration criteria. The important thing to recognize here is that this we have the opportunity in this case to identify different ‘families’ of representative sets whereas the previous approach will tend to oversample one or at best a few variations around the same family of solutions.

With that I can now move on to some of the specifics of our case study – which again, involved the comparison of surface responses for two different conceptual models. Qual2e (or k, depending on who you talk to) is based around the conceptual structure seen here, with N and P cycles, BOD and DO states considered as well as ChlorophylA. If we adapt this chart now to the other investigated model, one of the Mike11 ones, we see that is simpler in some respects — the lack of P cycle, the simplified N cycle, no ChlorophylA are notable, but also it models BOD in separate compartments, with a dissolved phase, a settleable and a settled phase.

The site we chose for our study looks like this, and is called the Potomac river basin, located in the Chesapeake Bay region in the eastern United States. The particular area we focused on is this area here, which is the non-tidal flow portion of the watershed. Shematically, we can highlight the major portions of the dendritic river system examined, and in particular two major branches flowing from sources in the north and south east to the outlet in the west. For our purpose here, we identify these branches as the primary and secondary branches. One reason this area was selected for this work was the relative wealth of flow-stage and water quality data for the 1992-1995 time period, when it was one of the first basins for which a combined USGS and North American Water Quality Assessment was undertaken. These red squares indicate the location of what NAWQA calls the major integrator stations for Water Quality data. So as you can see, even in this watershed, over a period where intense study was conducted in an area for which information is relatively abundant we’re clearly still talking about a sparse data situation.

Going back to the major parts of our approach, I’ve talked a little already about the conceptual models we’re using, so let’s move on now to the inverse modeling toolset we developed, dubbed Qual-IT, and then we’ll look at some examples of what we mean by surface plots and some results. So here some key features of the toolset we developed are listed. I’ll spare you much of the nitty gritty, but basically the approach is to use a serious of processing steps that are based on modular concepts, and in particular can take advantage of parallel processing schemes when possible, which is an important consideration in commodity computing.

The major components are the inverse model proper, which calls the water quality model selected and maps a parameter set and condition to a given run number, and the model fitness tester, which decides if the behaviour of a model can be considered representative for the given data and calibration criteria, and discads those that are not.
The third component listed is really the second again, but is listed again because we decided that in many cases a sequential processing strategy can also be advantageous in terms o accelerating the computation as I progresses.

Let’s look at some results now from an annual based calibration strategy and hopefully you’ll get a better idea of what I’m talking about. So, here we have the ammonia species concentration data for the basin outlet over the course of a year, and if we look at the representative model set derived for Mike11 model, we get a surface response that looks something like this. Note that the dark orange line denotes the mean of the representative values, and not a parent deterministic model in particular. So ok, that’s Mike11, now if we do the same for the Qual2e response, we get a surface like that shown in green. And we can see there is quite a bit of overlap between the two responses, but there are also some aeas with greater disaccord than others.

Another constituent of concern is dissolved oxygen, so here we can do the same, and the story is a little different this time, with the Mike 11 model tending to track reasonably consistently with the monthly data after an annual calibration, whereas the Qual2E response ‘strategy’ is a little more focused on staying near those mean values throughout the year.

Well how do we formally measure and describe these types of thing is a reasonable way for comparison and assessment? Well we should recognize that the issue here is really measuring the dispersion of models as we move away from calibration and observed data points. This is what cosmologists would call a time-space uncertainty cone which increases with both spatial and temporal distance away from the observation events. So if we look at a response surface in the watershed, such as this one, which disperses as it flows away from the data point to the right of the plot, then what we’re really trying to measure is the distances here, and how they evolve in the dendritic system. To do that, we’ve used an aggregate coefficient of variation measure, which is based on the individual coefficients of variation fo the ammonia, BOD and DO components. Looking at the base calibration set, the aggregate Mike11 model response looks like this, with the darker line indicating the ACV in the secondary branch. In comparison, the Qual2E model response looks like this. We also examined a higher-fitness scenario, based on a more stringent calibration criteria for each model, and in this case, the Mike11 response looks like this, with the Qual2e model behaving this way.

The bulk of the results are provided in the paper itself, but generally speaking here are some of the trends between the model responses we observed. The Mike11 model generally performs a little better at the outlet, with an apparent greater dependence on the availability of data at specific locations in the watershed. Interestingly, the Qual2e model doesn’t really seem to care where in the system the data points are, and tends to be optimize itself everywhere based on the data everywhere. Equallly, as the monthly to annual calibration results showed, the Qual2e model displays more model inertia in the temporal calibration process than the Mike11 model.

Moving forward, where does this work fit in to other problems? Well, the kind of insight that can be gained with the methodology proposed here has potentially many applications, ranging from issues of model equifinality or identifiability to sampling strategies, calibration criteria selection and model coupling issues. And those last two in particular are some of the areas I’ve been focusing on in some of my publications and work the last few years.

Thank you so much for your attention, and I hope you found it interesting. If you think any of this may be of interest to you or your team, or you’d just like some more information, please feel free to contact me.



Got an opinion? Some insight? Think we couldn't be more wrong?

We'd love to hear it! Leave a comment below!

NB: Email addresses are not displayed, but they are required to confirm your comments and allow you to follow the conversation via email (choose 'Subscribe' below) and do other neat things. If you are uncomfortable with this, use the default string below and select 'Post my Comments as a Guest' when prompted.

blog comments powered by Disqus