Archive

Archive for the ‘Modeling Tips’ Category

What is Delta Time (DT)?

August 3rd, 2010

Nat Pierson Modeling Tips

After reading Karim Chichakly’s recent post on Integration Methods and DT, I was reminded that delta time (DT) has always been a tricky modeling concept for me to grasp.   Beginning modelers don’t usually need to think about changing DT since STELLA and iThink set it to a useful default value of 0.25.   But once you progress with your modeling skills, you might consider the advantages and risks of playing with DT.

The DT setting is found in the Run Specs menu.

By definition, system dynamics models run over time and DT controls how frequently calculations are applied each unit of time.  Think of it this way, if your model was a movie, then DT would indicate the time interval between still frames in the strip of movie film.  For a simulation over a period of 12 hours, a DT of 1/4 (0.25) would give you a single frame every 15 minutes.  Lowering the DT to 1/60 would give a frame every minute.   The smaller the DT is, the higher the calculation frequency (1/DT).

Beware of the Extremes

A common tendency for modelers is to set the calculation frequency too high.  Without really thinking too hard about it, more data seems to imply a higher quality model – just like more frames in movie film make for smoother motion.  If your model calculates more data for every time unit, its behavior will begin to resemble the behavior of a smoothly continuous system.  But a higher frequency of calculations can greatly slow down your model’s run performance and more data does not directly translate to a better simulation.

Beware of Discrete Event Models

Another situation where DT can often lead to unexpected behavior is with models that depend on discrete events.   My eyes were opened to this when I attended one of isee’s workshops taught by Corey Peck and Steve Peterson of Lexidyne LLC.

One of the workshop exercises involved a simple model where the DT is set to the default 0.25, the inflow is set to a constant 10, and the outflow is set to flush out the stock’s contents as soon as it reaches 50.   This is how the model’s structure and equations looked:

Discrete Model

Stock = 0

inflow = 10

outflow = IF Stock >= 50 THEN 50 ELSE 0

I would have expected the value of the stock to plunge to zero after it reached or exceeded 50, but this graph shows the resulting odd saw-tooth pattern.

Sawtooth Model Behavior

The model ends up behaving like a skipping scratched record, in a perpetual state of never progressing far enough to reach the goal of zero.  (Click here to download the model.)

What is happening in the model?  In the first DT after the stock’s value reaches exactly 50, the outflow sets itself to 50 in order to remove the contents from the stock. So far so good, but now the DT gotcha begins to occur.   Since the outflow works over time, its value is always per time.  To get the quantity of material that actually flowed, you must multiply the outflow value (or rate) by how long the material was flowing.  When DT is set to 0.25,  the material flows 0.25 time units each DT.  Hence, the quantity of material removed from the stock is 50*0.25 = 12.50.

Suddenly we are in a situation where only 12.50 has been removed from the stock but the stock’s value is now less than 50.  Since the stock is no longer greater than or equal to 50, the outflow sets itself back to 0 and never actually flushes out the full contents of the stock. 

So what do we do?  One solution to this problem would be to use the PULSE built-in to remove the full value from the stock.   Here’s what the equation for the outflow would look like:

outflow = IF Stock >= 50 THEN PULSE(Stock) ELSE 0

(Note: This option will only work using Euler’s integration method.)

Further Reading

STELLA and iThink have great help documentation on DT.  The general introduction provides a good explanation of how DT works. The more advanced DT Situations Requiring Special Care section focuses more on artifactual delays and the discrete model issues mentioned in this post.  Delta time and resulting model behaviors are reminders that system dynamics models run over time, but they achieve this by applying numerous discrete calculations in order to simulate the smooth behavior of actual systems.

,

Integration Methods and DT

July 14th, 2010

Karim Chichakly Modeling Tips

The simulation engine underlying STELLA® and iThink® uses numerical integration.  Numerical integration differs from the integration you may have learned in Calculus in that it uses algorithms that approximate the solution to the integration.  The two approximations currently available are known as Euler’s method and the Runge-Kutta method.  All algorithms require a finite value for DT, the integration step-size, rather than the infinitesimally small value used in Calculus.  On the surface, it may seem that the smaller DT is, the more accurate the results, but this turns out not to be true.

Compound Interest:  Euler’s Method over Runge-Kutta

To introduce Euler’s method, let’s take a look at the simple problem of compound interest.  If we have $100 that we invest at 10% (or 0.1) compounded annually, we can calculate the interest after N years by adding in the interest each year and recalculating:

1st year:  interest = $1000 × 0.1 = $100; Balance = 1000 + 100 = $1100
2nd year: interest = $1100 × 0.1 = $110; Balance = 1100 + 110 = $1210
3rd year:  interest = $1210 × 0.1 = $121; Balance = 1210 + 121 = $1331

And so on up to year N.  We have just seen the essence of how Euler’s method works.  It calculates the new change in the stock for this DT (in this case, interest) and then adds that to the previous value of the stock (Balance) to get the new value of the stock.  In this example, DT = 1 year.

By noticing we always add the existing balance in, we can instead just multiply the previous year’s balance by 1 + rate = 1 + 0.1 = 1.1:

1st year:  Balance = $1000 × 1.1 = $1100
2nd year: Balance = $1100 × 1.1 = $1210
3rd year:  Balance = $1210 × 1.1 = $1331

And so on up to year N. We can further generalize by noticing we are multiplying by 1.1 N times and thus arrive at the compound interest formula:

Balance = Initial_Balance*(1 + rate)^N

Checking this, we find our Balance at the end of year 3 is 1000*1.1^3 = $1331.  In the general case of the formula, rate is the fractional interest rate per compounding period and N is the number of compounding periods (an integer).  In our example, the compounding period is one year, so rate is the annual fractional interest rate and N is the number of years.  However, if interest is compounded quarterly (four times a year), the interest rate has to be adjusted to a per quarter rate by dividing by 4 (so rate = 0.1/4 = 0.025) and N must be expressed as the number of quarters (N = number of years*4 = 3*4 = 12 for the end of year 3).  We can use this formula in our model to test the accuracy of Euler’s method.  Note that for quarterly compounding, we would set DT = 1/4 = 0.25 years.

To explore the differences between Euler’s and Runge-Kutta, the following structure will be used for all of the examples in this post.  This structure models the compound interest problem outlined above.

image

The equations change for each example and can be seen in the individual model files (accessed by clicking here).  For this example, the actual value is calculated using the compound interest formula, Initial_Balance*(1 + rate)^TIME.  The approximated value is calculated by integrating rate*Approx_Balance (into Approx_Balance).

In addition to the actual and approximate values, three errors are also calculated across the model run:  the maximum absolute error, the maximum relative error, and the root-mean-squared error (RMSE).  The absolute error is:

ABS(Actual_BalanceApprox_Balance)

The relative error is:

absolute_error/ABS(Actual_Balance)

and is usually expressed as a percentage.  The RMSE is found by averaging the values of the absolute error squared, and then taking the square root of that average.

Read more…

, , , ,

Steady-State Initialization of Conveyors

May 25th, 2010

Karim Chichakly Modeling Tips

Conveyors are useful model elements for representing pipelines or processes that take a certain amount of time to complete.  However, adding a leakage flow to a conveyor can make it difficult to initialize a model in steady-state.  The following discussion will explain how to initialize conveyors with leakage in steady-state.  Please refer to the model structure below while reading this discussion.

image

These additional variables will be also used:

transit_time = TRANSTIME(conveyor)
conveyor_length = transit_time/DT
leakage_fraction = the user-specified leakage fraction

Linear Leakage

The default leakage is linear in behavior.  The total amount that leaks across the length of the conveyor is directly proportional to the inflowing amount.  The leakage fraction is the constant of proportionality.  Thus, the fraction of inflowing material that makes it to the conveyor’s outflow is exactly

1 – leakage_fraction

Given the sample model structure above, to achieve equilibrium, conveyor_outflow must equal outflow.  For this to happen, we need to set the inflow as follows:

inflow = outflow/(1 – leakage_fraction)

The conveyor’s steady-state value is then:

conveyor = transit_time*inflow – (conveyor_length – 1)*leakage*DT/2

where the initial value of leakage is:

leakage = leakage_fraction*inflow

This must be calculated outside the program and entered as a constant into the conveyor as conveyors cannot be given equations (they can, however, be set to a the value of a single converter, but you must be careful how you calculate this to avoid circularity).

Exponential Leakage

Optionally, leakage can be made exponential.  The amount that leaks each DT is proportional to the amount remaining in the conveyor.  In this case, the leakage fraction is the fraction that leaks each unit of time so, for long conveyors, a lot of material can leak away.  Given the transit time, the fraction of inflowing material that makes it to the conveyor’s outflow is approximately

1 – (1 – leakage_fraction)^transit_time

Given the sample model structure above, to achieve equilibrium, conveyor_outflow must equal outflow.  For this to happen, we need to set the inflow as follows:

per_dt_no_leak = 1 – leakage_fraction*DT
inflow = outflow/(per_dt_no_leak^conveyor_length)

For steady-state, the conveyor itself must then be set as follows:

conveyor = (inflow*DT)*(1 – per_dt_no_leak^conveyor_length)/(1 – per_dt_no_leak)

, ,

Converting a Sector-based Model to Modules

March 17th, 2010

Karim Chichakly Modeling Tips

I generally do not use modules to build very small models (only a couple of stocks and flows), which may then lead me to use sectors as the model grows because they are very convenient.  By the time I have three sectors, though, it starts to become clear that I should have used modules.  I will then need to convert my sector-based model into a module-based model.  Historically, I also have a number of sector-based models that are crying to be module-based.

Converting from sectors to modules is not very difficult:

  1. Make sure there are no connections or flows between sectors.  Replace any of these with ghosts in the target sector.
  2. In a new model, create one module for every sector.
  3. Copy and paste the structure from each sector into its corresponding module.
  4. Connect the modules:  At this point, the model structure has been rearranged into modules, but none of the modules are connected.  The ghosts that were in the sectors became real entities when they were pasted into the modules.  Go back to identify all of these connections and reconnect them in the module-based model.

Stepping Through a Sample Model

Let’s walk through an example.  A small sector-based model is shown below (and is available by clicking here).

image

This model violates what I would call good sector etiquette:  there are connectors that run between the sectors.  This is often useful in a small model such as this because it makes the feedback loops visible.  However, in a larger model, this can lead to problems such as crossed connections and difficulty in maintaining the model because sectors cannot be easily moved.

Read more…

, , ,

Modeling Customers Switching Between Brands – The General Case

October 23rd, 2009

Karim Chichakly Modeling Tips

This is the last installment of a four-part series.  The first three parts can be accessed by clicking on the links below.
Methods for Using Arrays Effectively

Modeling a Watershed with Arrays
Modeling Customers Switching Between Brands

Generalizing the Model

When I showed Steve Peterson (at Lexidyne) my brand switching model, he told me there is a more general version that separates the customer loss fraction from the fraction won by another competitor.  This has been presented in Pharmaceutical Product Strategy by Mark Paich, Corey Peck, and Jason Valant.

In my original formulation, the switching probability matrix was the product of these two variables.  However, in many practical cases, the data available comes from two different places and reflects these two separate components.  The revised model structure is shown below.

image

Instead of one composite switching probability, this model uses a switching out probability that is distinct from the switching in probability.  The switching out probability is a one-dimensional array that, for each product, contains the fraction of customers lost to rivals every time unit (in our case, month).  A sample for the five brands A, B, C, D, and E appears below.

Brand Fraction Lost
A 0.091
B 0.170
C 0.046
D 0.026
E 0.071

switching out probability

We can see from this table that Brand B is losing 17% of its customers to rivals each and every month!  Whoever is managing that product had better do something quickly.

The other side of the story has to do with which brand the customers are switching to.  The switching in probability matrix contains, for each brand, the fraction of lost customers that migrate to a rival brand.  Thus, each row of this matrix must add up to one (100% of lost customers).  A sample appears below.

From\To A B C D E
A 0.00 0.11 0.33 0.55 0.01
B 0.18 0.00 0.29 0.41 0.12
C 0.22 0.02 0.00 0.44 0.32
D 0.04 0.00 0.77 0.00 0.19
E 0.02 0.07 0.28 0.63 0.00

switching in probability

Note the diagonal will always be zero.

We can determine a lot of things from this table.  For example, brand B offers no competition to brand D, brand D is the biggest rival of all the other brands, and brand C is brand D’s biggest rival.

Read more…

, , ,

Running Mean and Standard Deviation

October 22nd, 2009

Karim Chichakly Modeling Tips

This is an update to post published on August 31, 2009.  The attached model was updated to find negative means and an alternate method was included at the end.

I am frequently asked which built-in function gives either the running mean or running standard deviation of a model variable.  Unfortunately, there is no such built-in at this time (no, that is not what MEAN() does).

Luckily, however, we can replicate the behavior we desire from built-in functions by creating a reusable module.  I can create a module that calculates a running average and a running standard deviation from any model variable.

When building a reusable module component, it is important to carefully define what the input to the module will be (i.e., what are the parameters to the built-in function) and what the output of the module will be (i.e., what is the result or return value of the built-in function).  In this particular case, the input will be the variable whose running average or running standard deviation we wish to find.  There are two outputs:  the running average and the running standard deviation.  Note we do not have to use both outputs all the time.

Thus, our new module can be used as shown below:

image

Note the name of the module was chosen to give a meaningful context to the running mean and standard deviation variables, which have fixed names defined inside the reusable module.  As in this example, it is always a good idea to give the module outputs general names that make sense when qualified by a context (the module name).

The reusable module itself was built and tested in iThink, and can also be used in STELLA.  The input parameter was given an equation to allow the model to be completely tested and debugged before being reused.  The model appears below and can be downloaded by clicking here.

image

Note the input to the module is named value.  After importing the module, this will need to be assigned to the variable in question, Cash in the above example.  This can be done from outside the module by right-clicking on Cash and choosing “Module->Assign to”, or right-clicking on value and choosing “Module->Assign Input to”.  The outputs can be assigned in a similar way, or the Ghost tool can be used.

This method, while relatively easy to understand, does accurately compute the standard deviation when the mean of the running sum of squares is close in magnitude to the running mean squared.  An alternate method that does not suffer this problem was developed by Welford in 1962 and is implemented in the model that can be downloaded by clicking here.

Finally, I am including a simple reusable module that finds the maximum value of a model variable across the entire run of a simulation.  It can be downloaded by clicking here.  It uses a stock to hold the maximum value seen so far, and takes advantage of the fact that uniflows cannot be negative.  It is used the same way as the running mean and standard deviation module, but only has one output called maximum.

For more information about modules, consult the iThink and STELLA help files.  These on-line resources are also available:

Using Modules Webinar

Module FAQs

, , ,

Modeling Customers Switching Between Brands

September 30th, 2009

Karim Chichakly Modeling Tips

This is the third installment of a four-part series.  The other three parts can be accessed by clicking on the links below.
Methods for Using Arrays Effectively

Modeling a Watershed with Arrays
Modeling Customers Switching Between Brands – The General Case

In the second post of this series, I showed how to selectively pull information from an array in order to route water through a watershed.  In this post, I will use the exact same technique to move customers between different product brands.

Switching Customers between Different Products

Business models often need to model gaining customers from, and losing customers to, competing products in a relatively mature market (what Kim Warren, in his excellent book Strategy Management Dynamics, calls “Type 2 Rivalry”).  These are often driven with statistical models developed through market research.  For this application, we need a matrix describing the probability of switching from product A to product B each time unit.  A sample appears in the table below.

From\To A B C D E
A 0.000 0.010 0.030 0.050 0.001
B 0.030 0.000 0.050 0.070 0.020
C 0.010 0.001 0.000 0.020 0.015
D 0.001 0.000 0.020 0.000 0.005
E 0.001 0.005 0.020 0.045 0.000

switching probability (units: dimensionless)

To read this table, locate the product the customer is presently using in the left column (say, B).  Read across that row (the second row, in this case) until you find the product the customer is switching to (say, C).  The number in that cell (in this case, 0.05 or 5%) is the probability the customer will switch from the first product to the second (from B to C) in this time unit.  If the model is running in months, as ours is, this table indicates that 5% of customers using product B switch to product C every month.

Of course, the values in the table do not need to be constant.  Often each cell will contain a regression equation based on various product characteristics – including market share, marketing effort, product features, and product quality – that evolve over the course of the simulation.

Note the diagonal is zero.  This means customers do not switch from one product to the same product.

Note also that the sum in any row cannot exceed 1.0, which represents 100% of the customers using that product.  It is quite normal for it to be below 1.0 because we do not include people who are not switching.  Some modelers find it easier to always have each row add up to 1.0.  If you desire to do this, fill the diagonal with the difference between 1.0 and the sum of the other columns.  For example, to do this for product A, replace the top left cell with 1.0 – (0.01 + 0.03 + 0.05 + 0.001) = 0.909 [for you Beatles fans].

Read more…

, , ,

Modeling a Watershed with Arrays

September 15th, 2009

Karim Chichakly Modeling Tips

This is the second installment of a four-part series.  The other three parts can be accessed by clicking on the links below.
Methods for Using Arrays Effectively

Modeling Customers Switching Between Brands
Modeling Customers Switching Between Brands – The General Case

This is the second installment of a multipart series.  The first part can be found by clicking here. Part 3 is available here.

In the first post of this series, I showed how to conditionally pull information from an array.  In this post, I will extend this concept to show how to route information through an arrayed model.  This is especially useful in spatial modeling applications.

Routing Water Through a Watershed

A common ecology application is the modeling of a watershed.  Part of such a model will necessarily involve a network of stream or river segments – called reaches – which feed each other.  It is desirable to implement this in a way that makes it easy to modify the reach network.  Using an explicit stock-flow network makes this very difficult.  However, it is relatively straightforward to use arrays of stocks and flows to build an easily configurable network.

Imagine a small watershed broken down into reaches as shown below:

clip_image001[6]

For our purposes, a new reach will need to be created at every junction point.  Therefore, in this example and from a topological point-of-view, it is not strictly necessary to treat reach 4 separately from reach 2 nor reach 5 separately from reach 3, but reaches 2 and 3 must be separate from reach 1.  There are, of course, other reasons to separate reach 4 from 2 and reach 5 from 3, for example, slope, channel width, length, etc.

Every reach flows into exactly one other reach at its head, but many reaches can flow into the head of the same reach.  This requires a many-to-one representation of the reach network.  This is accomplished quite easily with a routing map which, for each reach, contains the number of the reach that this reach flows into.  We also need someway to signify the outlet.  Since reach numbers start at one, we can use zero to signify the outlet.  Using these rules, the above network is completely represented in the following routing map:

Reach Flows into
1 0
2 1
3 1
4 2
5 3

The nice thing about this representation is that it fits nicely into a one-dimensional array where the array index is the reach number and the reach it flows into is the value stored in that array element.

The model itself uses one stock to represent each reach.   That stock has one inflow for water entering the reach and one outflow for water leaving the reach: (Download the zipped STELLA model here)

image

Read more…

, , ,