How to explore a model with parameter sweep
This guide shows you how to use the parameter sweep tool to explore the effect of changing model parameters or decision variables within your WaterTAP model. This might be useful, for example, if you have an existing model of a multi-stage treatment train and you’d like to see the effect of varying Pump 1 pressure and Pump 2 pressure independently (where all possible combinations of Pump 1 and Pump 2 pressure will be explicitly tested). The type and quantity of parameters to be varied are easily changed following steps like the ones below.
Begin by importing or explicitly programming any functions relating to the following steps:
Building the Pyomo flowsheet model
Simulating the model using an initial condition
Setting up the optimization (e.g., unfixing and setting bounds)
Performing the optimization
For example, the code below imports from an existing flowsheet module, RO with energy recovery. In general you would import your own flowsheet module.
# replace this with your own flowsheet module, e.g. # import my_flowsheet_module as mfm import watertap.examples.flowsheets.RO_with_energy_recovery.RO_with_energy_recovery as RO_flowsheet
Once this is done, import the parameter sweep tool
from watertap.tools.parameter_sweep import parameter_sweep, LinearSample
Conceptually, regardless of the number of iterations necessary to test each possible combination of variables, it is only necessary to build, simulate, and set up the model once. Thus, these steps are left to the user and handled outside the parameter sweep function. Depending on how the functions you’ve defined work, this could be as straightforward as
# replace these function calls with # those in your own flowsheet module # set up system m = RO_flowsheet.build() RO_flowsheet.set_operating_conditions(m) RO_flowsheet.initialize_system(m) # simulate RO_flowsheet.solve(m) # set up the model for optimization RO_flowsheet.optimize_set_up(m)
m is the flowsheet model that results after the initial “build” step and subsequent operations are performed on that object.
Once this sequence of setup steps is performed, the parameters to be varied should be identified with a dictionary:
sweep_params = dict() sweep_params['Feed Mass NaCl'] = LinearSample(m.fs.feed.flow_mass_phase_comp[0, 'Liq', 'NaCl'], 0.005, 0.155, 4) sweep_params['Water Recovery'] = LinearSample(m.fs.RO.recovery_mass_phase_comp[0, 'Liq', 'H2O'], 0.3, 0.7, 4)
where the basic pattern is
dict_name['Short/Pretty-print Name'] = LinearSample(m.path.to.model.variable, lower_limit, upper_limit, num_samples).
For example, “Feed Mass NaCl” (the feed mass flow rate of NaCl), which is accessed through the model variable
m.fs.feed.flow_mass_phase_comp[0, 'Liq', 'NaCl'], is to be varied between 0.005 and 0.155 with 4 equally-spaced values, i.e.,
[0.005, 0.055, 0.105, 0.155].
It is also possible to perform random sampling (uniform or normal) with the parameter sweep tool, or the user can specify their own sampling method.
In this case, the 2 parameters will each be varied across 4 values for a total of 16 combinations.
Note that there is no limit on the number of sweep variables specified or their resolution besides the practical limit of how long it will take to optimize using each combination of parameters (e.g., if 5 different variables are provided and each one is individually represented with 20 discrete values, the total number of combinations is 20^5 = 3.2 million!).
After specifying the input parameters, the user should then specify output values on the flowsheet that will be reported in the summary CSV file, which has a similar format to the sweep parameters. For this RO flowsheet we’ll report the levelized cost of water, the optimized RO area, and the output pressure of pump 1:
outputs = dict() outputs['RO membrane area'] = m.fs.RO.area outputs['Pump 1 pressure'] = m.fs.P1.control_volume.properties_out.pressure outputs['Levelized Cost of Water'] = m.fs.costing.LCOW
Once the problem is setup and the parameters are identified, the parameter_sweep function can finally be invoked which will perform the adjustment and optimization of the model using each combination of variables specified above and saving to outputs_results.csv (utilizing the solve method defined in our flowsheet module). If specified, the full results from each run (the value of every variable and expression) will be reported in full_results.h5, along with companion text file containing the metadata of the h5 file in full_results.txt.
parameter_sweep(m, sweep_params, outputs, csv_results_file='outputs_results.csv', h5_results_file='full_results.h5')
Note that there are additional keyword arguments that can be passed to this function if you desire more control or debugging outputs, especially with regard to the restart logic used after a previous optimization attempt has failed or with managing local outputs computed on parallel hardware. For more information, consult the technical reference for the parameter sweep tool.
- watertap.tools.parameter_sweep.parameter_sweep(model, sweep_params, outputs=None, csv_results_file=None, h5_results_file=None, optimize_function=<function _default_optimize>, optimize_kwargs=None, reinitialize_function=None, reinitialize_kwargs=None, reinitialize_before_sweep=False, mpi_comm=None, debugging_data_dir=None, interpolate_nan_outputs=False, num_samples=None, seed=None)
This function offers a general way to perform repeated optimizations of a model for the purposes of exploring a parameter space while monitoring multiple outputs. If provided, writes single CSV file to
results_filewith all inputs and resulting outputs.
model – A Pyomo ConcreteModel containing a watertap flowsheet, for best results it should be initialized before being passed to this function.
sweep_params – A dictionary containing the values to vary with the format
sweep_params['Short/Pretty-print Name'] = (model.fs.variable_or_param[index], lower_limit, upper_limit, num_samples). A uniform number of samples
num_sampleswill be take between the
outputs – An optional dictionary containing “short names” as keys and and Pyomo objects on
modelwhose values to report as values. E.g.,
outputs['Short/Pretty-print Name'] = model.fs.variable_or_expression_to_report. If not provided, i.e., outputs = None, the default behavior is to save all model variables, parameters, and expressions which provides very thorough results at the cost of large file sizes.
csv_results_file (optional) – The path and file name where the results are to be saved; subdirectories will be created as needed.
h5_results_file (optional) – The file name without the extension where the results are to be saved; The path is identified from the arguments of csv_results_file. This filename is used when creating the H5 file and the companion text file which contains the variable names contained within the H5 file.
optimize_function (optional) – A user-defined function to perform the optimization of flowsheet
modeland loads the results back into
model. The first argument of this function is
model. The default uses the default IDAES solver, raising an exception if the termination condition is not optimal.
optimize_kwargs (optional) – Dictionary of kwargs to pass into every call to
optimize_function. The first arg will always be
optimize_function(model, **optimize_kwargs). The default uses no kwargs.
reinitialize_function (optional) – A user-defined function to perform the re-initialize the flowsheet
modelif the first call to
optimize_functionfails for any reason. After
reinitialize_function, the parameter sweep tool will immediately call
reinitialize_kwargs (optional) – Dictionary or kwargs to pass into every call to
reinitialize_function. The first arg will always be
reinitialize_function(model, **reinitialize_kwargs). The default uses no kwargs.
reinitialize_before_sweep (optional) – Boolean option to reinitialize the flow sheet model before every parameter sweep realization. The default is False. Note the parameter sweep model will try to reinitialize the solve regardless of the option if the run fails.
mpi_comm (optional) – User-provided MPI communicator for parallel parameter sweeps. If None COMM_WORLD will be used. The default is sufficient for most users.
debugging_data_dir (optional) – Save results on a per-process basis for parallel debugging purposes. If None no debugging data will be saved.
interpolate_nan_outputs (optional) – When the parameter sweep has finished, interior values of np.nan will be replaced with a value obtained via a linear interpolation of their surrounding valid neighbors. If true, a second output file with the extension “_clean” will be saved alongside the raw (un-interpolated) values.
num_samples (optional) – If the user is using sampling techniques rather than a linear grid of values, they need to set the number of samples
seed (optional) – If the user is using a random sampling technique, this sets the seed
- A list were the first N columns are the values of the parameters passed
sweep_paramsand the remaining columns are the values of the simulation identified by the
- Return type