-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving to file simulations in a suspended state and resuming #86
Comments
Sadly, this is a very thorny problem without any straightforward solutions. Out-of-the-box pickle (or any of the other libraries) are If you want to be able to do something today it's going to be very laborious but you can setup a VM and keep rerunning from a given snapshot. We'll give it some thought. |
Thanks very much for this. Ok, I don't think it's soooo urgent that we need to do something today with a VM... and my guess is that this would be more cumbersome and painful than just waiting for the simulation to repeat itself many times. [@ihawryluk - what do you think?] |
Absolutely don't need it anything done today, and it's ok I can repeat the simulation, worse case i'll just test fewer scenarios but that's fine. |
Having said that, I think I found the source of the recursion in our code and fixed it. Need to test it more (and on bigger simulations) but...fingers crossed! def test_pickling(obj):
filename = '/Users/tamuri/Desktop/testpick.pk'
pickle.dump(obj, open(filename, 'wb'))
return pickle.load(open(filename, 'rb'))
restored = test_pickling(sim)
|
Wow. That would be fantastic! |
A possible alternative which avoids the need to make the simulation pickleable may be to use |
Revisiting this (at least the most obvious solution - pickling). Out-of-the-box, pickling doesn't work. However, dill seems to do the right thing. Need to do plenty more checks, but an avenue to explore. A small, one month, 25k pop sim: from pathlib import Path
import pandas as pd
from tlo import Date, Simulation, logging
from tlo.analysis.utils import parse_log_file
from tlo.methods.fullmodel import fullmodel
from tlo.util import hash_dataframe
start_date = Date(2010, 1, 1)
end_date = start_date + pd.DateOffset(years=0, months=1)
resourcefilepath = Path("./resources")
sim=Simulation(start_date=start_date, seed=1)
sim.register(
*fullmodel(
resourcefilepath=resourcefilepath,
use_simplified_births=False,
module_kwargs={
"HealthSystem": {
"disable": True,
"mode_appt_constraints": 2,
"capabilities_coefficient": None,
"hsi_event_count_log_period": None
},
"SymptomManager": {"spurious_symptoms": False},
}
)
)
sim.make_initial_population(n=25000)
sim.simulate(end_date=end_date) Pickling it errors: import pickle
pickle.dump(sim, open('pickle-sim.pkl', 'wb'))
# ---------------------------------------------------------------------------
# AttributeError Traceback (most recent call last)
# Input In [5], in <cell line: 1>()
# ----> 1 pickle.dump(sim, open('pickle-sim.pkl', 'wb'))
#
# AttributeError: Can't pickle local object 'Models.make_lm_prob_becomes_stunted.<locals>.<lambda>' "Dilling" it works: import dill
dill.dump(sim, open('dill-sim.pkl', 'wb')) Look at some key data structures:
In a new Python session:
|
Adding my notes from the programming meeting yesterday.
A challenging bit, in my opinion, was how to trigger/apply the intervention in, say, 2023.
First step is to check whether using pickle/dill to save the state works reliably. Suggestion to do some quick tests: run a full simulation, checkpoint in the middle. Use the checkpoint in a new run to see if we get the same result. |
Hi @tamuri, when do you think point 2 ("change existing scenario code so each numbered run within different draws has the same simulation seed") could be implemented? This would benefit us right away without even getting to the checkpointing part |
Should be reasonably quick - I'll try to get it in today. |
Thinking about how this would work in light of Matt's work on #1227 User story: As an epidemiologist using TLOmodel, I want to run simulations testing a number of interventions without having to repeatedly run the first part of the simulation where there are no interventions, to reduce costs. Steps:
|
We have a common use case as follows:
It would seem like a solution to this would be to able to save the simulation at a certain point to a file. Then load up the file and resume the simulation under the same or different parametric conditions.
I thought this might be relatively straight forward using pickle (i.e. pickle the sim: which contains the sim.population.props and the event_queue, and all the modules and their internal contents). Then, unpickle the sim, manipulate any parameters in the modules, and restart the sim using sim.simulate(end_date = end_of_part_two_date). (see script below)
However, I tried this and the unpicking failed with a RecursionError. Stack overflow suggested this is a common error for pickling complex classes and suggested increasing the limit on recursions -- but this led to the console crashing for me.
Do you have any thoughts on this?
Short-term:
Medium-term:
The text was updated successfully, but these errors were encountered: