Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource file path from simulation #1410

Open
wants to merge 72 commits into
base: master
Choose a base branch
from

Conversation

jkumwenda
Copy link
Collaborator

Created resource file path function and calling it from different modules.

@jkumwenda jkumwenda linked an issue Jul 2, 2024 that may be closed by this pull request
5 tasks
@@ -44,7 +44,7 @@ class Simulation:
"""

def __init__(self, *, start_date: Date, seed: int = None, log_config: dict = None,
show_progress_bar=False):
show_progress_bar=False, resourcefilepath = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add str type hint to resourcefilepath.

@@ -80,6 +81,7 @@ def __init__(self, *, start_date: Date, seed: int = None, log_config: dict = Non
data=f'Simulation RNG {seed_from} entropy = {self._seed_seq.entropy}'
)
self.rng = np.random.RandomState(np.random.MT19937(self._seed_seq))
self.resourcefilepath = resourcefilepath
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we could convert and store Path type and check that path exists.

thewati and others added 20 commits July 30, 2024 15:50
…tate cancer, fixed test equipment and dxmanager
…t.py and breast_cancer.py method updated for resource file path from simulation.py
…t.py and breast_cancer.py method updated for resource file path from simulation.py
…t.py isort the import to fix incorrectly sorted error
@tbhallett
Copy link
Collaborator

@jkumwenda -- please merge in master and resolve conflicts

@tamuri -- please can we merge this when done, so this doesn't keep getting out of date?

@jkumwenda
Copy link
Collaborator Author

Will fix that and merge

@tbhallett
Copy link
Collaborator

@jkumwenda --- please could you update this so we can merge it in? As we wait, it becomes out-of-date and requires more updates.

@jkumwenda
Copy link
Collaborator Author

jkumwenda commented Jan 8, 2025 via email

thewati and others added 4 commits January 9, 2025 17:01
# Conflicts:
#	src/scripts/epilepsy_analyses/analysis_epilepsy.py
#	src/tlo/methods/care_of_women_during_pregnancy.py
#	src/tlo/methods/contraception.py
#	src/tlo/methods/epi.py
#	src/tlo/methods/hiv.py
#	src/tlo/methods/hiv_tb_calibration.py
#	src/tlo/methods/labour.py
#	src/tlo/methods/malaria.py
#	src/tlo/methods/measles.py
#	src/tlo/methods/newborn_outcomes.py
#	src/tlo/methods/postnatal_supervisor.py
#	src/tlo/methods/pregnancy_supervisor.py
#	src/tlo/methods/schisto.py
#	src/tlo/methods/simplified_births.py
#	src/tlo/methods/tb.py
Copy link
Collaborator

@mnjowe mnjowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Wati and Joel, for the excellent work on this PR. Below are my comments, which may seem many but primarily revolve around the following key points:

  1. I suggest initialising resourcefilepath as a path object in both the analysis scripts and test files. This change will help eliminate repetitive code currently arising from creating a path object from resourcefilepath in each disease module.
  2. I suggest reverting the changes made to the simulation end_date and population sizes in some of the analysis scripts.
  3. I suggest removing str option for resourcefilepath argument in Simulation object.
  4. I suggest making resourcefilepath argument in the read_parameters section optional to improve readability
  5. I suggest removing the condition to check if resourcefilepath is empty in utils. There may be a more efficient way to handle this check.

For changing resourcefilepath from str to path, I couldn't provide a comment on every affected line. However, if you agree that it should be declared as a path object (rather than a string), you can apply this change consistently across all affected areas. Similarly, if you agree with making resourcefilepath in read_parameters section optional, you can apply this adjustment to all relevant sections.

Once again, thank you for the great work on this PR!

Comment on lines 40 to 41
end_date = Date(2011, 12, 31)
popsize = 5000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make these changes just to make the script run faster? if yes can you now please revert the changes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Will revert the changes

@@ -348,7 +348,7 @@ def plot_modal_gbd_deaths_by_age_group(self):
start_date = Date(2010, 1, 1)
end_date = Date(2030, 1, 1)

resourcefilepath = Path("./resources") # Path to resource files
resourcefilepath = './resources' # Path to resource files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you're changing from Path to string here?

Comment on lines 41 to 42
end_date = Date(2011, 7, 1)
pop_size = 1000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, if the intention was to make this run faster, revert the changes


# Path to the resource files used by the disease and intervention methods
resources = "./resources"
resourcefilepath = "./resources"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we standardise path here i.e. making it to resourcefilepath = Path("./resources") and read it in read_parameters as read_csv_files(resourcefilepath / resourcefile_folder_name). I feel this will be good as we will initialise path once rather than each module initialising it.

@@ -25,7 +25,7 @@
# %%


resourcefilepath = Path("./resources")
resourcefilepath = './resources'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. why changing? I feel like it will be good to initialise path here rather than in the module

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We overlooked this. Thanks for catching it

@@ -192,12 +191,12 @@ def __init__(self, name=None, resourcefilepath=None):
)
}

def read_parameters(self, data_folder):
def read_parameters(self, resourcefilepath=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it optional?

"""Setup parameters used by the module, now including disability weights"""

# Update parameters from the resourcefile
self.load_parameters_from_dataframe(
pd.read_excel(Path(self.resourcefilepath) / "ResourceFile_Breast_Cancer.xlsx",
pd.read_excel(Path(resourcefilepath) / "ResourceFile_Breast_Cancer.xlsx",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be good if you could have initialised resourcefilepath as path object and avoid creating it here

@@ -256,7 +255,7 @@ def __init__(self, name=None, resourcefilepath=None, do_log_df: bool = False, do
self.lms_event_death = dict()
self.lms_event_symptoms = dict()

def read_parameters(self, data_folder):
def read_parameters(self, resourcefilepath=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it optional?

@@ -273,7 +272,7 @@ def read_parameters(self, data_folder):
ResourceFile_cmd_events_hsi.xlsx = HSI parameters for events

"""
cmd_path = Path(self.resourcefilepath) / "cmd"
cmd_path = Path(resourcefilepath) / "cmd"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resourcefilepath could be passed as a path object already and avoid creating it here

def __init__(
self,
*,
start_date: Date,
seed: Optional[int] = None,
log_config: Optional[dict] = None,
show_progress_bar: bool = False,
resourcefilepath: Optional[Path] = None,
resourcefilepath: Optional[str | Path] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should only take Path. I think string option should be removed

@jkumwenda
Copy link
Collaborator Author

Thank you, Wati and Joel, for the excellent work on this PR. Below are my comments, which may seem many but primarily revolve around the following key points:

  1. I suggest initialising resourcefilepath as a path object in both the analysis scripts and test files. This change will help eliminate repetitive code currently arising from creating a path object from resourcefilepath in each disease module.
  2. I suggest reverting the changes made to the simulation end_date and population sizes in some of the analysis scripts.
  3. I suggest removing str option for resourcefilepath argument in Simulation object.
  4. I suggest making resourcefilepath argument in the read_parameters section optional to improve readability
  5. I suggest removing the condition to check if resourcefilepath is empty in utils. There may be a more efficient way to handle this check.

For changing resourcefilepath from str to path, I couldn't provide a comment on every affected line. However, if you agree that it should be declared as a path object (rather than a string), you can apply this change consistently across all affected areas. Similarly, if you agree with making resourcefilepath in read_parameters section optional, you can apply this adjustment to all relevant sections.

Once again, thank you for the great work on this PR!

Thanks for these comments, we will review and provide feedback line by line.

…th("./resources") in scripts files and updated methods read parameters to def read_parameters(self, resourcefilepath: Optional[Path] = None): helps with single initialisation across the methods
# Conflicts:
#	src/tlo/methods/alri.py
#	src/tlo/methods/depression.py
#	src/tlo/methods/diarrhoea.py
#	src/tlo/methods/epilepsy.py
#	src/tlo/methods/rti.py
…rent activ resource file in the HIV resource folder
Comment on lines +56 to +57
data_hiv_mphia_inc = xls["MPHIA_incidence2020"]
data_hiv_mphia_prev = xls["MPHIA_prevalence_art2020"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tdm32, can you confirm this change is necessary. I agree with Joel, We don't have MPHIA_incidence2015 and MPHIA_prevalence_art2015 in the HIV resourcefiles folder.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this script originally used the MPHIA 2015 estimates, but would now use the 2020 estimates. The worksheets were renamed, so MPHIA_incidence2015 and MPHIA_prevalence_art2015 no longer exist. Thank you.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tara

…s(self, resourcefilepath: Optional[Path] = None):

        parameter_dataframe = read_csv_files(resourcefilepath /
…sourcefilepath is None:

        resourcefilepath = get_root_path() / 'resources' from utils.py.
…sourcefilepath is None:

        resourcefilepath = get_root_path() / 'resources' from utils.py.
…sourcefilepath is None:

        resourcefilepath = get_root_path() / 'resources' from utils.py.
…cefilepath is None:

        resourcefilepath = get_root_path() / 'resources' from utils.py.
…cefilepath is None:

        resourcefilepath = get_root_path() / 'resources' from utils.py.
…to jkumwenda/resource_file_path

# Conflicts:
#	src/tlo/methods/scenario_switcher.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ready for EM review
Development

Successfully merging this pull request may close these issues.

Get resource file path from simulation
6 participants