viewclust_vis package

Submodules

viewclust_vis.cumu_plot module

viewclust_vis.cumu_plot.cumu_plot(clust_info, cores_queued, cores_running, resample_str='', fig_out='', y_label='Usage', fig_title='', query_bounds=True, running=[], queued=[], submit_run=[], submit_req=[], user_run=[], plot_queued=False)[source]

Cumulative usage plot.

Parameters:
  • clust_info (DataFrame) – Frame which represents the cluster state at given time intervals. See job_use from viewclust.
  • cores_queued (array_like of DataFrame) – Series displaying queued resources at a particular time. See job_use from viewclust.
  • cores_running (array_like of DataFrame) – Series displaying running resources at a particular time. See job_use from viewclust.
  • resample_str (pandas freq str, optional) – Defaults to empty, meaning no resampling. Passing this parameter does not do sanity checking and will only run the below code example. cores_queued = cores_queued.resample(‘1D’).sum()
  • fig_out (str, optional) – Writes the generated figure to file as the given name. If empty, skips writing. Defaults to empty.
  • y_label (str, optional) – Makes the passed string the y-axis label.
  • fig_title (str, optional) – Appends the given string to the title.
  • query_bounds (bool, optional) – Draws red lines on the figure to represent where query is valid. Defaults to true.
  • submit_run (DataFrame, optional) – Draws a red line representing what would usage have looked like if jobs had started instantly and ran for their elapsed duration. Allows for easier interpretation of the queued series. Defaults to not plotting.
  • submit_req (DataFrame, optional) – Draws an orange line representing what usage would have looked like if jobs had started instantly and ran for their requested duration. Allows for easier interpretation of the queued series. Defaults to not plotting.

See also

jobUse()
Generates the input frames for this function.

viewclust_vis.delta_plot module

viewclust_vis.delta_plot.delta_plot(account_list, dist_list, fig_out='')[source]

Takes a list of distance from target frames and generates the delta plot.

Parameters:
  • account_list (array_like of DataFrame) – Strided to match dist_list for labels in the legend.
  • dist_list (array_like of DataFrame) – Taken to be a list of distance from target series generated by the job_use function.
  • fig_out (str, optional) – Writes the generated figure to file as the given name. If empty, skips writing. Defaults to empty.

See also

jobUse()
Generates the input frame for this function.

viewclust_vis.insta_plot module

viewclust_vis.insta_plot.insta_plot(clust_info, cores_queued, cores_running, resample_str='', fig_out='', y_label='Usage', fig_title='', query_bounds=True, running=[], queued=[], submit_run=[], submit_req=[], user_run=[], plot_queued=True)[source]

Instantaneous usage plot.

Parameters:
  • clust_info (DataFrame) – Frame which represents the cluster state at given time intervals. See job_use from viewclust.
  • cores_queued (array_like of DataFrame) – Series displaying queued resources at a particular time. See job_use from viewclust.
  • cores_running (array_like of DataFrame) – Series displaying running resources at a particular time. See job_use from viewclust.
  • resample_str (pandas freq str, optional) – Defaults to empty, meaning no resampling. Passing this parameter does not do sanity checking and will only run the below code example. cores_queued = cores_queued.resample(‘1D’).sum()
  • fig_out (str, optional) – Writes the generated figure to file as the given name. If empty, skips writing. Defaults to empty.
  • y_label (str, optional) – Makes the passed string the y-axis label.
  • fig_title (str, optional) – Appends the given string to the title.
  • query_bounds (bool, optional) – Draws red lines on the figure to represent where query is valid. Defaults to true.
  • running (DataFrame, optional) – Draws a green line representing the usage of jobs currently in RUNNING state if they run for the requested duration.
  • queued (DataFrame, optional) – Draws a gray line representing the usage of jobs currently in PENDING state if they were to start at query time and run for their requested duration.
  • submit_run (DataFrame, optional) – Draws a red line representing what would usage have looked like if jobs had started instantly and ran for their elapsed duration. Allows for easier interpretation of the queued series. Defaults to not plotting.
  • submit_req (DataFrame, optional) – Draws an orange line representing what usage would have looked like if jobs had started instantly and ran for their requested duration. Allows for easier interpretation of the queued series. Defaults to not plotting.

See also

jobUse()
Generates the input frames for this function.

viewclust_vis.job_scatter module

viewclust_vis.job_scatter.job_scatter(account, target, d_from, d_to='', d_from_drop='', out_name='', out_path='', plot_jobstack=True, plot_insta=True, plot_cumu=True, plot_mem_delta=False, plot_start_wait=False)[source]

Accepts an account name and query period to generate job usage summary figures.

Parameters:
  • account (string) – Name of account for which to query job records (note that Compute Canada systems expect a _cpu or _gpu suffix).
  • target (int-like) – The target share value for the account on the system (typically expressed as “cores” or “core-equivalents”).
  • d_from (date str) – Beginning of the query period, e.g. ‘2019-04-01T00:00:00’.
  • d_to (date str, optional) – End of the query period, e.g. ‘2020-01-01T00:00:00’. Defaults to now if empty.
  • d_from_drop (date str, optional) – Time prior to which to ingnore jobs of any state, e.g. ‘2019-12-01T00:00:00’.
  • out_path (date str, optional) – Name of path in which to place the output figure files. Defaults to current path
  • plot_jobstack (boolean, optional) – If True plot the jobstack figure. Note that for large job record data frames the jobstack figure can take some time to produce. The jobstack figure is a representation of the time periods and and resource size of each job in a job record query. Defaults to True.
  • plot_insta (boolean, optional) – If True plot the insta_plot figure. The insta_plot is a display of the job record usage measurement at each time point over the query period. Defaults to True.
  • plot_cumu (boolean, optional) – If True plot the cumu_plot figure. The cumu_plot is a display of the cumulative job record usage measurement at each time point over the query period. Defaults to True.
  • plot_mem_delta (boolean, optional) – If True plot the mem_delta figure. The mem_delta is a display memory requested (allocated) to each job as well as its peak polled memory (MaxRSS). Defaults to False.
  • plot_start_wait (boolean, optional) – If True create the start-time by wait-hours scatter plot figure. Defaults to False.
  • Output
  • -------
  • job usage figures located in the out_path directory (Requested) –

viewclust_vis.job_stack module

viewclust_vis.job_stack.job_stack(jobs, use_unit='cpu', fig_out='', plot_title='', query_bounds=True)[source]

Create job stack figure based on a given DataFrame and specified use unit.

Each job is a rectangle, where the height of the rectangle is based on the amount of resources that are being examined. Further, all three job states are displayed: queued, running, end time.

Parameters:
  • jobs (DataFrame) – Job DataFrame typically generated by the ccmnt package.
  • use_unit (str, optional) – Usage unit to examine. One of: {‘cpu’, ‘cpu-eqv’, ‘gpu’, ‘gpu-eqv’}. Defaults to ‘cpu’.
  • fig_out (str, optional) – Writes the generated figure to file as specified. If empty, skips writing. Defaults to empty.
  • plot_title (str, optional) – Title information passed to the figure object.
  • query_bounds (bool, optional) – Draws red lines on the figure to represent where query is valid. Defaults to true.

viewclust_vis.show_job_use module

viewclust_vis.show_job_use.show_job_use(account, target, d_from, d_to='', d_from_drop='', out_path='', use_unit='', plot_jobstack=True, plot_insta=True, plot_cumu=True, plot_mem_delta=False, plot_start_wait=False, plot_wait_viol=False, plot_start_runtime=False, plot_runtime_viol=False, override_frame=[])[source]

Accepts an account name and query period to generate job usage summary figures.

Parameters:
  • account (string) – Name of account for which to query job records (note that Compute Canada systems expect a _cpu or _gpu suffix).
  • target (int-like) – The target share value for the account on the system (typically expressed as “cores” or “core-equivalents”).
  • d_from (date str) – Beginning of the query period, e.g. ‘2019-04-01T00:00:00’.
  • d_to (date str, optional) – End of the query period, e.g. ‘2020-01-01T00:00:00’. Defaults to now if empty.
  • d_from_drop (date str, optional) – Time prior to which to ingnore jobs of any state, e.g. ‘2019-12-01T00:00:00’.
  • out_path (date str, optional) – Name of path in which to place the output figure files. Defaults to current path
  • use_unit (str, optional) –

    Usage unit to examine. One of: {‘cpu’, ‘cpu-eqv’, ‘gpu’, ‘gpu-eqv’}. Defaults to ‘cpu’, or determine from account suffix (if *_cpu, use_unit=’cpu-eqv’, if *_cpu, use_unit=’cpu-eqv’,

    else use_unit = ‘cpu).
  • plot_jobstack (boolean, optional) – If True plot the jobstack figure. Note that for large job record data frames the jobstack figure can take some time to produce. The jobstack figure is a representation of the time periods and and resource size of each job in a job record query. Defaults to True.
  • plot_insta (boolean, optional) – If True plot the insta_plot figure. The insta_plot is a display of the job record usage measurement at each time point over the query period. Defaults to True.
  • plot_cumu (boolean, optional) – If True plot the cumu_plot figure. The cumu_plot is a display of the cumulative job record usage measurement at each time point over the query period. Defaults to True.
  • plot_mem_delta (boolean, optional) – If True plot the mem_delta figure. The mem_delta is a display memory requested (allocated) to each job as well as its peak polled memory (MaxRSS). Defaults to False.
  • plot_start_wait (boolean, optional) – If True create the start-time by wait-hours scatter plot figure. Defaults to False.
  • override_frame (Dataframe) – Defaults to empty. If non empty, overrides the sacct call with the supplied Dataframe
  • Output
  • -------
  • job usage figures located in the out_path directory (Requested) –

viewclust_vis.summary_page module

viewclust_vis.summary_page.summary_page(folder_list, page_name)[source]

Builds an html page containing links to all html files in a list of folders.

There’s probably a library to do this properly but I just wanted something low level for now.

Parameters:
  • folder_list (Generates the input frames for this function.) – List of folders to check for html files.
  • page_name (str) – Output html page name

See also

useSuite()
Generates multiple figures per account

viewclust_vis.use_suite module

viewclust_vis.use_suite.use_suite(clust_info, cores_queued, cores_running, folder, submit_run=[])[source]

Creates a folder of a given name and creates figures inside of it.

Function is intended to be called in a loop over a list of accounts. Folder parameter should most often be the current account name.

Parameters:
  • clust_info (DataFrame) – Frame which represents the cluster state at given time intervals. See job_use from viewclust.
  • cores_queued (array_like of DataFrame) – Series displaying queued resources at a particular time. See job_use from viewclust.
  • cores_running (array_like of DataFrame) – Series displaying running resources at a particular time. See job_use from viewclust.
  • Folder (str) – Folder to place generated figures inside of. Typically an account name.
  • submit_run (DataFrame, optional) – Draws a red line representing what would usage have looked like if jobs had started instantly. Allows for easier interpretation of the queued series. Defaults to not plotting.

See also

jobUse()
Generates the input frames for this function.
summaryPage()
Can build an html page based on all figures generated

viewclust_vis.viewclust_vis module

Main module.

viewclust_vis.viol_plot module

viewclust_vis.viol_plot.viol_plot(d_from, cores_queued, cores_running, target, d_to='', fig_out='')[source]

Violin distribution usage plot.

Parameters:
  • d_from (date str) – Beginning of the query period, e.g. ‘2019-04-01T00:00:00’
  • cores_queued (array_like of DataFrame) – Series displaying queued resources at a particular time. See jobUse.
  • cores_running (array_like of DataFrame) – Series displaying running resources at a particular time. See jobUse.
  • target (int) – Target information for a given usage. Can be user specific or system-wide.
  • d_to (date str, optional) – End of the query period, e.g. ‘2020-01-01T00:00:00’. If empty, defaults to now.
  • fig_out (str, optional) – Writes the generated figure to file as the given name. If empty, skips writing. Defaults to empty.

See also

jobUse()
Generates the input frames for this function.

Module contents

Top-level package for ViewClust-Vis.