``glass_box_umap.plotting`` =========================== .. py:module:: glass_box_umap.plotting Overview -------- .. list-table:: Classes :header-rows: 0 :widths: auto :class: summarytable * - :py:obj:`LiveEmbeddingCallback ` - Pytorch Lightning callback that serves a live-updating Bokeh scatter. .. list-table:: Function :header-rows: 0 :widths: auto :class: summarytable * - :py:obj:`plot_embedding `\ (Z, contributions, \*None, group_names, feature_names, feature_values, top_k_global, hover_images, hover_tooltips, hover_data, output_backend) - Interactive 2D embedding scatter linked to a feature-contribution bar chart. * - :py:obj:`plot_embedding_static `\ (Z, group_ids, group_names, cmap, marker_size) - Static (matplotlib) scatter plot of a 2D embedding, optionally colored by group. Classes ------- .. autoclass:: LiveEmbeddingCallback Bases: :py:obj:`pytorch_lightning.Callback` .. rubric:: Methods: .. py:method:: on_fit_start(trainer: pytorch_lightning.Trainer, pl_module: pytorch_lightning.LightningModule) -> None Called when fit begins. .. py:method:: on_train_epoch_start(trainer: pytorch_lightning.Trainer, pl_module: pytorch_lightning.LightningModule) -> None Called when the train epoch begins. .. py:method:: on_train_epoch_end(trainer: pytorch_lightning.Trainer, pl_module: pytorch_lightning.LightningModule) -> None Called when the train epoch ends. To access all batch outputs at the end of the epoch, you can cache step outputs as an attribute of the :class:`pytorch_lightning.core.LightningModule` and access them in this hook: .. code-block:: python class MyLightningModule(L.LightningModule): def __init__(self): super().__init__() self.training_step_outputs = [] def training_step(self): loss = ... self.training_step_outputs.append(loss) return loss class MyCallback(L.Callback): def on_train_epoch_end(self, trainer, pl_module): # do something with all training_step outputs, for example: epoch_mean = torch.stack(pl_module.training_step_outputs).mean() pl_module.log("training_epoch_mean", epoch_mean) # free up the memory pl_module.training_step_outputs.clear() .. py:method:: on_train_end(trainer: pytorch_lightning.Trainer, pl_module: pytorch_lightning.LightningModule) -> None Called when the train ends. Functions --------- .. py:function:: plot_embedding(Z: numpy.typing.NDArray[numpy.floating], contributions: numpy.typing.NDArray[numpy.floating], *, group_names: collections.abc.Sequence[Any] | numpy.typing.NDArray | None = None, feature_names: list[str] | None = None, feature_values: numpy.typing.NDArray[numpy.floating] | None = None, top_k_global: int = 200, hover_images: numpy.typing.NDArray[numpy.uint8] | None = None, hover_tooltips: str | None = None, hover_data: collections.abc.Mapping[str, collections.abc.Sequence[Any]] | None = None, output_backend: glass_box_umap.plotting.bokeh._scatter.OutputBackend = 'webgl') -> bokeh.models.layouts.LayoutDOM Interactive 2D embedding scatter linked to a feature-contribution bar chart. A single radio toggle above the scatter chooses how to color the points: - ``Group`` (only available when ``group_names`` is provided): categorical coloring by user-supplied labels. - ``Feature``: a Viridis gradient over the L2-reduced contribution of one feature, picked via an autocomplete input that appears below the toggle (substring match, case-insensitive). - ``Top feature``: each sample is colored by the kept feature with its largest L2-reduced contribution. A slider lets the user choose the top-N most-frequent top features to colorize; samples whose top feature isn't in that set are drawn in gray underneath the colored points. Lasso- or box-selecting points in the scatter updates the linked bar chart on the right (which has its own ``L2 | normed L2 | Dim 1 | Dim 2`` view toggle); with no selection the bars summarize all samples. :param Z: Embedding coordinates of shape ``(n_samples, 2)``. :param contributions: Per-feature contributions of shape ``(n_samples, 2, n_features)``. Typically the output of :meth:`~glass_box_umap.GlassBoxUMAP.compute_contributions` with ``reduction=None``. :param group_names: Group label per sample. Any sequence of length ``n_samples``; elements are stringified before use. When provided, the ``Group`` color mode is added to the radio and used as the default; when ``None`` (default), the radio shows only ``Feature`` / ``Top feature`` and starts in ``Feature`` mode. :param feature_names: Human-readable name per feature; length must equal ``contributions.shape[2]``. Defaults to ``"Feature {i}"`` (0-indexed). :param feature_values: Per-sample feature values of shape ``(n_samples, n_features)``. When provided, the default tooltip for ``Feature`` mode adds ``value: `` (the picker-selected feature's value), and the default tooltip for ``Top feature`` mode adds ``value: `` (the top feature's value). Whatever scaling the caller passes is what the tooltip displays — pass raw values for human-readable tooltips, or the same standardized array fed to the embedder for consistency with contributions space. Ignored when ``hover_tooltips`` is set. :param top_k_global: How many features to ship to the browser, ranked by global L2 importance. Caps everything: the bar chart, the feature-picker autocomplete, and the candidate set for top-feature ranking. :param hover_images: Per-sample uint8 image array of shape ``(n_samples, H, W)`` or ``(n_samples, H, W, 3 | 4)``. When set, each tooltip shows the sample's image above the default index/group text. Mutually exclusive with ``hover_tooltips`` and ``hover_data``. :param hover_tooltips: Bokeh tooltip HTML template that fully replaces the default. May reference ``@index``, ``@group`` (when ``group_names`` is provided), and any keys from ``hover_data``. :param hover_data: Extra columns merged into the scatter ``ColumnDataSource`` for reference from ``hover_tooltips``. Each value must have length ``n_samples``. Keys must not collide with the reserved columns ``x``, ``y``, ``index``, ``group``, ``color_value``, ``top_feature_group``, ``top_feature_name``, ``top_data_value``, ``picker_data_value``, ``sample_rank``. :param output_backend: Bokeh rendering backend for the scatter. Defaults to ``"webgl"``, which offloads rendering to the GPU and stays smooth at high sample counts. Switch to ``"canvas"`` if the GPU/driver/browser combination renders the plot incorrectly (e.g. blank canvas, wrong-sized points, or color banding) — canvas is slower but uses CPU rasterization and works on any setup that supports Bokeh at all. :returns: A Bokeh layout — color-by controls + scatter on the left, linked bar chart with view toggle on the right. Pass it to :func:`bokeh.io.show` or :func:`bokeh.io.save`. .. py:function:: plot_embedding_static(Z: numpy.typing.NDArray[numpy.floating], group_ids: numpy.typing.NDArray[numpy.integer] | None = None, group_names: list[str] | None = None, cmap: matplotlib.colors.ListedColormap | None = None, marker_size: float = 2.0) -> matplotlib.figure.Figure Static (matplotlib) scatter plot of a 2D embedding, optionally colored by group. :param Z: Embedding coordinates with shape (n_samples, 2). :param group_ids: Integer group ID per point with shape (n_samples,). If None, points are uncolored. :param group_names: Human-readable name for each group, indexed by group ID. If None and group_ids are provided, defaults to str(gid) for each group. :param cmap: Colormap for the scatter plot. If None and group_ids are provided, a colormap is generated with one color per unique group. :param marker_size: Size of scatter plot markers.