.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/04_bars_one_per_doc/run-04-demo=topic_model_vb+merges-model=hdp_topic+mult.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_04_bars_one_per_doc_run-04-demo=topic_model_vb+merges-model=hdp_topic+mult.py: ================================================= 04: Training HDP Topic Model with merge proposals ================================================= .. GENERATED FROM PYTHON SOURCE LINES 8-20 .. code-block:: default import bnpy import numpy as np import os import sys from matplotlib import pylab import seaborn as sns FIG_SIZE = (3, 3) SMALL_FIG_SIZE = (1,1) pylab.rcParams['figure.figsize'] = FIG_SIZE .. GENERATED FROM PYTHON SOURCE LINES 21-22 Read dataset from file. .. GENERATED FROM PYTHON SOURCE LINES 22-27 .. code-block:: default dataset_path = os.path.join(bnpy.DATASET_PATH, 'bars_one_per_doc') dataset = bnpy.data.BagOfWordsData.read_npz( os.path.join(dataset_path, 'dataset.npz')) .. GENERATED FROM PYTHON SOURCE LINES 28-29 Set the local step algorithmic keyword args .. GENERATED FROM PYTHON SOURCE LINES 30-46 .. code-block:: default local_step_kwargs = dict( # Perform at most this many iterations at each document nCoordAscentItersLP=100, # Stop local iters early when max change in doc-topic counts < this thr convThrLP=0.001, restartLP=0, doMemoizeLocalParams=0, ) merge_kwargs = dict( m_startLap=5, m_pair_ranking_procedure='total_size', m_pair_ranking_direction='descending', ) .. GENERATED FROM PYTHON SOURCE LINES 47-51 Run the VB+proposals algorithm with only merges and re-shuffling. Initialization: 10 topics, using randomlikewang .. GENERATED FROM PYTHON SOURCE LINES 52-65 .. code-block:: default trained_model, info_dict = bnpy.run( dataset, 'HDPTopicModel', 'Mult', 'memoVB', output_path= '/tmp/bars_one_per_doc/' + 'trymoves-model=hdp+mult-K=10-moves=merge,shuffle/', nLap=50, convergeThr=0.001, nBatch=1, K=10, initname='randomlikewang', alpha=0.5, lam=0.1, moves='merge,shuffle', **dict(list(merge_kwargs.items()) + list(local_step_kwargs.items()))) .. GENERATED FROM PYTHON SOURCE LINES 68-95 .. code-block:: default def show_bars_over_time( task_output_path=None, query_laps=[0, 1, 2, 5, None], ncols=10): ''' Show square-image visualization of estimated topics over time. Post Condition -------------- New matplotlib figure with visualization (one row per lap). ''' nrows = len(query_laps) fig_handle, ax_handles_RC = pylab.subplots( figsize=(SMALL_FIG_SIZE[0] * ncols, SMALL_FIG_SIZE[1] * nrows), nrows=nrows, ncols=ncols, sharex=True, sharey=True) for row_id, lap_val in enumerate(query_laps): cur_model, lap_val = bnpy.load_model_at_lap(task_output_path, lap_val) cur_topics_KV = cur_model.obsModel.getTopics() # Plot the current model cur_ax_list = ax_handles_RC[row_id].flatten().tolist() bnpy.viz.BarsViz.show_square_images( cur_topics_KV, vmin=0.0, vmax=0.06, ax_list=cur_ax_list) cur_ax_list[0].set_ylabel("lap: %d" % lap_val) pylab.tight_layout() .. GENERATED FROM PYTHON SOURCE LINES 96-97 Examine the bars over time .. GENERATED FROM PYTHON SOURCE LINES 98-100 .. code-block:: default show_bars_over_time(info_dict['task_output_path']) .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.000 seconds) .. _sphx_glr_download_examples_04_bars_one_per_doc_run-04-demo=topic_model_vb+merges-model=hdp_topic+mult.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: run-04-demo=topic_model_vb+merges-model=hdp_topic+mult.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: run-04-demo=topic_model_vb+merges-model=hdp_topic+mult.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_