.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/06_we8there/run-01-demo=mix_vb_single_run-model=mix+mult.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_06_we8there_run-01-demo=mix_vb_single_run-model=mix+mult.py: ================================================= VB coordinate descent for Mixture of Multinomials ================================================= .. GENERATED FROM PYTHON SOURCE LINES 8-24 .. code-block:: default import bnpy import numpy as np import os from matplotlib import pylab import seaborn as sns FIG_SIZE = (3, 3) SMALL_FIG_SIZE = (1,1) pylab.rcParams['figure.figsize'] = FIG_SIZE top_word_kws = dict( wordSizeLimit=15, ncols=4, Ktop=10) .. GENERATED FROM PYTHON SOURCE LINES 25-26 Read text dataset from file .. GENERATED FROM PYTHON SOURCE LINES 26-37 .. code-block:: default dataset_path = os.path.join(bnpy.DATASET_PATH, 'we8there', 'raw') dataset = bnpy.data.BagOfWordsData.read_npz( os.path.join(dataset_path, 'dataset.npz'), vocabfile=os.path.join(dataset_path, 'x_csc_colnames.txt')) # Filter out documents with less than 20 words doc_ids = np.flatnonzero( dataset.getDocTypeCountMatrix().sum(axis=1) >= 20) dataset = dataset.make_subset(docMask=doc_ids, doTrackFullSize=False) .. GENERATED FROM PYTHON SOURCE LINES 38-39 Make a simple plot of the raw data .. GENERATED FROM PYTHON SOURCE LINES 40-46 .. code-block:: default bnpy.viz.PrintTopics.plotCompsFromWordCounts( dataset.getDocTypeCountMatrix()[:10], vocabList=dataset.vocabList, prefix='doc', **top_word_kws) .. GENERATED FROM PYTHON SOURCE LINES 47-51 Train with K=1 cluster ---------------------- This is a simple baseline. .. GENERATED FROM PYTHON SOURCE LINES 52-65 .. code-block:: default trained_model, info_dict = bnpy.run( dataset, 'DPMixtureModel', 'Mult', 'VB', output_path='/tmp/we8there/helloworld-model=dp_mix+mult-K=1/', nLap=1000, convergeThr=0.0001, nTask=1, K=1, initname='bregmankmeans+lam1+iter1', gamma0=50.0, lam=0.1) bnpy.viz.PrintTopics.plotCompsFromHModel( trained_model, vocabList=dataset.vocabList, **top_word_kws) .. GENERATED FROM PYTHON SOURCE LINES 66-70 Train with K=3 clusters ----------------------- Take the best of 10 initializations .. GENERATED FROM PYTHON SOURCE LINES 71-85 .. code-block:: default trained_model, info_dict = bnpy.run( dataset, 'DPMixtureModel', 'Mult', 'VB', output_path='/tmp/we8there/helloworld-model=dp_mix+mult-K=3/', nLap=1000, convergeThr=0.0001, nTask=10, K=3, initname='bregmankmeans+lam1+iter1', gamma0=50.0, lam=0.1) bnpy.viz.PrintTopics.plotCompsFromHModel( trained_model, vocabList=dataset.vocabList, **top_word_kws) .. GENERATED FROM PYTHON SOURCE LINES 86-90 Train with K=10 clusters ------------------------ Take the best of 10 initializations .. GENERATED FROM PYTHON SOURCE LINES 91-104 .. code-block:: default trained_model, info_dict = bnpy.run( dataset, 'DPMixtureModel', 'Mult', 'VB', output_path='/tmp/we8there/helloworld-model=dp_mix+mult-K=10/', nLap=1000, convergeThr=0.0001, nTask=10, K=10, initname='bregmankmeans+lam1+iter1', gamma0=50.0, lam=0.1) bnpy.viz.PrintTopics.plotCompsFromHModel( trained_model, vocabList=dataset.vocabList, **top_word_kws) .. GENERATED FROM PYTHON SOURCE LINES 105-109 Train with K=30 clusters ------------------------ Take the best of 10 initializations .. GENERATED FROM PYTHON SOURCE LINES 110-122 .. code-block:: default trained_model, info_dict = bnpy.run( dataset, 'DPMixtureModel', 'Mult', 'VB', output_path='/tmp/we8there/helloworld-model=dp_mix+mult-K=30/', nLap=1000, convergeThr=0.0001, nTask=10, K=30, initname='bregmankmeans+lam1+iter1', gamma0=50.0, lam=0.1) bnpy.viz.PrintTopics.plotCompsFromHModel( trained_model, vocabList=dataset.vocabList, **top_word_kws) .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.000 seconds) .. _sphx_glr_download_examples_06_we8there_run-01-demo=mix_vb_single_run-model=mix+mult.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: run-01-demo=mix_vb_single_run-model=mix+mult.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: run-01-demo=mix_vb_single_run-model=mix+mult.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_