How delete moves can be more effective than merges.

In this example, we show how merge moves alone may not be enough to reliably escape local optima. Instead, we show that more flexible delete moves can escape from situations where merges alone fail.

```
import bnpy
import numpy as np
import os
from matplotlib import pylab
import seaborn as sns
FIG_SIZE = (3, 3)
pylab.rcParams['figure.figsize'] = FIG_SIZE
```

Load dataset from file

```
dataset_path = os.path.join(bnpy.DATASET_PATH, 'faithful')
dataset = bnpy.data.XData.read_csv(
os.path.join(dataset_path, 'faithful.csv'))
```

Make a simple plot of the raw data

```
pylab.plot(dataset.X[:, 0], dataset.X[:, 1], 'k.')
pylab.xlabel(dataset.column_names[0])
pylab.ylabel(dataset.column_names[1])
pylab.tight_layout()
data_ax_h = pylab.gca()
```

```
merge_kwargs = dict(
m_startLap=10,
m_pair_ranking_procedure='total_size',
m_pair_ranking_direction='descending',
)
delete_kwargs = dict(
d_startLap=20,
d_nRefineSteps=50,
)
```

```
def show_clusters_over_time(
task_output_path=None,
query_laps=[0, 1, 2, 10, 20, None],
nrows=2):
'''
'''
ncols = int(np.ceil(len(query_laps) // float(nrows)))
fig_handle, ax_handle_list = pylab.subplots(
figsize=(FIG_SIZE[0] * ncols, FIG_SIZE[1] * nrows),
nrows=nrows, ncols=ncols, sharex=True, sharey=True)
for plot_id, lap_val in enumerate(query_laps):
cur_model, lap_val = bnpy.load_model_at_lap(task_output_path, lap_val)
cur_ax_handle = ax_handle_list.flatten()[plot_id]
bnpy.viz.PlotComps.plotCompsFromHModel(
cur_model, dataset=dataset, ax_handle=cur_ax_handle)
cur_ax_handle.set_title("lap: %d" % lap_val)
cur_ax_handle.set_xlabel(dataset.column_names[0])
cur_ax_handle.set_ylabel(dataset.column_names[1])
cur_ax_handle.set_xlim(data_ax_h.get_xlim())
cur_ax_handle.set_ylim(data_ax_h.get_ylim())
pylab.tight_layout()
```

Start with too many clusters (K=25)

```
gamma = 5.0
sF = 5.0
K = 25
diag1_trained_model, diag1_info_dict = bnpy.run(
dataset, 'DPMixtureModel', 'DiagGauss', 'memoVB',
output_path=('/tmp/faithful/' +
'trymoves-K=%d-gamma=%s-lik=DiagGauss-ECovMat=%s*eye-moves=none/' % (
K, gamma, sF)),
nLap=1000, nTask=1, nBatch=1, convergeThr=0.0001,
gamma0=gamma, sF=sF, ECovMat='eye',
K=K, initname='randexamplesbydist',
)
show_clusters_over_time(diag1_info_dict['task_output_path'])
```

Out:

```
Dataset Summary:
X Data
total size: 272 units
batch size: 272 units
num. batches: 1
Allocation Model: DP mixture with K=0. Concentration gamma0= 5.00
Obs. Data Model: Gaussian with diagonal covariance.
Obs. Data Prior: independent Gauss-Wishart prior on each dimension
Wishart params
nu = 4
beta = [ 10 10]
Expectations
E[ mean[k]] =
[ 0 0]
E[ covar[k]] =
[[5. 0.]
[0. 5.]]
Initialization:
initname = randexamplesbydist
K = 25 (number of clusters)
seed = 1607680
elapsed_time: 0.0 sec
Learn Alg: memoVB | task 1/1 | alg. seed: 1607680 | data order seed: 8541952
task_output_path: /tmp/faithful/trymoves-K=25-gamma=5.0-lik=DiagGauss-ECovMat=5.0*eye-moves=none/1
1.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 3.062238769e+00 |
2.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 3.029485836e+00 | Ndiff 3.204
3.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.988929671e+00 | Ndiff 3.849
4.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.934114089e+00 | Ndiff 3.694
5.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.864320498e+00 | Ndiff 3.057
6.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.796068518e+00 | Ndiff 2.511
7.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.755200911e+00 | Ndiff 2.108
8.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.724352077e+00 | Ndiff 2.158
9.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.714818175e+00 | Ndiff 2.034
10.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.710015995e+00 | Ndiff 1.890
11.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.704686401e+00 | Ndiff 2.337
12.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.698284253e+00 | Ndiff 2.931
13.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.684190801e+00 | Ndiff 3.692
14.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.673906693e+00 | Ndiff 4.579
15.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.665323886e+00 | Ndiff 5.389
16.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.654124178e+00 | Ndiff 5.757
17.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.640063487e+00 | Ndiff 5.488
18.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.617231862e+00 | Ndiff 4.911
19.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.586775359e+00 | Ndiff 4.515
20.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.569671629e+00 | Ndiff 4.553
21.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.545014854e+00 | Ndiff 4.407
22.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.525424540e+00 | Ndiff 3.380
23.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.509251223e+00 | Ndiff 3.026
24.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.493302635e+00 | Ndiff 3.039
25.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.488984785e+00 | Ndiff 2.950
26.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.484950693e+00 | Ndiff 2.825
27.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.480608059e+00 | Ndiff 2.770
28.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.467505075e+00 | Ndiff 2.837
29.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.465130737e+00 | Ndiff 3.079
30.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.463555210e+00 | Ndiff 3.395
31.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.461714070e+00 | Ndiff 3.730
32.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.459545432e+00 | Ndiff 4.089
33.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.456958951e+00 | Ndiff 4.479
34.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.453823064e+00 | Ndiff 4.899
35.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.449961599e+00 | Ndiff 5.330
36.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.445166898e+00 | Ndiff 5.712
37.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.439248319e+00 | Ndiff 5.926
38.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.432174168e+00 | Ndiff 5.797
39.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.424387029e+00 | Ndiff 5.157
40.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.416911819e+00 | Ndiff 4.062
41.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.409264356e+00 | Ndiff 2.986
42.000/1000 after 0 sec. | 193.9 MiB | K 25 | loss 2.392900067e+00 | Ndiff 2.341
43.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.379189152e+00 | Ndiff 2.183
44.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.375964695e+00 | Ndiff 2.106
45.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.373574138e+00 | Ndiff 1.965
46.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.371010870e+00 | Ndiff 1.705
47.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.368474850e+00 | Ndiff 1.323
48.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.366360106e+00 | Ndiff 0.879
49.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.364898037e+00 | Ndiff 0.531
50.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.363200338e+00 | Ndiff 0.358
51.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.354385944e+00 | Ndiff 0.162
52.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192787e+00 | Ndiff 0.035
53.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192415e+00 | Ndiff 0.021
54.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192256e+00 | Ndiff 0.014
55.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192185e+00 | Ndiff 0.009
56.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192152e+00 | Ndiff 0.007
57.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192136e+00 | Ndiff 0.005
58.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192129e+00 | Ndiff 0.003
59.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192124e+00 | Ndiff 0.003
60.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192122e+00 | Ndiff 0.002
61.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192120e+00 | Ndiff 0.001
62.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192119e+00 | Ndiff 0.001
63.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192119e+00 | Ndiff 0.001
64.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192118e+00 | Ndiff 0.001
65.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192118e+00 | Ndiff 0.001
66.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192117e+00 | Ndiff 0.001
67.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192117e+00 | Ndiff 0.000
68.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192117e+00 | Ndiff 0.000
69.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192117e+00 | Ndiff 0.000
70.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192117e+00 | Ndiff 0.000
71.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192116e+00 | Ndiff 0.000
72.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192116e+00 | Ndiff 0.000
73.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192116e+00 | Ndiff 0.000
74.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192116e+00 | Ndiff 0.000
75.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192116e+00 | Ndiff 0.000
76.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192116e+00 | Ndiff 0.000
77.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192116e+00 | Ndiff 0.000
78.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192116e+00 | Ndiff 0.000
79.000/1000 after 1 sec. | 193.9 MiB | K 25 | loss 2.350192116e+00 | Ndiff 0.000
... done. converged.
SKIPPED 2 comps with size below 0.00
```

Start with too many clusters (K=25) Use merges and deletes to reduce to a better set.

```
gamma = 5.0
sF = 5.0
K = 25
diag_trained_model, diag_info_dict = bnpy.run(
dataset, 'DPMixtureModel', 'DiagGauss', 'memoVB',
output_path=('/tmp/faithful/' +
'trymoves-K=%d-gamma=%s-lik=DiagGauss-ECovMat=%s*eye-moves=merge,delete,shuffle/' % (
K, gamma, sF)),
nLap=100, nTask=1, nBatch=1,
gamma0=gamma, sF=sF, ECovMat='eye',
K=K, initname='randexamplesbydist',
moves='merge,delete,shuffle',
**dict(delete_kwargs.items() + merge_kwargs.items()))
show_clusters_over_time(diag_info_dict['task_output_path'])
```

Out:

```
Dataset Summary:
X Data
total size: 272 units
batch size: 272 units
num. batches: 1
Allocation Model: DP mixture with K=0. Concentration gamma0= 5.00
Obs. Data Model: Gaussian with diagonal covariance.
Obs. Data Prior: independent Gauss-Wishart prior on each dimension
Wishart params
nu = 4
beta = [ 10 10]
Expectations
E[ mean[k]] =
[ 0 0]
E[ covar[k]] =
[[5. 0.]
[0. 5.]]
Initialization:
initname = randexamplesbydist
K = 25 (number of clusters)
seed = 1607680
elapsed_time: 0.0 sec
Learn Alg: memoVB | task 1/1 | alg. seed: 1607680 | data order seed: 8541952
task_output_path: /tmp/faithful/trymoves-K=25-gamma=5.0-lik=DiagGauss-ECovMat=5.0*eye-moves=merge,delete,shuffle/1
MERGE @ lap 1.00: Disabled. Cannot plan merge on first lap. Need valid SS that represent whole dataset.
DELETE @ lap 1.00: Disabled. Cannot delete before first complete lap, because SS that represents whole dataset is required.
1.000/100 after 0 sec. | 193.7 MiB | K 25 | loss 3.051961291e+00 |
MERGE @ lap 2.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 2.00: Disabled. Waiting for lap >= 20 (--d_startLap).
2.000/100 after 0 sec. | 193.7 MiB | K 25 | loss 3.022209522e+00 | Ndiff 2.772
MERGE @ lap 3.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 3.00: Disabled. Waiting for lap >= 20 (--d_startLap).
3.000/100 after 0 sec. | 193.7 MiB | K 25 | loss 2.984471596e+00 | Ndiff 3.323
MERGE @ lap 4.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 4.00: Disabled. Waiting for lap >= 20 (--d_startLap).
4.000/100 after 0 sec. | 193.7 MiB | K 25 | loss 2.926643058e+00 | Ndiff 2.598
MERGE @ lap 5.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 5.00: Disabled. Waiting for lap >= 20 (--d_startLap).
5.000/100 after 0 sec. | 193.7 MiB | K 25 | loss 2.862469660e+00 | Ndiff 2.755
MERGE @ lap 6.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 6.00: Disabled. Waiting for lap >= 20 (--d_startLap).
6.000/100 after 0 sec. | 193.7 MiB | K 25 | loss 2.779705996e+00 | Ndiff 3.164
MERGE @ lap 7.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 7.00: Disabled. Waiting for lap >= 20 (--d_startLap).
7.000/100 after 0 sec. | 193.7 MiB | K 25 | loss 2.734025386e+00 | Ndiff 3.344
MERGE @ lap 8.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 8.00: Disabled. Waiting for lap >= 20 (--d_startLap).
8.000/100 after 0 sec. | 193.7 MiB | K 25 | loss 2.705925412e+00 | Ndiff 3.188
MERGE @ lap 9.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 9.00: Disabled. Waiting for lap >= 20 (--d_startLap).
9.000/100 after 0 sec. | 193.7 MiB | K 25 | loss 2.695114165e+00 | Ndiff 2.809
DELETE @ lap 10.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 10.00 : 11/21 accepted. Ndiff 96.55. 39 skipped.
10.000/100 after 0 sec. | 193.7 MiB | K 14 | loss 2.588387948e+00 | Ndiff 2.809
DELETE @ lap 11.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 11.00 : 6/10 accepted. Ndiff 57.31. 23 skipped.
11.000/100 after 0 sec. | 193.7 MiB | K 8 | loss 2.418907387e+00 | Ndiff 2.809
DELETE @ lap 12.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 12.00 : 4/5 accepted. Ndiff 24.39. 14 skipped.
12.000/100 after 0 sec. | 193.7 MiB | K 4 | loss 2.344906523e+00 | Ndiff 2.809
DELETE @ lap 13.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 13.00 : 1/4 accepted. Ndiff 0.00. 2 skipped.
13.000/100 after 0 sec. | 193.7 MiB | K 3 | loss 2.342439735e+00 | Ndiff 2.809
MERGE @ lap 14.00: No promising candidates, so no attempts.
DELETE @ lap 14.00: Disabled. Waiting for lap >= 20 (--d_startLap).
14.000/100 after 0 sec. | 193.7 MiB | K 3 | loss 2.341617749e+00 | Ndiff 1.311
DELETE @ lap 15.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 15.00 : 0/2 accepted. Ndiff 0.00. 0 skipped.
15.000/100 after 0 sec. | 193.7 MiB | K 3 | loss 2.341288596e+00 | Ndiff 0.753
DELETE @ lap 16.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 16.00 : 0/1 accepted. Ndiff 0.00. 0 skipped.
16.000/100 after 0 sec. | 193.7 MiB | K 3 | loss 2.341147153e+00 | Ndiff 0.467
DELETE @ lap 17.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 17.00 : 0/2 accepted. Ndiff 0.00. 0 skipped.
17.000/100 after 0 sec. | 193.7 MiB | K 3 | loss 2.341080073e+00 | Ndiff 0.312
MERGE @ lap 18.00: No promising candidates, so no attempts.
DELETE @ lap 18.00: Disabled. Waiting for lap >= 20 (--d_startLap).
18.000/100 after 0 sec. | 193.7 MiB | K 3 | loss 2.341044964e+00 | Ndiff 0.220
DELETE @ lap 19.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 19.00 : 0/2 accepted. Ndiff 0.00. 0 skipped.
19.000/100 after 1 sec. | 193.7 MiB | K 3 | loss 2.341025070e+00 | Ndiff 0.163
MERGE @ lap 20.00: No promising candidates, so no attempts.
DELETE @ lap 20.00: 1/1 accepted. Ndiff 169.12.
20.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925760e+00 | Ndiff 0.163
DELETE @ lap 21.00: 0/1 accepted. Ndiff 0.00.
MERGE @ lap 21.00 : 0/1 accepted. Ndiff 0.00. 0 skipped.
21.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925543e+00 | Ndiff 0.009
MERGE @ lap 22.00: No promising candidates, so no attempts.
DELETE @ lap 22.00: 0/1 accepted. Ndiff 0.00.
22.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925529e+00 | Ndiff 0.002
MERGE @ lap 23.00: No promising candidates, so no attempts.
DELETE @ lap 23.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
23.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925528e+00 | Ndiff 0.001
MERGE @ lap 24.00: No promising candidates, so no attempts.
DELETE @ lap 24.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
24.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925528e+00 | Ndiff 0.000
MERGE @ lap 25.00: No promising candidates, so no attempts.
DELETE @ lap 25.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
25.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925528e+00 | Ndiff 0.000
DELETE @ lap 26.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
MERGE @ lap 26.00 : 0/1 accepted. Ndiff 0.00. 0 skipped.
26.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925528e+00 | Ndiff 0.000
MERGE @ lap 27.00: No promising candidates, so no attempts.
DELETE @ lap 27.00: 0/1 accepted. Ndiff 0.00.
27.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925528e+00 | Ndiff 0.000
MERGE @ lap 28.00: No promising candidates, so no attempts.
DELETE @ lap 28.00: 0/1 accepted. Ndiff 0.00.
28.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925528e+00 | Ndiff 0.000
MERGE @ lap 29.00: No promising candidates, so no attempts.
DELETE @ lap 29.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
29.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925528e+00 | Ndiff 0.000
MERGE @ lap 30.00: No promising candidates, so no attempts.
DELETE @ lap 30.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
30.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925528e+00 | Ndiff 0.000
DELETE @ lap 31.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
MERGE @ lap 31.00 : 0/1 accepted. Ndiff 0.00. 0 skipped.
31.000/100 after 1 sec. | 193.7 MiB | K 2 | loss 2.330925528e+00 | Ndiff 0.000
... done. converged.
```

Start with too many clusters (K=25) Use merges and deletes to reduce to a better set.

```
full_trained_model, full_info_dict = bnpy.run(
dataset, 'DPMixtureModel', 'Gauss', 'memoVB',
output_path=('/tmp/faithful/' +
'trymoves-K=%d-gamma=%s-lik-Gauss-ECovMat=%s*eye-moves=merge,delete,shuffle/' % (
K, gamma, sF)),
nLap=100, nTask=1, nBatch=1,
gamma0=gamma, sF=sF, ECovMat='eye',
K=K, initname='randexamplesbydist',
moves='merge,delete,shuffle',
**dict(delete_kwargs.items() + merge_kwargs.items()))
show_clusters_over_time(full_info_dict['task_output_path'])
```

Out:

```
Dataset Summary:
X Data
total size: 272 units
batch size: 272 units
num. batches: 1
Allocation Model: DP mixture with K=0. Concentration gamma0= 5.00
Obs. Data Model: Gaussian with full covariance.
Obs. Data Prior: Gauss-Wishart on mean and covar of each cluster
E[ mean[k] ] =
[0. 0.]
E[ covar[k] ] =
[[5. 0.]
[0. 5.]]
Initialization:
initname = randexamplesbydist
K = 25 (number of clusters)
seed = 1607680
elapsed_time: 0.0 sec
Learn Alg: memoVB | task 1/1 | alg. seed: 1607680 | data order seed: 8541952
task_output_path: /tmp/faithful/trymoves-K=25-gamma=5.0-lik-Gauss-ECovMat=5.0*eye-moves=merge,delete,shuffle/1
MERGE @ lap 1.00: Disabled. Cannot plan merge on first lap. Need valid SS that represent whole dataset.
DELETE @ lap 1.00: Disabled. Cannot delete before first complete lap, because SS that represents whole dataset is required.
1.000/100 after 0 sec. | 193.4 MiB | K 25 | loss 2.959097158e+00 |
MERGE @ lap 2.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 2.00: Disabled. Waiting for lap >= 20 (--d_startLap).
2.000/100 after 0 sec. | 193.4 MiB | K 25 | loss 2.937002302e+00 | Ndiff 3.003
MERGE @ lap 3.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 3.00: Disabled. Waiting for lap >= 20 (--d_startLap).
3.000/100 after 0 sec. | 193.4 MiB | K 25 | loss 2.912622033e+00 | Ndiff 3.709
MERGE @ lap 4.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 4.00: Disabled. Waiting for lap >= 20 (--d_startLap).
4.000/100 after 0 sec. | 193.4 MiB | K 25 | loss 2.880654959e+00 | Ndiff 3.092
MERGE @ lap 5.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 5.00: Disabled. Waiting for lap >= 20 (--d_startLap).
5.000/100 after 0 sec. | 193.4 MiB | K 25 | loss 2.829532053e+00 | Ndiff 3.072
MERGE @ lap 6.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 6.00: Disabled. Waiting for lap >= 20 (--d_startLap).
6.000/100 after 0 sec. | 193.4 MiB | K 25 | loss 2.769025659e+00 | Ndiff 2.422
MERGE @ lap 7.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 7.00: Disabled. Waiting for lap >= 20 (--d_startLap).
7.000/100 after 1 sec. | 193.4 MiB | K 25 | loss 2.725273863e+00 | Ndiff 1.750
MERGE @ lap 8.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 8.00: Disabled. Waiting for lap >= 20 (--d_startLap).
8.000/100 after 1 sec. | 193.4 MiB | K 25 | loss 2.675946971e+00 | Ndiff 1.849
MERGE @ lap 9.00: Disabled. Waiting for lap >= 10 (--m_startLap).
DELETE @ lap 9.00: Disabled. Waiting for lap >= 20 (--d_startLap).
9.000/100 after 1 sec. | 193.4 MiB | K 25 | loss 2.647374823e+00 | Ndiff 1.998
DELETE @ lap 10.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 10.00 : 11/18 accepted. Ndiff 107.08. 41 skipped.
10.000/100 after 1 sec. | 193.4 MiB | K 14 | loss 2.556490112e+00 | Ndiff 1.998
DELETE @ lap 11.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 11.00 : 6/7 accepted. Ndiff 88.56. 27 skipped.
11.000/100 after 1 sec. | 193.4 MiB | K 8 | loss 2.396890018e+00 | Ndiff 1.998
DELETE @ lap 12.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 12.00 : 4/5 accepted. Ndiff 35.75. 11 skipped.
12.000/100 after 1 sec. | 193.4 MiB | K 4 | loss 2.315013331e+00 | Ndiff 1.998
DELETE @ lap 13.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 13.00 : 2/3 accepted. Ndiff 6.24. 3 skipped.
13.000/100 after 1 sec. | 193.4 MiB | K 2 | loss 2.282002142e+00 | Ndiff 1.998
MERGE @ lap 14.00: No promising candidates, so no attempts.
DELETE @ lap 14.00: Disabled. Waiting for lap >= 20 (--d_startLap).
14.000/100 after 1 sec. | 193.4 MiB | K 2 | loss 2.268874488e+00 | Ndiff 1.018
DELETE @ lap 15.00: Disabled. Waiting for lap >= 20 (--d_startLap).
MERGE @ lap 15.00 : 0/1 accepted. Ndiff 0.00. 0 skipped.
15.000/100 after 1 sec. | 193.4 MiB | K 2 | loss 2.268112275e+00 | Ndiff 0.390
MERGE @ lap 16.00: No promising candidates, so no attempts.
DELETE @ lap 16.00: Disabled. Waiting for lap >= 20 (--d_startLap).
16.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268077838e+00 | Ndiff 0.076
MERGE @ lap 17.00: No promising candidates, so no attempts.
DELETE @ lap 17.00: Disabled. Waiting for lap >= 20 (--d_startLap).
17.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076838e+00 | Ndiff 0.013
MERGE @ lap 18.00: No promising candidates, so no attempts.
DELETE @ lap 18.00: Disabled. Waiting for lap >= 20 (--d_startLap).
18.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076812e+00 | Ndiff 0.002
MERGE @ lap 19.00: No promising candidates, so no attempts.
DELETE @ lap 19.00: Disabled. Waiting for lap >= 20 (--d_startLap).
19.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
DELETE @ lap 20.00: 0/1 accepted. Ndiff 0.00.
MERGE @ lap 20.00 : 0/1 accepted. Ndiff 0.00. 0 skipped.
20.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
MERGE @ lap 21.00: No promising candidates, so no attempts.
DELETE @ lap 21.00: 0/1 accepted. Ndiff 0.00.
21.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
MERGE @ lap 22.00: No promising candidates, so no attempts.
DELETE @ lap 22.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
22.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
MERGE @ lap 23.00: No promising candidates, so no attempts.
DELETE @ lap 23.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
23.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
MERGE @ lap 24.00: No promising candidates, so no attempts.
DELETE @ lap 24.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
24.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
DELETE @ lap 25.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
MERGE @ lap 25.00 : 0/1 accepted. Ndiff 0.00. 0 skipped.
25.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
MERGE @ lap 26.00: No promising candidates, so no attempts.
DELETE @ lap 26.00: 0/1 accepted. Ndiff 0.00.
26.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
MERGE @ lap 27.00: No promising candidates, so no attempts.
DELETE @ lap 27.00: 0/1 accepted. Ndiff 0.00.
27.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
MERGE @ lap 28.00: No promising candidates, so no attempts.
DELETE @ lap 28.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
28.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
MERGE @ lap 29.00: No promising candidates, so no attempts.
DELETE @ lap 29.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
29.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
DELETE @ lap 30.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
MERGE @ lap 30.00 : 0/1 accepted. Ndiff 0.00. 0 skipped.
30.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
MERGE @ lap 31.00: No promising candidates, so no attempts.
DELETE @ lap 31.00: Empty plan. 0 UIDs eligible as delete target. 0 too busy with other moves. 0 too big. 2 have past failures.
31.000/100 after 2 sec. | 193.4 MiB | K 2 | loss 2.268076811e+00 | Ndiff 0.000
... done. converged.
```

```
pylab.figure()
pylab.plot(
diag1_info_dict['lap_history'][2:],
diag1_info_dict['loss_history'][2:], 'r.-',
label='diag_covar fixed')
pylab.plot(
diag_info_dict['lap_history'][2:],
diag_info_dict['loss_history'][2:], 'k.-',
label='diag_covar + moves')
pylab.plot(
full_info_dict['lap_history'][2:],
full_info_dict['loss_history'][2:], 'b.-',
label='full_covar + moves')
pylab.legend(loc='upper right')
pylab.xlabel('num. laps')
pylab.ylabel('loss')
pylab.tight_layout()
```

**Total running time of the script:** ( 0 minutes 7.234 seconds)