Configuration Parameters Reference

Here are the configuration parameters supported by pebl. These can set in a configuration file or via the config.set() function.

data

data.discretize
Number of bins used to discretize data. Specify 0 to indicate that data should not be discretized. default=0
data.filename
File to read data from. default=None
data.text
The text of a dataset included in config file. default=

learner

learner.numtasks
Number of learner tasks to run. default=1
learner.type
Type of learner to use.
The following learners are included with pebl:
  • greedy.GreedyLearner
  • simanneal.SimulatedAnnealingLearner
  • exhaustive.ListLearner

default=greedy.GreedyLearner

greedy

greedy.max_iterations
Maximum number of iterations to run. default=1000
greedy.max_time
Maximum learner runtime in seconds. default=0
greedy.max_unimproved_iterations
Maximum number of iterations without score improvement before a restart. default=500
greedy.seed
Starting network for a greedy search. default=

simanneal

simanneal.delta_temp
Change in temp between steps. default=0.5
simanneal.max_iters_at_temp
Max iterations at any temperature. default=100
simanneal.seed
Starting network for a greedy search. default=
simanneal.start_temp
Starting temperature for a run. default=100.0

listlearner

listlearner.networks
List of networks to score. default=

localscore_cache

localscore_cache.maxsize
Max number of localscores to cache. Default=-1 means unlimited size. default=-1

result

result.filename
The name of the result output file default=result.pebl
result.format
The format for the pebl result file (pickle or html). default=pickle
result.size
Number of top-scoring networks to save. Specify 0 to indicate that all scored networks should be saved. default=1000
gibbs.burnin
Burn-in period for the gibbs sampler (specified as a multiple of the number of missing values) default=10
gibbs.max_iterations

Stopping criteria for the gibbs sampler.

The number of Gibb’s sampler iterations to run. Should be a valid python expression using the variable n (number of missing values). Examples:

  • n**2 (for n-squared iterations)
  • 100 (for 100 iterations)

default=n**2

taskcontroller

taskcontroller.type
The task controller to use. default=serial

multiprocess

multiprocess.poolsize
Number of processes to run concurrently (0 means no limit) default=0

xgrid

xgrid.controller
Hostname or IP of the Xgrid controller. default=
xgrid.grid
Id of the grid to use at the Xgrid controller. default=0
xgrid.password
Password for the Xgrid controller. default=
xgrid.peblpath
Full path to the pebl script on Xgrid agents default=pebl
xgrid.pollinterval
Time (in secs) to wait between polling the Xgrid controller. default=60.0

ipython1

ipython1.controller
IPython1 TaskController (default is 127.0.0.1:10113) default=127.0.0.1:10113

ec2

ec2.config
EC2 config file. This is kept seperate from pebl config because it contains authentication keys, etc. default=
ec2.max_count
Maximum number of EC2 instances to create (default=0 means the same number as ec2.min_count). default=0
ec2.min_count
Minimum number of EC2 instances to create (default=1). default=1