Configuration Parameters Reference
Here are the configuration parameters supported by pebl. These can set in a
configuration file or via the config.set() function.
data
-
data.discretize
- Number of bins used to discretize data. Specify 0 to indicate that data should not be discretized.
default=0
-
data.filename
- File to read data from.
default=None
-
data.text
- The text of a dataset included in config file.
default=
learner
-
learner.numtasks
- Number of learner tasks to run.
default=1
-
learner.type
Type of learner to use.
- The following learners are included with pebl:
- greedy.GreedyLearner
- simanneal.SimulatedAnnealingLearner
- exhaustive.ListLearner
default=greedy.GreedyLearner
greedy
-
greedy.max_iterations
- Maximum number of iterations to run.
default=1000
-
greedy.max_time
- Maximum learner runtime in seconds.
default=0
-
greedy.max_unimproved_iterations
- Maximum number of iterations without score improvement before a restart.
default=500
-
greedy.seed
- Starting network for a greedy search.
default=
simanneal
-
simanneal.delta_temp
- Change in temp between steps.
default=0.5
-
simanneal.max_iters_at_temp
- Max iterations at any temperature.
default=100
-
simanneal.seed
- Starting network for a greedy search.
default=
-
simanneal.start_temp
- Starting temperature for a run.
default=100.0
listlearner
-
listlearner.networks
- List of networks to score.
default=
localscore_cache
-
localscore_cache.maxsize
- Max number of localscores to cache. Default=-1 means unlimited size.
default=-1
result
-
result.filename
- The name of the result output file
default=result.pebl
-
result.format
- The format for the pebl result file (pickle or html).
default=pickle
-
result.size
- Number of top-scoring networks to save. Specify 0 to indicate that all scored networks should be saved.
default=1000
-
gibbs.burnin
- Burn-in period for the gibbs sampler (specified as a multiple of the number of missing values)
default=10
-
gibbs.max_iterations
Stopping criteria for the gibbs sampler.
The number of Gibb’s sampler iterations to run. Should be a valid
python expression using the variable n (number of missing values).
Examples:
- n**2 (for n-squared iterations)
- 100 (for 100 iterations)
default=n**2
taskcontroller
-
taskcontroller.type
- The task controller to use.
default=serial
multiprocess
-
multiprocess.poolsize
- Number of processes to run concurrently (0 means no limit)
default=0
xgrid
-
xgrid.controller
- Hostname or IP of the Xgrid controller.
default=
-
xgrid.grid
- Id of the grid to use at the Xgrid controller.
default=0
-
xgrid.password
- Password for the Xgrid controller.
default=
-
xgrid.peblpath
- Full path to the pebl script on Xgrid agents
default=pebl
-
xgrid.pollinterval
- Time (in secs) to wait between polling the Xgrid controller.
default=60.0
ipython1
-
ipython1.controller
- IPython1 TaskController (default is 127.0.0.1:10113)
default=127.0.0.1:10113
ec2
-
ec2.config
- EC2 config file. This is kept seperate from pebl config because it contains
authentication keys, etc.
default=
-
ec2.max_count
- Maximum number of EC2 instances to create (default=0 means the same number as ec2.min_count).
default=0
-
ec2.min_count
- Minimum number of EC2 instances to create (default=1).
default=1