Michael Cho wrote:Some rough pseudocode of what I am implementing:
- Code: Select all
MCMC-ORDER-SEARCH(dataset d, max_parents k, max_time t):
[1. INITIALIZATION:]
od_0 = random_order( d )
s_0 = log_score( od_0, d, k )
results = { (od_0, s_0) }
end_time = system_time() + t
while ( system_time() < end_time )
[2. PROPOSAL:]
od*_{i+1} = random_swap( od_i )
s*_{i+1} = log_score( od*_{i+1}, d, k )
[3. ACCEPT-REJECT:]
if ( rand[0,1] < exp( s*_{i+1} - s_i ) )
od_{i+1} = od*_{i+1}
s_{i+1} = s*_{i+1}
else
od_{i+1} = od_i
s_{i+1} = s_i
end
results = { results, (od*_{i+1}, s*_{i+1}) }
end
The main detail I am not too sure about is the last step. In a sampling algorithm, we would store
od_{i+1} since that is the step we made. In this version, it seems that we would ignore duplicates (i.e. it wouldn't be useful to attempt to add
od_{i+1} when
od_{i+1}=od_i).
The prior idea involves in random_order() and random_swap(). User can specify their pror knowledge on the ordering search. For example, the user can specify any pairwise relationships using the following syntax for each line:
node1, node2, prior value
which represents that node1 should proceed node2 with prior probability of the given prior value. For example, if a user specifies the following lines (we ask users to specify probability that is always higher than 0.5, i.e., you can specify the first line as gene1, M_gene1, 0.0 as well):
M_gene1, gene1, 1.0
gene2, gene3, 0.7
Then whenever you generate a random_order() or random_swap(), you will make sure M_gene1 does not end up in a lower order than gene1. Also, when generating a random_order() or random_swap(), if gene2 is in higher order of gene3 then proceed, however, if gene2 become lower order of gene3, then only accept it when you toss a bias coin with getting a head of (1.0-0.7)/0.5 = 0.6 and when the head comes out. Please double check the math.