Start the discussion about BDe scoring module

This module implements Equation 35 of Heckerman's article in Theory & Concepts

Re: Start the discussion about BDe scoring module

Postby cwyoo » Mon Nov 28, 2016 4:14 pm

lsand039 wrote:Here are some dot file examples.


Also please post associated datasets as well.
cwyoo
Site Admin
 
Posts: 379
Joined: Sun Jun 22, 2014 2:38 pm

Re: Start the discussion about BDe scoring module

Postby lsand039 » Tue Nov 29, 2016 11:05 am

Here are the data files associated with the structures.
Attachments
11275.txt
11275 variables
(8.19 MiB) Downloaded 139 times
training.txt
8 variables
(15.89 KiB) Downloaded 135 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: Start the discussion about BDe scoring module

Postby qja0428 » Tue Nov 29, 2016 1:50 pm

This week, I implemented two function in the class graph, one is loading dot txt file into graph, the other is transforming graph to a dot txt file and save it in the specific directory. I have tested, it work fine, and I have uploaded all the code files onto our gitlab, thanks!
qja0428
 
Posts: 18
Joined: Wed Jan 21, 2015 4:13 pm

Re: Start the discussion about BDe scoring module

Postby qja0428 » Tue Nov 29, 2016 6:05 pm

lsand039 wrote:Here are the data files associated with the structures.

Hi Lauren, I have checked your data, there are some problem about your data: first the data format should be csv file; second in the data set each column should be one variable, each row should be one observation; third in my code, I assume there is no row name in the data set and there should be a header (variable name) in the data set. So I think you should clean your data first, and then the code will work.
qja0428
 
Posts: 18
Joined: Wed Jan 21, 2015 4:13 pm

Re: Start the discussion about BDe scoring module

Postby cwyoo » Wed Nov 30, 2016 1:58 pm

qja0428 wrote:
lsand039 wrote:Here are the data files associated with the structures.

Hi Lauren, I have checked your data, there are some problem about your data: first the data format should be csv file; second in the data set each column should be one variable, each row should be one observation; third in my code, I assume there is no row name in the data set and there should be a header (variable name) in the data set. So I think you should clean your data first, and then the code will work.


I will suggest you commit your code in SMLG Git server under public domain and write how to run it at:

Board index ‹ Manuscripts & Documentation ‹ Implemented Classes/Modules/Programs Documentation ‹ How to run scripts/programs
cwyoo
Site Admin
 
Posts: 379
Joined: Sun Jun 22, 2014 2:38 pm

Re: Start the discussion about BDe scoring module

Postby qxin001 » Fri May 26, 2017 11:50 am

Concept learning for Bayesian Network:
1.Bayes Rule P(A|B)=P(B|A)P(A)/P(B)
2.Likelihood-How probable is the evidence given that our hypothesis is true?
Prior-How probable was our hypothesis before observing the evidence?
Posterior-How probable is our hypothesis given the oberserved evidence?
Marginal-How probable is the new evidence under all possible hypotheses?
P(H|e)=P(e|H)P(H)/P(e)

what we want to know : p(s|x) posterior
what we should know: p(s|x)=p(x|s)p(s) likelihood *prior
background knowledge: p(s|x)=p(x|s)p(s) p(x|s) is maximum likelihood. p(s) is assumption.

3.A Bayesian Network consists below:
A set of variables and a set of direct edges between variables
Each variables has a finite set of mutually exclusive states
The variable and direct edge form a DAG (directed acyclic graph)
To each variable A with parents B1, B2 ..Bn there is attached a conditional probability table P(A| B1, B2 .. Bn) ------------reference from [Jensen, 1996]
Jessen F., An Introduction of Bayesian Network, Springer-Verlag,1996

4.Three Bauesian Network Structure connection
diverging connection P(A,B,C)=P(B|C)P(A|C)P(C)
serial connection P(A,B,C)=P(B|C)P(C|A)P(A)
converging connection P(A,B,C)=P(C|A,B)P(B)P(A)

5.Bayesian belief network For instance, if we already know a CPT of LungCancer, we could calculate the probability -P("Smoker")*P("FamilyHistory")*P("Lungcancer"|("FamilyHistory,Smoker")). We will get more accurate result.If each conditional probability is unknown, we should use EM algorithm or others. If the data structure is unknown, then we should use K2 algorithm(greedy search) to solve this problem based on the data we already know.
qxin001
 
Posts: 2
Joined: Wed May 10, 2017 12:01 pm

Previous

Return to BDe Scoring Module

Who is online

Users browsing this forum: No registered users and 1 guest