Implementation Hints

Module that implements conversion of dot format to Genie Bayesian Network format

Implementation Hints

Postby M Charles » Fri Jul 18, 2014 12:56 am

As previously mentioned, I will try to provide some hints on implementation. Before you begin, you should set up your own project in git for this module (you can fork my project and delete any files you don't need). It will be easier for you to simply implement your own free-function (i.e. a function not inside a class) rather than try to fit your function into my current interface.

This module requires two main tasks:
  1. Read in a Graphviz DOT file
  2. Print out a Genie XDSL file
There is various ways to implement this. I will address two extremes:

Do it yourself (The more difficult)
Essentially, you will code everything from scratch.

Step 1 is the trickier part, and it may require some understanding lexing/parsing. You need to take in the DOT file and then identify valid tokens (e.g. brackets, ->, etc.). Then these tokens should be put in a representation that the code can identify. If you put some restrictions on the allowed DOT syntax, you can simplify this process.

From the representation, you should have enough information to understand the arcs and nodes of the network. Then, you can output XDSL format by opening a file stream and outputting the necessary text. Try opening a Genie-generated XDSL file with a text editor (e.g. vim) to see the syntax (it is an XML file with particular tags). Whitespace is probably not needed, but useful for readability.

Use the libraries! (The easier way)
Here, you can just let other libraries do your heavy lifting.

First, the Boost library is very powerful and has a module to form a Graph from a DOT file (http://www.boost.org/doc/libs/1_55_0/libs/graph/doc/table_of_contents.html).

With this, you can use the API to access nodes/edges and then create a DSL_network with the SMILE API (http://genie.sis.pitt.edu/wiki/SMILE_Documentation). Then, you can run the WriteFile() function to generate an XDSL file.

In the end...
Whichever way you want to do it is fine. You should be able to get better performance from the first (skip creation of graph-like objects internally), but the second will be a lot easier.
M Charles
 
Posts: 23
Joined: Sun Jun 22, 2014 5:00 pm

Re: Implementation Hints

Postby cwyoo » Mon Jul 28, 2014 10:39 am

Avik, please update your progress here and comit your code in Git everyday starting from today. Post your questions/comments here as well.
cwyoo
Site Admin
 
Posts: 377
Joined: Sun Jun 22, 2014 2:38 pm

Re: Implementation Hints

Postby cwyoo » Tue Jul 29, 2014 5:49 pm

cwyoo wrote:Avik, please update your progress here and comit your code in Git everyday starting from today. Post your questions/comments here as well.


Also this implementation is somewhat a reverse function of what Michael already wrote and checked into Git. Michael, for the record, could you let Avik know where your code is in Git? Did you happen to document the module?
cwyoo
Site Admin
 
Posts: 377
Joined: Sun Jun 22, 2014 2:38 pm

Re: Implementation Hints

Postby lsand039 » Mon May 08, 2017 4:58 pm

I've found the dot to XDSL most helpful when I have a structure on BaNJO that I want represented on GeNIe. Just be aware that in this conversion, the variables can only hold two states. I have not had success in getting the GeNIe XDSL file to show more than two variable states. Here is the link to the Dot to XDSL program on the Git server: http://smlg.fiu.edu/gitlab/cwyoo/dot-to-xdsl.

Below are the files needed and commands on the terminal to get the Dot to XDSL program to run on the terminal.

Files:
Dataset.txt - The variable names should be columns and the samples' values should make up the rows. This is the same format needed in GeNIe.
InputFile.dot - Copy the dot file text from the output of the BaNJO results and paste it into this file.

Running the Dot to XDSL program:
1. In the terminal, get to the location of the Dot to XDSL program.
2. Type "./dot-to-xdsl"
3. Open the output.xdsl file on GeNIe.

A common problem I ran into was getting the message "Cannot read data file: [Dataset.txt] ... exiting." Make sure that there are no blank lines in the Dataset.txt file, especially the very last line.

Hope this helps!
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm


Return to Dot to XDSL

Who is online

Users browsing this forum: No registered users and 0 guests

cron