bernardoj wrote:Using the dataset that Olu posted, I transposed it and cleaned it up a bit. I filled in missing values and corrected a few values. I had to transpose it since the BN algorithm will think that trials are variables and not the other way around. After doing that I went through and got rid off unnecessary issues. At the start, there were 618 total patients/observations. I removed 32 that did not have a gender specified, which left 586. Then I removed 13 that did not have an age specified. this left a total of 573 total observations. From there I looked at how each glioblastoma case was graded. There were 16 cases where the grade was 0, 86 where the grade was 1, 4 where the grade was 3, 114 where the grade was 4 and 352 where the grade was 11. Since I'm not sure what 11 means, I decided to do two things. In one scenario, I turned anything that was not 0 into 1 to signify which people had glioblastoma and which did not. I called this dataset CaseControlGrade Binary. After doing a little research, I came to the realization that 11 might mean Grade II. Therefore, in the second scenario, I changed just the 11s to 2. This way the data sets now have grade 0, 1, 2, 3, 4. Lastly, I ran the Min-Max Hill Climbing algorithm to see what I get as a trial, and to see how long it would take. It took about 4 hours to finish. In the zip file you can see the results. I will be rerunning the MMHC to see what results I get now with a more clean dataset.
Olu1 wrote:bernardoj wrote:Using the dataset that Olu posted, I transposed it and cleaned it up a bit. I filled in missing values and corrected a few values. I had to transpose it since the BN algorithm will think that trials are variables and not the other way around. After doing that I went through and got rid off unnecessary issues. At the start, there were 618 total patients/observations. I removed 32 that did not have a gender specified, which left 586. Then I removed 13 that did not have an age specified. this left a total of 573 total observations. From there I looked at how each glioblastoma case was graded. There were 16 cases where the grade was 0, 86 where the grade was 1, 4 where the grade was 3, 114 where the grade was 4 and 352 where the grade was 11. Since I'm not sure what 11 means, I decided to do two things. In one scenario, I turned anything that was not 0 into 1 to signify which people had glioblastoma and which did not. I called this dataset CaseControlGrade Binary. After doing a little research, I came to the realization that 11 might mean Grade II. Therefore, in the second scenario, I changed just the 11s to 2. This way the data sets now have grade 0, 1, 2, 3, 4. Lastly, I ran the Min-Max Hill Climbing algorithm to see what I get as a trial, and to see how long it would take. It took about 4 hours to finish. In the zip file you can see the results. I will be rerunning the MMHC to see what results I get now with a more clean dataset.
Hey Bernardo, for each of your dataset, which GSE did you combined. Zhengua needs to authenticate your dataset.
Users browsing this forum: No registered users and 6 guests