Probabilistic Graphical Models, Fall 2021

Class Projects from courses such as Probabilistic Graphical Network, Biostatistics II, etc.

Re: Probabilistic Graphical Models, Fall 2021

Postby rpazo001 » Fri Nov 26, 2021 12:54 pm

Data analysis is complete. I have extracted approximate inferance and exaxt inferance of the models. I chose to use the model that had the best score. Will work on paper this week.
Attachments
Class Project Script 11-12.R
(22.01 KiB) Downloaded 45 times
rpazo001
 
Posts: 58
Joined: Fri May 29, 2020 12:34 pm

Re: Probabilistic Graphical Models, Fall 2021

Postby mlwilcox » Tue Nov 30, 2021 7:02 am

Wilcox_2021.11.29.pptx
(155.09 KiB) Downloaded 45 times
Attached is an updated PowerPoint showing the progress on my class project. After my last discussion with Dr. Yoo on the SMGL meeting, I decided to scale back my project and perform a differential abundance analysis to identity operational taxonomic units (OTUs) that are associated with IBD diagnosis.

I completed the following steps:

1. Normalized counts within each sample to account for potential unequal quantities of starting RNA. I used the total-sum scaling normalization method, which essentially transforms the abundance count table into a relative abundance table.
2. Calculated log2 fold-changes for Crohn's Disease (CD) relative to control, and for ulcerative colitis (UC) relative to control.
3. Selected the 20 OTU's with the largest ABS(Log2 FC) for CD and UC, resulting in ~40 OTUs.
4. Discretized the relative abundance of each selected OTU based on its mean value.
5. Used bnlearn in R to learn the Bayesian Network. The model inputs were IBD diagnosis (CD, UC, vs nonIBD) and the relative abundances for ~40 identified OTUs. I used the hill-climbing algorithm based on the loglikelihood score and allowing for a maximum of 3 parents per node.

The structure of the learned BN is attached. The nodes that are a part of the Marcov blanket for IBD diagnosis are bold.

I have also started drafting the background and methods sections of my paper.
Attachments
DAG 11.29.21.jpg
DAG 11.29.21.jpg (286.09 KiB) Viewed 692 times
mlwilcox
 
Posts: 11
Joined: Thu Apr 29, 2021 7:51 pm

Re: Probabilistic Graphical Models, Fall 2021

Postby rpazo001 » Thu Dec 02, 2021 11:05 am

I decided to focus on a gene the is intergral in the mTOR pathway and just use one BM. I focused speficially on it and it's relationship with a zinc finger protein in zinc deficient rats. The paper is broken into two parts. The first part is the Limma analysis which identifies the DEGs that are used in the BN analysis.Part 2 is the BN analysis , but it only focuses on a small part of the DEGs.
Attachments
Class Project Paper.docx
Paper
(547.59 KiB) Downloaded 48 times
bn2.xlsx
data for R w/ top 15 genes
(14.44 KiB) Downloaded 43 times
datas-cleaned25.xlsx
downloaded contrast
(1.62 MiB) Downloaded 43 times
ttstats.txt
Tabletop output
(13.14 KiB) Downloaded 49 times
mle method bn.txt
MLE output
(3.24 KiB) Downloaded 46 times
rpazo001
 
Posts: 58
Joined: Fri May 29, 2020 12:34 pm

Re: Probabilistic Graphical Models, Fall 2021

Postby mlwilcox » Sun Dec 05, 2021 10:47 pm

I've made some modifications to my analysis since my last post and the last SMGL meeting. I decided to go back and normalize the raw counts using the DESeq method, which has been shown to be less biased than the total-sum scaling method. After re-normalizing the counts, I recomputed the log2 fold changes for Crohn's Disease vs. controls and for ulcerative colitis vs. control to select the OTUs to include in my models. I generated 2 structures: (1) Naive Bayes Classifier and (2) a structure learned from a hill-climbing with random restarts algorithm. I compared the goodness of fit and prediction capabilities of the 2 structures (the latter using 3-fold cross validation). Based on these assessments, I selected the structure learned from a hill-climbing with random restarts algorithm and used Bayesian parameter estimation to estimate the parameters.

A draft of my paper is attached. I'm still working on revising the introduction to reflect the change in my research question, and working on the Discussion section. Also attached is the data used in the analysis (normalized counts) and my R code.
Attachments
HMP2_16S_Data for bnlearn 12.05.21 v2.csv
Analysis datafile
(154.79 KiB) Downloaded 41 times
Wilcox_Project Paper PHC6067 05Dec2021 v2.docx
Draft of paper
(252.15 KiB) Downloaded 47 times
bnlearn 12.05.21 v2.R
R code
(4.98 KiB) Downloaded 45 times
mlwilcox
 
Posts: 11
Joined: Thu Apr 29, 2021 7:51 pm

Previous

Return to Class Projects

Who is online

Users browsing this forum: No registered users and 9 guests