Probabilistic Graphical Models, Fall 2025

Class Projects from courses such as Probabilistic Graphical Network, Biostatistics II, etc.

Probabilistic Graphical Models, Fall 2025

Postby cwyoo » Wed Sep 03, 2025 11:25 am

Class projects correspondences in Probabilistic Graphical Models, Fall 2025. Please post your weekly proposed outcome here every Friday.
cwyoo
Site Admin
 
Posts: 389
Joined: Sun Jun 22, 2014 2:38 pm

Re: Probabilistic Graphical Models, Fall 2025

Postby ahern489 » Wed Oct 15, 2025 12:56 pm

Hi everyone,

Attached is my updated class project proposal, which incorporates the feedback provided by Dr. Yoo. Please let me know if you have any additional comments or suggestions.
Attachments
Class Project Proposal-Addressing Feedback.docx
(21.2 KiB) Downloaded 9 times
ahern489
 
Posts: 4
Joined: Tue Aug 12, 2025 1:06 pm

Re: Probabilistic Graphical Models, Fall 2025

Postby BriValdes » Fri Oct 17, 2025 2:43 pm

Good afternoon,

I’ve attached my updated proposal. Let me know if there’s anything else I should change or update.
Attachments
Updated Project Porposal - FCI.docx
(23.32 KiB) Downloaded 8 times
BriValdes
 
Posts: 3
Joined: Wed Aug 27, 2025 7:20 pm

Re: Probabilistic Graphical Models, Fall 2025

Postby avinc016@fiu.edu » Fri Oct 17, 2025 3:43 pm

Hi,

I've attached my updated project proposal based on the feedback provided by Dr. Yoo. Please let me know if you have any additional feedback or suggestions.
Attachments
PHC6067_ Updated Class Project.docx
(132.44 KiB) Downloaded 9 times
avinc016@fiu.edu
 
Posts: 3
Joined: Thu Aug 28, 2025 11:45 am

Re: Probabilistic Graphical Models, Fall 2025

Postby BriValdes » Sat Oct 25, 2025 5:00 pm

Here’s an update on what I’ve done so far:
I checked all the ALARM datasets and confirmed that the variables are discrete. I also verified the dataset sizes. I installed and used the causal-learn package to run the FCI algorithm on both D1KC9v and D50C9v. I first ran the algorithm with the default parameters to see how it performs, and then ran it again using the Chi-square independence test since the data is discrete. I generated and attached the graph outputs from both runs. I’m interpreting the graphs based on the FCI documentation, focusing on what each edge type represents (such as direct, confounded, or possibly latent relationships). I plan to continue testing different parameter settings in the FCI function including changing the independence test method (by trying the G-squared test), alpha, depth, and max path length to see which configuration produces the most accurate results for this dataset. By Tuesday, I will have the precision calculations completed for those two datasets.
Attachments
fci_D50C9v_graph_labeled.png
fci_D50C9v_graph_labeled.png (88.64 KiB) Viewed 138 times
fci_D1KC9v_graph_labeled.png
fci_D1KC9v_graph_labeled.png (93.04 KiB) Viewed 138 times
BriValdes
 
Posts: 3
Joined: Wed Aug 27, 2025 7:20 pm

Re: Probabilistic Graphical Models, Fall 2025

Postby ahern489 » Mon Oct 27, 2025 2:48 pm

Apologies for the delay in posting. I ran into some memory and storage issues while processing the full dataset, so I needed a bit more time to get everything aligned correctly.
I worked on verifying and preparing the GSE222494 single-nucleus RNA-seq dataset, which includes samples from PSEN1-E280A autosomal dominant AD carriers, sporadic AD cases, controls, and one APOE3-Christchurch protective case.
I downloaded the two main files from GEO: the counts matrix (GSE222494_BrainFRONTAL_counts.csv.gz) and the metadata file (GSE222494_BrainFRONTAL_metadata.csv.gz) and made sure they matched in dimensions and cell IDs. Once they were aligned, I checked variable types and distributions and confirmed that all four AD subtypes were properly labeled in the metadata.
Using R (Seurat), I created a Seurat object, filtered out low-quality nuclei (fewer than 200 or more than 6,000 detected genes or >5% mitochondrial content), and applied log-normalization. I generated violin and UMAP plots to check that the data looked clean and consistent across samples. These checks confirmed that the dataset is ready for downstream analysis.
I also completed the subtyping step, verifying that each nucleus was correctly assigned to its Alzheimer’s disease category (PSEN1-E280A, sporadic AD, control, or APOE3-Christchurch). I started clustering and preliminary cell-type annotation using known marker genes like MAP2 (neurons), GFAP (astrocytes), CD68 (microglia), and MBP (oligodendrocytes).

This week I will:
• Finalize the cell-type annotations and verify cluster identities.
• Create subtype-specific normalized expression matrices.
• Start initial Bayesian Network reconstruction to explore regulatory relationships among NRF1, NFE2L1, NFE2L2, and GABPA.
ahern489
 
Posts: 4
Joined: Tue Aug 12, 2025 1:06 pm

Re: Probabilistic Graphical Models, Fall 2025

Postby BriValdes » Sun Nov 09, 2025 12:53 am

Here is an update of what I have done so far:
In my last update I only used the Chi-squared conditional independence test and this time i have tried the G-squared conditional independence test. The only change that I found is for the D50C9 where
In my last update, I used the Chi-squared conditional independence test. This time, I repeated the analysis on the D50C9 dataset using the G-squared conditional independence test (I also test it on the D1KC9 but there was no change). With the Chi-squared test, the model identified five edges as can be seen in the attached image: fci_D50C9v_graph. When using the G-squared conditional independence test, the number of edges decreased to four with the SaO2–VentLung edge not being detected as can be seen in the image fci_D50C9vg_graph. All other relationships remained consistent, and both models identified Catechol as the central collider variable.

For the Chi-squared conditional independence test, I created a summary table showing how causal-learn describes the properties of each edge specifically whether it has a no latent confounder (nl), definitely direct (dd), possibly latent confounder (pl), or possibly direct (pd) relationship. This table was generated for both D1KC9v and D50C9v datasets.

I also added an E_label column based on the causal relationship categories that Meredith gave me:

E1: Unconfounded causal relationship with the second variable causing the first.
E2: Unconfounded causal relationship with the first variable causing the second.
E3: No confounding and no causal relation.
E4: Confounded causal relationship with the second variable causing the first.
E5: Confounded causal relationship with the first variable causing the second.
E6: Confounded but not causally related.

My next goal is to show Meredith how I assigned the E_labels and confirm If I classified them correctly especially for edges containing circle endpoints (o→ or o–o), which I assigned two possible E_labels because I wasn't sure.
Attachments
D1KC9v_table.png
D1KC9v_table.png (31.42 KiB) Viewed 82 times
D50C9v_table.png
D50C9v_table.png (17.93 KiB) Viewed 82 times
fci_D50C9v_graph.png
fci_D50C9v_graph.png (88.64 KiB) Viewed 82 times
fci_D50C9vg_graph.png
fci_D50C9vg_graph.png (83.21 KiB) Viewed 82 times
BriValdes
 
Posts: 3
Joined: Wed Aug 27, 2025 7:20 pm

Re: Probabilistic Graphical Models, Fall 2025

Postby avinc016@fiu.edu » Fri Nov 14, 2025 9:22 pm

Hi, I’ve attached an update summarizing the progress I’ve made so far.
Attachments
Class Project Progress (11_14).pdf
(245.52 KiB) Downloaded 10 times
avinc016@fiu.edu
 
Posts: 3
Joined: Thu Aug 28, 2025 11:45 am

Re: Probabilistic Graphical Models, Fall 2025

Postby ahern489 » Sat Nov 15, 2025 8:50 am

Here is an update on what I’ve done so far:

I began by preprocessing the GSE222494 frontal cortex single-nucleus dataset. Because the BN algorithm required subject-level observations rather than individual cells, I generated pseudobulk profiles by summing gene counts across all cells belonging to each subject. This produced a 24 × 33,525 subject-level expression matrix. I verified that all selected transcription factors (NRF1, NFE2L1, NFE2L2, and GABPA) were present in the pseudobulk dataset. Next, I normalized the expression values using a log₂(x+1) transform and then discretized the transcription factor expressions into three equal-frequency bins (Low/Medium/High). This step was necessary because I used discrete Bayesian Network structure learning, which requires categorical variables. For structure learning, I used the Hill-Climb Search (HC) algorithm implemented in pgmpy along with the BDeu score (equivalent sample size = 5). I first ran the structure learning using the four transcription factors to evaluate how network complexity influences the inferred edges. I generated and saved the BN graph outputs.

The BN identified a consistent regulatory pattern where NFE2L2 → NRF1 → NFE2L1, with an additional edge NFE2L2 → GABPA. I also fitted maximum likelihood CPDs for each TF. These CPDs showed strong directional probability patterns. For example, NFE2L2(High) predicts NRF1(High) with probability 0.875 suggesting meaningful regulatory dependencies.
As part of the evaluation, I am now interpreting the network using biological literature to determine whether the edges reflect known mitochondrial and oxidative stress pathways (NFE2L2 as an upstream antioxidant regulator and NRF1/NFE2L1 in mitochondrial biogenesis).

BN_TFs_only.png
BN_TFs_only.png (50.14 KiB) Viewed 59 times


Next steps:
I plan to test model stability by altering the discretization scheme (2 bins vs. 3 bins) and by adjusting the BDeu equivalent sample size to see whether the learned edges remain consistent. I also plan to extend the BN to include downstream mitochondrial genes (such as TFAM, COX4I1, and ATP5F1B) to build a more complete regulatory network. If needed, I will generate additional CPDs and visualizations to compare how parameter changes affect the final structure.
Attachments
bn_cpds_TFs_only.txt
(1.95 KiB) Downloaded 4 times
ahern489
 
Posts: 4
Joined: Tue Aug 12, 2025 1:06 pm

Re: Probabilistic Graphical Models, Fall 2025

Postby avinc016@fiu.edu » Fri Nov 21, 2025 11:44 pm

Hi, I’ve attached a document summarizing the updates I made over the past week.
Attachments
Class Project Progress (11_21).pdf
(450.51 KiB) Downloaded 5 times
avinc016@fiu.edu
 
Posts: 3
Joined: Thu Aug 28, 2025 11:45 am

Next

Return to Class Projects

Who is online

Users browsing this forum: No registered users and 1 guest

cron