GEO datasets

Re: GEO datasets

Postby lsand039 » Mon Jan 30, 2017 1:58 pm

Here are the lists of most recent & most cited articles I'll be looking at for my background paper. The ones in bold are the ones I'll be looking at first.
Attachments
Literature%20Search.xlsx
(165.24 KiB) Downloaded 160 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: GEO datasets

Postby lsand039 » Tue Jan 31, 2017 4:40 pm

I removed GSE84890 since the submitter decided to process the GSE84890 data differently and wanted the existing dataset deleted. There are a total of 2337 samples (1235 AD samples).

For the 364 samples under 65, 60 of these had AD.
Attachments
Full Data.xlsx
v3
(252.56 KiB) Downloaded 166 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: GEO datasets

Postby lsand039 » Mon Feb 06, 2017 12:58 pm

Attached are data organized to analyze and compare PSEN1 mutants.

The only PSEN1 mutants were from GSE39420, and there were 7 in total. 6 of these cases were males. There were 2 M139T mutants, 2 V89L mutants, and 1 E120G; and all these samples were taken from the posterior cingulate.

In the data file, I organized the samples as all AD cases (1235; 353 males, 700 females), all normal cases(1102;670 males, 432 females), all from the cingulate cortex(114), and all from the posterior cingulate(78).
From samples taken from the cingulate cortex had an even 57 males & females and 66 AD cases (32 males, 34 females). This included samples obtained from the the anterior and posterior cingulate.
The samples specifically from the posterior cingulate had 44 males and 34 females; 46 of these samples were AD cases (25 males, 21 females).

Please let me know if there are any other divisions of groups I should do/ any more descriptive statistics I should post.
Attachments
PSEN1data.xlsx
(217.26 KiB) Downloaded 162 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: GEO datasets

Postby lsand039 » Mon Feb 06, 2017 1:17 pm

I've compared the expression levels of the PSEN1 mutants with the other AD cases as previously discussed last Tuesday. I took the 7 mutant PSEN1 samples and found the average & SDs of their discretized gene expression levels. I organized 30 groups of 7 AD cases, randomized using random.org's list generator, and found the average & SD values for each of these groups. I averaged the 30 different averages and SDs from these 30 groups and compared it to the values of the PSEN1 mutant groups.

PSEN1 mutant groups average/SD values:
PSEN2: 1±0
PSEN1: 1.14±0.69
APP: 1±0
APOE: 1.14±0.69

Regular AD cases average/SD values:
PSEN2: 0.93±0.49
PSEN1: 0.99±0.57
APP: 0.85±0.51
APOE: 1.11±0.45

This was the only group where I could get 30 groups of 7. Since 6/7 of the mutant AD cases were males from the posterior cingulate, I could do smaller groups of AD cases of males sampled from the cingulate cortex or posterior cortex, but I would only be able to get about 3 groups of 6 or 7 samples. Please advise on how to proceed.
Attachments
PSEN1 comparisons AD.xlsx
(81.25 KiB) Downloaded 155 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: GEO datasets

Postby lsand039 » Tue Feb 07, 2017 3:05 pm

I'm not great in my brain anatomy, but after reading through some Wikipedia articles and searching through Google images, here are the categories and subcategories we could separate the available brain regions into:

Neocortex:
--Frontal lobe
-----Frontal Cortex
-----Frontal Pole
-----Dorsolateral Prefrontal Cortex
-----Inferior Frontal Gyrus
-----Precentral Gyrus
-----Prefrontal Cortex
-----Superior Frontal Gyrus
--Frontal temporal cortex
--Temporal cortex
-----Entorhinal Cortex
-----Inferior Temporal Gyrus
-----Medial Temporal Gyrus
-----Superior Temporal Gyrus
-----Temporal Pole
--Parietal lobe
-----parietal cortex
-----Post-Central Gyrus
-----Superior Parietal Lobule
--Occipital lobe
-----Occipital Visual Cortex
-----Primary Visual Cortex
-----Visual Cortex

Limbic system:
--Corpus Striatum
-----Amygdala
-----Caudate Nucleus
-----Putamen
--Cingulate cortex
-----Anterior Cingulate
-----Posterior Cingulate
-----posterior cingulate at thalamus level
-----Posterior cingulate cortex
--Hippocampus
--Nucleus Accumbens
--Parahippocampal Gyrus

Reptilian brain:
--Cerebellum

Misc. - could not categorize
-cortical tissue
-unknown cortex
-unknown region cortex

Please let me know:
-if I should change the categories at all
-if/ how we should analyze the samples
I'll be posting the dataset with these categories soon.
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: GEO datasets

Postby lsand039 » Tue Feb 07, 2017 4:12 pm

Attached is the data we currently have available organized by brain region.
Below are the Brain region categories I've set up and the sample counts for each region:

Neocortex: (1404)
--Frontal lobe (570)
-----Frontal Cortex (79)
-----Frontal Pole (39)
-----Dorsolateral Prefrontal Cortex (33)
-----Inferior Frontal Gyrus(30)
-----Precentral Gyrus (24)
-----Prefrontal Cortex (262)
-----Superior Frontal Gyrus (103)
--Frontal temporal cortex (3)
--Temporal lobe (422)

----Temporal cortex (164)
-----Entorhinal Cortex (95)
-----Inferior Temporal Gyrus (33)
-----Middle Temporal Gyrus (62)
-----Superior Temporal Gyrus (36)
-----Temporal Pole (32)
--Parietal lobe (105)
-----parietal cortex (11)
-----Post-Central Gyrus (68)
-----Superior Parietal Lobule (26)
--Occipital lobe (288)
-----Occipital Visual Cortex (27)
-----Primary Visual Cortex (31)
-----Visual Cortex (230)

--Unspecified Neocortex (16)

Limbic system: (527)
--Corpus Striatum (90)
-----Amygdala (32)
-----Caudate Nucleus (29)
-----Putamen (29)
--Cingulate cortex (114)
-----Anterior Cingulate (36)
-----Posterior Cingulate (13)
-----posterior cingulate at thalamus level (21)
-----Posterior cingulate cortex (44)
--Hippocampus (255)
--Nucleus Accumbens (30)
--Parahippocampal Gyrus (38)

Reptilian brain:
--Cerebellum (230)

Misc. - could not categorize (184)
-cortical tissue(176)
-unknown cortex (2)
-unknown region cortex (6)
Attachments
Data by Brain Region.xlsx
(106.2 KiB) Downloaded 150 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: GEO datasets

Postby lsand039 » Mon Feb 13, 2017 10:28 am

Attached are the samples that contain APOE genotype information. There are only 47 samples, but only 40 do not have a PSEN1 mutation specified. A majority of the samples have the 3/3 APOE genotype, and there are only 9 females included in this group. The brain regions included are the hippocampus and posterior cingulate.
Attachments
APOE allele samples.xlsx
(10.54 KiB) Downloaded 159 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: GEO datasets

Postby lsand039 » Mon Feb 13, 2017 1:20 pm

Attached are the samples from datasets that provided race or ethnicity. In total, there were 1128 samples available, but some of these samples listed the race/ethnicity as "unknown", leaving 1049 samples with race/ ethnicity data.
GSE15222's published study states that the samples came from those who had a self-defined ethnicity of European descent. On their original dataset, all the controls are listed as Caucasian, but their AD cases do not have any race/ ethnicity listed. I labeled these as European for now. Please let me know if I should include these Europeans in the Caucasian/ White group.

Breakdown by group:
Asian (14)
Black (100)
Caucasian/ White (730)
European (177)
Hispanic (28)
Unknown: (79)
Attachments
Race and Ethnicity.xlsx
(98.54 KiB) Downloaded 156 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: GEO datasets

Postby lsand039 » Tue Feb 14, 2017 1:05 pm

Attached are the preliminary results I plan to use for the CURFIU poster.
Attachments
Results CURFIU.docx
(121.46 KiB) Downloaded 160 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

Re: GEO datasets

Postby lsand039 » Wed Feb 15, 2017 1:58 pm

Attached are the list of 8286 unique genes that are common in 13 datasets. There are 3 other datasets where the gene name/symbol requires a lot more work to isolate since its wrapped up in other identifiers as seen in the second sheet of the attached Excel file. Among those three, it looks that there are 1576 unique genes.

13 datasets matched for common genes: GSE44768, GSE44770, GSE44771, GSE5281, GSE28146, GSE48350, GSE16759, GSE84422, GSE26927, GSE29378
3 datasets that require extra work to match: GSE39420, GSE37263, GSE36980

If there is a tool I can use that would be faster than going through all +1500 genes and deleting the unnecessary stuff, please let me know.
Attachments
needsmatching.xlsx
3 datasets where the gene name/symbols are difficult to obtain
(275.95 KiB) Downloaded 156 times
commongenes.txt
common genes from 13 datasets
(57.93 KiB) Downloaded 158 times
lsand039
 
Posts: 237
Joined: Thu Jan 14, 2016 12:17 pm

PreviousNext

Return to Alzheimer

Who is online

Users browsing this forum: No registered users and 32 guests