Page 1 of 23
GEO datasets

Posted:
Tue Jan 26, 2016 9:51 pm
by cwyoo
Please goto
http://www.ncbi.nlm.nih.gov/geo/ and download gene expression data for Alzheimer Disease in human. Post the original data, discritized data, and analysis results of banjo and ARACNE.
Re: GEO datasets

Posted:
Thu Jan 28, 2016 4:35 am
by shstyoo
I've gone ahead and downloaded a couple of the .soft files and summarized studies.
The attached output.txt file contains the descriptions for each of the GEO datasets I've found as well as the GDS# that will identify the file on GEO.
The file names that have the word "FULL" in them appear to not have a direct link to the study on GEO (but it can be found by searching the GDS# that precedes it).
The files with the word "FAMILY" have a related link (if given) that links to the study on GEO. Otherwise, they can be found by typing in their GSE# into GEO.
As I find more studies I'll update the descriptions.
edit: You may need to use notepad++ or some other text editor (anything that isn't Window's default notepad) to view the file properly, since notepad doesn't seem to recognize the newline in the output.
Re: GEO datasets

Posted:
Thu Jan 28, 2016 7:51 am
by cwyoo
shstyoo wrote:I've gone ahead and downloaded a couple of the .soft files and summarized studies.
The attached output.txt file contains the descriptions for each of the GEO datasets I've found as well as the GDS# that will identify the file on GEO.
The file names that have the word "FULL" in them appear to not have a direct link to the study on GEO (but it can be found by searching the GDS# that precedes it).
The files with the word "FAMILY" have a related link (if given) that links to the study on GEO. Otherwise, they can be found by typing in their GSE# into GEO.
As I find more studies I'll update the descriptions.
edit: You may need to use notepad++ or some other text editor (anything that isn't Window's default notepad) to view the file properly, since notepad doesn't seem to recognize the newline in the output.
Great work. Could you share your search terms and selection criteria that lead to the results? So others can try different search terms and/or new criteria to discover new data.
Re: GEO datasets

Posted:
Thu Jan 28, 2016 2:46 pm
by shstyoo
The script I wrote picked up any study that had the word "Alzheimer", "AD", "Alzheimer's Disease", "neurofibrillary tangle", "amyloid beta protein" and had the species "homo sapien" in either its description or title.
It should ignore anything that isn't human, and doesn't have those keywords. It also will not download datasets that are over 100mb in size.
For some reason, it was only able to download from the first two pages of the GEO datasets so I still need to work on the script. I'll post it later when I've finished it.
Re: GEO datasets

Posted:
Wed Feb 03, 2016 2:55 pm
by lsand039
Attached are the results I found from the first four pages of searching "Alzheimer" on the GEO datasets. I followed the same format as Steve's output.txt file. I didn't include any results that didn't look relevant to human Alzheimer patients, but let me know if I should be more selective in choosing files.
Re: GEO datasets

Posted:
Wed Feb 03, 2016 3:40 pm
by cwyoo
lsand039 wrote:Attached are the results I found from the first four pages of searching "Alzheimer" on the GEO datasets. I followed the same format as Steve's output.txt file. I didn't include any results that didn't look relevant to human Alzheimer patients, but let me know if I should be more selective in choosing files.
Great. Let's create a table that identifies for each study, how many females and/or males, their age ranges, and country where subjects were recruited (e.g., U.S.A., Japan, etc.).
Re: GEO datasets

Posted:
Wed Feb 10, 2016 2:33 pm
by lsand039
Attached is a table with the information for my results. I was only able to find enough information on about 15 of the studies. Some of the subjects in the studies were taken from brain banks, so the country listed would be the location of the brain banks.
Re: GEO datasets

Posted:
Thu Feb 11, 2016 1:57 pm
by lsand039
Here's an updated table containing all the sources so far. I did my best to get information on subject demographics, but I couldn't find all the specific information from the published articles.
Re: GEO datasets

Posted:
Thu Feb 18, 2016 1:53 pm
by lsand039
Here's a table for GDS810 containing the both the demographics of the subjects and their microarray data. More tables are coming soon!
Re: GEO datasets

Posted:
Fri Feb 19, 2016 12:44 pm
by cwyoo
lsand039 wrote:Here's a table for GDS810 containing the both the demographics of the subjects and their microarray data. More tables are coming soon!
Great! Let's do research on the following questions (please feel free to add any other questions):
- For a study, what are different datasets that are uploaded in GEO?
- What is the difference between *.family.soft and *.soft files?
- How are the gene expression value calculated?