lsand039 wrote:cwyoo wrote:shstyoo wrote:Just a quick update, my computer is having trouble opening the actual GSE6360_family.soft file (found on the GEO website). It looks like the file size is too large for Open Office to handle. Are you able to open it through Microsoft Excel?
Steve, could you create a script that reads in two text files (which are in comma separated (csv) or tap separtate (txt) format) and creates one text file with same format? You may use the two files that Lauren posted here (Probe IDs and Gene Names.csv is already in comma separated format; and GSE63060.xlsx should be converted into a comma separated (csv) or a tap separtate (txt) format).
So, your script should read in Probe IDs and Gene Names.csv and GSE63060.xls (converted into a comma separated (csv) or a tap separtate (txt) format; let's call it GSE63060.csv) and produce a result text file that adds a column called GeneID (that corresponds to the Probe ID from Probe IDs and Gene Names.cs) into GSE63060.csv.
Since Lauren has more files like GSE63060.xlsx, she is planning to use your script and produce the text files that are needed to do further analyses. Please let us know if you have any other questions/comments.
I wasn't able to open the actual GSE6360_family.soft file, but I did find another file, GSE63060_series_matrix.txt, that contained the probe ID and array information. This file was small enough to let me open on Excel, and I used this file to create GSE63060.xlsx. The file was downloaded from http://www.ncbi.nlm.nih.gov/geo/query/a ... c=GSE63060.
Would you prefer I upload files like GSE63060.xlsx as *.csv files? Also, it would be great if you could let me know how to use Git. Thanks!
shstyoo wrote:The finalized script is up and running. If you would like to download a local version of the script do so here:
https://github.com/shstyoo/alzheimer-prediction-model
I'm not sure where to push the script on the Gitlab page. I could push it to Summer 2014 Chronic Disease Model or 2014 acheeti code. Let me know which one you want me to push it to.
In order to use the script, you will have to download it to a local folder.
In the folder put the Probe & Gene ID CSV file (name it something easy to type into a command line), also put the Dataset file in the same folder as well.
Click the gene-probe_id.py application and launch it.
Follow the prompts and type in the names of the files (case sensitive and you are required to type in the file extension as well).
Let me know if there are any issues with the script, If I can find some of the probe ID's for the GSE6360, or any equivalently sized file I'll try to test it out. Currently it seems to work for moderately sized files.
lsand039 wrote:I got through fixing the correlation values and the genes of interest. Attached are the revised files
Users browsing this forum: No registered users and 1 guest