GEO Microarray Data Cleaning

GEO Brain Tumor Microarray and RNA-seq Data Cleaning

GEO Microarray Data Cleaning

Postby stefmoore » Tue Feb 20, 2018 3:41 pm

To clean the GEO Dataset GSE84010, I did the following:

1) Go to https://www.ncbi.nlm.nih.gov/geo/ and search for the dataset GSE84010.
2) Download the Series Matrix File(s) (TXT Format).
3) Click on its corresponding platform GPL22111.
4) Click "View full table..." and copy & paste text into Notepad, saving it as a TXT file.
5) Open R Studio and use Efrain's code posted in GitLab to clean the dataset.
a. The most recently updated version of the code RCleanDscret.R as of 20Feb18 is attached.
b. Be sure to install the necessary libraries before running the code (pryr, MASS, dplyr, tidyr, readr, stringr).
c. Copy & paste the code into R Studio & run.
d. Select the downloaded Series Matrix File(s).
e. Select the relevant platform file.
f. Check for errors.
6) When the code is finished running, it should produce 3 files in your working directory which are attached.
a. GSE84010aftexcel - Clean dataset
b. GSE84010zscore - Normalized dataset
c. GSE84010dscrt - Discretized dataset
Attachments
GSE84010dscrt.txt
(1.1 MiB) Downloaded 453 times
GSE84010zscore.txt
(4.78 MiB) Downloaded 477 times
GSE84010aftexcel.txt
(3.98 MiB) Downloaded 466 times
RCleanDscret.R.txt
(13.06 KiB) Downloaded 497 times
stefmoore
 
Posts: 6
Joined: Mon Aug 08, 2016 2:11 pm

Re: GEO Microarray Data Cleaning

Postby stefmoore » Tue Mar 13, 2018 2:52 pm

**UPDATE**

The attached code RCleanDscret.R is an updated version of Efrain's code as of 13Mar18. Running this code in R Studio using the previous example dataset GSE84010 should now produce 4 files in your working directory: the three that were produced before, as well as one additional file GSE84010avg.txt, which is also attached. This file contains gene averages.
Attachments
GSE84010avg.txt
(3.26 MiB) Downloaded 489 times
RCleanDscret.R.txt
(14 KiB) Downloaded 466 times
stefmoore
 
Posts: 6
Joined: Mon Aug 08, 2016 2:11 pm

Re: GEO Microarray Data Cleaning

Postby DanielTira » Mon Apr 02, 2018 11:45 pm

Files for GSE83130
Attachments
GSE83130zscore.txt
(25 Bytes) Downloaded 437 times
GSE83130dscrt.txt
(1.96 MiB) Downloaded 434 times
GSE83130avg.txt
(26 Bytes) Downloaded 432 times
GSE83130aftexcel.txt
(112.95 KiB) Downloaded 444 times
DanielTira
 
Posts: 18
Joined: Thu Feb 15, 2018 5:09 pm

Re: GEO Microarray Data Cleaning

Postby DanielTira » Tue Apr 03, 2018 12:27 am

Files for GSE77259
Attachments
GSE77259zscore.txt
(6.1 MiB) Downloaded 436 times
GSE77259dscrt.txt
(1.65 MiB) Downloaded 443 times
GSE77259avg.txt
(2.85 MiB) Downloaded 431 times
GSE77259aftexcel.txt
(4.9 MiB) Downloaded 429 times
DanielTira
 
Posts: 18
Joined: Thu Feb 15, 2018 5:09 pm

Re: GEO Microarray Data Cleaning

Postby DanielTira » Tue Apr 03, 2018 12:35 am

Files for GSE7344
Attachments
GSE7344zscore.txt
(7.86 MiB) Downloaded 445 times
GSE7344dscrt.txt
(2.02 MiB) Downloaded 465 times
GSE7344avg.txt
(6.26 MiB) Downloaded 447 times
GSE7344aftexcel.txt
(11.88 MiB) Downloaded 442 times
DanielTira
 
Posts: 18
Joined: Thu Feb 15, 2018 5:09 pm

Re: GEO Microarray Data Cleaning

Postby DanielTira » Thu Apr 05, 2018 12:59 pm

Files for GSE70678
Attachments
GSE70678zscore.txt
(27.57 MiB) Downloaded 444 times
GSE70678dscrt.txt
(6.36 MiB) Downloaded 434 times
GSE70678avg.txt
(9.01 MiB) Downloaded 428 times
GSE70678aftexcel.txt
(19.85 MiB) Downloaded 440 times
DanielTira
 
Posts: 18
Joined: Thu Feb 15, 2018 5:09 pm

Re: GEO Microarray Data Cleaning

Postby DanielTira » Thu Apr 05, 2018 1:28 pm

Files for GSE70460

Files removed GPL cleaning ERROR
Last edited by DanielTira on Thu Apr 05, 2018 3:10 pm, edited 1 time in total.
DanielTira
 
Posts: 18
Joined: Thu Feb 15, 2018 5:09 pm

Re: GEO Microarray Data Cleaning

Postby DanielTira » Thu Apr 05, 2018 2:08 pm

Files for GSE67850
Attachments
GSE6367016686zscore.txt
(12.48 MiB) Downloaded 430 times
GSE67850dscrt.txt
(3.04 MiB) Downloaded 445 times
GSE67850avg.txt
(8.63 MiB) Downloaded 425 times
GSE67850aftexcel.txt
(16.64 MiB) Downloaded 463 times
DanielTira
 
Posts: 18
Joined: Thu Feb 15, 2018 5:09 pm

Re: GEO Microarray Data Cleaning

Postby DanielTira » Thu Apr 05, 2018 2:10 pm

Part 1 of GSE63670

Files Removed GPL Cleaning ERROR
Last edited by DanielTira on Thu Apr 05, 2018 3:10 pm, edited 1 time in total.
DanielTira
 
Posts: 18
Joined: Thu Feb 15, 2018 5:09 pm

Re: GEO Microarray Data Cleaning

Postby DanielTira » Thu Apr 05, 2018 2:11 pm

Part 2 of GSE63670

Files Removed GPL Cleaning ERROR
Last edited by DanielTira on Thu Apr 05, 2018 3:10 pm, edited 1 time in total.
DanielTira
 
Posts: 18
Joined: Thu Feb 15, 2018 5:09 pm

Next

Return to Dataset Cleaning

Who is online

Users browsing this forum: No registered users and 6 guests

cron