RNA-seq

Currently undergoing other projects. Once there is significant traction, we will create new forums for those analyses and results.

Paired-end RNA-seq

Postby efrain.gonzalez0 » Thu Nov 16, 2017 11:37 am

Hello,

Earlier yesterday I obtained several warnings when trying to run htseq-count on the alignment output produced by HISAT2. The warnings occurred at almost every line and stated the following
Read (read#) claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)
After some searching I discovered that this is an issue that occurs when using htseq-count on paired-end data. The solution is that after obtaining the HISAT2 alignment output one must use the samtools sort function in order to sort the data by name. This can be done with the following command:
Code: Select all
./samtools sort -n -O SAM ~/Desktop/HISAT2_HOME/DrHooi/R1.4_Hisat2.sam -o ~/Desktop/HISAT2_HOME/DrHooi/R1.4_sortHIS.sam

Where "~/Desktop/HISAT2_HOME/DrHooi/R1.4_Hisat2.sam" represents the alignment file produced by HISAT2 and "~/Desktop/HISAT2_HOME/DrHooi/R1.4_sortHIS.sam" represents the new sorted file that will be created.
You can find out how to install and download samtools by going to this link: http://www.htslib.org/
You may be able to use the command
Code: Select all
sudo apt-get install samtools
in order to install samtools but this may not be the latest version and so it may not have all of the features you need. For example in my case the version of samtools available off of Ubuntu only allowed for you to input BAM files. While it is fairly simple to convert between SAM and BAM it is easier and less time consuming if no conversion has to be done at all.

Furthermore, for paired-end data you will be required to have the most updated version of htseq-count. Although you may be able to update htseq-count in the following way
Code: Select all
sudo apt-get install python-htseq
this will not necessarily be the most updated version of htseq and so I recommend you use the following code:
Code: Select all
pip install HTSeq --user
This command will provide the most updated version of htseq. I required the updated version for htseq-count because it included the "-r" option.
Afterwards you will be able to use the following command in order to run htseq-count:
Code: Select all
htseq-count -m intersection-nonempty -s no -i gene_name -r name ~/Desktop/HISAT2_HOME/DrHooi/R1.4_sortHIS.sam ~/Desktop/Homo_sapiens.GRCh38.89.chr_patch_hapl_scaff.gtf -o R1.4_COUNT.sam > R1.4_Count.txt

Now when you run this command you should see little to no warnings.
efrain.gonzalez0
 
Posts: 138
Joined: Tue May 02, 2017 12:29 pm

Re: RNA-seq

Postby cpere117 » Wed Mar 13, 2019 12:38 am

Results mapping "exon proteins" only, for RNA-Seq analysis of Dr. Felty's three samples. Btw, this is single stranded RNA sequencing data not meant to be paired end sequencing.
Attachments
Norm Count Data.xlsx
(1.04 MiB) Downloaded 171 times
Galaxy29-[DESeq2_plots_on_data_11,_data_16,_and_data_22].pdf
(681.89 KiB) Downloaded 175 times
28_ DESeq2 result files on data 11, data 16, and data 22.tgz
(2.27 MiB) Downloaded 162 times
cpere117
 
Posts: 38
Joined: Thu Aug 24, 2017 7:15 pm

Re: RNA-seq

Postby cpere117 » Mon Mar 18, 2019 4:05 pm

Note, that these results display exon proteins primarily. I plan to gather the results from intron proteins and mRNA proteins once I've completed analysis.
cpere117
 
Posts: 38
Joined: Thu Aug 24, 2017 7:15 pm

Re: RNA-seq

Postby cpere117 » Thu Apr 18, 2019 10:09 pm

Here is an updated list of annotated ID's and normalized counts produced for the three RNA-Seq samples of ID3
Attachments
37_ annotateMyIDs on collection 28_ Annotated IDs.tgz
(873.81 KiB) Downloaded 154 times
38_ annotateMyIDs on collection 28_ Rscript.tgz
(967 Bytes) Downloaded 160 times
cpere117
 
Posts: 38
Joined: Thu Aug 24, 2017 7:15 pm

Previous

Return to Others

Who is online

Users browsing this forum: No registered users and 1 guest

cron