The summary of tophat-cufflinks protocol is like that:
step1: generate a tophat_out folder with bam files
tophat -G genes.gtf <index> sample1_1.fq sample1_2.fq
tophat -G genes.gtf <index> sample2_1.fq sample2_2.fq
step2: generate new .gtf files (assemble isoform)
cufflinks sample1/accepted_hits.bam
cufflinks sample2/accepted_hits.bam
step3: prepare a text file named assemblies.txt with following gtf files
cat << EOF > assemblies.txt
>sample1/transcript.gtf
>sample2/transcript.gtf
>EOF
step4: run cuffmerge to generate merged.gtf
cuffmerge -g genes.gtf -s genome.fa assemblies.txt
step5: compare gene expressions of two samples
cuffdiff merged.gtf sample1/accepted_hits.bam sample2/accepted_hits.bam
The protocol specifically used for our data
step0: access to the data
Open the web serve at , the passwd is
The result can be downloaded and viewed in ***
in the shell, type: 'cd ~/new2/RNAseq/trim'
step1: generate a tophat_out folder with bam files, using only JU1421-1 as example
"-N 8 \ --read-gap-length 8 \ --read-edit-dist 8 \" are generally called mismatch, this means the mismatch for the mapping is 8. Using this parameter, we can only find 69% JU1421 reads are mapped.
tophat2 -p 15 -i 20 -I 5000 -g 10 \
-N 8 \
--read-gap-length 8 \
--read-edit-dist 8 \
-o ./tophat_out/JU1421-1 \
-G ../genome/GENES.gff3 \
../genome/cb4_ws242 \
JU1421-1_S1_L001_R1_001_trimpair.fastq.gz,JU1421-1_S1_L001_R2_001_trimpair.fastq.gz\
All reads should be mapped using the same parameters. For AF16, the example is:
tophat2 -p 15 -i 20 -I 5000 -g 10 \
-N 8 \
--read-gap-length 8 \
--read-edit-dist 8 \
-o ./tophat_out/AF16-1 \
-G ../genome/GENES.gff3 \
../genome/cb4_ws242 \
AF16-1_S1_L001_R1_001_trimpair.fastq.gz,AF16-1_S1_L001_R2_001_trimpair.fastq.gz\
step2: generate new .gtf files (assemble isoform)
cufflinks -p 8 -o ./tophat_out/JU1421-1 ./tophat_out/JU1421-1/accepted_hits.bam
cufflinks -p 8 -o ./tophat_out/JU1421-2 ./tophat_out/JU1421-2/accepted_hits.bam
cufflinks -p 8 -o ./tophat_out/JU1421-3 ./tophat_out/JU1421-3/accepted_hits.bam
cufflinks -p 8 -o ./tophat_out/AF16-1 ./tophat_out/AF16-1/accepted_hits.bam
cufflinks -p 8 -o ./tophat_out/AF16-2 ./tophat_out/AF16-2/accepted_hits.bam
cufflinks -p 8 -o ./tophat_out/AF16-3 ./tophat_out/AF16-3/accepted_hits.bam
step3: prepare a text file named assemblies.txt with following gtf files
cat << EOF > assemblies.txt
>JU1421-1/transcript.gtf
>JU1421-2/transcript.gtf
>JU1421-3/transcript.gtf
>AF16-1/transcript.gtf
>AF16-2/transcript.gtf
>AF16-3/transcript.gtf
>EOF
step4: run cuffmerge to generate merged.gtf
cuffmerge -g ../genome/GENES.gff3 -s ../genome/cb4_ws242.fa assemblies.txt
step5: compare gene expressions of two samples
cuffdiff -p 8 merged.gtf –L JU1421,AF16\
./JU1421-1/accepted_hits.bam,\
./JU1421-2/accepted_hits.bam,\
./JU1421-3/accepted_hits.bam \
./AF16-1/accepted_hits.bam,\
./AF16-2/accepted_hits.bam,\
./AF16-3/accepted_hits.bam \
Comments
comments powered by Disqus