一种基于卷积神经网络预测假尿苷修饰位点的方法与流程

文档序号:13513278阅读:742来源:国知局
一种基于卷积神经网络预测假尿苷修饰位点的方法与流程

本发明涉及基因在转录的过程中rna序列中假尿苷修饰位点预测技术,具体是一种基于卷积神经网络预测假尿苷修饰位点的方法。



背景技术:

基因在转录的过程中,很多rna都发生了修饰的现象。截止目前,已经发现了一百多种rna的修饰。用化学的方法,这些共价原子的rna修饰已经被研究了十二年左右,一些这种修饰存在一个生命中的很多部位,它们影响rna的二级和三级结构,影响基因表达的速度和精度,能够维持rna的稳定性、帮助rna在核糖体上正确解码、阻止一些疾病的发生等在正常行驶生物学功能方面具有重要意义。

在这一百多种修饰中,假尿苷是第一个被发现的,并且是迄今为止数量最多的一种rna修饰。目前人们熟知假尿苷修饰存在trna,rrna,srna等一些非编码rna中,后来thomasm.carlile等人通过高通量测序的方法发现,在人类和酵母菌细胞中的mrna上也存在假尿苷位点的修饰。假尿苷位是尿嘧啶的异构体,是在一定条件下通过共价键的转移形成的。例如在真核生物中,假尿苷化的过程主要是通过boxh/acarnps的催化作用,boxh/acarnas每一个发夹有两个凸环,它通过识别特定的rna序列,并在下面凸环的结构处与之碱基互补配对,然后通过特定酶的催化,作用于凸环顶端未配对处右面的尿嘧啶,使尿嘧啶的化学结构以位置3和6连线为轴旋转180°,然后磷酸c5顺时针旋转到最下面,这样和核糖相连的原来的c-n键变为了c-c键,形成了假尿苷。

假尿苷能够改变rna的结构,增加碱基堆集,提高碱基配对,固定核糖-磷酸骨架。它与帕金森等神经性疾病和x连锁形式的骨头骨髓衰竭综合征角化病直接或间接相关,由于其特殊的结构和化学性质,以及它的生物学和医学的意义,假尿苷位点的研究越来越引起人们的关注。针对假尿苷位点识别这一问题,高通量测序技术被称为ψ-seq被提出(carlile,t.m.etal.pseudouridineprofilingrevealsregulatedmrnapseudouridylationinyeastandhumancells.nature515,143(2014)),对一些物种ψ位点进行了全面、高分辨率测绘,来确定假尿苷位点,但这种技术是对基因组序列进行测序,成本巨大,耗费时间比较长,而且随着序列长度的增加测序会越来越困难。因此,迫切需要开发一些更方便的计算机算法提取假尿苷位点的信息,然后对位点进行预测。

目前li,y等人(li,y.h.,zhang,g.&cui,q.ppus:awebservertopredictpus-specificpseudouridinesites.bioinformatics31,3362(2015))和chenw等人(wei,c.,hua,t.,jing,y.,hao,l.&chou,k.c.irna-pseu:identifyingrnapseudouridinesites.moleculartherapynucleicacids5,e332(2016))等人通过对基因序列进行截取,然后对序列进行编码,chenw在编码时加入了核苷酸的物理化学性质,最后再用libsvm算法进行特征提取和分类,来确定假尿苷位点,但libsvm算法进行特征提取和分类的准确率有待提高,为了更准确的预测假尿苷位点,需要更高效率的算法进行序列特征提取。



技术实现要素:

本发明目是针对现有技术的不足,而提供一种基于卷积神经网络预测假尿苷修饰位点的方法。这种方法能提高假尿苷位点预测的准确率,使假尿苷位点预测更好的推广于应用。

实现本发明目的的技术方案是:

一种基于卷积神经网络预测假尿苷修饰位点的方法,包括如下步骤:

1)数据集整理及转换:选取wei,c.,hua,t.,jing,y.,hao,l.&chou,k.c.irna-pseu:identifyingrnapseudouridinesites.moleculartherapynucleicacids5,e332-2016论文中的由含有假尿苷位点的正样本和不含假尿苷位点的负样本组成的酵母菌、人和小家鼠三个物种的数据集,对这些数据集进行编码,将人和小家鼠数据集中每一个样本转换成20×20大小的矩阵,酵母菌数据集样本转换成20×30大小的矩阵;

2)模型构建和训练卷积神经网络模型:构建卷积神经网络(convolutionalneuralnetwork,简称cnn)的结构,我们将步骤1)中转换成矩阵的正负样本作为cnn的输入,同时满足正负样本的均衡性,调整cnn的层数以及卷积核的个数和大小,然后利用调整好的cnn结构对数据集序列进行特征提取,训练出一个包含特征向量的模型;

3)对待预测序列截取和编码:将所需要检测的整条序列整理为fasta格式,即首行第一个字符为‘>’,后面添加对序列的解释说明,下一行为待预测序列,用同步骤1)的数据集样本相同长度的滑动窗口对待预测序列进行截取,截取的序列形式和数据集样本形式相同,并将截取的序列转换成步骤1)中的矩阵形式;

4)特征提取和预测:将步骤3)的转换结果作为预测集输入,利用卷积神经网络特征提取后,根据步骤2)已经训练好的卷积神经网络模型对输入序列进行预测,然后向待预测序列末尾的方向滑动窗口,重复循环步骤3)中对序列的截取转换和步骤4),直到整条序列的末尾,最终得到了预测出的假尿苷位点。

步骤1)中所述的编码为:rna序列中一共有a,u,g,c四种核糖核苷酸,任意先后取两个为一组,一共有16种组合方式,然后进行16维移位编码,每一对组合都会被编码为一个16维的列向量,对于一个样本序列,从左到右取两个相邻的核苷酸编码,然后右移一个核苷酸,取后面相邻的两个核苷酸进行移位编码,重复这样的操作进行编码,直到最后一个核苷酸,按照这样的编码方式可知,相邻两个核苷酸都可以转换为一个16维的列向量,简单的这样编码还是不够的,为了更准确的转换特征,还需加上核苷酸的化学性质,核苷酸的化学性质见表1,用第17维代表相邻两个中第一个核苷酸的环形结构,嘌呤用数字‘1’表示,嘧啶用数字‘0’表示;第18维代表相邻两个中第一个核苷酸的官能团,氨基用数字‘1’表示,酮基用数字‘0’表示;第19维代表相邻两个中第一个核苷酸互补配对时氢键的强弱,强用数字‘1’表示,弱用数字‘0’表示;第20维表示和相邻两个中第一个核苷酸类型相同的核苷酸占样本中除去最后一个核苷酸后的比例;对于一个由l+r+1个核苷酸组成的样本序列,编码后转换成为一个矩阵,该矩阵大小为20×(l+r),

表1核糖核苷酸的化学性质

利用卷积神经网络进行序列特征提取在假尿苷位点预测中的应用。

这种方法利用深度学习中卷积神经网络算法对序列特征进行提取和预测。

这种方法的有益效果是:假尿苷在行驶正常的生物学功能方面具有重要作用,因此我们需要准确的预测出假尿苷位点,卷积神经网络具有能够自动深度的挖掘数据的隐含特征的特点,相比现有技术所使用的支持向量机(supportvectormachine,svm)算法,能够更好的提取序列特征,进而提高假尿苷位点预测的准确率。

这种方法能提高假尿苷位点预测的准确率,使假尿苷位点预测更好的推广于应用。

附图说明

图1为实施例的方法流程示意图;

图2为实施例中假尿苷位点的形成过程示意图;

图3是实施例中序列的移位编码方式示意图;

图4是实施例中以物种人为例cnn的结构示意图。

具体实施方式

下面接和附图和实施例对本发明内容作进一步的阐述,但不是对本发明的限定。

实施例:

假尿苷是尿嘧啶的同分异构体,它是在rna转录的过程中,通过酶的催化作用,如图2所示,使尿嘧啶的化学结构以位置3和6连线为轴旋转180°,然后磷酸c5顺时针旋转到最下面,这样和核糖相连的原来的c-n变为了c-c键,形成了假尿苷。

参照图1,一种基于卷积神经网络预测假尿苷修饰位点的方法,包括如下步骤:

1)数据集整理及转换:选取wei,c.,hua,t.,jing,y.,hao,l.&chou,k.c.irna-pseu:identifyingrnapseudouridinesites.moleculartherapynucleicacids5,e332-2016论文中的由含有假尿苷位点的正样本和不含假尿苷位点的负样本组成的酵母菌、人和小家鼠三个物种的数据集,对这些数据集进行编码,将人和小家鼠数据集中每一个样本转换成20×20大小的矩阵,酵母菌数据集样本转换成20×30大小的矩阵;

具体的编码方式为:先进行移位编码,如图3所示,rna序列中一共有a,u,g,c四种核糖核苷酸,任意先后取两个为一组,一共有16种组合方式,然后进行16维移位编码,每一对组合都会被编码为一个16维的列向量,对于一个样本序列,从左到右取两个相邻的核苷酸编码,然后右移一个核苷酸,取后面相邻的两个核苷酸进行移位编码,重复这样的操作进行编码,直到最后一个核苷酸,按照这样的编码方式可知,相邻两个核苷酸都可以转换为一个16维的列向量,简单的这样编码还是不够的,为了更准确的转换特征,还需加上核苷酸的化学性质,核苷酸的化学性质见表1,用第17维代表相邻两个中第一个核苷酸的环形结构,嘌呤用数字‘1’表示,嘧啶用数字‘0’表示;第18维代表相邻两个中第一个核苷酸的官能团,氨基用数字‘1’表示,酮基用数字‘0’表示;第19维代表相邻两个中第一个核苷酸互补配对时氢键的强弱,强用数字‘1’表示,弱用数字‘0’表示;第20维表示和相邻两个中第一个核苷酸类型相同的核苷酸占样本中除去最后一个核苷酸后的比例;例如序列‘agaucu’的编码结果r(agaucu)如式(1)所示:

表1核糖核苷酸的化学性质

2)模型构建和训练卷积神经网络模型:构建卷积神经网络的结构,我们将步骤1)中转换成矩阵的正负样本作为cnn的输入,同时满足正负样本的均衡性,调整cnn的层数以及卷积核的个数和大小,如图4所示,给出的物种人调整好的卷积神经网络的结构,然后利用调整好的cnn结构对数据集序列进行特征提取,训练出一个包含特征向量的模型;

3)对待预测序列截取和编码:利用滑动窗口对整条待预测序列截取和编码,将所需要检测的整条序列整理为fasta格式,即首行第一个字符为‘>’,后面添加对序列的解释说明,下一行为待预测序列,用同步骤1)的数据集样本相同长度的滑动窗口对待预测序列进行截取,截取的序列形式和数据集样本形式相同,所以待预测序列的截取方式为如式(2)所示,以被预测的位点u为基准,在其上游和下游分别取l和r个核苷酸,截取序列的长度为l+r+1个核苷酸,

s(u)=n-ln-(l-1)n-(l-2)...n-2n-1un+1n+2n+(r-2)...n+(r-1)n+r(2),

根据数据集样本的长度,如果待预测序列来自物种人和小家鼠,我们取l=r=10;如果待预测序列来自物种酵母菌,我们取l=r=15,并将截取的序列转换成步骤1)中的矩阵形式;

4)特征提取和预测:将步骤3)的转换结果作为预测集输入,利用卷积神经网络特征提取后,根据步骤2)已经训练好的卷积神经网络模型对输入序列进行预测,然后向待预测序列末尾方向滑动窗口,重复循环步骤3)中对序列的截取转换和步骤4),直到整条序列的末尾,最终在整条待预测序列上得到了预测出的假尿苷位点。

图1给出了基于卷积神经网络的假尿苷位点预测的步骤,首先我们要对数据集进行整理及编码转换,把序列数据集转换成矩阵形式;其次,进行卷积神经网络模型的搭建,然后利用数据集转换成的矩阵训练搭建好的卷积神经网络模型;紧接着,利用滑动窗口截取待预测序列,然后对截取的序列编码;最后,利用卷积神经网络进行特征提取后,基于训练好的模型对输入序列进行预测。

实验例:

利用三个独立测试集s(4),s(5),s(6)对三个物种进行预测:s(4),s(5),s(6)分别来自物种人、酵母菌和小家鼠,其中,s(4),s(5)来自论文(wei,c.,hua,t.,jing,y.,hao,l.&chou,k.c.irna-pseu:identifyingrnapseudouridinesites.moleculartherapynucleicacids5,e332(2016)),s(6)是根据本实施例方法需要单独构造的,s(4),s(5),s(6)分别包含一百个含有假尿苷位点的正样本和一百个不含位点的负样本,预测结果如表2所示:

表2:本实施例方法与仅有的两个预估器的预测结果对比

由表2可以看出,使用本实施例方法预测,其预测结果表明,cnn明显优于目前世界上仅有的基于支持向量机(supportvectormachine简称:svm)算法的两个预测器ppus和irna-pseu。

序列表

<110>桂林电子科技大学

<120>一种基于卷积神经网络预测假尿苷修饰位点的方法

<141>2017-10-20

<160>3

<170>siposequencelisting1.0

<210>2

<211>6200

<212>rna

<213>saccharomycescerevisiae

<400>2

cuaucaucgcugaucucccacucccugaucugaagaggucaucgguucgauuccgguugc60

guguaagaugcaagaguucgaaucucuuagcaagcgaaagauuagaaaucuuuugggcuu120

ugccgguuaaggcgaaagauuagaaaucuuuuggguuuaggaccgagcuuuuaguggaug180

ucaucaggacacuucugauguuucaaaagauauuccagguacuggacgagaaucgcagaa240

caauuugacguagauguuuguuguucacccacaacugaagaguugucgaguuuuuugagg300

uuaagaaugaaaggucgaaaaaguuucaggcaguuucucagcguugggcccccgguucga360

uuccgggcuugcugguaaaauccaacguugccaucguugggccuaagcgcaagugguuua420

gugguaaaauccaagguuaaggcgaaagauuagaaaucuuuuggggcgaaagauuagaaa480

ucuuuugggcuuugccggcuucauuaacauguacuucaacuacggaaguggagaucaucg540

guucaaauccgauuggaauuugguuuucaaguguaauaggcuacgugaucagugguucaa600

gacgucgccuuuacacggcguagugguuaucacuuucgguuuugauccggacacuuucgg660

uuuugauccggacaaccccgguaauugaucuauguuguagcugcgcuggcggcaacucca720

guucuuuaucuucuuucuccgcuggcgucugacuucuaaucagaagauuauggguucuuc780

cgugauaguuuaauggucagaaugggcagaaugggcgcuugucgcgugccagaucgggug840

ccagaucgggguucaauuccccgucgcgagaaaaagccaaugaugagauacaagccauua900

ucgacauaugcugguuacauggcaguagaagaauauacauucuauuaucgaaccuggcca960

ugaaacaagauuucuguagcauacucgcuucauacuuguuuucuuuuuugugccuuuguu1020

acguugcuuuguggaaguucgaaacuccaaaguaugagugauggaaguguaguuauccgg1080

agaucagggucaaaucuucguugaccgucaauuacaugcagcacaaauuuguagacaggc1140

ugguuugaggauuacuuggacauuaacgguucuccuauucaagacaaaaguguucuuuca1200

ucugcaguguuggcguacagauuguaguuguggcugcuaccuuuuuuaauguccguuucu1260

augauugggcuauuguucgaagguaaugccuugaucagaagacuguugguccuuaguucg1320

auccugagugcgagcagcagauugcaaaucuguugguccuuaguuuauccgauauagugu1380

aacggcuaucacauccguggagaccgggguucgacuccccguaucggguauguuauuuau1440

guaacggguaugcgaacauucuuuuuuugauguaauaggauaagcuugcuguucuuuuca1500

guguaacaacugaaaugacuguaguaucuguucuuuucaguguaacaacuguguaguauc1560

uguucuuuucaguguaacaacaaguguaguaucuguucuuuucaguguaacacaagugua1620

guaucuguucuuuucaguguaacaucaaguguaguaucuguucuuuucaguguaucauug1680

uucuuggauuucaaugggugcugucuaaauuucgccacuguagaugaagaagacgaaaaa1740

ugagaagaguguagauguauuauccuuccaagauagacuauguaaugguaaagaacauau1800

ggcggcgggugccuuuggagcagcaaucgaugguguggucacuguaagagauuggcccca1860

ccauggacgagccuguaguauacaacgguaaacaaaggucuuccuaugauuccggcguuc1920

gucuuucucauacccuguagaccagaccucucuagaauacuuugaagguuuaaccgagga1980

aaugcguggagaccgggguucgacuccccguaucguuauccgauauaguguaacggcuau2040

cacaucggacacuucugauguuucaaaagauauuccauaacugugggaauacucagguau2100

cguaagauguaagaugcaagaguucgaaucucuuagcaaacaauuuucacaguuuaaggc2160

caagaacaaggcccguuuacacauuuugauacaaccguagacgggaggucccggguucga2220

gucccggcucgcgauucucgcuuagggugcgggaggucccggggcgugcgacuguuaauc2280

gcaagaucgugagucgcaagaucgugaguucaacccucacuggggcguugggcccccggu2340

ucgauuccgggcuugcugguaaaauccaacguugccaucguugggcccuagcgcaagugg2400

uuuagugguaaaauccaaggcuguguucuucuuucuaaauucccuaucgggaaaaacccg2460

uugcuagaagcgcaacuggugaaaaaaguucagaauugcagaaaaguggugagugguuuc2520

cuaguguaucagccacuaucggcauaagguuagggguucgagcccccuacagggcaaucg2580

guagcgcguaugacucuuaaucauaaacaaaagaagcuguuccagagagcccaagccgga2640

caaccccgguucgaauccggguaggacacuuucgguuuugauccggacaaccccgguggu2700

uaucacuuucgguuuugauccggacaacuagugguuaucacuuucgguuuugauccggau2760

guugccgcuaaguguaaggaagucgguauccugguauauucuauauacucacuuauuacu2820

uuucugguauauucuauauacucacuuauuacaauggcucuuuuuguuauucgaaagcuu2880

acauaaaaaguucggcuaucucuugggcucugccucugcccgcgcugguucaaauccugc2940

uggugcauggaugauauuuguaguauggcggaaaacguggagaucaucgguucaaauccg3000

auuggaaauacuauucaguuucucagauauagguugcagcaauuggaaaaaucuauuaac3060

ccagaugaaccagugcgucuacuauuacucggccaaauauucguaauuugagaucucugc3120

aaaacaaugcaaaacaaugcaccuccuggcaaaaacaucaauaaacaucaaugucaauug3180

uuugaacgucaaugaacgucaauucuuguucguuguccgcaagcaauuaauauggcuugu3240

aauggaaacaagcaaaaacaagcaagaucuucccauaccguuucccaccguuuccccugc3300

auguagaaugcaacgauaucaauguuuaaucauaacagaucaaagagcaucaaagagcag3360

ugguacuacagaugcgucaagguacgcauaagcgugaaccccggucgacgccggucgacg3420

auacauacagagcuguuacaauauagcaaaggacaguagaaaccugaguaauccugagua3480

auggaucuuugaaugauauuaacugauauuaacgaaaaugaagagcuccaaaaugcucca3540

aaauuuccauagaaaaaucagcgaauuccccaaggaaaaauagcgaaaccagaaagguua3600

augcgggaagauuacauugccuugaaaugccuugaaacaaccuccaagcuugggagauga3660

ggagaucucgucguuuaagaaccaagucaaaccaagucauucgguaacaaguuccaagac3720

guuccaagacauuacugucgaaccucaaucccguagguaaaguguauuuagugagggaac3780

gccgcgacaagugaucauccauuuauugugacauauugugacacuguaucauuccuuuca3840

aaccaaacauauuacugcaucaaucuggucaugucuggucaugucaugcuuucugacuuu3900

gauuuaagauacaaaaauuuguucagauggauucagauggauucagaacuaauuccuuug3960

uugguacuuggcuguacuccauuuaaaggagauaauucacgucaaauuuccacaugauaa4020

ggaaguuucgggaaguuucgaagaauuguaaagaccugauacuucuucaaaaaaguucag4080

uggucguucuuaccccccucuaauaccugcauuaaaugauaacuccuuuuauauugucuu4140

gcaauaaacacccgaaacgaugaugaaauugaugaggcugauccaggcugauccauucca4200

ugauuuuaauucuaugcuacucugaaaauuauaccuacggaaaaauuaucuuaugauaaa4260

aauguaaaaaauuauuuaaaacgagaaagugaaugaaaaauauaauaucauauaauauca4320

uuuauugucugauaaugcuguacguaccauccgcaucaguggauauccaaugauauccaa4380

ugauaguaauuucgcgaguuuacgcgaguuuauccguugcuguuauauuaucauauauua4440

ucacuuuuuaauauucuuuucaaaggauuccuuccgcaauucuucugaaauacugcucgc4500

caguuuuuuguucuuccacguaauccccuuauuaacggagauuugauuucucccagcacc4560

gauucgagugaguacguuuucaaauauguucaaauaugcuuaaucugaucucuucugcgg4620

ccgaucugugccauuauaguaagcagugccaagcagugccacuugucuaauauaagauga4680

ugauuuuaccguuuucuggggacaucaugauacaucaugauaucauuugguacauaauga4740

acagauaauuggauuucuugcauuuuuugcgauuuuuugcgauuauggcuuguugaccau4800

ucacaaccauucacaaaaguuggucuaacauaauuuuaaguccuuguaauauucuagcuu4860

uugagucucugggagugguaaaucuacugaccaucuucuuuuauccaaucaucuggcaag4920

uccuuaauuuucaucucuaaaauuuagauauggacguuuguggacguuugagauauuuuc4980

guauuucugccuauuucugccaauucuuccuuuaacugugacuaacugugaccguacuga5040

uucgcuuucccuugcuuucccuuugaauuuuuauuauacccucucauuacuugcuuaucu5100

gaauuuuuuuccauuuuguuugccuauccuuccaucugaugacuugugaugacuugaaau5160

guucugacagguaagauucucaacauucuuaauccaaacgauguccuucuccugcuugug5220

uauuaaaggacaugaaauauuucgcuacauguaauggaagaucaucuguagauucguauc5280

uguaguuucucaucagcaagaucuuucaaaaacgcuugauuugcuggcaccucuuaauag5340

cgcuuguuucugcuugcucuacccucuaggugaacguuuaaucugacauccgggaaguuu5400

gaugugaaguauucugcuaaccguucagguccuucaccuguuuguccaagggaguaaggg5460

gacuuucuggcuuuuuuuuuuacgaaacucuuccucaucaucuucagccucaacauuuuc5520

caaccgcaacuucuuguucuugcuuaugcccugcuuauuguggguugucccgccauuauu5580

cgccauuauuguuaauagauucaacaaaauauuuaucauugaaaauucacgugaucgcaa5640

uagaucgcaauauuccgucaggagugauaaauaucgucauugcacaaauuaguuuauuau5700

ucacuauuauucacgacucuuaacaacgacaauuuuagacaggucguccguagauauuua5760

cauaaauacuacacagacuacuauuagaauuugcgaaaauuugcgaaggauuuaccgaag5820

aaaagcacagaaaagcacagaccuuauugagcuuuugaaucaauaaccaggaguuucaaa5880

aacaaacaggcacuuuucauugaucuauuugauaaaucugccacuagaguccaaucuacg5940

cgacuuauugcaacuuauugcauuccuuggaaggugaaagucuugcacgaugguccaguu6000

auugaugaauuuuuauuuggccauucaacuucauaaguggucgguaagguaccaggaaag6060

uuucugaaaccaucccaaucuuuacucuacuuauuauccauugcaucccccugaauaucu6120

uauuuuagcauuagucaacauuagucaaagaaaugaagcgguucguuuugguucguuuua6180

uugauagaaaacaggacagu6200

<210>3

<211>4200

<212>rna

<213>homosapiens

<400>3

gcuaaacagguacugcugggcuuauugagugucuacuguguggauaaacuguuacgcaua60

uauuugucgguguuaacaaaauggucgggccuaguucaaaccuuuuuuuuaaguauacag120

gggucuggccggucuguagcggaucacuagcuaucgcuucucggccuuugaaaguaacuu180

ugcccgagcacuauucuguuaaaaucaggagcagcugccuuuccaacagcccaaaaugac240

uuucguucuucuuucagauacuuacauaguuuuccgaaucaacuuugccguguugacuca300

aaguuacucuccuuccuacccaccuuucccagaaguggacaauauauuaaauggauugag360

gacaauauauuaaauggaguguaguaucuguucuuaucaaaguguaguaucuguucuuau420

ucaaguguaguaucuguucuuagaucaaguguaguaucuguucuucucggccuuuuggcu480

aagaguaaucgcuucucggccuuugaguaaucgcuucucggccuugauguauuguuugca540

cucuucaugauucuauuauaguauucuuguuuuuguauuguugcuccuuucuuuuuuuug600

gccuuucucgcuaaacagguacugcugggcccauuaucgcuucucggccuucauuaucgc660

uucucggccuuuuguaauauuuuaucccuggacuaguaucuguucuuaucaguuguagua720

ucuguucuuaucaguguguaguaucuguucuuaucaaaguguaguaucuguucuuauuca780

aguguaguaucuguucuuagaucaaguguaguaucuguuguauugagugucuacugugug840

uuuucaucacuauggcuuagcgcaucaaaacuucacuuuuugauuggugguauaguggug900

agcgauaaaaggcuaauauccagaggucccugguucgaucccgggagacugaagaucuaa960

agguccgggagagcguuagacugaagaugggagagcguuagacugaagaaauccuuucua1020

aauugcaugcauaaaaaguuuuuucuucagagaguauggauuccgauaugaaagacauga1080

auaagaacugaugacuuucaauuaucugugugagccuuuucuuuguuuguaacuagccau1140

cagguaagccaagaucuucucggccuuuuggcuaagauccaucgcuucucggccuuuggg1200

cccagggugcuguggagaauuguccuccuucugaagcccccuccuuuucugaggaaggug1260

auuggaacgauacagagaagaagacuauacuuucagggaucagcgccccaauuauuauga1320

cuguaaguuauuuugcucucacuggcaauuugguuccaccacaucacucaauacuuaccu1380

ggcaggcacucaauacuuaccuggcagcuggcugcuguaggucuuuucauuguugauauu1440

ugcccagcagggccucaguuagcucucaagucccaugguguaaugguuagcguuagcacu1500

cuggacuuugaaggacuuugaauccagcgauccgcgauccgaguucaaaucucgcgaucc1560

gaguucaaaucucggucauuuuauguauauuuaucaccuuuccaguuacuccuuauauaa1620

guuauuuugcucucacugucaaguguaguaucuguucuugguaggugaguuuaaagucuu1680

cucuuaccuguuaaaaucagggcaacagaguucaacuaucuccauuugcuguuacucugg1740

agaucaaguguaguaucuguucuuguaaaaggguuacucucauacuuuuauuauuuggau1800

gaauaucuucucggccuuuuggcuaagaacuaucgcuucucggccuuuaaacuaucgcuu1860

cucggccuucccuggagguuccaauccugcuucuccaugauucgugcaucucuaauuaug1920

cuggacuguuuuauuggaacgauacagagaagaauauuucucauuucuuuuaguuauacu1980

aaaauuggaacgauaauuggaacgauacagagaagaacacgcaaauucgugaagcguaag2040

uguaguaucuguucuuauucaaguguaguaucuguucuuucaaguguaguaucuguucuu2100

gugauauaacucaguggcagaggccuuggauuucauccccagggagagggagugggaaca2160

ggauuugcaagacuccuaguaccuuguguagcaaugguguccaggaguaacaaguucagg2220

uucaccgcaaagucacucuauucugaucccaaagguuuacuuaauguuuagguuccuguu2280

gcuugccaucuaagagguuuguuguccuauuggaagucuuuuccuuuaaagucucuuagc2340

aucagacacuuaagagagagaaugagaaucaucguggaaugaauagacuuaacugucagg2400

aggcugucuuacguacacaauugcauguggaagcugcaauaacucauuccuacagcccca2460

caaacgguuuaagcuugagucacaauaaucaucauuucauuccuucaaauaaaaaaaaau2520

cauuucugaauucagauguaucuaucauaguuggguuuaagaaucagaacauuggguaua2580

uuccaccauggugucugggagcacacauuaccccucccuucccgcaccaacgaucugcuu2640

gugaacagagcuuuaguccagagcaagcccccgccuuuuuuucuguuguaaauuuuguua2700

ugcaauuaauuuagaggaauagggaaaguggacgugucuguuguuucucaaggguccgga2760

cuguuugacacugaugaaugcuuucucaaaaguuuaaacaguuucauuuggaaguagggu2820

cgccuuaagucaacaucacagaugcuccagcaggcaaccauauguuuagaaauaaaacca2880

gccgcggugccagcaaagaacagacacauuacuugaacuuguucugaguucuacugucuu2940

acccaaaugcucggaaacucucuuaugacugugacuucagaaaaagaaggauuccaaaga3000

caaacucaaauucuuagaugaccaaggcagacaguaggaagaguaauggaaauccuuuug3060

uuuuguuguucuguuguugucaagugcaaaaauauaauuuguugaauaugugugcuucug3120

uccuacuacauuucuuccauuuuuaauuaaaaaguagagcuaggacccacucuuguuccu3180

guacucacuguaggaccccaccuaaaaguauaauccugagaguucacgcugagccuuuuc3240

ucucucuuccugaaaacugaaguguucccaaagcuauguguaaagguuugguucucaucu3300

cucucucucucucucucuuguagguggguaguaggugagcagcugggaguuaaauacucu3360

guggaaccucucuaguuaaaaguaaccagucugugggaaguaaaagcaacauucccugcu3420

ggaggcuccaggauccuaagggacgucuguacucuaaggggacauuuaaauugcaucucc3480

cucauuaaaugaugacugaugcuacuauguuuaaacauuggauuuaacguuuauuucauu3540

guuuuuauuucacugugggucugggcuuuaagacccucauuuuagcugccuagccuucag3600

augaagggggggucucugcuaauuauacaucuggaguucagccuucagaacuugucagcc3660

acccuacccuacuuggaccaugucuugaaaagacaagugguugacuuuggguuucuuaug3720

uguuuguuuguuuguuuguuuguuuugcuccugacaccaccacccucuuuuaaguagauu3780

gugaccagaauaguaacuaaaauguugaauuuauuugcuuaacaaauguggcucuaaauu3840

uuaaggaucauuaugaaagaugaauagcuccccuuucucugcuugugaacacguaugcca3900

auggacucugcucccguguuacagugugaccuaacuuuggauacuuuuuccucuauaguu3960

aaccacauuaauuucaaaauugcagagaaauggaucacuuugcaucaguagggcugguaa4020

auugaaauacuggaccaucacauauuuccuggugcuucuuuguuuauucauuuggcuauu4080

ccauuguuccuguaccaucaaucuuucucaguuugugaacaugagcucuugagauucauu4140

caggaggucucagaacacuaaggcuuuauugucuccuaaucuuaacucuuggggcuggua4200

<210>3

<211>4200

<212>rna

<213>musmusculus

<400>3

gacucugcuguuccaaaggacaacccagaauuauuauuucuuauucuugguuuuuuuuuu60

cucauguacuuuguagugguuuaucugccuuuguuugaucugagcuauucuuauauuugu120

uuuuuagcuucugggguuugugauucuucugcacugcugucccgagaccucgcugcuuuu180

cucaagcaaaugccccaccucuggacaaguggcccugcacuaugauauauguucucaggg240

uuagauccccauugccaguggcuucauugguggcuguucacuguauugggggaaaacaaa300

uccuuauucagccuccccaggagguuccaauccugcgggacccgacuuauuccuuagcgg360

ucagcccuccgugugcuuuuacagacaauuucaaagucaguuggugguauuaaagaagac420

guccucacuguacagugccaaaacaaagauguucuuuugucucauuuggauuugcauucc480

agcuacuaagacuuguugguagcccaccucuuccuuaagccugcugcauggaugcuaugc540

accccagaaguuuuuguacaggcagauaagaagcaaguauuaggaccacugguggcagug600

gaagcaccaccugcuacucuacccaccaaaagguaccgcuuuccucaaguggucuacaag660

cuacacgugguuccuuuuugaauuuguaaggacguaacaucuguauauuuaaucgaaggc720

acacuuucagccagcgucuuugaaauauuaguuucaucuuaacagauuagugccuuugga780

ucccaaguuccuggugaacgcugcugcuuuucaugguccacccagugacuaacaucugcc840

gcgcugucuuuuccgaucucguacauggagguuccucugggggggcugcggcuacuucug900

cacaucggcucuguagacacuuucuugcccagacuauauaauggcuuguuaaugauuuuu960

uuuuuucuuuuggcucuaugagcucuggacuccaaacuucaucauggcgcucagcuacua1020

caaccagagaguaauggguuagaaaccaucaguaauggguuagaaaccaucacacucugc1080

uuacggucagacucuggacuuuuacauccacgaccucuugucaucccuggaagcccucuu1140

gucaucccuggaagcccaggagcccuacacuucuguagaccgggguucaauuccuagaga1200

ccgggguucaauuccuagccucuuuccguaccauucuagacuaacucuguagaacagucu1260

uuaucuuguauucauuguaccgaugcugagguacugccucgcuccaccuuuuuaccaaaa1320

ugucugugaugucuacaaaguucuacuccaagaguaugcggcagcagaauuauauuauau1380

ggacaccuacuacuacuacuacuaacuugaagcuguuuauaguagacugauggccgacua1440

acuaaaaagccacgauguguaucgagccacaugcucacuuuauuauccaugggacuccuc1500

uuuccaucggaaaaguugagcauauguucucagaguuugacauuucgaaaucugugguca1560

guauuuuaauaauuagcuuuaaaguuauaaaaagcaaacagaugucuuuguacccagagc1620

ugcuauuuuuagauacuuagucaacuuuuaaaauaccaccauaggcaguuacauucugca1680

guucuuucuuucuuuuuuuuucggaccagcuuacagagucagcugcuacauuuacauaga1740

gugcaguuucuuuuuucagauuuuuuuguaucacuuuguagaccaacuaggaaucuacag1800

auuaagugaagcucuuuauauaguugagucuguaacauuccugaugaucuucauaaugua1860

ucccuuacaggguccuuccuacaagaaagaacuuuuaauauuaguagcagaauuuuuacu1920

aucuauccauuacagccaguccuguggcuugcuagccuggaguucuaaucuucagaucuu1980

gauuuaacagcagaggaaaaaggcauauagaaaauuugugacaguguagcugugauucag2040

ggcccgguucaugacccggcagucuucguuugucagucaaaaagaagccuuuagugugug2100

ucaacccaccugcucucuguagacaguuugcuauuggggugaauuuagauauucaucuag2160

caagguggggcaguaauaucuuacccauuuuucauagaugucuuuucauaggcaauguaa2220

gcuuuuacccagcaccuguagaacaggucuguuuuggugagaaucuguuuggugagaauc2280

uguuuugguggaacgggaauacccugcaugccucuauugcucauucugggccagcucauc2340

auguagauguuggaucuucuccuaaaggcuuugacagacccacuggucauagucacuccc2400

uggauuguauugcucgcaaaaucacuuaugcuucccuauaauuuucuggcccuucuacac2460

uguuaugucauuuuguuucuugacugagucuaucugugugaccauagguuucauacugug2520

ucuagugcacuagugacauuucccaauucaguuugauuuuuuugagguauuauaaugucc2580

aaugaagcaaggcuaacaaagccuagucaucaugucaaguggacauaugggaaaguaaaa2640

uucaccaaauucaccauaauacuagccaccauauguguaaggaauuauaacagggaguuu2700

guaaucgguuuucagagcauccauugucacucuacacaaguagcagucaucgccuuagua2760

auacaugaucuuuagaguauuauacucuacucauugauuugacuacauauuucuuaauga2820

aaacuaugcucaacuuuuugcacaaugggaagacuuaaccuguacauagcuguuuaauuu2880

cucugacucacgauaucccuguuuccaaugugaagacaggcaguguaaauagagaugaug2940

gcagaaugccugaauuuauggauugacuggcugggagccuucgcaaugaguauuaauuaa3000

uacagagagagaauaggauauggaguuagagacguggaaacaaugccauggagacuacag3060

aguugacagccugaacuuguacagcacauuaaucuacuggaaaguauaacugggaagauu3120

uccaggaacauaaaauguauuugacucuugcuccaaauaauaaauuaagggggccuuaca3180

cucacuggacagucacccccucugaugagcuguagaguuggacuauucuggugagcuuag3240

ucccugggccaugccgcuugguuaguguguuuugugcucuuuaaaguugagugauauacc3300

ucauauauacaacacaccgaagagacacucaggguccuauaucuuuugcuguugagggac3360

caaugcagguucaaggugacacacacuagguuuagagucauguguucugugaucagugga3420

ccaucuguauggcugagaugaaauuuguccuucaucucaccauuguguagcccgauccuc3480

uccucuugaugccuacucauuuuucaguucuuacuucugccagagucuccugcuaaguuu3540

cccauggaacauguacaucugaaucuuugcaaccaagcagugacugaccuuuaauuuggc3600

aucuuugaguuggaaaucccaguacagaguaaccauuagcugauaaaugagugagacaua3660

aagcucugagcaggcauugcaaugauaaaaugaauaaacacaggacuaacuuuacauacu3720

uaauuacuucauaacugcaaaauaggaauuauagaucucucauaaugcuuugcccagugc3780

uacugggugcuacuguacaauaagcagccgauauagguaguucccacccaaguaauucau3840

ucuccaguuuaccuugcaaaaccagaugcagagauaggccccaauaaagaggaaaugguc3900

auaauuuuauuuaauauuucccauauacaccucauuaucugcuguaccucauuauauaug3960

aucuauuuuuuagucuuccuuagcaggcccaacucucagcuuuaugacuccuuacgccag4020

cucucugaggccgcagauaauguauuuguguugaauaaaucccucacacacagauagcag4080

cacauagcagcucaucgggcuuucauaucacaucaccagccucucauuagauccuugauu4140

accacugugccuucuucacacagauguuugacacucaauuuuuacccccuucugauuuac4200

当前第1页1 2 
网友询问留言 已有0条留言
  • 还没有人留言评论。精彩留言会获得点赞!
1