با همکاری انجمن علمی گیاهان دارویی ایران

نوع مقاله : مقاله علمی پژوهشی

نویسندگان

1 دانشجوی دکتری، پردیس کشاورزی و منابع طبیعی، دانشگاه تهران، کرج.

2 استادیار پردیس کشاورزی و منابع طبیعی، دانشگاه تهران، کرج.

3 دانشیار پردیس کشاورزی و منابع طبیعی، دانشگاه تهران، کرج.

4 استادیار، پژوهشکده گیاهان و مواد اولیه دارویی، دانشگاه شهید بهشتی، تهران.

چکیده

زعفران گیاه دارویی و ادویه‌ای ارزشمند متعلق به خانواده زنبق و به‌عنوان منبع غنی از آپوکاروتنوئیدها در جهان محسوب می‌شود. به دلیل سایز بزرگ و پیچیدگی ژنوم زعفران توالی‌یابی آن به‌عنوان چالش مطرح است. با ظهور تکنیک‌های توالی‌یابی نسل بعد، توالی‌یابی RNA به‌عنوان منبع غنی مطالعات بیولوژیکی توسعه یافته ‌است. سرهم‌بندی ترنسکریپتوم­ها­ از تعداد بی‌شمار خوانش‌های کوتاه منبعی غنی برای مطالعه گونه‌هایی که ژنوم مرجع آن‌ها در دسترس نیست فراهم می‌کند. اما سرهم‌بندی قرائت­ها و رسیدن به نتیجه مطلوب به‌ویژه برای گیاهان پلی­پلوئید همواره یک چالش بزرگ محسوب می‌شود. در این مطالعه کارایی ابزارهای مختلف سرهم‌بندی با توجه به فاکتورهایی هم­چون طول N50، تعداد کل یونی­ژن‌ها و درصد هم‌ردیفی مورد مقایسه قرار گرفتند. نتایج نشان داد که Bridger به‌عنوان ابزاری بهتر جهت سرهم‌بندی قرائت­های ترنسکریپتوم زعفران است که می­تواند سرهم­بندی بر اساس پارامترهای تعداد ترنسکریپت­ها، طول N50، اندازه کل سرهم­بندی و درصد هم‌ردیفی قرائت­ها به ترنسکریپتوم فراهم آورد. Velvet/Oases بالاترین درصد درهم­ریختگی را نشان می­دهد که منجر می­شود اعضای مختلف یک خانواده ژنی که شباهت بالایی به یکدیگر دارند در یک ترنسکریپت سرهم­بندی شوند. نتایج حاصل از این تحقیق می‌تواند به محققان در جهت انتخاب بهتر ابزار سرهم‌بندی و توسعه ابزارهای موجود راهکارهایی را ارائه نماید.

کلیدواژه‌ها

موضوعات

عنوان مقاله [English]

Comparative performance of transcriptome assembly programs for saffron (Crocus sativus L.)

نویسندگان [English]

  • Maryam Vahedi 1
  • Seyed Alireza Salami 2
  • Majid Shokrpour 3
  • Hassan Rezadoost 4

1 PhD. Student, College of Agriculture and Natural resources, University of Tehran, Karaj, Iran.

2 Assistant professor, College of Agriculture and Natural resources, University of Tehran, Karaj, Iran

3 Associate professor, College of Agriculture and Natural resources, University of Tehran, Karaj, Iran.

4 Assistant professor, Medicinal Plants and Drugs Research Institute, Shahid Beheshti University, Evin, Tehran, Iran

چکیده [English]

Saffron (Crocus sativus L.) belonging to Iridaceae family as a source of apocarotenoids is one of the most valuable spices and medicinal plants in the world. Because of the large size and high complexity of saffron genome, its sequencing remains a challenge. The arrival of next-generation sequencing technologies (NGS) has allowed the rapid and efficient development for RNA sequencing. De novo assembly of transcriptome from short-read RNA-Seq data provides a great resource for the study of species without a reference genome. De novo assembly of the transcriptome has some unique challenges, particularly in the case of plants, which possess a large amount of paralogs, orthologs, homoeologs and isoforms. In this research, we attempted to compare the performance of de novo assembly tools including BinPacker, Bridger, Oases-Velvet and Trinity through consideration of a quality metrics such as N50 length, the total number of contigs and alignment scores. The results of these analyses revealed that assembly using Bridger had a superior performance for saffron transcriptome, Oases suffered from relatively high chimera rates and redundancies which causes genes family with high similarity assembled into one transcript, Trinity performs worse than Bridger in the increase of false positives. Our comparison study will assist researchers in selecting a well-suited assembler and offer essential information for the improvement of existing assemblers.

کلیدواژه‌ها [English]

  • Saffron
  • Assembly
  • BinPacker
  • Bridger
  • Trinity
  • Velvet/Oases
Baba, S.A., Mohiuddin, T., Basu, S., Swarnkar, M.K., Malik, A.H., Wani, Z.A., Abbas, N.A., Singh, A.K., and Ashraf, N. 2015. Comprehensive transcriptome analysis of Crocus sativus for discovery and expression of genes involved in apocarotenoid biosynthesis. BMC genomics 16 (1): 698-712.
Baker, M. 2012. De novo genome assembly: what every biologist should know. Nature Methods 9 (4): 333-337.
Cabau, C., Escudie, F., Djari, A., Guiguen Y., Bobe, J., and Klopp, C. 2017. Compacting and correcting Trinity and Oases RNA-Seq de novo assemblies. PeerJ 5: e2988.
Chang, Z., Li, G., Liu, J., Zhang, Y., Ashby, C., Liu, D., Cramer, C.L., and Huang, X. 2015. Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biology 16 (1): 1.
Chopra, R., Burow, G., Farmer, A., Mudge, J., Simpson, C.E., and Burow, M.D. 2014. Comparisons of de novo transcriptome assemblers in diploid and polyploid species using peanut (Arachis spp.) RNA-seq data. PloS One 9 (12): e115055.
Clarke, K., Yang, Y., Marsh, R., Xie, L., and Zhang, K.K. 2013. Comparative analysis of de novo transcriptome assembly. Science China Life Sciences 56 (2): 156-162.
Fernandez, J.A. 2004. Biology, biotechnology and biomedicine of saffron. Recent Research Developments in Plant Science2: 127–159.
Fiore, A., Pizzichini, D., Diretto, G., Scossa, F., and Spano, L. 2010. Genomics and transcriptomics of saffron: new tools to unravel the secrets of an attractive spice. The Editor 25: 1-14.
Grabherr, M.G., Haas, B.J., Yassour, M., Levin, J.Z., Thompson, D.A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke, A., Rhind, N., di Palma, F., Birren, B.W., Nusbaum, Ch.,  Lindblad-Toh, K., Friedman N.,  and Regev, A. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology 29 (7): 644-652.
Haas, B.J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J., Couger M.B., Eccles, D., Li, B., Lieber, M., MacManes, M.D., Ott, M., Orvis, J.  Pochet, N., Strozzi, F., Weeks, N., Westerman, R., William,T., Dewey, C.N., Henschel, R.,  LeDuc, R.D., Friedman, N.,  and Regev, A. 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature Protocols 8 (8): 1494-1512.
Honaas, L.A., Wafula, E.K., Wickett, N.J., Der, J.P., Zhang, Y., Edger, P.P., Altman, N.S., Pires, J.C., Leebens-Mack, J.H., and dePamphilis, C.W. 2016. Selecting superior de novo transcriptome assemblies: Lessons learned by leveraging the best plant genome. PloS One 11 (1): e0146062.
Izadpanah, F., Kalantari, S., Hassani, M.E., Naghavi, M.R., and Shokrpour, M. 2014. Variation in Saffron (Crocus sativus L.) accessions and Crocus wild species by RAPD analysis. Plant Systematics and Evolution 300 (8): 1941-1944.
Jain, M., Srivastava, P.L., Verma, M., Ghangal, R., and Garg, R. 2016. De novo transcriptome assembly and comprehensive expression profiling in Crocus sativus to gain insights into apocarotenoid biosynthesis. Scientific Reports 6.
Kafi, M. 2006. Saffron (Crocus sativus): Production and Processing. Science Publishers, 249 p.
Langmead, B., and Salzberg, S.L. 2012. Fast gapped-read alignment with Bowtie 2. Nature methods 9 (4): 357-359.
Lin, Y., Li, J., Shen, H., Zhang, L., and Papasian, C.J. 2011. Comparative studies of de novo assembly tools for next-generation sequencing technologies. Bioinformatics 27 (15): 2031-2037.
Liu, J., Li, G., Chang, Z., Yu, T., Liu, B., McMullen, R., Chen, P., and Huang, X. 2016. BinPacker: packing-based De Novo transcriptome assembly from RNA-seq data. PLoS Computational Biology 12 (2): e1004772.
Marguerat, S., and Bähler, J. 2010. RNA-seq: from technology to biology. Cellular and Molecular Life Sciences 67 (4): 569-579.
Moreton, J., Dunham, S.P., and Emes, R.D. 2014. A consensus approach to vertebrate de novo transcriptome assembly from RNA-seq data: assembly of the duck (Anasplatyrhynchos) transcriptome. Frontiers in Genetics 5 (190): 1-6.
Nakasugi, K., Crowhurst, R., Bally, J., and Waterhouse, P. 2014. Combining transcriptome assemblies from multiple De Novo assemblers in the Allo-tetraploid plant Nicotiana benthamiana. PloS One 9 (3): e91776.
Nemati, Z., Mardi, M., Majidian, P., Zeinalabedini, M., Pirseyedi, S.M., and Bahadori, M. 2014. Saffron (Crocus sativus L.), a monomorphic or polymorphic species?. Spanish Journal of Agricultural Research 12 (3): 753-762.
O’Neil, S., and Emrich, S.J. 2013. Assessing De Novo transcriptome assembly metrics for consistency and utility. BMC Genomics 14 (465): 1-12.
Rana, S.B., Zadlock IV, F.J., Zhang, Z., Murphy, W.R., and Bentivegna, C.S. 2016. Comparison of De Novo transcriptome assemblers and k-mer strategies using the Killifish, fundulus heteroclitus. PloS One 11 (4): e0153104.
Salzberg, S.L., Phillippy, A.M., Zimin, A., Puiu, D., Magoc, T., Koren, S., Treangen, T.J., Schatz, M.C., Delcher, A.L., Roberts, M., Marcais, G., Pop, M., and Yorke, J.A., 2012. GAGE: A critical evaluation of genome assemblies and assembly algorithms. Genome Research 22 (3): 557-567.
Schulz, M.H., Zerbino, D.R., Vingron, M., and Birney, E. 2012. Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28 (8): 1086-1092.
Simpson, J.T., Wong, K., Jackman, S.D., Schein, J. E., Jones, S.J., and Birol, I. 2009. ABySS: a parallel assembler for short read sequence data. Genome Research 19 (6): 1117-1123.
Smith-Unna, R., Boursnell, C., Patro, R., Hibberd, J.M., and Kelly, S. 2016. TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Research 26: 1134–1144.
Surget-Groba, Y., and Montoya-Burgos, J.I. 2010. Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Research 20 (10): 1432-1440.
Wang, S., and Gribskov, M. 2017. Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis. Bioinformatics 33 (3): 327-333.
Wang, Z., Gerstein, M., and Snyder, M. 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics 10 (1): 57-63.
Xie, Y., Wu, G., Tang, J., Luo, R., Patterson, J., Liu, S., Huang, W., He, G., Gu, S., Li, S., Zhou, X., Lam, T.W., Li, Y., Xu, X., Wong, G.K., and Wang, J.,. 2014. SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads. Bioinformatics 30 (12): 1660-1666.
Yilmaz, A., Nyberg, N.T., Mølgaard, P., Asili, J., and Jaroszewski, J.W. 2010. 1H NMR metabolic fingerprinting of saffron extracts. Metabolomics 6 (4): 511-517.
Zhao, Q.Y., Wang, Y., Kong, Y.M., Luo, D., Li, X., and Hao, P. 2011. Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. BMC Bioinformatics 12 (14): S2.
Zhu, Y.Y., Machleder, E.M., Chenchik, A., Li, R., and Siebert, P.D. 2001. Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. Biotechnique30: 892–897.