The halophyte Suaeda aralocaspica performs complete C 4 photosynthesis within individual cells SCC 4 , which is distinct from typical C 4 plants, which require the collaboration of 2 types of photosynthetic cells. However, despite SCC 4 plants having features that are valuable in engineering higher photosynthetic efficiencies in agriculturally important C 3 species such as rice, there are no reported sequenced SCC 4 plant genomes, limiting our understanding of the mechanisms involved in, and evolution of, SCC 4 photosynthesis. The final genome assembly was Mb, consisting of 4, scaffolds, with a scaffold N50 length of 1. We annotated 29, protein-coding genes using Evidence Modeler based on the gene information from ab initio predictions, homology levels with known genes, and RNA sequencing—based transcriptome evidence. A complete circular with no gaps chloroplast genome of S. We have presented the genome sequence of the SCC 4 plant S.
|Published (Last):||20 September 2015|
|PDF File Size:||8.76 Mb|
|ePub File Size:||16.81 Mb|
|Price:||Free* [*Free Regsitration Required]|
The halophyte Suaeda aralocaspica performs complete C 4 photosynthesis within individual cells SCC 4 , which is distinct from typical C 4 plants, which require the collaboration of 2 types of photosynthetic cells. However, despite SCC 4 plants having features that are valuable in engineering higher photosynthetic efficiencies in agriculturally important C 3 species such as rice, there are no reported sequenced SCC 4 plant genomes, limiting our understanding of the mechanisms involved in, and evolution of, SCC 4 photosynthesis.
The final genome assembly was Mb, consisting of 4, scaffolds, with a scaffold N50 length of 1. We annotated 29, protein-coding genes using Evidence Modeler based on the gene information from ab initio predictions, homology levels with known genes, and RNA sequencing—based transcriptome evidence.
A complete circular with no gaps chloroplast genome of S. We have presented the genome sequence of the SCC 4 plant S. Knowledge of the genome of S. Carbon loss through photorespiration and water loss through transpiration are common in C 3 plants, especially in warm or dry environments, and they result in significant decreases in growth, water use efficiency, and harvestable yields [ 1 ]. These problems are overcome in C 4 and crassulacean acid metabolism CAM plant families [ 2 ], which perform evolved CO 2 -concentrating mechanisms C 4 cycle and Calvin cycle C 3 cycle using spatial Kranz structure and temporal day to night switch separations, respectively.
Both C 4 and CAM plants can outperform C 3 plants, especially under photorespiratory conditions, and increase their water use efficiency [ 2 ], which has created considerable interest in implementing the C 4 cycle in C 3 crops such as rice to improve yields and stress tolerance [ 3—6 ].
Among eudicots, C 4 photosynthesis most frequently occurs in the Amaranthaceae of Caryophyllales [ 7—9 ]. Suaeda contains species that utilize all types of C 4 , C 3 , and SCC 4 mechanisms for CO 2 fixation and, thus, represents a unique genus to study the evolution of C 4 photosynthesis [ 14 ]. Mechanistically, the spatially separated chloroplasts in SCC 4 contain different sets of nuclear-encoded proteins that are related to specific functions in the C 4 and C 3 cycles, which biochemically and functionally resemble mesophyll and bundle sheath cells in chloroplasts of Kranz C 4 plant species [ 10 , 11 , 15—18 ].
These findings indicate that the key enzymes in photosynthesis are conserved and that both C 3 and C 4 enzymes work in the same cells in SCC 4 plants during the daytime, which is different from both C 4 and CAM plants.
At present, most of the knowledge of SCC 4 photosynthesis has come from studies of Bienertia sinuspersici , which has 2 types of chloroplasts distributed in the central and peripheral parts of the cell [ 16 , 18—29 ]. Studies on Suaeda aralocaspica NCBI:txid have focused on the germination of dimorphic seeds [ 30—34 ]. This is analogous to the Kranz anatomy but lacks the intervening cell wall [ 35 ].
This cellular feature indicates that S. Therefore, it is important to sequence the genome of S. In the present study, we sequenced the genome of S. Using an integrated assembly strategy that combined shotgun Illumina sequencing and single-molecule real-time sequencing technology from Pacific Biosciences PacBio , we generated a reference genome assembly of S.
To our knowledge, this is the first sequenced SCC 4 genome. These genomic resources provide a platform for advancing basic biological research and gene discovery in SCC 4 plants, as well as for engineering C 4 functional modules into C 3 crops to increase yields and to adapt to high-salt conditions. Seeds were first collected from a healthy specimen of S. The seeds were placed in 0. After seed germination, leaves were collected as tissue sources for whole-genome sequencing.
In addition, 6 other healthy S. All the samples were collected with permission from and under the supervision of the local forestry bureau. Genomic DNA isolated from S.
Then, the S. To reduce the effects of sequencing errors on the assembly, a series of stringent filtering steps were used during read generation. We cleaned Illumina reads using the following steps: 1 Cut off adaptors. In total, , raw subreads were produced by Pacbio. This yielded , corrected PacBio reads. After the quality control and filtering steps, Gb clean Illumina reads and 6. The term k -mer refers to a sequence with a length of k bp, and each unique k -mer within a genome dataset can be used to determine the discrete probability distributions of all possible k -mers and their frequencies of occurrence.
Genome size can be calculated using the total length of sequencing reads divided by sequencing depth. To estimate the sequencing depth of the S. The peak value of the frequency curve represents the overall sequencing depth. G denotes the genome size, and D represents the overall depth estimated from the k -mer distribution.
Based on this method, the estimated genome size of S. This resulted in a genome size of The assembly spanned RNA-seq was performed for genome annotation. Different tissues mature leaf, stem, root, and fruit of 6 S. Tissues were ground in liquid nitrogen. RNA integrity was further verified by 1. To further annotate transcriptional start and termination sites, we also sequenced cap analysis of gene expression and deep sequencing CAGE and polyadenylation site sequencing PAS data.
Polyadenylated mRNAs were purified and concentrated with oligo dT -conjugated magnetic beads Invitrogen. Finally, Finally, 4. Different methods and data were used to check the completeness of the assembly. In total, We found that For unigenes longer than 1 kb, The genome of S. Nonredundant protein sequences of 7 sequenced plants Arabidopsis thaliana, Oryza sativa, Beta vulgaris, Chenopodium quinoa, Glycine max, Spinacia oleracea , and Vitis vinifera provided homology evidence. The S.
We predicted 29, PCGs, with an average transcript length of 4, bp, coding sequence size of 1, bp, and a mean of 4. Of the annotated PCGs, In addition, 1, long noncoding RNAs were predicted following a previously published method [ 52 ]. To annotate the repeat sequences of the S. The Rebase database [ 58 ] was used to identify TEs. Finally, we combined the de novo and homolog predictions of repeat elements according to their coordination in the genome, and detected As observed in other sequenced genomes [ 61 ], long terminal repeats [ 62 ] in S.
The longest proteins encoded by each gene in all species were selected as input for OrthoFinder with default parameters. Of the 29, annotated S. In total, 3, orthogroups , genes were shared among all the genomes analyzed. A total of 70 orthogroups genes were specific to the assembled S.
The aLRT method was used to perform 1, bootstrap analyses to test the robustness of each branch. Then, a timetree was inferred using the Realtime method [ 66 , 67 ] and ordinary least-squares estimates of branch lengths. This analysis involved 18 amino acid sequences. There were 4, positions in the final dataset. The resulting phylogenetic tree showed that all 5 Amaranthaceae species were placed in the same clade, among which A.
Moreover, S. Our results of phylogenetic analyses were consistent with a previous study on the evolution of C. Inside of the Amaranthaceae, the close phylogenetic distance between S.
Outside of the Amaranthaceae, S. These findings do not fully support the existing model that S. Phylogenetic tree of S. Bootstrap values were obtained from 1, bootstrap replicates and are reported as percentages. Using the short insert size bp data, a complete circular with no gaps chloroplast genome of S.
Gene map of the S. Genes shown outside the outer circle are transcribed clockwise, and those inside are transcribed counterclockwise. Genes belonging to different functional groups are color coded. The dashed area in the inner circle indicates guanine-cytosine content of the chloroplast genome. Using the Illumina and Pacbio platforms, we successfully assembled the genome of S.
The final genome assembly was Mb in size and consisted of 4, scaffolds, with a scaffold N50 length of 1. The phylogenetic tree placed SCC 4 in a clade more closely related to the C 3 than the C 4 plants, not fully supporting the hypothesis that SCC 4 is a C 3 —C 4 intermediate that independently evolved from the C 3 ancestors. The available genome assembly, together with transcriptomic data of S.
We anticipate that future studies of S. Supplemental Figure 1: k -mer distribution of sequencing reads. Supplemental Figure 3: Integrity comparison of genome assemblies of S. For S. Supplemental Figure 4: Annotated genes supported by different evidence. Supplemental Figure 5: Gene ontology distribution of S. Supplemental Table 1: Summary of sequencing data obtained for genome assembly. Supplemental Table 2: The assembly statistics of the S. Supplemental Table 4: Mapping efficiency of short insert library reads.
BGI 5034 - beim Carl Heymanns Verlag
E-Bonding Runbook Error Codes & Troubleshooting