A Gene-Phenotype Relationship Extraction Pipeline from the Biomedical Literature Using a Representation Learning Approach

Authors:Wenhui Xing , Junsheng Qi , Xiaohui Yuan , Lin Li , Xiaoyu Zhang , Yuhua Fu , Shengwu Xiong , Lun Hu and Jing Peng

Publish:Bioinformatics, 2018

Motivation: The fundamental challenge of modern genetic analysis is to establish gene-phenotype correlations that are often found in the large-scale publications. Because lexical features of gene are relatively regular in text, the main challenge of these relation extraction is phenotype recognition. Due to phenotypic descriptions are often study- or author-specific, few lexicon can be used to effectively identify the entire phenotypic expressions in text, especially for plants.

Methods: We propose a pipeline for extracting phenotype, gene and their relations from biomedical literature. Combined with abbreviation revision and sentence template extraction, we improve the unsupervised word-embedding-to-sentence-embedding cascaded approach as representation learning to recognize the various broad phenotypic information in literature. In addition, the dictionary- and rulebased method is applied for gene recognition. Finally, we integrate one of famous information extraction system OLLIE to identify gene-phenotype relations.

Results: To demonstrate the applicability of the pipeline, we established two types of comparison experiment using model organism Arabidopsis thaliana. In the comparison of state-of-the-art baselines, our approach obtained the best performance (F1-Measure of 66.83%). We also applied the pipeline to 481 full-articles from TAIR gene-phenotype manual relationship dataset to prove the validity. The results showed that our proposed pipeline can cover 70.94% of the original dataset and add 373 new relations to expand it.

2018-05-21 05:31:31
Gene-Phenotype Relationship,Representation Learning Approach

Development and validation of InDel markers for identification of QTL underlying flowering time in soybean

Authors:Jialin Wang, Lingping Kong, Kanchao Yu, Fengge Zhang, Xinyi Shi, Yanping Wang, Haiyang Nan, Xiaohui Zhao, Sijia Lu, Dong Cao, Xiaoming Li, Chao Fang, Feifei Wang, Tong Su, Shichen Li, Xiaohui Yuan, Baohui Liu , Fanjiang Kong

Publish: The Crop Journal (2017)

Soybean [Glycine max (L.) Merrill] is a major plant source of protein and oil. An accurate and well-saturated molecular linkage map is a prerequisite for forward genetic studies of gene function and for modern breeding for many useful agronomic traits. Next-generation sequence data available in public databases provides valuable information and offers new insights for rapid and efficient development of molecular markers. In this study, we attempted to show the feasibility and facility of using genomic resequencing data as raw material for identifying putative InDel markers. First, we identified 17,613 InDel sites among 56 soybean accessions and obtained 12,619 primer pairs. Second, we constructed a genetic map with a random subset of 2841 primer pairs and aligned 300 polymorphic markers with the 20 consensus linkage groups (LG). The total genetic distance was 2347.3 cM and the number of mapped markers per LG ranged from 10 to 23 with an average of 15 markers. The largest and smallest genetic distances between adjacent markers were 52.3 cM and 0.1 cM, respectively. Finally, we validated the genetic map constructed by newly developed InDel markers by QTL analysis of days to flowering (DTF) under different environments. One major QTL (qDTF4) and four minor QTL (qDTF20, qDTF13, qDTF12, and qDTF11) on 5 LGs were detected. These results demonstrate the utility of the InDel markers developed in this work for map-based cloning and molecular breeding in soybean. 
2018-01-22 12:00:52
Soybean,Resequencing data, InDel markers ,Genetic map ,QTL analysis

Quantitative Trait Locus Mapping of Soybean Maturity Gene E6

Authors:Xiaoming Li, Chao Fang, Meilan Xu, Fengge Zhang, Sijia Lu, Haiyang Nan, Tong Su, Shichen Li, Xiaohui Zhao, Lingping Kong, Xiaohui Yuan, Baohui Liu,* Jun Abe,* Elroy R. Cober,* and Fanjiang Kong*

Publish:crop science, vol. 57, september–october 2017

Soybean [Glycine max (L.) Merr.] sensitivity to photoperiod determines adaptation to a specific range of latitudes for soybean cultivars. When temperate-adapted soybean cultivars are grown in low latitude under short day conditions, they flower early, resulting in low grain yield, and consequently limiting their utility in tropical areas. Most cultivars adapted to low-latitude environments have the trait of delayed flowering under short day conditions, and this trait is commonly called long juvenile (LJ). In this study, the E6 locus, the classical locus conditioning the LJ trait, was molecularly mapped on Gm04 near single-nucleotide polymorphism marker HRM101. Testcross, genetic mapping, and sequencing suggest that the E6 and J loci might be tightly linked. Genetic interaction evaluation between E6 and E1 suggests that E6 has a suppressive effect on E1 and that the function of E6 is dependent on E1. The tagging markers for E6 are very useful for molecular breeding for wide adaptation and stable productivity of soybean under lowlatitude environments. Molecular identification and functional characterization of the E6 gene will greatly facilitate the understanding of the genetic and molecular mechanisms underlying the LJ trait. 
2018-01-22 11:40:51
Soybean,Quantitative Trait ,Soybean Maturity Gene E6

GmILPA1, Encoding an APC8-like Protein, Controls Leaf Petiole Angle in Soybean

Authors:Gao J, Yang S, Cheng W, Fu Y, Leng J, Yuan X, Jiang N, Ma J, Feng X

Publish:Plant Physiol, 2017, 174(2):1167-1176

Leaf petiole angle (LPA) is an important plant architectural trait that affects canopy coverage, photosynthetic efficiency, and ultimately productivity in many legume crops. However, the genetic basis underlying this trait remains unclear. Here, we report the identification, isolation, and functional characterization of Glycine max Increased Leaf Petiole Angle1 (GmILPA1), a gene encoding an APC8-like protein, which is a subunit of the anaphase-promoting complex/cyclosome in soybean (Glycine max). A gamma ray-induced deletion of a fragment involving the fourth exon of GmILPA1 and its flanking sequences led to extension of the third exon and formation of, to our knowledge, a novel 3'UTR from intronic and intergenic sequences. Such changes are responsible for enlarged LPAs that are associated with reduced motor cell proliferation in the Gmilpa1 mutant. GmILPA1 is mainly expressed in the basal cells of leaf primordia and appears to function by promoting cell growth and division of the pulvinus that is critical for its establishment. GmILPA1 directly interacts with GmAPC13a as part of the putative anaphase-promoting complex. GmILPA1 exhibits variable expression levels among varieties with different degrees of LPAs, and expression levels are correlated with the degrees of the LPAs. Together, these observations revealed a genetic mechanism modulating the plant petiole angle that could pave the way for modifying soybean plant architecture with optimized petiole angles for enhanced yield potential.
2017-07-06 16:09:08
Leaf petiole angle, protein, Leaf Petiole Angle

InDel marker detection by integration of multiple softwares using machine learning techniques

Authors:Jianqiu Yang, Xinyi Shi, Lun Hu, Daipeng Luo, Jing Peng, Shengwu Xiong, Fanjing Kong, Baohui Liu, and Xiaohui Yuan

Publish:BMC Bioinformatics, 2016

Background: In the biological experiments of soybean species, molecular markers are widely used to verify the soybean genome or construct its genetic map. Among a variety of molecular markers, insertions and deletions (InDels) are preferred with the advantages of wide distribution and high density at the whole-genome level. Hence, the problem of detecting InDels based on next-generation sequencing data is of great importance for the design of InDel markers. To tackle it, this paper integrated machine learning techniques with existing software and developed two algorithms for InDel detection, one is the best F-score method (BF-M) and the other is the Support Vector Machine (SVM) method (SVM-M), which is based on the classical SVM model.  
Results: The experimental results show that the performance of BF-M was promising as indicated by the high precision and recall scores, whereas SVM-M yielded the best performance in terms of recall and F-score. Moreover, based on the InDel markers detected by SVM-M from soybeans that were collected from 56 different regions, highly polymorphic loci were selected to construct an InDel marker database for soybean. 
Conclusions: Compared to existing software tools, the two algorithms proposed in this work produced substantially higher precision and recall scores, and remained stable in various types of genomic regions. Moreover, based on SVM-M, we have constructed a database for soybean InDel markers and published it for academic research.
2016-11-22 13:05:01
Insertions and deletions, InDel detection, Evaluation

Transcriptome Sequencing Identified Genes and Gene Ontologies Associated with Early Freezing Tolerance in Maize

Authors:Zhao Li, GuanghuiHu, XiangfengLiu, YaoZhou, YuLi, XuZhang, Xiaohui Yuan, QianZhang, DeguangYang, TianyuWang and ZhiwuZhang

Publish:Frontiers in Plant Science, 2016

Originating in a tropical climate, maize has faced great challenges as cultivation has expanded to the majority of the world's temperate zones. In these zones, frost and cold temperatures are major factors that prevent maize from reaching its full yield potential. Among 30 elite maize inbred lines adapted to northern China, we identified two lines of extreme, but opposite, freezing tolerance levels—highly tolerant and highly sensitive. During the seedling stage of these two lines, we used RNA-seq to measure changes in maize whole genome transcriptome before and after freezing treatment. In total, 19,794 genes were expressed, of which 4550 exhibited differential expression due to either treatment (before or after freezing) or line type (tolerant or sensitive). Of the 4550 differently expressed genes, 948 exhibited differential expression due to treatment within line or lines under freezing condition. Analysis of gene ontology found that these 948 genes were significantly enriched for binding functions (DNA binding, ATP binding, and metal ion binding), protein kinase activity, and peptidase activity. Based on their enrichment, literature support, and significant levels of differential expression, 30 of these 948 genes were selected for quantitative real-time PCR (qRT-PCR) validation. The validation confirmed our RNA-Seq-based findings, with squared correlation coefficients of 80% and 50% in the tolerance and sensitive lines, respectively. This study provided valuable resources for further studies to enhance understanding of the molecular mechanisms underlying maize early freezing response and enable targeted breeding strategies for developing varieties with superior frost resistance to achieve yield potential.
2016-10-07 03:43:56
whole genome transcriptome, maize

A Global Analysis of the Polygalacturonase Gene Family in Soybean (Glycine max)

Authors:Feifei Wang, Xia Sun, Xinyi Shi, Hong Zhai, Changen Tian, Fanjiang Kong, Baohui Liu, Xiaohui Yuan

Publish:PLoS ONE, 2016, 1(1):e0163012.

Polygalacturonase is one of the pectin hydrolytic enzymes involved in various developmental and physiological processes such as seed germination, organ abscission, pod and anther dehiscence, and xylem cell formation. To date, no systematic analysis of polygalacturonase incorporating genome organization, gene structure, and expression profiling has been conducted in soybean (Glycine max var. Williams 82). In this study, we identified 112 GmPG genes from the soybean Wm82.a2v1 genome. These genes were classified into three groups, group I (105 genes), group II (5 genes), and group III (2 genes). Fifty-four pairs of duplicate paralogous genes were preferentially identified from duplicated regions of the soybean genome, which implied that long segmental duplications significantly contributed to the expansion of the GmPG gene family. Moreover, GmPG transcripts were analyzed in various tissues using RNA-seq data. The results showed the differential expression of 64 GmPGs in the tissue and partially redundant expression of some duplicate genes, while others showed functional diversity. These findings suggested that the GmPGs were retained by substantial subfunctionalization during the soybean evolutionary processes. Finally, evolutionary analysis based on single nucleotide polymorphisms (SNPs) in wild and cultivated soybeans revealed that 107 GmPGs had selected site(s), which indicated that these genes may have undergone strong selection during soybean domestication. Among them, one non-synonymous SNP of GmPG031 affected floral development during selection, which was consistent with the results of RNA-seq and evolutionary analyses. Thus, our results contribute to the functional characterization of GmPG genes in soybean.
Soybean, Polygalacturonase

Identification of additional QTLs for flowering time by removing the effect of the maturity gene E1 in soybean

Authors:LU Si-jia1, LI Ying, WANG Jia-lin, NAN Hai-yang, CAO Dong, LI Xiao-ming, SHI Dan-ning, FANG Chao, SHI Xin-yi, YUAN Xiao-hui, Jun Abe, LIU Bao-hui, KONG Fan-jiang

Publish:Journal of Integrative Agriculture, 2016, 15(1):60345-7.

The adaptability of soybean to be grown at a wide range of latitudes is attributed to natural variation in the major genes and quantitative trait loci (QTLs) that control flowering time and maturity. Thus, the identification of genes controlling flowering time and maturity and the understanding of their molecular basis are critical for improving soybean productivity. However, due to the great effect of the major maturity gene E1on flowering time, it is difficult to detect other small-effect QTLs. In this study, aiming to reduce the effect of the QTL, associated with the E1 gene, on the detection of other QTLs, we divided a population of 96 recombinant inbred lines (RILs) into two sub-populations: one with the E1 allele and another with the e1nl allele. Compared with the results of using all 96 recombinant inbred lines, additional QTLs for flowering time were identified in the sub-populations, two (qFT-B1 and qFT-H) in RILs with the E1allele and one (qFT-J-2) in the RILs with the e1nl allele, respectively. The three QTLs,qFT-B1, qFT-H and qFT-J-2 were true QTLs and played an important role in the regulation of growth period. Our data provides valuable information for the genetic mapping and gene cloning of traits controlling flowering time and maturity and will help a better understanding of the mechanism of photoperiod-regulated flowering and molecular breeding in soybean.
multiple-QTL model (MQM), mixed linear-based composite interval mapping (MCIM), photoperiod, maturity

Sequence composition of BAC clones and SSR markers mapped to Upland cotton chromosomes 11 and 21 targeting resistance to soil-borne pathogens-bornepathogens

Authors:Congli Wang, MauricioUlloa , XinyiShi , XiaohuiYuan, ChristopherSaski ,John Z.Yu and PhilipA.Roberts

Publish:Frontiers in Plant Science, 2015, 6:791.

Genetic and physical framework mapping in cotton (Gossypium spp.) were used to discover putative gene sequences involved in resistance to common soil-borne pathogens. Chromosome (Chr) 11 and its homoeologous Chr 21 of Upland cotton (G. hirsutum) are foci for discovery of resistance (R) or pathogen-induced R (PR) genes underlying QTLs involved in response to root-knot nematode (Meloidogyne incognita), reniform nematode (Rotylenchulus reniformis), Fusarium wilt (Fusarium oxysporum f.sp. vasinfectum), Verticillium wilt (Verticillium dahliae), and black root rot (Thielaviopsis basicola). Simple sequence repeat (SSR) markers and bacterial artificial chromosome (BAC) clones from a BAC library developed from the Upland cotton Acala Maxxa were mapped on Chr 11 and Chr 21. DNA sequence through Gene Ontology (GO) of 99 of 256 Chr 11 and 109 of 239 Chr 21 previously mapped SSRs revealed response elements to internal and external stimulus, stress, signaling process, and cell death. The reconciliation between genetic and physical mapping of gene annotations from new DNA sequences of 20 BAC clones revealed 467 (Chr 11) and 285 (Chr 21) G. hirsutum putative coding sequences, plus 146 (Chr 11) and 98 (Chr 21) predicted genes. GO functional profiling of Unigenes uncovered genes involved in different metabolic functions and stress response elements (SRE). Our results revealed that Chrs 11 and 21 harbor resistance gene rich genomic regions. Sequence comparisons with the ancestral diploid D5 (G. raimondii), A2 (G. arboreum) and domesticated tetraploid TM-1 AD1 (G. hirsutum) genomes revealed abundance of transposable elements and confirmed the richness of resistance gene motifs in these chromosomes. The sequence information of SSR markers and BAC clones and the genetic mapping of BAC clones provide enhanced genetic and physical frameworks of resistance gene-rich regions of the cotton genome, thereby aiding discovery of R and PR genes and breeding for resistance to cotton diseases.
Gossypium hirsutum, genetic and physical mapping, resistance-rich cluster, resistance stress element

GmCOL1a and GmCOL1b function as flowering repressor in soybean under long-day conditions

Authors:Dong Cao, Ying Li, Sijia Lu, Jialin Wang, Haiyang Nan, Xiaoming Li, Danning Shi, Chao Fang, Hong Zhai, Xiaohui Yuan, Toyoaki Anai, Zhengjun Xia, Baohui Liu, and Fanjiang Kong

Publish:Plant and Cell Physiology, 2015, 56(12):2409-22.

CONSTANS (CO) has a central role in the photoperiod response mechanism in Arabidopsis. However,
the functions of legume CO genes in controlling flowering remain unknown. Here, we analyze the
expression patterns of E1, E2 and GmCOL1a/1b using near-isogenic lines (NILs), and we further
analyze flowering-related genes in gmcol1b mutants and GmCOL1a-overexpressing plants. Our data
showed that both E3 and E4 up-regulate E1 expression, with the effect of E3 on E1 being greater than
the effect of E4 on E1. E2 was up-regulated by E3 and E4 but down-regulated by E1. GmCOL1a/1b
were up-regulated by E1, E2, E3 and E4. Although the spatial and temporal patterns of GmCOL1a/1b
expression were more similar to those of AtCOL2 than to those of AtCO, gmcol1b mutants flowered
earlier than wild-type plants under long-day (LD) conditions, and the overexpression of GmCOL1a
caused late flowering under LD or natural conditions. In addition, GmFT2a/5a, E1 and E2 were
down-regulated in GmCOL1a-overexpressing plants under LD conditions. Because E1/2 influences the
expression of GmCOL1a, and vice versa, we conclude that these genes may function as part of a
negative feedback loop, and GmCOL1a/b genes may serve as suppressors in photoperiodic flowering
in soybean under LD conditions.
soybean [Glycine max (L.) Merr.], CONSTANS, flowering time, photoperiod

GmmiR156b overexpression delays flowering time in soybean

Authors:Dong Cao ,Ying Li,Jialin Wang,Haiyang Nan,Youning Wang,Sijia Lu1,Qiong Jiang,Xiaoming Li,Danning Shi,Chao Fang,Xiaohui Yuan,Xiaohui Zhao,Xia Li,Baohui Liu,Fanjiang Kong

Publish:Plant Molecular Biology, 2015, 89(4-5):353-63.

Soybean [Glycine max (L.) Merr.] is an important crop used for human consumption, animal feed and biodiesel fuel. Wering time and maturity significantly affect soybean grain yield. In Arabidopsis thaliana, miR156 has been proposed to regulate the transition from the juvenile to the adult phase of shoot development, which is accompanied by changes in vegetative morphology and an increase in reproductive potential. However, the molecular mechanisms underlying miR156 function in soybean flowering remain unknown. Here, we report that the overexpression of GmmiR156b delays flowering time in soybean. GmmiR156b may target SPL orthologs and negatively regulate GmSPLs, thereby delaying flowering in soybean under LD and natural conditions. GmmiR156b down-regulates several known flowering time regulators in soybean, such as GmAP1 (a, b, c), GmLFY2, GmLFY2, GmFULs, GmSOC1s, GmFT5a, and GmmiR172. These data show that a similar miR156-SPL regulatory module was conserved in the soybean flowering pathway. However, GmFULs, GmSOC1a and GmSOC1b were significantly suppressed under LD conditions but not under SD conditions, which is different in Arabidopsis that these genes were down-regulated irrespective of photoperiod. In addition, GmmiR156b was up-regulated by E1, E2 (GmGI), E3 and E4, which control flowering time and maturity in soybean, and suppressed E1 (E1-Like) and E2 (E2-Like) genes under LD conditions. These data indicated that the miR156-SPL regulatory module was also with some degree of divergent in soybean flowering pathway.


Soybean, microRNA, Flowering time, GmmiR156b

QTL mapping for flowering time in different latitude in soybean

Authors:Sijia Lu, Ying Li, Jialin Wang, Peerasak Srinives, Haiyang Nan, Dong Cao, Yanping Wang, Jinliang Li, Xiaoming Li, Chao Fang, Xinyi Shi, Xiaohui Yuan, Satoshi Watanabe , Xianzhong Feng, Baohui Liu, Jun Abe, Fanjiang Kong

Publish:Euphytica, 2015, 206(3):725

Flowering represents the transition from the vegetative to reproductive phase and plays an important role in many agronomic traits. For soybean, a short day (SD) induced and photoperiod-sensitive plant, delaying flowering time under SD environments is very important and has been used by breeders to increase yields and enhance plant adaptabilities at lower latitudes. The purpose of this study was to identify quantitative trait loci (QTLs) associated with flowering time, especially QTLs underlying the long juvenile (LJ) trait which delays flowering time under SD environments. A population of 91 recombinant inbred lines derived from a cross between AGS292 and K3 was used for map construction and QTL analysis. The map covered 2546.7 cM and included 52 new promoter-specific indel and 9 new exon-specific indel markers. The phenotypic days-to-flowering data were examined in nine environments, including four short-day (SD, low latitude) and five long-day pho-toperiod (LD, high latitude) environments. For the SD environments, six QTLs were detected. Five of them were associated with the LJ trait. Among the five LJ QTLs, four QTLs may be attributed to the known flowering time genes, including qFT-J-1 for FT5a locus, qFT-J-2 for the FT2a locus, qFT-O for the E2 locus and qFT-L for the E3 locus. This is the first report that the E2, E3, FT2a and FT5a loci may be Sijia Lu, Ying Li, and Jialin Wang have contributed equally to this work.
Additive effect, Epistatic effect, Flowering time, QTL

PopGeV: A Web-based Large-scale Population Genome Browser

Authors:Xinyi Shi, Jing Peng, Xiaohan Yu, Xiaohong Zhang, Dongye Li, Baohui Liu, Fanjiang Kong and Xiaohui Yuan

Publish:Bioinformatics, 2015, 31(18):3048-50

The development of high-throughput sequencing technology has made it possible for more and more researchers to use population sequencing data to mine genes associated with specific traits. However, the massive amounts of sequencing data have also brought new challenges to the researchers. The question of how to browse population genomic data in an easy and intuitive manner must be addressed. Web-based genome browsers allow user to conveniently view the results of genomic analyses, but heavy usage can reduce the response speed of the webpage, which limits its usefulness in the display of large-scale genome data. IndexedDB technology is a good solution to this problem; it supports web browsers and so creates local databases. In this way, data can be read from the local storage, achieving a smooth display of population genomic data. PopGeV has the following characteristics. First, it uses a new encoding method for compression of population SNP and INDEL data. IndexedDB technology is used to download the results to local storage so that users can browse the results smoothly even when the network traffic is heavy. Second, PopGeV identify similar genomic regions between two individuals based on SNP data. Population diversity indexes are calculated when comparing two populations. Third, user defined annotation information can be integrated for user-friendly mining of gene functions. Simulation shows that PopGeV can smoothly display analysis results of population genome containing over 500 individuals with 2 millions SNP data. PopGeV is available at CONTACT: © The Author (2015). Published by Oxford University Press. All rights reserved. For Permissions, please email:
Web-based genome, large-scale genome data

Dual functions of GmTOE4a in the regulation of photoperiod‑mediated flowering and plant morphology in soybean

Authors:Xiaohui Zhao, Dong Cao, Zhijun Huang, Jialin Wang, Sijia Lu, Yan Xu, Baohui Liu, Fanjiang Kong, Xiaohui Yuan

Publish:Plant Molecular Biology, 2015, 88(4):343-355

Flowering time, maturity, and plant morphology have considerable effects on the adaptation and grain yield of soybean (Glycine max). The identification of novel genes and an understanding of their molecular basis are critical to improve soybean productivity. In this study, we cloned a flowering time related APETALA2-like gene GmTOE4a and generated GmTOE4a-overexpressing lines in the cultivar Williams 82. The transgenic lines exhibited late flowering both under long day and short day conditions, and repressed the flowering-related genes, including GmFT2a, GmFT5a, GmAP1, and GmLFY, whereas the flowering repressors GmFT4 and miR156 were upregulated. Interestingly, GmTOE4a was also mediated by photoperiod via maturity genes E3 and E4, which encode photoreceptors in soybean. Further, miR172-mediated GmTOE4a, which regulates flowering in soybean, is different in Arabidopsis in that it is dependent on the CONSTANS-like gene GmCOL1a. In addition to its effect on flowering time, GmTOE4a regulated plant morphology, increased stem thickness, and reduced plant height, internode length and leaf size, which are important agronomic traits that enhance the capacity to resist lodging and increase soybean yield. This is useful information to understand the molecular mechanism of flowering time and plant morphology in soybean and will greatly influence soybean yield improvement.
Glycine max, Photoperiod flowering, miR172, GmTOE4a, Plant morphology

A New Dominant Gene E9 Conditions Early Flowering and Maturity in Soybean

Authors:Fanjiang Kong , Haiyang Nan, Dong Cao, Ying Li, Fangfang Wu, Jialin Wang, Sijia Lu, Xiaohui Yuan, Elroy R. Cober, Jun Abe, and Baohui Liu

Publish:crop science, 2014, 54(6):2529-2535

Adaptability of soybean [Glycine max (L.) Merr.] to a wide range of latitudes is attributed to the natural variation in the major genes and quantitative trait loci (QTL) that control flowering time and maturity. Identification of novel genes and understanding their molecular basis is critical to improving soybean productivity. We identified a new locus conditioning days to flowering and maturity that was detected in hybrid progeny between cultivated and wild soybeans. A backcross was made between the recurrent parent Tokei 780 and two early-flowering recombinant inbred lines (RILs; from the cross Tokei 780 × Hidaka 4, a wild soybean accession, all of which possessed an identical genotype at the major four maturity loci, E1 to E4). The segregation patterns observed in the F^sub 2^ and F^sub 3^ progeny derived from the two crosses revealed that early-flowering was controlled by a single dominant gene. The gene was fine-mapped to a 245-kb interval between markers M5 and M7 on Gm16. A tagging marker ID1 was significantly associated with the variation in days to flowering (0.82, p < 0.01) and maturity (0.76, p < 0.01) in the F^sub 2^ population. The new early-flowering gene and its tagging marker are very useful for molecular breeding towards early maturity and stable productivity of soybean under high-latitude environments. The gene symbol E9e9 has been assigned. E9E9 results in early maturity and e9e9 results in late maturity.
soybean, flowering time, maturity

Genetic Variation in Soybean at the Maturity Locus E4 Is Involved in Adaptation to Long Days at High Latitudes

Authors:Yasutaka Tsubokura , Hisakazu Matsumura , Meilan Xu , Baohui Liu , Hiroko Nakashima , Toyoaki Anai , Fanjiang Kong , Xiaohui Yuan , Hiroyuki Kanamori , Yuichi Katayose , Ryoji Takahashi , Kyuya Harada and Jun Abe

Publish:Agronomy 2013, 3(1), 2013, 3(1):117-134

Soybean (Glycine max) cultivars adapted to high latitudes have a weakened or absent sensitivity to photoperiod. The purposes of this study were to determine the molecular basis for photoperiod insensitivity in various soybean accessions, focusing on the sequence diversity of the E4 (GmphyA2) gene, which encodes a phytochrome A (phyA) protein, and its homoeolog (GmphyA1), and to disclose the evolutionary consequences of two phyA homoeologs after gene duplication. We detected four new single-base deletions in the exons of E4, all of which result in prematurely truncated proteins. A survey of 191 cultivated accessions sourced from various regions of East Asia with allele-specific molecular markers reliably determined that the accessions with dysfunctional alleles were limited to small geographical regions, suggesting the alleles’ recent and independent origins from functional E4 alleles. Comparison of nucleotide diversity values revealed lower nucleotide diversity at non-synonymous sites in GmphyA1 than in E4, although both have accumulated mutations at almost the same rate in synonymous and non-coding regions. Natural mutations have repeatedly generated loss-of-function alleles at the E4 locus, and these have accumulated in local populations. The E4 locus is a key player in the adaptation of soybean to high-latitude environments under diverse cropping systems.
genetic diversity, soybean, flowering, phytochrome A, photoperiod insensitivity

Molecular identification of genes controlling flowering time, maturity, and photoperiod response in soybean

Authors:Zhengjun Xia, Hong Zhai, Baohui Liu, Fanjiang Kong, Xiaohui Yuan, Hongyan Wu, Elroy R. Cober, Kyuya Harada

Publish:Plant Systematics and Evolution, 2012, 298(7):1217-1227

Most plants activate the developmental transition from the vegetative to the reproductive phase in response to photoperiod length, temperature, and other environmental stimuli. Successful identification of major genes underlying flowering time and maturity in soybean is a prerequisite for understanding of the regulation of flowering time. Recent progress has been made toward molecular bases of soybean maturity loci by using both candidate gene and positional cloning approaches. In particular, successful identification of the molecular identity of the soybean maturity locus E1 is a remarkable achievement, because this gene is essential for understanding the regulation of flowering time and maturity in soybean. The E1 gene has a putative bipartite nuclear localization signal, and a domain distantly related to B3. Transcriptional profiling showed the E1 gene is under photoperiodic regulation. The E2 gene in soybean encodes GmGIa, a homolog of Arabidopsis GIGANTEA that has multiple functions involved in the circadian clock and flowering. Both of the E3 and E4 genes encode copies of PHYTOCHROME A proteins, and both genes response differentially to light with different red to far-red quantum (R:FR) ratios. In addition, two homologs (GmFT2a and GmFT5a) of FLOWERING LOCUS T coordinately promote photoperiodic flowering in soybean. Public availability of the soybean genome sequence to the research community will greatly facilitate fine mapping and cloning of more genes underlying flowering time and photoperiodic response. Further research on identified genes will help us to understand the exquisite regulatory network of parallel and intertwining pathways controlling flowering time and photoperiodic response in soybean.
Flowering time, Maturity, Soybean, Photoperiodic response, Positional cloning, E loci


Authors:Y Nishiura,T Teramoto,X Yuan

Publish:Communications on Pure and Applied Analysis, 2012, 11(1):307-338

Spatially localized patterns form a representative class of patterns in dissipative systems. We study how the dynamics of traveling spots in two-dimensional space change when heterogeneities are introduced in the media. The simplest but fundamental one is a line heterogeneity of jump type. When spots encounter the jump, they display various outputs including penetration, rebound, and trapping depending on the incident angle and its height. The system loses translational symmetry by the heterogeneity, but at the same time, it causes the emergence of various types of heterogeneity-induced-ordered-patterns (HIOPs) replacing the homogeneous constant state. We study these issues by using a three-component reaction-diffusion system with one activator and two inhibitors. The above outputs can be obtained through the interaction between the HIOPs and the traveling spots. The global bifurcation and eigenvalue behavior of HISPs are the key to understand the underlying mechanisms for the transitions among those dynamics. A reduction to a finite dimensional system is presented here to extract the model-independent nature of the dynamics. Selected numerical techniques for the bifurcation analysis are also provided.
Reaction-diffusion equations, bifurcation, heterogeneous media, traveling spot

Cloning and Expression Analysis of GmMYB Genes Induced by Abiotic Stresses

Authors:SUN Xia1,LIU Jin-Yue1,YUAN Xiao-Hui1,PAN Xiang-Wen1,DU Wei-Guang2,REN Hai-Xiang2,MA Yong-Bo3,Jun ABE 4,QIU Li-Juan5,,and LIU Bao-Hui1,1

Publish:ACTA AGRONOMICA SINICA, 2012, 38(2):360-362

Response to external environment is the outcome of stress-induced gene expression.In this paper,based on one stress-induced EST sequence,we cloned four R2R3-MYB genes from soybean cultivar Dongnong 42,whose genomic sequences con-sisted of three exons and two introns.Three of them corresponding to Gm02g1300,Gm03g38040,and Gm10g01340 are respectively consistent with the sequences of Williams 82.A mutation at the 375th single nucleotide in the sequence of Gm19g40650 from Dongnong 42 caused a synonymous amino acid substitution(E125-D125).To test the relationship of four MYB genes with stress resis-tance,we treated the seedlings of cultivar Dongnong 42 with abiotic stresses including salt,alkali,drought and low temperature in the artificial climate chamber.Quantitative PCR analysis indicated that all of the four genes were transient down-regulated or up-regulated when subjected to the stresses,but different in the expression time,level and tendency.Gm02g01300 was induced by drought stress while Gm03g38040 was strongly induced by multiple stresses,indicating that they play important roles in responding to external stresses.There were also differences in the expression of individual gene between cotyledons and embryos.These results under a variety of abiotic stress conditions suggest that the four R2R3-MYB genes are different not only in the expression patterns,but also in the regulation modes.
Abiotic Stresses, GmMYB, Expression analysis

Overview of Mollisols in the world: Distribution, land use and management

Authors:Xiaobing Liu, Charles Lee Burras, Yuri S. Kravchenko, Artigas Duran, Ted Huffman, Hector Morras, Guillermo Studdert, Xingyi Zhang, Richard M. Cruse, and Xiaohui Yuan

Publish:Canadian Journal of Soil Science, 2012, 92(3):383-402

Mollisols - a.k.a., Black Soils or Prairie Soils make up about 916 million ha, which is 7% of the world's ice-free land surface. Their distribution strongly correlates with native prairie ecosystems, but is not limited to them. They are most prevalent in the mid-latitudes of North America, Eurasia, and South America. In North America, they cover 200 million ha of the United States, more than 40 million ha of Canada and 50 million ha of Mexico. Across Eurasia they cover around 450 million ha, extending from the western 148 million ha in southern Russia and 34 million ha in Ukraine to the eastern 35 million ha in northeast China. They are common to South America's Argentina and Uruguay, covering about 89 million and 13 million ha, respectively. Mollisols are often recognized as inherently productive and fertile soils. They are extensively and intensively farmed, and increasingly dedicated to cereals production, which needs significant inputs of fertilizers and tillage. Mollisols are also important soils in pasture, range and forage systems. Thus, it is not surprising that these soils are prone to soil erosion, dehumification (loss of stable aggregates and organic matter) and are suffering from anthropogenic soil acidity. Therefore, soil scientists from all of the world's Mollisols regions are concerned about the sustainability of some of current trends in land use and agricultural practices. These same scientists recommend increasing the acreage under minimum or restricted tillage, returning plant residues and adding organic amendments such as animal manure to maintain or increase soil organic matter content, and more systematic use of chemical amendments such as agricultural limestone to replenish soil calcium reserves.
Mollisols, Chernozems, classification, sustainability, soil quality, tillage systems

Onset of unidirectional pulse propagation in an excitable medium with asymmetric heterogeneity

Authors:Teramoto, T; Yuan, XH; Bar, M; Nishiura, Y

Publish:Phys Rev E Stat Nonlin Soft Matter Phys, 2009, 79(4 Pt 2):706-723

Heterogeneity is one of the most important and ubiquitous types of external perturbations in dissipative systems. To know the behaviors of pulse waves in such media is closely related to studying the collision process between the pulse and the heterogeneity-induced-ordered pattern. In particular, we focus on unidirectional propagation of pulses in a medium with an asymmetric bump heterogeneity. This topic has attracted much interest recently with respect to potential computational aspects of chemical pulse propagation as well as with respect to pulse propagation in biological signal processing. We employ a three-component reaction-diffusion system with one activator and two inhibitor species to illustrate these issues. The reduced dynamics near a drift bifurcation describes the phenomena in the full partial differential equations by ordinary differential equations. Such a reduced dynamics is able to capture unidirectional propagation properties of pulses near an asymmetric heterogeneity in a qualitatively correct way. A remarkable feature is that such unidirectional behavior is linked to the imperfection of global bifurcation structure and the resulting asymmetric locations of critical points.
bifurcation, differential equations, nonlinear dynamical systems, reaction-diffusion systems

Dynamics of traveling pulses in heterogeneous media

Authors:Nishiura, Yasumasa1 Teramoto, Takashi2 Xiaohui Yuan1 Ueda, Kei-Ichi3

Publish:Chaos, 2007, 17(3):607-618

One of the fundamental issues of pulse dynamics in dissipative systems is clarifying how the heterogeneity in the media influences the propagating manner. Heterogeneity is the most important and ubiquitous type of external perturbation. We focus on a class of one-dimensional traveling pulses, the associated parameters of which are close to drift and/or saddle-node bifurcations. The advantage in studying the dynamics in such a class is twofold: First, it gives us a perfect microcosm for the variety of outputs in a general setting when pulses encounter heterogeneities. Second, it allows us to reduce the original partial differential equation dynamics to a tractable finite-dimensional system. Such pulses are sensitive when they run into heterogeneities and show rich responses such as annihilation, pinning, splitting, rebound, as well as penetration. The reduced ordinary differential equations (ODEs) explain all these dynamics and the underlying bifurcational structure controlling the transitions among different dynamic regimes. It turns out that there are hidden ordered patterns associated with the critical points of ODEs that play a pivotal role in understanding the responses of the pulse; in fact, the depinning of pulses can be explained in terms of global bifurcations among those critical points. We focus mainly on a bump and periodic types of heterogeneity, however our approach is also applicable to general cases. It should be noted that there appears to be spatio-temporal chaos for a periodic type of heterogeneity when its period becomes comparable with the size of the pulse. [ABSTRACT FROM AUTHOR] Copyright of Chaos is the property of American Institute of Physics and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Bifurcations, Partial differential equations, Manifolds, Phase diagrams

Heterogeneity-induced defect bifurcation and pulse dynamics for a three-component reaction-diffusion system

Authors:Yuan, XH; Teramoto, T; Nishiura, Y

Publish:Phys Rev E Stat Nonlin Soft Matter Phys, 2007, 75(3 Pt 2):036220

We consider the dynamics when traveling pulses encounter heterogeneities in a three-component reaction diffusion system of one-activator-two-inhibitor type, which typically arises as a qualitative model of a gas-discharge system. We focused on the case where one of the kinetic coefficients changes similar to a smoothed step function, which is basic for more general heterogeneity as in periodic or random media. Since the heterogeneity is introduced to the kinetic part in an additive way, it causes the system to produce various types of localized structures smoothing the jump heterogeneity called the defects at the jump point, which makes a sharp contrast with the multiplicative heterogeneous case for the Gray-Scott model. The main issue is to study the collision dynamics between traveling pulses and defects, and show that their global bifurcation structure plays a key role in clarifying the underlying mechanism. Five outputs are observed after collisions including annihilation, rebound, and pinning. Unstable steady states are identified as separators between two different dynamic regimes: penetration and rebound, the role of which is very close to that of scattors arising in collision process. An organizing center producing the traveling pulses, defects, and scattors via unfolding with respect to the parameters is also presented.
non-Newtonian fluids, Dirichlet boundary initial value problem, weak solutions, local pressure method