Annotation of yellow genes in Diaphorina citri, the vector for Huanglongbing disease

Huanglongbing (HLB), also known as citrus greening disease, is caused by the bacterium Candidatus Liberibacter asiaticus (CLas). It is a serious threat to global citrus production. This bacterium is transmitted by the Asian citrus psyllid, Diaphorina citri (Hemiptera). There are no effective in planta treatments for CLas. Therefore, one strategy is to manage the psyllid population. Manual annotation of the D. citri genome can identify and characterize gene families that could be novel targets for psyllid control. The yellow gene family is an excellent target because yellow genes, which have roles in melanization, are linked to development and immunity. Combined analysis of the genome with RNA-seq datasets, sequence homology, and phylogenetic trees were used to identify and annotate nine yellow genes in the D. citri genome. Manual curation of genes in D. citri provided in-depth analysis of the yellow family among hemipteran insects and provides new targets for molecular control of this psyllid pest. Manual annotation was done as part of a collaborative Citrus Greening community annotation project.

immune effector response, in which melanin is synthesized and cross-linked with other molecules in injured areas, resulting in the death of invading pathogens and hardening of the wound clot [12]. Melanization is also essential for cuticle sclerotization or tanning, which leads to hardening of the insect exoskeleton [13], and the prevention of moisture loss [10]. To develop gene-targeting or gene-suppressing treatments for yellow proteins, accurate gene sequences need to be established, annotated, and basic expressional details provided. Here we describe the yellow genes of D. citri using a combination of genome annotation and expressional differences [14] based on previously conducted RNA-seq studies.

Context
The yellow genes are of ancient lineage, as evidenced by the presence of yellow-like genes in several bacterial species. However, there is no evidence that these genes exist in the complete genome sequences of the worm Caenorhabditis elegans or the yeast Saccharomyces cerevisiae. This suggests that they may have been lost from many lineages and may now be largely restricted to arthropods [15]. While functional assignments have not yet been made for every member of this family, research suggests that a role in melanization may be conserved for several yellow family members [11]. Duplications, as well as losses, are apparent in the yellow gene family, and phylogenetic analysis shows that yellow family expansion is associated with insect diversification [11]. Previous studies have shown that the yellow-y, -c, -d, -e, -f, -g, and -h genes were present prior to divergence of the hemimetabolous and holometabolous insects; however, some of these ancestral yellow genes are lost in specific insect lineages [11]. The most notable case of yellow lineage duplication is the entire major royal jelly protein (mrjp) family, which forms a distinct cluster within the yellow family phylogeny and seems to be restricted to certain species of bees ( Figure 1) [15]. Here we describe the yellow genes of the Asian citrus psyllid, Diaphorina citri. Because of the multiplicity of yellow genes discovered in D. citri and the inconsistency of ortholog names, phylogenetic analysis was performed to properly classify these genes. This was followed by examining expression differences based on previously available RNA-seq datasets. Based on these results, we discuss possible functions of the yellow genes identified in D. citri.

METHODS
The D. citri genome was annotated as part of a community-driven manual curation project [9] with an undergraduate focus [16]. Protein sequences of the yellow family were   Tables 2 and 3. collected from the National Center for Biotechnology Information (NCBI) protein database [17] and were used for a BLAST search of the D. citri MCOT (Maker (RRID:SCR_005309), Cufflinks (RRID:SCR_014597), Oases (RRID:SCR_011896), and Trinity (RRID:SCR_013048)) protein database [14]. MCOT protein sequences were used to search the D. citri genomes (version 2.0 and 3.0) [18]. Regions of high sequence identity were manually annotated in Apollo version 2.1.0 (RRID:SCR_001936), using de novo transcriptome and MCOT gene predictions, RNA-seq, Iso-seq, and ortholog data as evidence to determine and validate proper gene structure ( Table 1). The gene models were compared with those from other hemipterans for accuracy and completeness. A neighbor-joining phylogenetic tree of D. citri yellow protein sequences, along with various related orthologs, was created in MEGA version 7 (RRID:SCR_000667) using the MUSCLE (RRID:SCR_011812) multiple sequence alignment with p-distance for determining branch length and 1000 bootstrap replicates [19].
A more detailed description of the annotation workflow is available (Figure 2) [20]. Accession numbers for the sequences used in this analysis can be found in Tables 1, 2, and 3. Comparative expression levels of yellow proteins throughout different life stages (egg, nymph, and adult) in Candidatus Liberibacter asiaticus (Clas) exposed versus healthy D. citri insects was determined using RNA-seq data and the Citrus Greening Expression Network [14]. Gene expression levels were obtained from the Citrus Greening Expression Network [14] and visualized using Excel (RRID:SCR_016137) and the pheatmap package in R (RRID:SCR_016418) [21,22]. Expression values for all samples discussed in this manuscript and visualized in Figures 3-7 are reported in Table 4.

Data validation and quality control
Manual gene annotation of the D. citri genome revealed the presence of nine yellow genes, each containing the major royal jelly protein (mrjp) domain conserved in this gene family. All nine of the genes were confirmed by at least four types of evidence, including RNA-seq and ortholog presence (Table 1). Phylogenetic analysis was conducted to determine the orthology of these yellow genes; the results coincide well with previous studies ( Figure 1) [11,27]. Based on this analysis, the yellow genes in D. citri comprise two yellow-y genes, two yellow-d genes, two yellow-g genes, one yellow-h, and one yellow-c, as well as one yellow gene (yellow 9) that seems to be a duplication unique to hemipterans, but is closely related to known yellow-f orthologs ( Figure 1, Table 5).

yellow-y
The gene yellow-y (also simply referred to as yellow) was the first example of a single gene mutation affecting behavior [13]. However, yellow, was initially identified because of its role in pigmentation, and was named for the loss of black pigment that gave mutant flies a more yellow appearance [42]. Recent studies suggest that the Drosophila melanogaster yellow-y and ebony gene together determine the pattern and intensity of melanization [43], and that the yellow-y gene may regulate the expression of yellow-f or other enzymes involved in melanization [10]. Many studies have also noted a role for yellow-y in the behavior and mating ability of Drosophila, such as changes in the structures used during . Eggs and nymphs raised on Citrus macrophylla [8] and nymphs raised on Citrus spp. [25] are single replicate data. RNA-seq data were sourced from insects raised on C. sinensis and C. medica (NCBI BioProject PRJNA609978). Expression analysis was performed using the Citrus Greening Expression Network [14]. mating in yellow mutants [13]. The yellow-y gene is present in most insect species as a single copy; however, both yellow and yellow 2 in D. citri form a clade exclusively with known yellow-y orthologs, indicating a duplication event that seems to be unique to D. citri  [24] and midgut [26] of infected and uninfected adult insects. Values are represented in transcripts per million (TPM). Expression data were obtained using the Citrus Greening Expression Network [14].
with research that found yellow-y to be abundant in Drosophila pupae when melanin is deposited in the adult cuticle [44]. On the other hand, yellow 2 (y2) may be important in adult D. citri and should be studied further.

yellow-c, -f, -b, and 9
No function has been directly identified for the yellow genes yellow-b or yellow-c. However, phylogenetic analysis reveals a close relationship between yellow-b, yellow-c, yellow-f, and yellow 9 [11,27] (Figure 1). In Drosophila, yellow-f and yellow-f2 function as dopachrome conversion enzymes (DCE), which are important in melanin biosynthesis [43]. Interestingly, while most hemipterans seem to have one or more yellow-c genes, none form clades exclusively with known yellow-f or yellow-b orthologs ( Figure 1). Instead, hemipteran genes are grouped into their own separate clade. A close relationship was observed, however, between yellow-f and yellow 9, which is supported by the presence of an Acyrthosiphon pisum sequence in the yellow 9 clade, previously reported as grouping with yellow-f ( Figure 1) [11]. The addition of several other hemipteran sequences may have helped align A. pisum more closely to the yellow 9 orthologs. This distinctness of the hemipteran group is common among the other yellow genes in the tree; however, the association to a known ortholog is typically much clearer than is seen with yellow 9. More studies should be conducted to conclusively determine the identity of this hemipteran outlier.
Of all yellow genes in D. citri, yellow 8 (c) shows the greatest expression levels and is most highly expressed in the adult whole body of D. citri insects reared on Citrus medica ( Figure 3). There was a significant increase in expression in nymphs reared on NP_001019868.1 Accessions number for non-hemipteran insects used in phylogenetic analysis ( Figure 1). XP identifiers are computer predicted models. † Yellow-fa and -fb are the names given specifically to Bombyx mori. ‡ Yellow-x is the name commonly used to refer to this group.

yellow-h
Transcripts of yellow-h show color-related expression patterns in some species, but the function of the encoded protein is poorly understood [11]. Phylogenetic analysis reveals that D. citri contains one yellow-h gene, yellow 4 (Table 5, Figure 1). Expression data from D. citri shows the highest expression of this gene in the egg and nymph (Figure 3). This is consistent with previous research showing that mutations of yellow-h in larval Vanessa cardui led to death in pupal stages of development, suggesting that yellow-h could be important during insect development [27]. Furthermore, expression data revealed an 8.47-fold increase in yellow-h expression in D. citri nymphs reared on Citrus spp. and infected with CLas (52.52 TPM) versus uninfected nymphs (6.2 TPM) ( Figure 5). This differential expression of yellow-h, coupled with the impact of mutations in pupal mortality, indicates that yellow-h could be a potential RNA interference (RNAi) target and warrants additional study in D. citri.

yellow-e3/d
Previous research has revealed that yellow-d shows red-specific expression in the butterfly V. cardui, and that the loss of yellow-d function not only affects melanin patterns, but also  Accession numbers for hemipterans used in phylogenetic analysis (Figure 1). XP identifiers are computer-predicted models. presumptive ommochrome patterns [27]. Phylogenetic analysis of the yellow genes annotated in the D. citri genome shows yellow 3 and yellow 5 in a clade with known yellow-e3/d orthologs (Figure 1). Expression of yellow 3 (d) was highest in the whole body of adult psyllids reared on Citrus medica, while those reared on C. macrophylla or C. spp. showed low expression. Similarly, expression was highest in nymphs raised on C. sinensis and C. medica, with almost no expression in psyllids reared on other citrus species (Figure 3). In insects reared on C. reticulata, expression of yellow 3 (d) was consistently close to zero TPM, while yellow 5 (d), showed relatively high expression in the adult antennae, head, and thorax ( Figure 6).

yellow-g
The function of yellow-g is currently not well understood; however, it is often present in duplicate in most insects ( Figure 1, Table 5). D. citri contains two yellow-g genes, yellow 6 and yellow 7, both of which are expressed more in adult males versus females raised on C. reticulata ( Figure 3). The expression of both genes is relatively similar throughout the stages and tissues that have been assayed. Neither gene is expressed in the egg, and both show low expression levels in the nymph, with higher expression in adults. There is a notable upregulation of yellow 6 (g), from undetectable in uninfected nymphs raised on C. spp. to 4.01 TPM in infected nymphs. There is also an upregulation of yellow 6 (g) by 8-fold and yellow 7 (g) by 2.64-fold in the midgut of infected versus uninfected adult psyllids reared on C. spp. (Figure 7). This effect may indicate an immune response and should be studied further as a possible RNAi target.

Conclusion
The yellow gene family is a continuously evolving set of genes, with duplications and losses among insects [11]. Many of these genes are crucial in melanization, which is essential for insect survival in relation to development and immunity [10,12]. Though the function of some yellow proteins remain poorly understood, identification of these proteins in the   Comparative expression levels in transcripts per million (TPM) of the D. citri yellow-y proteins in infected versus uninfected D. citri insects grown on various citrus varieties. Expression data were obtained using the Citrus Greening Expression Network [14]. RNA-Seq data for psyllids were obtained from NCBI BioProject's PRJNA609978 and PRJNA448935, in addition to published datasets [8,[23][24][25][26]. Citrus hosts are abbreviated as C.sin (C. sinensis), C.med (C. medica), C.ret (C. reticulata) and C.mac (C. macrophylla).
The yellow gene family ortholog numbers based on results of the phylogenetic analysis ( Figure 1). Dc numbers represent the number of manually annotated genes in the D. citri v3.0 genome. † Copy number in Am includes a predicted sequence from NCBI that groups with mrjp in phylogenetic analysis.
hemipteran, D. citri, provides a novel insect lineage for studies of insect evolution and biology. D. citri harbors a unique duplication of yellow-y, a gene that may affect cuticular hardening. Therefore, it could be a potential target for a D. citri-specific molecular control mechanism [13]. Expression data shows an inverse relationship between the two yellow-y genes, suggesting independent roles for these proteins during juvenile and adult stages ( Figure 3). The yellow 9 gene appears to be unique to hemipterans (Figure 1), and is a potential alternative to yellow-f in holometabolous insects, which encodes a dopachrome conversion enzyme (DCE) in Drosophila [43].

REUSE POTENTIAL
The manually curated gene models based on the highly contiguous version 3 genome, compared with previous assemblies, provide a novel resource for understanding psyllid biology for the citrus greening community. To improve the accessibility and usability of this data, it will be included in the Psyllid Expression Network [14]. This visualization and analysis tool includes public transcriptomics data for Diaphorina citri from multiple tissues, life stages, infection states and citrus hosts in an expression cube for comparative analysis.

DATA AVAILABILITY
The data sets supporting this article are available in the GigaScience GigaDB repository [45].

EDITOR'S NOTE
This article is one of a series of Data Releases crediting the outputs of a student-focused and community-driven manual annotation project, curating gene models and -if requiredcorrecting assembly anomalies, for the Diaphorina citri genome project [18].

ETHICAL APPROVAL
Not applicable.

CONSENT FOR PUBLICATION
Not applicable.