The stLFR sequencing data obtained from the many-banded krait were subjected to assembly using Supernova (v2.1.1, RRID:
SCR_016756)
[12]. To improve the quality of the assembly, GapCloser (v1.12-r6, RRID:
SCR_015026) and redundans (v0.14a)
[9] were utilized for gap filling and redundancy removal, respectively, by incorporating the whole genome sequencing data.
To identify known repeat elements in the genome of the many-banded krait, Tandem Repeats Finder
[13], LTR_FINDER (RRID:
SCR_015247)
[11], and RepeatModeler (v2.0.1, RRID:
SCR_015027)
[14] were utilized. RepeatMasker (v3.3.0, RRID:
SCR_012954)
[15] and RepeatProteinMask v3.3.0
[16] were employed for repeat element annotation. Protein-coding genes were predicted using
de novo, homology-based, and transcript-mapping approaches. The
de novo gene prediction was performed using Augustus (v3.0.3, RRID:
SCR_008417)
[17]. RNA-seq data were filtered using Trimmomatic (v0.30, RRID:
SCR_011848)
[18], and transcripts were assembled based on clean RNA-seq data using Trinity (v2.13.2, RRID:
SCR_013048)
[19] for RNA-seq-based prediction. PASA v2.0.2
[20] was utilized to align transcripts against the many-banded krait genome to obtain gene structures. Our homology-based prediction was performed by mapping protein sequences of the UniProt database (release-2020_05) of
Pseudonaja textilis,
Crotalus tigris,
Thamnophis elegans, and
Notechis scutatus to the
B. multicinctus genome using Blastall v2.2.26
[21]. Gene models were predicted by analyzing the alignment results using GeneWise (v2.4.1, RRID:
SCR_015054)
[22]. Finally, the MAKER pipeline (v3.01.03, RRID:
SCR_005309)
[23] was employed to generate the final gene set, which represented RNA-seq, homology, and
de novo predicted genes.
To perform functional annotations, a BLAST search (RRID:
SCR_004870) was conducted against several databases, including SwissProt
[24], TrEMBL
[24], and KEGG
[25], with an E-value cut-off of 1e-5. Furthermore, InterProScan (v5.52-86.0, RRID:
SCR_005829)
[26] was used to predict motifs, domains, and GO terms.
To reconstruct the phylogenetic tree, OrthoFinder (v2.3.7, RRID:
SCR_017118)
[28] was used to search for single-copy orthologs among the protein sequences of
Anolis carolinensis (
GCA_000090745.2),
Chelonia mydas (
GCA_015237465.2),
Danio rerio (
GCA_000002035.4),
Deinagkistrodon acutus [29],
Gallus gallus (
GCA_016699485.1),
Homo sapiens (
GCA_000001405.29),
Mus musculus (
GCA_000001635.9),
Ophiophagus hannah (
GCA_000516915.1),
Python bivittatus (
GCA_000186305.2),
Xenopus tropicalis (
GCA_000004195.4), and
Alligator mississippiensis (
GCA_000281125.4). The number of orthogroups of all species was 7,788.