Getting Started
CassavaDB is a comprehensive, multi-omics database specifically designed for cassava (Manihot esculenta) research. It integrates various types of biological data to support cassava genomics, breeding, and functional studies.
Data Types Available:
- Genomic Data: Reference genome assemblies, gene annotations, and miRNA sequences
- Transcriptomic Data: RNA-Seq datasets, gene expression profiles, differential expression analyses, co-expression networks, and single-cell RNA-seq data
- Variation Data: Whole genome sequencing (WGS) projects, SNP/InDel variants, and variant density analyses
- Metabolomic Data: Metabolite profiles, metabolite GWAS, and related analyses
- Breeding Resources: Cultivar information, SSR markers, and research publications
Key Features: Interactive visualizations, analysis tools, bulk data downloads, and integration with external databases like NCBI and PlantGDB.
CassavaDB is organized into logical sections based on data types. Here's a recommended navigation strategy:
Start with the Homepage
Get an overview of available resources, recent updates, and featured datasets.
Explore by Research Interest
Use the main navigation menu to access specific data types: Genomics, Transcriptomics, Variomics, Metabolomics, or Breeding.
Use Search Functions
Start with gene search, variant search, or browse expression data to find your genes/regions of interest.
Try Analysis Tools
Use built-in tools like GO enrichment, KEGG pathway analysis, or BLAST search for functional analysis.
Pro Tip: Bookmark frequently used pages and use the breadcrumb navigation to keep track of your location in the database.
No registration required! CassavaDB is completely open-access and free to use. You can:
- Browse all data without creating an account
- Use all search and analysis tools
- Download datasets
- Access all visualization tools
Open Science Approach: We believe in making cassava research data freely accessible to accelerate scientific discovery and crop improvement efforts globally.
Data Search and Access
CassavaDB offers multiple ways to search for genes:
1. Basic Gene Search (Genomics → Gene Search)
- By Gene ID: Enter gene identifiers like "Manes.01G000100"
- By Gene Name: Search for gene names or symbols
2. Advanced Search Features
- Batch Search: Search multiple genes at once by entering multiple gene IDs
- Wildcard Search: Use asterisk (*) for partial matches
- Case Sensitivity: Gene IDs are case-sensitive, so ensure correct formatting
3. BLAST Search
Use nucleotide or protein sequences to find similar genes:
- Access via Genomics → BLAST
- Supports BLASTN, BLASTP, BLASTX, TBLASTN, and TBLASTX
- Adjustable parameters for sensitivity and specificity
Search Tips: Use wildcards (*) for partial matches, ensure correct gene ID formatting (e.g., Manes.01G000100), and try both gene IDs and gene names. Gene IDs are case-sensitive.
CassavaDB provides comprehensive gene expression data from multiple RNA-Seq studies:
Expression Data Sources
- RNA-Seq Projects: Multiple transcriptomic studies from different research groups
- Sample Diversity: Various tissues, developmental stages, and experimental conditions
- Data Processing: Standardized analysis pipelines for consistent results
- Expression Profiles: Gene-level expression data across different samples
How to Access Expression Data
Gene Expression Browser
Go to Transcriptomics → Gene Expression Browser, enter gene ID(s), and select datasets of interest.
RNA-Seq Projects
Browse available studies in Transcriptomics → RNA-Seq Projects to understand experimental designs.
Differential Expression
Use Transcriptomics → Differential Expression to find genes up/down-regulated in specific conditions.
Visualization Options: Heatmaps, bar charts, line plots, and box plots. Data can be downloaded in CSV/Excel formats.
CassavaDB includes comprehensive variant data from whole-genome sequencing of diverse cassava accessions:
Variant Types Available
Variant Type | Count | Description |
---|---|---|
SNPs | ~15 million | Single nucleotide polymorphisms |
InDels | ~2 million | Small insertions and deletions |
Search Methods
- Position-based: Search by chromosome and position range
- Gene-based: Find variants within or near specific genes
- Effect-based: Filter by predicted functional impact (high, moderate, low)
- Population-based: Filter by allele frequency in different populations
- GWAS results: Access significant associations with traits
Advanced Features: Variant density visualization, population frequency analysis, and functional effect prediction for SNPs and InDels.
Analysis Tools
CassavaDB provides built-in tools for functional enrichment analysis:
GO Enrichment Analysis
Prepare Gene List
Collect your genes of interest (e.g., from differential expression analysis)
Access Tool
Navigate to Genomics → GO Enrichment Analysis
Input Parameters
Enter gene IDs, select GO categories (BP/MF/CC), set p-value threshold
Interpret Results
Review enriched terms, p-values, and gene counts. Download results for further analysis
KEGG Pathway Analysis
- Pathway Mapping: Map your genes to KEGG pathways
- Enrichment Testing: Statistical testing for over-represented pathways
- Visual Pathway Maps: Interactive pathway diagrams with highlighted genes
- Cross-species Comparison: Compare with pathways from other plant species
Input Requirements: Use valid cassava gene IDs (e.g., Manes.01G000100). Check if the enrichment analysis tools are available in your CassavaDB instance, as some advanced analysis features may require specific configuration.
⚠️ Error Result: If you click "Pick Primers" without entering any DNA sequence, you will receive an error message:
PRIMER_ERROR=Missing SEQUENCE tag
Solution: Always enter a DNA sequence in the "Source Sequence" text area before clicking "Pick Primers".
How to Use Primer3 Correctly
Enter DNA Sequence
Paste your target DNA sequence (5' → 3' direction) in the "Source Sequence" field. FASTA format is supported.
Select Task Type
Choose appropriate task: "generic" for standard PCR, "pick_sequencing_primers" for sequencing, or "check_primers" to validate existing primers.
Configure Parameters
Adjust primer size, melting temperature, GC content, and product size according to your experimental needs.
Pick Primers
Click "Pick Primers" to generate optimized primer pairs with detailed quality information.
Sequence Requirements: Only ACGTN letters are recognized (other letters treated as N). Numbers and spaces are automatically ignored.
Different PCR applications require different primer design parameters:
Parameter | qPCR | Standard PCR | Sequencing | Description |
---|---|---|---|---|
Product Size | 80-150 bp | 200-1000 bp | Variable | Target amplicon length |
Primer Length | 18-22 bp | 18-25 bp | 18-30 bp | Primer sequence length |
Tm Difference | ≤ 2°C | ≤ 5°C | ≤ 5°C | Temperature difference between primers |
GC Content | 45-55% | 40-60% | 40-60% | Percentage of G and C nucleotides |
Tm Range | 58-62°C | 55-65°C | 55-70°C | Melting temperature range |
Advanced Settings for Difficult Templates
- High GC Content: Increase Tm range to 65-70°C, allow longer primers
- Repetitive Sequences: Use template masking, increase penalty for repeats
- Low Specificity: Use mispriming library, increase penalty for secondary structures
- Multiplex PCR: Ensure similar Tm values, check for primer-primer interactions
Pro Tip: Start with default parameters and adjust based on results. For challenging sequences, try template masking with cassava-specific repeat libraries.
JBrowse2 is our interactive genome browser for exploring cassava genomic features in their chromosomal context:
Getting Started with JBrowse2
Access the Browser
Navigate to Genomics → JBrowse2 to open the genome browser interface
Navigate to Region
Enter coordinates (e.g., "Chr01:1000000-2000000") or gene IDs in the location box
Add Data Tracks
Select tracks to display: genes, variants, RNA-Seq coverage, repeat elements, etc.
Customize View
Zoom in/out, adjust track heights, change color schemes, and configure display options
Available Data Tracks
- Gene Models: Protein-coding genes and gene annotations
- Variants: SNPs and InDels with functional annotations
- RNA-Seq: Expression data and coverage tracks from different studies
- Genome Features: Reference genome sequences and chromosomal regions
Keyboard Shortcuts: Use arrow keys to pan, +/- to zoom, and 'r' to reverse complement. Right-click features for detailed information and links to other tools.
Data Download
CassavaDB provides comprehensive datasets for offline analysis. All data is freely available without registration:
Available Datasets
Data Type | Format | Size | Description |
---|---|---|---|
Reference Genome | FASTA, GFF3 | ~750 MB | Complete genome assembly with annotations |
Gene Sequences | FASTA | ~50 MB | CDS, protein, and transcript sequences |
Variant Data | VCF, TSV | ~1-2 GB | SNPs and InDels with functional annotations |
Expression Data | CSV, TSV, H5 | ~500 MB | RNA-Seq count matrices and metadata |
Metabolite Data | CSV, Excel | ~10 MB | Metabolite profiles and GWAS results |
Download Methods
- Bulk Downloads: Complete datasets via the Download page
- Custom Downloads: Subset data based on your search criteria
- API Access: Programmatic access for automated downloads
- FTP Server: Direct access to all data files
Data Updates: Datasets are updated regularly. Check version numbers and release dates to ensure you have the latest data.
Most CassavaDB tools provide multiple export options for your results:
Export Formats Available
- CSV/TSV: Tabular data for Excel, R, Python analysis
- JSON: Structured data for programmatic processing
- FASTA: Sequence data for further bioinformatics analysis
- GFF3/BED: Genomic coordinates for genome browsers
- PNG/SVG: High-quality plots and visualizations
- PDF: Publication-ready figures and reports
Export Procedures
Complete Your Analysis
Perform search or analysis using any CassavaDB tool
Locate Export Button
Look for "Download", "Export", or "Save" buttons near results tables or plots
Choose Format
Select appropriate file format based on your downstream analysis needs
Save File
File will be downloaded to your default download folder
File Size Limits: Large result sets may be split into multiple files or require bulk download. Check file sizes before exporting.
Technical Issues and Troubleshooting
CassavaDB is optimized for modern web browsers. For the best experience, use:
Recommended Browsers
Browser | Minimum Version | Recommended Version | Notes |
---|---|---|---|
Chrome | 90+ | Latest | Best performance, all features supported |
Firefox | 88+ | Latest | Excellent compatibility |
Safari | 14+ | Latest | Good on macOS/iOS |
Edge | 90+ | Latest | Chromium-based versions |
Common Display Issues and Solutions
- Slow Loading: Clear browser cache, disable ad blockers, check internet connection
- Missing Graphics: Enable JavaScript, update browser, check popup blockers
- Layout Problems: Try browser zoom reset (Ctrl+0), disable browser extensions
- Interactive Tools Not Working: Allow JavaScript execution, disable strict security settings
Mobile Access: CassavaDB is responsive and works on tablets and smartphones, though desktop browsers provide the best experience for complex analyses.
Large analyses can take time. Here are optimization strategies:
Performance Optimization
- Reduce Dataset Size: Filter by chromosome, gene set, or expression level before analysis
- Batch Processing: Split large gene lists into smaller chunks
- Use Appropriate Tools: Choose the right tool for your data size and complexity
- Off-peak Usage: Try analyses during less busy times (early morning, late evening)
Timeout Troubleshooting
Check Input Size
Reduce number of genes/variants in your analysis
Simplify Parameters
Use less stringent cutoffs, reduce number of comparisons
Try Alternative Approach
Consider downloading data for local analysis
Contact Support
For persistent issues, contact our support team
Resource Limits: Some analyses have built-in limits (e.g., max 1000 genes for enrichment analysis) to ensure reasonable response times for all users.
We appreciate bug reports and feedback to improve CassavaDB:
How to Report Issues
Document the Issue
Note what you were doing, expected vs. actual results, and error messages
Gather System Info
Include browser type/version, operating system, and screen resolution
Take Screenshots
Visual documentation helps us understand and reproduce the issue
Send Report
Email us at 23220951310021@hainanu.edu.cn with all details
Information to Include
- Page URL: Exact page where the issue occurred
- Steps to Reproduce: Detailed sequence of actions
- Input Data: Gene IDs, search terms, or parameters used
- Error Messages: Exact text of any error messages
- Expected Behavior: What you expected to happen
- Browser Console: Any JavaScript errors (F12 → Console)
Response Time: We typically respond to bug reports within 24-48 hours and aim to fix critical issues within a week.
Additional Resources
We provide several educational resources to help you use CassavaDB effectively:
Available Resources
- Video Tutorials: Step-by-step guides for common analyses (coming soon)
- Webinar Recordings: Monthly webinars on CassavaDB features and cassava research
- Example Workflows: Complete analysis pipelines for different research questions
- Best Practices Guide: Recommendations for data interpretation and analysis
- Publication Gallery: Studies that used CassavaDB data
Educational Content
Coming Soon: Interactive tutorials, example datasets, and guided analysis workflows to help new users get started quickly.
Community Resources
- User Forum: Discussion platform for questions and tips (planned)
- Mailing List: Updates on new features and data releases
- Social Media: Follow us for news and research highlights
Please cite CassavaDB when using our data or tools in your research:
Primary Citation
Database Citation:
Author et al. (2025). CassavaDB: A comprehensive multi-omics database for cassava genomics and breeding. Plant Biotechnology Journal, XX(X), XXX-XXX. DOI: 10.1111/xxxxx
Dataset-Specific Citations
When using specific datasets, please also cite the original data sources:
- Genome Assembly: Cite the original genome publication
- RNA-Seq Data: Cite the original expression studies
- Variant Data: Cite the population genomics studies
- Metabolomic Data: Cite the original metabolomics publications
Reference List: Complete citations for all datasets are available on individual data pages and in the Reference & Publication section.
Acknowledgment Template
You may also include this acknowledgment: