Human Metabolome Database (HMDB)

Introduction to the Human Metabolome Database (HMDB)

The Human Metabolome Database (HMDB) is a freely available, comprehensive electronic database that serves as a central repository for detailed information on small molecule metabolites found in the human body. Its primary purpose is to support research and applications in the fields of metabolomics, clinical chemistry, biomarker discovery, and general education. By integrating chemical, clinical, and biological data, the HMDB provides a foundational resource for scientists and clinicians studying human metabolism.

Please click the link to access the on Human Metabolome Database (HMBD).

Data Content of HMBD

The HMDB contains records for 220,945 metabolite entries, encompassing both water-soluble and lipid-soluble compounds. Beyond the metabolites themselves, the database links these entries to 8,610 protein sequences, including the enzymes that produce or break down the metabolites and the transporters that move them. Each metabolite entry in the HMDB is called a MetaboCard, and it functions as an incredibly detailed digital passport for that molecule. A single MetaboCard contains up to 130 structured data fields, covering everything from the compound's chemical structure and biological roles to its associated diseases, concentrations in body fluids, and even spectral data for identification. *Structured data fields are pre-defined, labeled categories—like "Chemical Formula" or "Disease Association"—that organize information in a uniform, computer-readable format. In a MetaboCard, this structure allows each of the 130 data points for a metabolite to be consistently stored and instantly retrieved, enabling powerful database searches and automated analysis. Approximately two-thirds of this information is dedicated to chemical and clinical data, while the remaining one-third covers enzymatic and biochemical data. To provide maximum utility, most fields are hyperlinked to other authoritative databases like KEGG, PubChem, UniProt, and GenBank.

The HMDB Suite of Integrated Databases

The HMDB is the flagship database within a larger, interconnected suite of resources, each with a specialized focus:

DrugBank: This provides equivalent information on approximately 2,832 drugs and 800 drug metabolites. DrugBank is a comprehensive, freely accessible online database that uniquely combines detailed drug data with extensive drug target information. Often described as a "pharmacopeia of the future," it serves as a critical bioinformatics resource for pharmacology, drug discovery, and precision medicine by linking chemical, pharmaceutical, genetic, and molecular data all in one place.
T3DB (The Toxin and Toxin Target Database): This contains data on about 3,670 common toxins and environmental pollutants. T3DB (The Toxin and Toxin Target Database): is a freely accessible, comprehensive bioinformatics resource that is specifically dedicated to toxins and their biological targets. It serves as a detailed encyclopedia for toxic substances—from environmental chemicals and pesticides to venoms and microbial toxins—and systematically documents how they interact with the human body at the molecular level.
SMPDB (The Small Molecule Pathway Database): It houses pathway diagrams for approximately 132,335 human metabolic, drug, and disease pathways, plus thousands more for other organisms. SMPDB: The Small Molecule Pathway Database, or SMPDB, is an essential, visually-oriented component of the HMDB suite. Its primary purpose is to map and illustrate the complex network of biochemical pathways within the human body, with a unique focus on the small molecules at the center of these processes.
FooDB: It serves as a comprehensive resource on roughly 70,000 food components and food additives. FooDB is the world's largest and most comprehensive freely available database dedicated to the chemistry and biology of food. It serves as a detailed encyclopedia for both nutrients and flavor compounds, cataloging the thousands of biochemicals found in natural and processed food products.

Navigation tools in HMBD

The HMDB offers a variety of user-friendly tools to access its data:

Browse Function: Provides a tabular overview of the database, allowing users to casually scroll, sort, and click on any entry for full details. A key feature under this menu is the Biospecimen link, which lists normal and abnormal metabolite concentrations for 23 different biological sample types (e.g., blood, urine, cerebrospinal fluid).
Text and Chemical Search: Users can perform simple or advanced text queries. The ChemQuery tool allows searching by drawing a chemical structure or entering a SMILES string.
Sequence Search: Supports BLAST searches against the database's collection of over 8,000 gene and protein sequences.
Advanced Search: An easy-to-use yet powerful relational query tool that allows for complex searches across multiple data fields simultaneously.

Specialized Spectral Search Tools (for Experimental Data)

A defining strength of the HMDB is its direct support for experimental metabolomics data analysis:

MS Search: Users can submit mass spectral files (in MoverZ format) to search against HMDB's library of LC-MS/MS spectra. This is used to identify metabolites from complex mixtures.
NMR Search: Users can submit peak lists from 1D or 2D NMR spectra (such as 1H, 13C, TOCSY, or HSQC) to compare against HMDB's NMR spectral libraries for metabolite identification.

Other Resources and Information in HMBD

Downloads section: This section provides bulk access to sequence, image, and text files from the database.
Human Metabolome Library (HML): A linked service where researchers can order physical samples of metabolite standards for use in their own experiments.
About Section: The HML lists metabolites that can be ordered for a fee by researchers around the world. There is About menu under which there are links for valuable metadata, including detailed database statistics and descriptions of the sources used to assemble the HMDB.

Methodology to use the Human Metabolome Database (HMDB)

Here is a step-by-step workflow for effectively using HMDB, illustrated with a practical example. This structured workflow transforms HMDB from a simple lookup table into a powerful discovery platform for metabolomics research

Case Scenario/research question: In a blood plasma sample from a patient, an unknown compound with a mass of 180.0634 Da via mass spectrometry is detected. Identify the sugar related metabolite and its biological role.

Steps to navigate Human Metabolome Database (HMDB)

Step 1: Define Your Goal and Choose the Appropriate Search Tool

First, determine what you already know about your compound and select the HMDB tool designed for that type of data.

What You Know	Recommended HMDB Tool
Name or Descriptive Text (e.g., "glucose", "ketone body")	Text Query (Simple or Advanced)
Chemical Structure (drawn or as SMILES)	Chemical Structure (drawn or as SMILES) ChemQuery Structure Search
Mass Spectrometry (MS) Data (peak list or spectrum file) MS Search	MS Search
Nuclear Magnetic Resonance (NMR) Data (peak list)	1D or 2D NMR Search
Protein/Gene Sequence related to the metabolite	Sequence Search (BLAST)
Multiple Criteria (e.g., mass range + disease association)	Advanced Search

For the given case/example: We have an accurate mass from an MS experiment. The most direct and powerful approach is to use the MS Search Alternatively, we could start with an Advanced Search using the mass value.

Step 2: Execute the Search and Filter Results

Navigate to the chosen search tool from the HMDB homepage and input your data.

Workflow A: Using MS Search (Direct Spectral Matching)

Click "Search" → "MS Search".
Upload your mass spectrum file (in supported formats like .mzXML, .mzML) or enter the observed m/z value (180.0634) and adduct type (e.g., [M+H]+, [M+Na]+).
Set tolerance parameters (e.g., 5 ppm mass accuracy).
Run the search. The database compares your data against its library of experimental and theoretical spectra.

Workflow B: Using Advanced Search (Query by Properties)

Click "Search" → "Advanced Search".
In the query builder:

Select "Monoisotopic Molecular Weight".
Set the operator to "Between" and enter 180.0594 and 180.0674 (allowing ± 0.004 Da/~22 ppm tolerance).
(Optional) Add a second criterion: "Biospecimen" → "Contains" → "Blood".

Run the query.

Example Result: Both searches will return a primary, high-confidence hit: HMDB0000122 - D-Glucose. You will also see other isomers like galactose (same mass) but ranked lower if spectral matching is used.

Step 3: Analyze the MetaboCard Entry

Click on the HMDB ID (e.g.,HMDB0000122 ) to open the full MetaboCard. This is the core information hub. Systematically review the following sections:

Chemical Data: Confirm structure, formula (C₆H₁₂O₆), and chemical properties.
Clinical Data: This is crucial for our example

"Normal Concentrations": See that normal fasting blood glucose is ~4.0-5.5 mM.
"Abnormal Concentrations":Find direct links to hyperglycemia and diabetes mellitus. This confirms the clinical relevance of your finding.

Biological Data:

Pathways: Click the "Pathway Browser" link or see the "Pathway Name" field. You'll find glucose in central pathways like "Glycolysis", "Gluconeogenesis", and "Insulin signaling".
Associated Enzymes/Genes: See links to proteins like hexokinase (HK1, HK2) and glucose transporters (SLC2A1/GLUT1, SLC2A4/GLUT4).

Step 4: Leverage Connected Databases (The HMDB Suite)

Use the hyperlinks and integrated databases to expand your investigation.

Navigate to SMPDB for Pathways: From the glucose MetaboCard, click the link to the "Glycolysis" pathway in SMPDB. This provides a visual, interactive diagram showing glucose's position, connected metabolites, and regulating enzymes—offering immediate biological context.
Check DrugBank for Therapeutics: A quick search in the integrated DrugBank (via the suite menu) for "glucose" will show it is not only a metabolite but also a drug (e.g., dextrose injection) used to treat hypoglycemia.
Explore FooDB for Dietary Sources: Link to FooDB to see the natural abundance of glucose in thousands of foods, completing the picture from diet to metabolism.

Step 5: Access Experimental Data and External Validation

Utilize HMDB's specialized data for experimental validation and further linking.

Spectral Evidence: In the MetaboCard, examine the "MS/MS Spectra" and "NMR Spectra" sections. You can download these to compare directly with your own lab data for definitive confirmation.
External Databases: Use the hyperlinks in the "General References" and "Database Links" sections (e.g., PubChem, KEGG, ChEBI) to cross-validate information and access even more literature and data.

Step 6: Download and Conclude

For Reporting: Use the "Download" options to obtain high-resolution images of the chemical structure, pathway maps, and spectral data for your publication or report.
Final Synthesis for Our Example: You have successfully:
1. Identified the unknown peak as D-Glucose.
2. Confirmed its elevated level is clinically significant for diabetes.
3. Mapped its role in key metabolic pathways.
4. Accessed its reference MS/NMR spectra for future comparisons.
5. Explored its connections to drugs and diet.

Table: Summary of differences between METLIN and HMDB

Category	METLIN	HMDB (Human Metabolome Database)
Primary purpose	Identify unknown compounds from MS data, especially MS/MS “fingerprints”	Comprehensive reference for human metabolites (chemical + biological + clinical context)
Best for	Mass spectrometry–based metabolite identification	Understanding what a metabolite means in humans (pathways, diseases, concentrations)
Typical user question it answers	“My LC–MS peak is at m/z X. What molecule is this?”	“This metabolite is X. Where is it found in the body? What is normal level? What diseases is it linked to?”
Main data types	Accurate mass, adduct handling, MS/MS fragmentation spectra, chemical info	Metabolite records (“MetaboCards”), normal/abnormal concentrations, disease associations, pathways, proteins/enzymes, MS/NMR spectra
Biological scope	Broad metabolite/chemical entity coverage (not limited to humans)	Human-focused (human biofluids, tissues, clinical interpretation)
Strongest feature	Fast annotation/ID using MS/MS matching	Deep annotation: biology + clinical chemistry + pathways + reference ranges
Works well when you have…	LC–MS / MS/MS data and want identification	A metabolite name/ID and want biological meaning, pathways, clinical relevance
Typical output	List of candidate metabolites that match mass/MS/MS	Detailed metabolite “passport”: structure, levels in blood/urine, associated diseases, enzymes, pathways
Role in a metabolomics workflow	Identification/annotation step after peak detection	Interpretation/biological context step after identification
Key limitation (practical)	Mass-only matches can produce many candidates; MS/MS required for confidence	Human-centered; for pure MS/MS identification, you may still rely on METLIN-like matching and then use HMDB for context