For research use only. Not for use in diagnostic procedures.
The Waters MarkerLynx Application Manager for MassLynx Software addresses the problem of identifying multiple interrelated markers by looking for patterns in related LC-MS data sets.
Metabonomic studies interrogate biological fluids looking for changes in an organism’s metabolic profile induced by genetic modifications, toxicological insults, or environmental stimuli. These changes can often be subtle and are often hidden by other metabolic variations such as age, diurnal effects, and gender. In such studies, very rarely does a system perturbation produce concentration changes in just one metabolite, but rather there is a cascade of metabolic changes and these changes would be difficult to visualize. The Waters MarkerLynx Application Manager for MassLynx Software addresses the problem of identifying multiple interrelated markers by looking for patterns in related LC-MS data sets. The samples are categorized based on the similarity of their LC-MS signal. Based on this classification, interrelated markers can be quickly identified. These markers of interest can then be further studied by MS/MS or other analyses of choice.
The MarkerLynx Application Manager is an integrated software package that provides automated peak detection followed by principal components analysis (PCA) with direct links to the original chromatograms and ion intensity trend plots. In brief, PCA is a mathematical process to reduce multi-dimensional data sets down to coordinates that can be readily displayed in a graphical view. PCA plots reveal intersamplerelationships via their spatial proximity to one another, the trajectory of metabolic changes occurring, and the compounds responsible for such changes. The advantage of using such a statistical tool is that similar chromatographic and/or mass spectrometric data are revealed in clusters so that the marker ions that contribute to the variance can be readily discerned.
As an example, Figure 1 illustrates the use of MarkerLynx to distinguish between male and female white mice sample based on the metabolites present in urine.
A simple data set consisting of three sample mixtures was created to illustrate the processing features of MarkerLynx. The first two mixtures were either aqueous (Mixture 1) or rat urine-based (Mixture 2) and were spiked at varying concentrations (sample-to-sample) with theophylline, caffeine, hippuricacid, benzoic acid, chloropheniramine, nortriptyline, tolbutamide, and reserpine. The final mixture (Mixture 3) contained a constant concentration of theophylline, caffeine, hippuricacid, nortriptyline, and 4-nitrobenzoic acid in water. According to the experimental design, MarkerLynx should classify three distinct group clusters and identify those marker ions responsible for the group separation. Inspection of the MarkerLynx browser below demonstrates that indeed three sample groups clusters were produced. Nortriptyline (m/z 264), theophylline (m/z 181), and reserpine (m/z 609) were correctly identified as marker ions.
There are four distinct steps that MarkerLynx carries out to conduct a sample set analysis: (1) chromatographic peaks are detected, (2) ion intensities are identified, (3) ions from detected peaks are retained in a data matrix, and (4) PCA is performed on the resulting data.
Multivariate analysis techniques are highly reliant upon correctly aligned data from both chromatographic and mass spectrometric peaks from sample to sample. LC-MS data that is coarsely binned and aligned (a common practice) will produce somepotential marker metabolites, though much of the fine detail produced by a high mass accuracy, high resolution LC-MS system will be lost. In order to maximize the information generated from such a system (as with the Waters Metabonomics MS System), MarkerLynx utilizes the patented ApexTrack algorithm to perform accurate peak detection. ApexTrack takes the second derivative of a chromatogram (Figure 3 and 4) and locates the inflection points, local minima, and peak apex. After the apex is found for each peak, a retention time is assigned, and the data is correctly aligned.
After the chromatographic peaks are located, the ions associated with these peaks are analyzed. When an ion’s maximum intensity is found under the chromatographic peak, its retention time, ion intensity, and exact mass are captured (Figure 5). This process is repeated for all samples in the data set.
The ions found to maximize under the chromatographic peaks are assembled into a single data matrix (Figure 6). In order to be retained in the data matrix, an ion has to be present in at least two samples.
The matrix is filled with m/z and intensity information from sample 1 through sample N. Whenever an ion is detected, the ion list is first interrogated to determine whether it already exists. If the ion already exists in the matrix, the ion intensity is added to the corresponding location. If the ion does not exist it is then appended to the matrix, and a zero is inserted for samples in which the ion was previously undetected. Hence, retaining the validity of the matrix. A typical data matrix is shown in Figure 7.
The resulting data matrix is subjected to PCA by MarkerLynx. During PCA processing, the data matrix is examined to find combinations of the ion intensities that best describe the maximum variance in the data. PCA produces two plots: a scores plot and a loadings plot. The scores plot effectively displays the intersamplerelationships in multi-dimensional hyperspace; with more similar samples clustering together and dissimilar samples separated. The loadings plot describes the relationship between the measured variables; those contributing most to the variance are those furthest from the origin of the plot. A typical LC-MS data matrix is shown below in Figure 7.
The MarkerLynx Application Manager for MassLynx Software accurately detects ions of interest from an MS sample set by extracting the relevant data in four automated processing steps: (1) chromatographic peak detection, (2) ion intensity identification, (3) data matrix construction, and (4) PCA on the resulting data set. The seamless integration of the data reduction, integration, and statistical analysis of the LC-MS raw data allows for the rapid identification of biomarkers of disease, toxicity, or genetic modification. MarkerLynx eliminates the need for simplistic data binning and retains the information-rich nature of the data produced by LC-MS, without the need for significant (and often error-prone) time-consuming manual data processing steps.
720001056, November 2004