Quick Start#

Installation#

You can clone the repository of alpinac : https://gitlab.com/empa503/atmospheric-measurements/alpinac .

The install with pip:

pip install .

This should install alpinac and all the requirements.

Usage#

From Python#

The main function for using alpinac is :py:func:alpinac.mode_identification.main.make_identification . This can be called from python direcly by specifying the correct arguments.

The function make_identification must be imported from alpinac using:

from alpinac.mode_identification.main import make_identification

Command Line#

If you have a file containing the fragments to analyze, you can use the command line interface of alpinac:

python -m alpinac.mode_identification <filename> [--target-elements <elements> --target-formula <formula>  --verbose-knapsack]

You can see more details on the command line usage using:

python -m alpinac.mode_identification --help

Input data#

The required data is a file containing the fragments to analyze.

Input data are text files with the following columns:

  1. RT: Retention time in seconds (unused)

  2. mass: Measured mass, m/z

  3. mass_u_ppm: Uncertainty of the measured mass (measurement noise), ppm, 1 sigma

  4. mass_cal_u_ppm: Uncertainty of the mass calibration, ppm, 1 sigma

  5. area: Measured signal intensity

  6. area_u: Uncertainty of the measured intensity (unsued)

  7. peak_width: Width of the mass peak, 1 sigma

  8. peak_alpha: Fraction of Lorenzian peak shape, use zero if your peak is Gaussian.

  9. compound_bin: Index of time bin, all co-eluting masses should have the same value.

  10. LOD: Limit of detection for the measured mass (unsued)

  11. SN: Ratio signal/noise (unused)

  12. Ionisation: Ionisation type, can be EI or CI

  13. Adduct: The molecule that is used as adduct, required for CI ionization

  14. Spectrum_id: Id of the belonging spectrum, (if more than one spectrum is analyzed)

A few columns are not used yet but may be taken into account in future developments.

Some examples can be found on the gitlab repository: https://gitlab.com/empa503/atmospheric-measurements/alpinac/-/tree/master/data/nontarget_screening/fragments

Reading the output#

Using the command line interface, the output is saved in a folder containing many files. If you use the python interface, the output is returned as a dictionary of dedicated classes to store the output. Here we simply describe the output of the command line interface, the python objects are described in the API section.

The identification returns many results. Each compound from the input file is saved separately. The most important are the following:

  • The most likely molecular ions, with assigned probablities: most_likely_mol_ions.txt

  • The identification of each individual fragments in the file: results_file_mygu.txt containing the following columns:

  1. Percentage of signal for the chemical formula, compared to the sum of measured signal

  2. Retention time

  3. Percentage of signal compared to the largest peak - the largest peak has a value of 100.

  4. Measured mass, m/z

  5. Reconstructed exact mass, m/z

  6. Combined uncertainty of the measured mass as computed from Eq. (1) in the paper, ppm

  7. Mass difference between measured and exact masses, ppm

  8. Reconstructed chemical formula

  9. DBE (double bound equivalent)

  10. Absolute intensity assigned to this chemical formula

  11. Fraction of the measured peak assigned to this chemical formula

  12. Ionisation type