Quick Start#

Installation#

You can clone the repository of alpinac : https://gitlab.com/empa503/atmospheric-measurements/alpinac .

The install with pip:

pip install .

This should install alpinac and all the requirements.

Usage#

From Python#

The main function for using alpinac is :py:func:alpinac.mode_identification.main.make_identification . This can be called from python direcly by specifying the correct arguments.

The function make_identification must be imported from alpinac using:

from alpinac.mode_identification.main import make_identification

Command Line#

If you have a file containing the fragments to analyze, you can use the command line interface of alpinac:

python -m alpinac.mode_identification <filename> [--target-elements <elements> --target-formula <formula>  --verbose-knapsack]

You can see more details on the command line usage using:

python -m alpinac.mode_identification --help

Input data#

The required data is a file containing the fragments to analyze.

Input data are text files with the following columns:

RT: Retention time in seconds (unused)
mass: Measured mass, m/z
mass_u_ppm: Uncertainty of the measured mass (measurement noise), ppm, 1 sigma
mass_cal_u_ppm: Uncertainty of the mass calibration, ppm, 1 sigma
area: Measured signal intensity
area_u: Uncertainty of the measured intensity (unsued)
peak_width: Width of the mass peak, 1 sigma
peak_alpha: Fraction of Lorenzian peak shape, use zero if your peak is Gaussian.
compound_bin: Index of time bin, all co-eluting masses should have the same value.
LOD: Limit of detection for the measured mass (unsued)
SN: Ratio signal/noise (unused)
Ionisation: Ionisation type, can be EI or CI
Adduct: The molecule that is used as adduct, required for CI ionization
Spectrum_id: Id of the belonging spectrum, (if more than one spectrum is analyzed)

A few columns are not used yet but may be taken into account in future developments.

Some examples can be found on the gitlab repository: https://gitlab.com/empa503/atmospheric-measurements/alpinac/-/tree/master/data/nontarget_screening/fragments

Reading the output#

Using the command line interface, the output is saved in a folder containing many files. If you use the python interface, the output is returned as a dictionary of dedicated classes to store the output. Here we simply describe the output of the command line interface, the python objects are described in the API section.

The identification returns many results. Each compound from the input file is saved separately. The most important are the following:

The most likely molecular ions, with assigned probablities: most_likely_mol_ions.txt
The identification of each individual fragments in the file: results_file_mygu.txt containing the following columns:

Percentage of signal for the chemical formula, compared to the sum of measured signal
Retention time
Percentage of signal compared to the largest peak - the largest peak has a value of 100.
Measured mass, m/z
Reconstructed exact mass, m/z
Combined uncertainty of the measured mass as computed from Eq. (1) in the paper, ppm
Mass difference between measured and exact masses, ppm
Reconstructed chemical formula
DBE (double bound equivalent)
Absolute intensity assigned to this chemical formula
Fraction of the measured peak assigned to this chemical formula
Ionisation type