TY - JOUR
T1 - Allergenicity prediction of novel and modified proteins
T2 - Not a mission impossible! Development of a Random Forest allergenicity prediction model
AU - Westerhout, Joost
AU - Krone, Tanja
AU - Snippe, Almar
AU - Babé, Lilia
AU - McClain, Scott USRE
AU - Ladics, Gregory S.
AU - Houben, Geert GF
AU - Verhoeckx, Kitty CM
PY - 2019/10
Y1 - 2019/10
N2 - Alternative and sustainable protein sources (e.g., algae, duckweed, insects) are required to produce (future) foods. However, introduction of new food sources to the market requires a thorough risk assessment of nutritional, microbial and toxicological risks and potential allergic responses. Yet, the risk assessment of allergenic potential of novel proteins is challenging. Currently, guidance for genetically modified proteins relies on a weight-of-evidence approach. Current Codex (2009) and EFSA (2010; 2017) guidance indicates that sequence identity to known allergens is acceptable for predicting the cross-reactive potential of novel proteins and resistance to pepsin digestion and glycosylation status is used for evaluating de novo allergenicity potential. Other physicochemical and biochemical protein properties, however, are not used in the current weight-of-evidence approach. In this study, we have used the Random Forest algorithm for developing an in silico model that yields a prediction of the allergenic potential of a protein based on its physicochemical and biochemical properties. The final model contains twenty-nine variables, which were all calculated using the protein sequence by means of the ProtParam software and the PSIPred Protein Sequence Analysis program. Proteins were assigned as allergenic when present in the COMPARE database. Results show a robust model performance with a sensitivity, specificity and accuracy each greater than ≥85%. As the model only requires the protein sequence for calculations, it can be easily incorporated into the existing risk assessment approach. In conclusion, the model developed in this study improves the predictability of the allergenicity of new or modified food proteins, as demonstrated for insect proteins.
AB - Alternative and sustainable protein sources (e.g., algae, duckweed, insects) are required to produce (future) foods. However, introduction of new food sources to the market requires a thorough risk assessment of nutritional, microbial and toxicological risks and potential allergic responses. Yet, the risk assessment of allergenic potential of novel proteins is challenging. Currently, guidance for genetically modified proteins relies on a weight-of-evidence approach. Current Codex (2009) and EFSA (2010; 2017) guidance indicates that sequence identity to known allergens is acceptable for predicting the cross-reactive potential of novel proteins and resistance to pepsin digestion and glycosylation status is used for evaluating de novo allergenicity potential. Other physicochemical and biochemical protein properties, however, are not used in the current weight-of-evidence approach. In this study, we have used the Random Forest algorithm for developing an in silico model that yields a prediction of the allergenic potential of a protein based on its physicochemical and biochemical properties. The final model contains twenty-nine variables, which were all calculated using the protein sequence by means of the ProtParam software and the PSIPred Protein Sequence Analysis program. Proteins were assigned as allergenic when present in the COMPARE database. Results show a robust model performance with a sensitivity, specificity and accuracy each greater than ≥85%. As the model only requires the protein sequence for calculations, it can be easily incorporated into the existing risk assessment approach. In conclusion, the model developed in this study improves the predictability of the allergenicity of new or modified food proteins, as demonstrated for insect proteins.
KW - Allergenicity assessment
KW - Allergenicity prediction
KW - Food allergy
KW - Novel and modified proteins
KW - Random forest
UR - http://www.scopus.com/inward/record.url?scp=85069578492&partnerID=8YFLogxK
U2 - 10.1016/j.yrtph.2019.104422
DO - 10.1016/j.yrtph.2019.104422
M3 - Article
C2 - 31310847
AN - SCOPUS:85069578492
SN - 0273-2300
VL - 107
JO - Regulatory Toxicology and Pharmacology
JF - Regulatory Toxicology and Pharmacology
M1 - 104422
ER -