BioContext is a text mining system for extracting information about molecular processes in biomedical articles.

Using the data extracted by BioContext, it is possible to get an overview of a range of biomolecular processes relating to a particular gene (example), or anatomical location (example).

BioContext is the subject of the following papers: The following PhD thesis also contains, among other things, extensive analysis of BioContext data in order to identify instances where statements in different papers contrast or are in direct conflict with each other:
For questions, suggestions or bug reports, please contact Martin Gerner or Farzaneh Sarafraz.
To navigate back: the Nenadic group or the Bergman lab.

BioContext dataset query interface

Event type: Gene expression
Protein catabolism
Positive regulation
Negative regulation


Data Downloads

All data was extracted from the complete MEDLINE (2011 baseline files) and open-access subset of PubMed Central (download as of May 2011). Rows that are filled with NULL values indicate the absence of data (i.e., a null row in the gene table for PMID 123456 means that no genes were found for that document).

Source code and binary downloads


  1. Gerner, M., Nenadic, G. and Bergman, C. M. (2010). "LINNAEUS: a species name identification system for biomedical literature." BMC Bioinformatics 11: 85.
  2. Hakenberg, J., Gerner, M., Haeussler, M., Solt, I., Plake, C., Schroeder, M., Gonzalez, G., Nenadic, G. and Bergman, C. M. (2011). "The GNAT library for local and remote gene mention normalization." Bioinformatics 27(19): 2769-71.
  3. Huang, M., Liu, J. and Zhu, X. (2011). "GeneTUKit: a software for document-level gene normalization." Bioinformatics 27(7): 1032-3.
  4. Björne, J., Heimonen, J., Ginter, F., Airola, A., Pahikkala, T. and Salakoski, T. (2009). "Extracting complex biological events with rich graph-based feature sets." In Proceedings of the Workshop on BioNLP: Shared Task Boulder, Colorado: 10-18.
  5. Miwa, M., Pyysalo, S., Hara, T., Tsujii, J. (2010). "Evaluating Dependency Representation for Event Extraction". In the 23rd International Conference on Computational Linguistics (COLING 2010). pp. 779--787, August 2010
  6. Sarafraz, F. and Nenadic, G. (2010). "Using SVMs with the Command Relation Features to Identify Negated Events in Biomedical Literature." In The Workshop on Negation and Speculation in Natural Language Processing, Uppsala, Sweden.
  7. Sagae, K. and Tsujii, J. (2007). "Dependency parsing and domain adaptation with LR models and parser ensembles." In CoNLL 2007 Shared Task.
  8. Sagae, K., Miyao, Y. and Tsujii, J. i. (2008). "Comparative Parser Performance Analysis across Grammar Frameworks through Automatic Tree Conversion using Synchronous Grammars." In COLING 2008.
  9. McClosky, D., Charniak, E. and Johnson, M. (2006). "Effective Self-Training for Parsing." In HLT-NAACL.

Last updated: February 12th, 2012.