From documents to datasets: challenges and solutions in the context of IDMP and pharmacology

Industry

To obtain authorization to bring a medicinal product on the market, 200,000 pages of text need to be submitted. The upcoming effectuation of the IDMP directive (EU) forces pharma companies to submit datasets instead. This has enormous impact. The challenges that this poses are manifold. Semantic Web technology is optimally positioned to address many of these. This presentation focusses on one of these challenges. When the authorization for an existing product has to be renewed, an IDMP-compliant dataset has to be compiled. Some 70 to 80 percent of the datapoints is described in the text and not obtainable from IT-systems. Manual data entry is error prone and does not scale, since it is estimated that the total number of datapoints may exceed 1700 for a single submission. Based on state of the art entity extraction software, a solution is developed that generates those parts of the dataset that can be obtained from the text. The presentation describes some of the major challenges that had to be overcome and details the solutions that were found. It presents some results and describes the major business requirements that need to be met.

Speakers:

Jan Voskuil

CEO

Taxonic
http://www.taxonic.com/en/

After obtaining a PhD in theoretical linguistics, Jan worked for several start-ups in the field of artificial intelligence. Jan has worked as senior solution architect at a major systems integrator and was involved in several large-scale, high-profile innovation programs.

Search form

From documents to datasets: challenges and solutions in the context of IDMP and pharmacology

Speakers:

Jan Voskuil