Hacker News new | past | comments | ask | show | jobs | submit login

Many of the criticisms voiced in this thread stem from a lack of expertise in biomedical Natural Language Processing and text mining.

Various annotated datasets and models already exist within the field which can extract potentially useful information and be used in downstream task for targetted document and information retrieval. Biomedical text mining [1] is wide a subfield with plenty of open datasets and competitions such as the bi-annual ACL BioNLP workshop [1].

- Biomedical Named Entity Recognition: extract names of proteins, drugs, diseases, symptoms, etc. and classify their biomedical category [3]. Extracting the terms of symptoms is a crucial in document discovery and modeling and knowledge-base creation. Several open datasets can be found here [4].

- Biomedical relation and event extraction: Traditionally focused on extracting protein-protein interactions, which are crucial for virtually every process in a living cell. Information about these interactions provides the foundations for new therapeutic approaches. Recently interest have been shifted to the extraction of complex relations such as biomolecular events. [2] These methods can detect and classify the causal relations between the genes and proteins in a sentence like "TNF-alpha is a rapid activator of IL-8 gene expression by...".

- Document retrieval: Helping researchers and medical staff find relevant topic-specific papers by improving search with topic modeling, document similarity, named entities, etc.

These are only some examples of common biomedical text mining tasks and there are plenty more. Now of course, relying on previous annotated data is an issue because the tagged categories might not relevant for many of the issues related to COVID19. However, even unsupervised modeling like using SciBERT to create topic models or document clusters of related documents can be helpful for scientific discovery.

1. https://en.wikipedia.org/wiki/Biomedical_text_mining

2. https://aclweb.org/aclwiki/BioNLP_Workshop

3. https://www.hindawi.com/journals/cmmm/2015/571381/

4. http://gcancer.org/clstmdata/

5. https://bmcbioinformatics.biomedcentral.com/articles/10.1186...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: