April 2, 2024, 7:53 p.m. | Mengzhou Hu, Sahar Alkhairy, Ingoo Lee, Rudolf T. Pillich, Dylan Fong, Kevin Smith, Robin Bachelder, Trey Ideker, Dexter Pratt

cs.CL updates on arXiv.org arxiv.org

arXiv:2309.04019v2 Announce Type: replace-cross
Abstract: Gene set analysis is a mainstay of functional genomics, but it relies on curated databases of gene functions that are incomplete. Here we evaluate five Large Language Models (LLMs) for their ability to discover the common biological functions represented by a gene set, substantiated by supporting rationale, citations and a confidence assessment. Benchmarking against canonical gene sets from the Gene Ontology, GPT-4 confidently recovered the curated name or a more general concept (73% of cases), …

abstract analysis arxiv cs.ai cs.cl databases discovery evaluation five function functional functions gene genomics language language models large language large language models llms q-bio.gn q-bio.mn set type

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Data Engineer

@ Paxos | Remote - United States

Data Analytics Specialist

@ Media.Monks | Kuala Lumpur

Software Engineer III- Pyspark

@ JPMorgan Chase & Co. | India

Engineering Manager, Data Infrastructure

@ Dropbox | Remote - Canada

Senior AI NLP Engineer

@ Hyro | Tel Aviv-Yafo, Tel Aviv District, Israel