Web: https://www.reddit.com/r/LanguageTechnology/comments/xhw7bg/a_simple_tool_to_check_the_rarity_of_words_or/

Sept. 19, 2022, 12:01 a.m. | /u/milliondollarhaircut

Natural Language Processing reddit.com

Inspired by [this post](https://www.reddit.com/r/LanguageTechnology/comments/xgy617/any_easy_tool_to_cherry_pick_rare_words_from_text/) from yesterday, I wrote a script that can rank the rarity of words in an input string, or, alternatively, can return a list of only the rare words included in an input string.

[Here's the repo](https://github.com/cmwxyz/word-rarity), which contains a more in-depth description of how it works.

I wanted to turn this into a module, but that process is more involved than I realized, so it may be a few more days before I figure all that …

extract languagetechnology text tool words

Research Scientists

@ ODU Research Foundation | Norfolk, Virginia

Embedded Systems Engineer (Robotics)

@ Neo Cybernetica | Bedford, New Hampshire

2023 Luis J. Alvarez and Admiral Grace M. Hopper Postdoc Fellowship in Computing Sciences

@ Lawrence Berkeley National Lab | San Francisco, CA

Senior Manager Data Scientist

@ NAV | Remote, US

Senior AI Research Scientist

@ Earth Species Project | Remote anywhere

Research Fellow- Center for Security and Emerging Technology (Multiple Opportunities)

@ University of California Davis | Washington, DC

Staff Fellow - Data Scientist

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Staff Fellow - Senior Data Engineer

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Data Scientist (Analytics) - Singapore

@ Momos | Singapore, Central, Singapore

Machine Learning Scientist, Drug Discovery

@ Flagship Pioneering, Inc. | Cambridge, MA

Applied Scientist - Computer Vision

@ Flawless | Los Angeles, California, United States

Sr. Data Engineer, Customer Service

@ Wayfair Inc. | Boston, MA