Welcome to Profoundly Mundane! This blog discusses engineering in the context of Natural Language Processing (NLP). While the amount of resources and information available has greatly increased within the last few years, many ‘simple’ subjects remain under-discussed. This makes it hard to understand the trade-offs when selecting NLP methods, unlike more established engineering fields. The goal of this blog is to explore the hidden depth within the seemingly mundane aspects of NLP and make understanding the different design choices more accessible.
Posts
-
Deep dive: On the Theoretical Limitations of Embedding-Based Retrieval
-
Temperature, Tokens, and Long Tales/Tails
-
WIP: Using Landmarks to Extract Spans with Prompting
-
Practical Tidbits: Taking a Magnifying Glass to (Text) Classifier Performance
-
Work In Progress: LLMs for ETL
-
Improving the NLP Tool Kit: Characterization
-
Fun with Words: A Foray into Solving NYT Connections via Decomposition
-
Fun With Words: NYT Connections
-
Quick and Dirty Metric to Imperial Conversions (How to Entertain Yourself as an American Driving in a Metric Country)
-
Negative Result: Improving Fixed Vocab Text Representations
-
Practical Tidbits: To Pickle or Not to Pickle
-
Practical Tidbits: Selecting MinHash Hyperparameters for Deduplication
-
Practical Tidbits: ElasticSearch with custom Embeddings (Vectors) for Versions Greater than 7.6
-
Original Work: “Nudging” Active Learning to Learn Minority Classes
-
An In-depth Discussion of Textual Similarity: Taking a look at the toolkit
-
A Dream: An Easy Way to Work with Documents and (implicitly) Structured Text
-
An In-depth Discussion of Textual Similarity: Characteristics and When They Matter
-
An In-depth Discussion of Textual Similarity: Starting the Conversation
subscribe via RSS