Welcome to my blog, where I periodically post about data science and AI, do-it-yourself projects and other tech related topics.

Smith Waterman Distance for feature extraction in NLP

I recently competed in a competition. The task was to classify text with multi-labels. Therefore, I started with a basic bag of words approach, which performed quite good. After analyzing the data a bit, I realized that some keywords came up in slightly different representation – which for bag of words is a bit unfavorable. …