• Products
    Tagalog
  • Services
  • References
  • Blog
  • About us
  • Get in touch

NLP Town blog

Just some things that have kept us busy recently.

25 May 2021

Adventures in Zero-Shot Text Classification

Transfer learning has had an enormous impact in Natural Language Processing. Thanks to models like BERT, it is now possible to train more accurate NLP models than before, and typically do so with less labeled data. Now that finetuning language models has become the standard procedure in NLP, it’s only natural to get curious and ask: do we need any task-specific labeled training items at all? In this article, we investigate two available models for zero-shot text classification and evaluate...

by Yves Peirsman

16 August 2019

Distilling BERT Models with spaCy

Transfer learning is one of the most impactful recent breakthroughs in Natural Language Processing. Less than a year after its release, Google's BERT and its offspring (RoBERTa, XLNet, etc.) dominate most of the NLP leaderboards. While it can be a headache to put these enormous models into production, various solutions exist to reduce their size considerably. At NLP Town we successfully applied model distillation to train spaCy's text classifier to perform almost as well as BERT on sentiment analysis of...

by Yves Peirsman

2 May 2018

Comparing Sentence Similarity Methods

Word embeddings have become widespread in Natural Language Processing. They allow us to easily compute the semantic similarity between two words, or to find the words most similar to a target word. However, often we're more interested in the similarity between two sentences or short texts. In this blog post, we'll compare the most popular ways of computing sentence similarity and investigate how they perform. For people interested in the code, there's a companion Jupyter Notebook with all the details....

by Yves Peirsman

31 January 2018

Why computers don’t yet read better than us

Artificial Intelligence is on a roll these days. It feels like the media report a new breakthrough every day. In 2017, computer and board games were at the center of public attention, but this year things look different. In the early days of 2018, both Microsoft and Alibaba claimed to have developed software that can read as well as humans do. Sensational headlines followed suit. CNN wrote that “Computers are getting better than humans at reading”, while Newsweek feared “Computers...

by Yves Peirsman

12 September 2017

Named Entity Recognition and the Road to Deep Learning

Not so very long ago, Natural Language Processing looked very different. In sequence labelling tasks such as Named Entity Recognition, Conditional Random Fields were the go-to model. The main challenge for NLP engineers consisted in finding good features that captured their data well. Today, deep learning has replaced CRFs at the forefront of sequence labelling, and the focus has shifted from feature engineering to designing and implementing effective neural network architectures. Still, the old and the new-style NLP are not...

by Yves Peirsman

5 July 2017

Perplexed by Game of Thrones.

N-grams have long been part of the arsenal of every NLPer. These fixed-length word sequences are not only ubiquitous as features in NLP tasks such as text classification, but also formed the basis of the language models underlying machine translation and speech recognition. However, with the advent of recurrent neural networks and the diminishing role of feature selection in deep learning, their omnipresence could quickly become a thing of the past. Are we witnessing the end of n-grams in Natural...

by Yves Peirsman

10 April 2017

Anything2Vec, or How Word2Vec Conquered NLP

Word embeddings are one of the main drivers behind the success of deep learning in Natural Language Processing. Even technical people outside of NLP have often heard of word2vec and its uncanny ability to model the semantic relationship between a noun and its gender or the names of countries and their capitals. But the success of word2vec extends far beyond the word level. Inspired by this list of word2vec-like models, I set out to explore embedding methods for a broad...

by Yves Peirsman

26 January 2017

Understanding Deep Learning Models in NLP

Deep learning methods have taken Artificial Intelligence by storm. As their dominance grows, one inconvenient truth is slowly emerging: we don’t actually understand these complex models very well. Our lack of understanding leads to some uncomfortable questions. Do we want to travel in self-driving cars whose inner workings no one really comprehends? Can we base important decisions in business or healthcare on models whose reasoning we don’t grasp? It’s problematic, to say the least. In line with the general evolutions...

by Yves Peirsman

10 December 2016

DIY methods for sentiment analysis

In my previous blog post, I explored the wide variety of off-the-shelf solutions that are available for sentiment analysis. The variation in accuracy, both within and between models, led to the question whether you’re better off building your own model instead of trusting a pre-trained solution. I’ll address that dilemma in this blog post. Training your own sentiment analysis model obviously brings with it its own challenges. First, you need a large number of relevant texts that have been labelled...

by Yves Peirsman

23 November 2016

Off-the-shelf methods for sentiment analysis

Sentiment analysis is one of the most popular applications of Natural Language Processing. Many companies run software that automatically classifies a text as positive, negative or neutral to monitor how their products are received online. Other people use sentiment analysis to conduct political analyses: they track the average sentiment in tweets that mention the US presidential candidates, or show that Donald Trump’s tweets are much more negative than those posted by his staff. There are even companies that rely on...

by Yves Peirsman

21 September 2016

Text Classification Made Simple

When you need to tackle an NLP task — say, text classification or sentiment analysis — the sheer number of available software options can be overwhelming. Task-specific packages, generic libraries and cloud APIs all claim to offer the best solution to your problem, and it can be hard to decide which one to use. In this blog post we’ll take a look at some of the available options for a text classification task, and discover their main advantages and disadvantages....

by Yves Peirsman

27 June 2016

NLP in the Cloud: Measuring the Quality of NLP APIs

Natural Language Processing seems to have become somewhat of a commodity in recent years. More than a few companies have sprung up that offer basic NLP capabilities through a cloud API. If you’d like to know whether a text carries a positive or negative message, or what people or companies it mentions, you can just send it to one of these black boxes, and receive the answer in less than a second. Superficially, all these NLP APIs look more or...

by Yves Peirsman

29 May 2016

NLP People: the 2016 NLP job market analysis

When the University of Leuven asked me to give a guest lecture in their Master of Artificial Intelligence earlier this year, one thing I set out to do was to give students an idea of the opportunities in the NLP job market. I contacted Alex and Maxim from NLP People, and they were so kind to give me access to their database of job ads. My analysis of their data brought to light some interesting patterns, and was posted on...

by Yves Peirsman

9 August 2015

Generating Genre Fiction with Deep Learning

These days Deep Learning is everywhere. Neural networks are used for just about every task in Natural Language Processing — from named entity recognition to sentiment analysis and machine translation. A few months ago, Andrej Karpathy, PhD student at Stanford University, released a small software package for automatically generating texts with a recurrent neural network. I wanted to find out how it performs when it is asked to generate genre fiction, such as fantasy or chick lit. Andrej’s model is...

by Yves Peirsman

  • LinkedIn
  • Contact us
© by NLP Town