7 followers 0 articles/week
GPT-3, a Giant Step for Deep Learning and NLP

A few days ago, OpenAI announced a new successor to their Language Model (LM) - GPT-3. This is the largest model trained so far, with 175 billion parameters. While training this large model has its merits, reading a large portion of 72 pages can be tiresome. In this blog post I’ll highlight the parts that I find interesting for people familiar with...

Wed Jun 3, 2020 13:46
The accessibility of GPT-2 - text generation and fine-tuning

Natural Language Generation (NLG) is a well studied subject among the NLP community. With the rise of deep learning methods, NLG has become better and better. Recently, OpenAI has pushed the limits, with the release of GPT-2 - a Transformers based model that predicts the next token at each time space. Nowadays it’s quite easy to use these models -...

Thu Nov 28, 2019 15:16
Mixture of Variational Autoencoders - a Fusion Between MoE and VAE

The Variational Autoencoder (VAE) is a paragon for neural networks that try to learn the shape of the input space. Once trained, the model can be used to generate new samples from the input space. If we have labels for our input data, it’s also possible to condition the generation process on the label. In the MNIST case, it means we can specify...

Tue Apr 2, 2019 09:19
TensorFlow — The Scope of Software Engineering

So you’ve finished training your model, and it’s time to get some insights as to what it has learned. You decide which tensor should be interesting, and go look for it in your code — to find out what its name is. Then it hits you — you forgot to give it a name. You also forgot to wrap the logical code block with a named scope. It means you’ll have...

Tue Feb 5, 2019 23:06
Preparing for the Unexpected

Some of the problems we tackle using machine learning involve categorical features that represent real world objects, such as words, items and categories. So what happens when at inference time we get new object values that have never been seen before? How can we prepare ourselves in advance so we can still make sense out of the input? Unseen values,...

Mon Jan 28, 2019 01:07
Think your Data Different

In the last couple of years deep learning (DL) has become a main enabler for applications in many domains such as vision, NLP, audio, click stream data etc. Recently researchers started to successfully apply deep learning methods to graph datasets in domains like social networks, recommender systems and biology, where data is inherently structured...

Mon Jan 21, 2019 23:02

Build your own newsfeed

Ready to give it a go?
Start a 14-day trial, no credit card required.

Create account