Guus Bosman

software engineering director


You are here

internet

Generating Dutch language using Nescio's works and a GPT

I'm mostly done with the Natural Language Processing specialization and tonight I was playing around with a fun proof of concept written by Andrej Karpathy.

The proof of concept is a tiny transformer architecture, "GPT mini". Unlike its big GPT-3 and GPT-4 brothers, this tiny model is character based, not word based. The text I used are three short stories by author Nescio -- which, in fact, constitute his complete works, around 200k characters. I trained it for 20,000 iterations.

internet

Learning about Attention

I've completed the first three courses of the Natural Language Processing specialization at Coursera, and started the fourth one today. I really enjoyed the the courses so far but this fourth course is the best part: learning about Attention.

The basic concept of this was introduced in 2014, three years later the famous "Attention is all you need" paper came out.

Recent comments

Recently read

Books I've recently read: