June 2023 | Guus Bosman.nl

Generating Dutch language using Nescio's works and a GPT

Mon, 06/12/2023 - 10:06pm

I'm mostly done with the Natural Language Processing specialization and tonight I was playing around with a fun proof of concept written by Andrej Karpathy.

The proof of concept is a tiny transformer architecture, "GPT mini". Unlike its big GPT-3 and GPT-4 brothers, this tiny model is character based, not word based. The text I used are three short stories by author Nescio -- which, in fact, constitute his complete works, around 200k characters. I trained it for 20,000 iterations.

Read more about Generating Dutch language using Nescio's works and a GPT
Log in to post comments

Learning about Attention

Sun, 06/04/2023 - 8:32am

I've completed the first three courses of the Natural Language Processing specialization at Coursera, and started the fourth one today. I really enjoyed the the courses so far but this fourth course is the best part: learning about Attention.

The basic concept of this was introduced in 2014, three years later the famous "Attention is all you need" paper came out.

Read more about Learning about Attention
Log in to post comments

Guus Bosman

software executive and technologist

You are here

Generating Dutch language using Nescio's works and a GPT

Learning about Attention

Navigation

Recent comments

Recently read