OpenAI organised a challenge to solve coding problems with the aid of an AI assistant. This is a review of the challenge, and first impressions on working with an AI pair-programmer.

OpenAI Codex

OpenAI is an AI research and development company. You might have heard some buzz about one of its products: GPT-3. GPT-3 is a language model that can generate human-like text. It can be used for chatting, text auto-completion, text summarisation, grammar correction, translation, etc.

GPT-3 demo.gif

Checkout OpenAI API to access the playground.

Codex is a descendant of GPT-3, trained on natural language data and publicly available source-codes (e.g. from public GitHub repos). Codex translates a natural language prompt to code. It is the very model that powers GitHub Copilot — an AI pair-programmer (checkout the site for demos, it is fascinating).


Credits: OpenAI

OpenAI recently released an API to access Codex (in beta). The demos attached with the release were a cause for consternation. Codex is proficient in a dozen (programming) languages. It can be used for code generation, refactoring, autocompletion, transpilation (translating source-code b/w languages), code explanation, etc. To show off Codex, OpenAI recently organised a challenge.

The Challenge

The challenge was to solve a series of (five) programming puzzles in Python. The only twist — you can use Codex as a pair-programmer. It was a time-judged competition, with a temporal cap. Not surprisingly, Codex itself was a participant (not just as a helper).


The problems were simple. ~830 "people" (Codex included) were able to solve all five of them. I had to solve the first two challenges manually (OpenAI server issues). "Had to" because it was a race against time (& top 500 win an OpenAI t-shirt). For the other three, however, I was able to call in the cavalry (it was pretty climactic).

The novel experience of watching an AI auto-generate code is amazing. Just type a docstring — describing the procedure — and watch the code develop. If you're an old-time programmer, you'll get the notion when you experience it.


I've illustrated one problem statement where I used Codex to generate a solution.

PROBLEM Parse the given Python source code and return the list of full-qualified paths for all imported symbols, sorted in ascending lexicographic order.

CONSTRAINTS The input will not contain any wildcard imports (from ... import *). Ignore aliases (renamings): from x import y as z should be represented as x.y.

LIBRARY SUGGESTION Consider using the [ast](<>) module.

EXAMPLES Input import os import concurrent.futures from os import path as renamed_path from typing import ( List, Tuple )

Output ['concurrent.futures', 'os', 'os.path', 'typing.List', 'typing.Tuple']

Codex it!

I just formulated the docstring. Using the doc, imported libs and function signature, it generated an (almost) functional code:

Codex Prob-4 demo.gif

Pretty impressive. After just one or two manual bug sweeps, the code passed all testcases! Final script:

import ast
from typing import List

def parse_imports(code: str) -> List[str]:
    Parse all the imports in the code using ast module.
    Imports of the form 'from x import y' should be appended as 'x.y'. 
		Ignore any alias. Append each import type to a list 
		and return the sorted list.
    symbols = []
    for node in ast.walk(ast.parse(code)):
        if isinstance(node, ast.Import):
            for name in node.names:
        elif isinstance(node, ast.ImportFrom):
            for name in node.names:
                symbols.append(node.module + '.' +
    print(code, symbols)
    return sorted(symbols)

# Examples
print(parse_imports('import os'))
print(parse_imports('import os\\nfrom typing import List'))