Journalist/developer. Storytelling developer @ USA Today Network. Builder of @HomicideWatch. Sinophile for fun. Past: @frontlinepbs @WBUR, @NPR, @NewsHour.
2209 stories
·
45 followers

Your job is to deliver code you have proven to work

1 Share

In all of the debates about the value of AI-assistance in software development there's one depressing anecdote that I keep on seeing: the junior engineer, empowered by some class of LLM tool, who deposits giant, untested PRs on their coworkers - or open source maintainers - and expects the "code review" process to handle the rest.

This is rude, a waste of other people's time, and is honestly a dereliction of duty as a software developer.

Your job is to deliver code you have proven to work.

As software engineers we don't just crank out code - in fact these days you could argue that's what the LLMs are for. We need to deliver code that works - and we need to include proof that it works as well. Not doing that directly shifts the burden of the actual work to whoever is expected to review our code.

How to prove it works

There are two steps to proving a piece of code works. Neither is optional.

The first is manual testing. If you haven't seen the code do the right thing yourself, that code doesn't work. If it does turn out to work, that's honestly just pure chance.

Manual testing skills are genuine skills that you need to develop. You need to be able to get the system into an initial state that demonstrates your change, then exercise the change, then check and demonstrate that it has the desired effect.

If possible I like to reduce these steps to a sequence of terminal commands which I can paste, along with their output, into a comment in the code review. Here's a recent example.

Some changes are harder to demonstrate. It's still your job to demonstrate them! Record a screen capture video and add that to the PR. Show your reviewers that the change you made actually works.

Once you've tested the happy path where everything works you can start trying the edge cases. Manual testing is a skill, and finding the things that break is the next level of that skill that helps define a senior engineer.

The second step in proving a change works is automated testing. This is so much easier now that we have LLM tooling, which means there's no excuse at all for skipping this step.

Your contribution should bundle the change with an automated test that proves the change works. That test should fail if you revert the implementation.

The process for writing a test mirrors that of manual testing: get the system into an initial known state, exercise the change, assert that it worked correctly. Integrating a test harness to productively facilitate this is another key skill worth investing in.

Don't be tempted to skip the manual test because you think the automated test has you covered already! Almost every time I've done this myself I've quickly regretted it.

Make your coding agent prove it first

The most important trend in LLMs in 2025 has been the explosive growth of coding agents - tools like Claude Code and Codex CLI that can actively execute the code they are working on to check that it works and further iterate on any problems.

To master these tools you need to learn how to get them to prove their changes work as well.

This looks exactly the same as the process I described above: they need to be able to manually test their changes as they work, and they need to be able to build automated tests that guarantee the change will continue to work in the future.

Since they're robots, automated tests and manual tests are effectively the same thing.

They do feel a little different though. When I'm working on CLI tools I'll usually teach Claude Code how to run them itself so it can do one-off tests, even though the eventual automated tests will use a system like Click's CLIRunner.

When working on CSS changes I'll often encourage my coding agent to take screenshots when it needs to check if the change it made had the desired effect.

The good news about automated tests is that coding agents need very little encouragement to write them. If your project has tests already most agents will extend that test suite without you even telling them to do so. They'll also reuse patterns from existing tests, so keeping your test code well organized and populated with patterns you like is a great way to help your agent build testing code to your taste.

Developing good taste in testing code is another of those skills that differentiates a senior engineer.

The human provides the accountability

A computer can never be held accountable. That's your job as the human in the loop.

Almost anyone can prompt an LLM to generate a thousand-line patch and submit it for code review. That's no longer valuable. What's valuable is contributing code that is proven to work.

Next time you submit a PR, make sure you've included your evidence that it works as it should.

You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

Read the whole story
chrisamico
54 minutes ago
reply
Boston, MA
Share this story
Delete

A Software Library with No Code

1 Share

All You Need is Specs?

Today I’m releasing whenwords, a relative time formatting library that contains no code.

whenwords provides five functions that convert between timestamps and human-readable strings, like turning a UNIX timestamp into “3 hours ago”.

There are many libraries that perform similar functions. But none of them are language agnostic.

whenwords supports Ruby, Python, Rust, Elixir, Swift, PHP, and Bash. I’m sure it works in other languages, too. Those are just the languages I’ve tried and tested.

(I even implemented it as Excel formulas. Though that one requires a bit of work to install.)

But like I said: the whenwords library contains no code. Instead, whenwords contains specs and tests, specifically:

  • SPEC.md: A detailed description of how the library should behave and how it should be implemented.
  • tests.yaml: A list of language-agnostic test cases, defined as input/output pairs, that any implementation must pass.
  • INSTALL.md: Instructions for building whenwords, for you, the human.

The installation instructions are comically simple, just a prompt to paste into Claude, Codex, Cursor, whatever. It’s short enough to print here in its entirety:

Implement the whenwords library in [LANGUAGE].

1. Read SPEC.md for complete behavior specification
2. Parse tests.yaml and generate a test file
3. Implement all five functions: timeago, duration, parse_duration, 
   human_date, date_range
4. Run tests until all pass
5. Place implementation in [LOCATION]

All tests.yaml test cases must pass. See SPEC.md "Testing" section 
for test generation examples.

Pick your language, pick your location, copy, paste, and go.


Okay. This is silly. But the more I play with it, the more questions and thoughts I have.

Recent advancements in coding agents are stunning. Opus 4.5 coupled with Claude Code isn’t perfect, but its ability to implement tightly specified code is uncanny. Models and their harnesses crossed a threshold in Q4, and everyone I know using Opus 4.5 has felt it. There wasn’t a single language where Claude couldn’t implement whenwords in one shot. These capabilities are raising all sorts of questions, especially: “What does software engineering look like when coding is free?”

I’ve chewed on this question a bit, but this “software library without code” is a tangible thought experiment that helped firm up a few questions and thoughts. Specifically:

Do we still need 3rd party code libraries?

There are many utility libraries that aim to perform similar functions, but exist as language-specific implementations. Do we need them all? Or do we need one, tightly defined set of rules which we implement on demand, according to the specific conventions of a given language and project? For libraries that are simple utilities (as opposed to complex frameworks), I think the answer might be, “Yes.”

Now, whenwords is (purposely) a very simple utility. It’s five functions, doesn’t require many dependencies, and depends on a well-defined standard (Unix time). It’s not an expensive operation, a poor implementation probably won’t be a bottleneck, and the written spec is only ~500 lines.

But there’s no reason we couldn’t get more complex. Well defined standards (like those you’d need to implement a browser) can help you tackle complex bits of software relatively quickly. The question is: when does this model make sense and when doesn’t it?

Today, I see 5 reasons why you’d want libraries with code:

1. When Performance Matters

Let’s run with that browser example. There are well-defined, large specs for how to interpret HTML, JS, and CSS. One could push these further and deliver a spec-only browser.

But performance is going to be an issue. I want to open hundreds of tabs and not spring memory leaks. I want rendering to be quick, optimized to within an inch of what’s possible. I want a large group of users going out and encountering strange websites, buggy javascript, bad imports, and more. I want people finding these issues, fixing them, and memorializing them as code.

2. When Testing is Complicated

But Drew, you say, if we find performance issues in the spec-only browser we can just update the spec. That’s true, but testing updates gets complicated fast.

Let’s say you notice whenwords has a bug in its Elixir implementation. To fix the whenwords spec, you add a line to the SPEC.md file to prevent the Elixir bug. You submit a PR request and I’m able to verify it helps Claude build a working Elixir implementation.

But did the change screw up the other variants? Does whenwords still work for Ruby, Python, Bash, and Excel? Does it work for all of them when building with Claude and Codex? What about Qwen? Do we end up with a CI/CD pipeline that builds and tests our spec against 4 coding agents and 20 languages? Or do we just say, “Screw it,” and tell users they’re responsible for whatever code produced?

This isn’t a huge deal for a library with the scope of whenwords, but for anything moderately complex, the amount of surface area we’d want to test grows quickly. whenwords has 125 tests. For comparison, SQLite has 51,445 tests. I’m not building on a spec-only implementation of a database.

3. When You Need to Provide Support & Bug Fixes

Chasing down bugs is harder with spec-only libraries because failures are inconsistent.

Let’s imagine a future where we’re shipping enterprise software as a Claude Skill, or some other similar prepared context that lets agents implement our software for our customers, depending on their environment. This is basically our “software library with no code” taken to an extreme. While there may be benefits here, there are also perils.

Replicating bugs is nearly impossible. If the customer gets stuck on an issue with their own generated codebase, how do we have a hope of finding the problem? Do we just iterate on our spec and add plenty of tests, toss it over to them, and ask them to rebuild the whole thing? Probably not. The models remain probabilistic and as our specs grow the likelihood of our implementations being significantly different grows.

4. When Updates Matter

A library I like is LiteLLM, an AI gateway that provides one interface to call many LLMs across multiple platforms. They add new models quickly, push updates to address connection issues with different platforms, and are generally very responsive.

Other foundational libraries (like nginx, Rails, Postgres) push essential security updates. These are dependencies I wish to maintain. Spec-only libraries, on the other hand, likely work best for implement-and-forget utilities and functions. When continual fixes, support, and security aren’t needed or aren’t valued.

5. When Community & Interoperability Matter

Running through all the points above is community. Lots of users mean more bugs are spotted. More contributors mean more bugs are fixed. Comprehensive testing means PRs are accepted faster. A big community increases the odds someone is available to help. Community support means code is kept up-to-date.

When you want these things, you want community. The code we rely on is not just an instantiation of a spec (a tightly defined set of concepts, aims, and requirements), but the product of people and culture that crystallize around a goal. It’s the magic of open source; why it works and why I love it.

For the job whenwords performs, we don’t need to belong to a club. But for foundations, the things we want to build on, the community is essential because it delivers the points above. Sure, there may be instances of spec-only libraries created and maintained by a vibrant community. But I imagine there will continually be a reference implementation that codifies and ties the spec to the ground.


But the above isn’t fully baked. Our models will get better, our agents more capable. And I’m sure the list above is not exhaustive. I’d enjoy hearing your thoughts on this one, do reach out.


Read the whole story
chrisamico
1 day ago
reply
Boston, MA
Share this story
Delete

Useful patterns for building HTML tools

1 Share

I've started using the term HTML tools to refer to HTML applications that I've been building which combine HTML, JavaScript, and CSS in a single file and use them to provide useful functionality. I have built over 150 of these in the past two years, almost all of them written by LLMs. This article presents a collection of useful patterns I've discovered along the way.

First, some examples to show the kind of thing I'm talking about:

  • svg-render renders SVG code to downloadable JPEGs or PNGs
  • pypi-changelog lets you generate (and copy to clipboard) diffs between different PyPI package releases.
  • bluesky-thread provides a nested view of a discussion thread on Bluesky.
screenshot of svg-render screenshot of pypi-changelog screenshot of bluesky-thread

These are some of my recent favorites. I have dozens more like this that I use on a regular basis.

You can explore my collection on tools.simonwillison.net - the by month view is useful for browsing the entire collection.

If you want to see the code and prompts, almost all of the examples in this post include a link in their footer to "view source" on GitHub. The GitHub commits usually contain either the prompt itself or a link to the transcript used to create the tool.

The anatomy of an HTML tool

These are the characteristics I have found to be most productive in building tools of this nature:

  1. A single file: inline JavaScript and CSS in a single HTML file means the least hassle in hosting or distributing them, and crucially means you can copy and paste them out of an LLM response.
  2. Avoid React, or anything with a build step. The problem with React is that JSX requires a build step, which makes everything massively less convenient. I prompt "no react" and skip that whole rabbit hole entirely.
  3. Load dependencies from a CDN. The fewer dependencies the better, but if there's a well known library that helps solve a problem I'm happy to load it from CDNjs or jsdelivr or similar.
  4. Keep them small. A few hundred lines means the maintainability of the code doesn't matter too much: any good LLM can read them and understand what they're doing, and rewriting them from scratch with help from an LLM takes just a few minutes.

The end result is a few hundred lines of code that can be cleanly copied and pasted into a GitHub repository.

Prototype with Artifacts or Canvas

The easiest way to build one of these tools is to start in ChatGPT or Claude or Gemini. All three have features where they can write a simple HTML+JavaScript application and show it to you directly.

Claude calls this "Artifacts", ChatGPT and Gemini both call it "Canvas". Claude has the feature enabled by default, ChatGPT and Gemini may require you to toggle it on in their "tools" menus.

Try this prompt in Gemini or ChatGPT:

Build a canvas that lets me paste in JSON and converts it to YAML. No React.

Or this prompt in Claude:

Build an artifact that lets me paste in JSON and converts it to YAML. No React.

I always add "No React" to these prompts, because otherwise they tend to build with React, resulting in a file that is harder to copy and paste out of the LLM and use elsewhere. I find that attempts which use React take longer to display (since they need to run a build step) and are more likely to contain crashing bugs for some reason, especially in ChatGPT.

All three tools have "share" links that provide a URL to the finished application. Examples:

Switch to a coding agent for more complex projects

Coding agents such as Claude Code and Codex CLI have the advantage that they can test the code themselves while they work on it using tools like Playwright. I often upgrade to one of those when I'm working on something more complicated, like my Bluesky thread viewer tool shown above.

I also frequently use asynchronous coding agents like Claude Code for web to make changes to existing tools. I shared a video about that in Building a tool to copy-paste share terminal sessions using Claude Code for web.

Claude Code for web and Codex Cloud run directly against my simonw/tools repo, which means they can publish or upgrade tools via Pull Requests (here are dozens of examples) without me needing to copy and paste anything myself.

Load dependencies from CDNs

Any time I use an additional JavaScript library as part of my tool I like to load it from a CDN.

The three major LLM platforms support specific CDNs as part of their Artifacts or Canvas features, so often if you tell them "Use PDF.js" or similar they'll be able to compose a URL to a CDN that's on their allow-list.

Sometimes you'll need to go and look up the URL on cdnjs or jsDelivr and paste it into the chat.

CDNs like these have been around for long enough that I've grown to trust them, especially for URLs that include the package version.

The alternative to CDNs is to use npm and have a build step for your projects. I find this reduces my productivity at hacking on individual tools and makes it harder to self-host them.

Host them somewhere else

I don't like leaving my HTML tools hosted by the LLM platforms themselves for a couple of reasons. First, LLM platforms tend to run the tools inside a tight sandbox with a lot of restrictions. They're often unable to load data or images from external URLs, and sometimes even features like linking out to other sites are disabled.

The end-user experience often isn't great either. They show warning messages to new users, often take additional time to load and delight in showing promotions for the platform that was used to create the tool.

They're also not as reliable as other forms of static hosting. If ChatGPT or Claude are having an outage I'd like to still be able to access the tools I've created in the past.

Being able to easily self-host is the main reason I like insisting on "no React" and using CDNs for dependencies - the absence of a build step makes hosting tools elsewhere a simple case of copying and pasting them out to some other provider.

My preferred provider here is GitHub Pages because I can paste a block of HTML into a file on github.com and have it hosted on a permanent URL a few seconds later. Most of my tools end up in my simonw/tools repository which is configured to serve static files at tools.simonwillison.net.

Take advantage of copy and paste

One of the most useful input/output mechanisms for HTML tools comes in the form of copy and paste.

I frequently build tools that accept pasted content, transform it in some way and let the user copy it back to their clipboard to paste somewhere else.

Copy and paste on mobile phones is fiddly, so I frequently include "Copy to clipboard" buttons that populate the clipboard with a single touch.

Most operating system clipboards can carry multiple formats of the same copied data. That's why you can paste content from a word processor in a way that preserves formatting, but if you paste the same thing into a text editor you'll get the content with formatting stripped.

These rich copy operations are available in JavaScript paste events as well, which opens up all sorts of opportunities for HTML tools.

  • hacker-news-thread-export lets you paste in a URL to a Hacker News thread and gives you a copyable condensed version of the entire thread, suitable for pasting into an LLM to get a useful summary.
  • paste-rich-text lets you copy from a page and paste to get the HTML - particularly useful on mobile where view-source isn't available.
  • alt-text-extractor lets you paste in images and then copy out their alt text.
screenshot of hacker-news-thread-export screenshot of paste-rich-text screenshot of alt-text-extractor

Build debugging tools

The key to building interesting HTML tools is understanding what's possible. Building custom debugging tools is a great way to explore these options.

clipboard-viewer is one of my most useful. You can paste anything into it (text, rich text, images, files) and it will loop through and show you every type of paste data that's available on the clipboard.

Clipboard Format Viewer. Paste anywhere on the page (Ctrl+V or Cmd+V). This shows text/rtf with a bunch of weird code, text/plain with some pasted HTML diff and a Clipboard Event Information panel that says Event type: paste, Formats available: text/rtf, text/plain, 0 files reported and 2 clipboard items reported.

This was key to building many of my other tools, because it showed me the invisible data that I could use to bootstrap other interesting pieces of functionality.

More debugging examples:

  • keyboard-debug shows the keys (and KeyCode values) currently being held down.
  • cors-fetch reveals if a URL can be accessed via CORS.
  • exif displays EXIF data for a selected photo.
screenshot of keyboard-debug screenshot of cors-fetch screenshot of exif

Persist state in the URL

HTML tools may not have access to server-side databases for storage but it turns out you can store a lot of state directly in the URL.

I like this for tools I may want to bookmark or share with other people.

  • icon-editor is a custom 24x24 icon editor I built to help hack on icons for the GitHub Universe badge. It persists your in-progress icon design in the URL so you can easily bookmark and share it.

Use localStorage for secrets or larger state

The localStorage browser API lets HTML tools store data persistently on the user's device, without exposing that data to the server.

I use this for larger pieces of state that don't fit comfortably in a URL, or for secrets like API keys which I really don't want anywhere near my server - even static hosts might have server logs that are outside of my influence.

  • word-counter is a simple tool I built to help me write to specific word counts, for things like conference abstract submissions. It uses localStorage to save as you type, so your work isn't lost if you accidentally close the tab.
  • render-markdown uses the same trick - I sometimes use this one to craft blog posts and I don't want to lose them.
  • haiku is one of a number of LLM demos I've built that request an API key from the user (via the prompt() function) and then store that in localStorage. This one uses Claude Haiku to write haikus about what it can see through the user's webcam.
screenshot of word-counter screenshot of render-markdown screenshot of haiku

Collect CORS-enabled APIs

CORS stands for Cross-origin resource sharing. It's a relatively low-level detail which controls if JavaScript running on one site is able to fetch data from APIs hosted on other domains.

APIs that provide open CORS headers are a goldmine for HTML tools. It's worth building a collection of these over time.

Here are some I like:

  • iNaturalist for fetching sightings of animals, including URLs to photos
  • PyPI for fetching details of Python packages
  • GitHub because anything in a public repository in GitHub has a CORS-enabled anonymous API for fetching that content from the raw.githubusercontent.com domain, which is behind a caching CDN so you don't need to worry too much about rate limits or feel guilty about adding load to their infrastructure.
  • Bluesky for all sorts of operations
  • Mastodon has generous CORS policies too, as used by applications like phanpy.social

GitHub Gists are a personal favorite here, because they let you build apps that can persist state to a permanent Gist through making a cross-origin API call.

  • species-observation-map uses iNaturalist to show a map of recent sightings of a particular species.
  • zip-wheel-explorer fetches a .whl file for a Python package from PyPI, unzips it (in browser memory) and lets you navigate the files.
  • github-issue-to-markdown fetches issue details and comments from the GitHub API (including expanding any permanent code links) and turns them into copyable Markdown.
  • terminal-to-html can optionally save the user's converted terminal session to a Gist.
  • bluesky-quote-finder displays quotes of a specified Bluesky post, which can then be sorted by likes or by time.
screenshot of species-observation-map screenshot of zip-wheel-explorer screenshot of github-issue-to-markdown screenshot of terminal-to-html screenshot of bluesky-quote-finder

LLMs can be called directly via CORS

All three of OpenAI, Anthropic and Gemini offer JSON APIs that can be accessed via CORS directly from HTML tools.

Unfortunately you still need an API key, and if you bake that key into your visible HTML anyone can steal it and use to rack up charges on your account.

I use the localStorage secrets pattern to store API keys for these services. This sucks from a user experience perspective - telling users to go and create an API key and paste it into a tool is a lot of friction - but it does work.

Some examples:

screenshot of haiku screenshot of openai-audio-output screenshot of gemini-bbox

Don't be afraid of opening files

You don't need to upload a file to a server in order to make use of the <input type="file"> element. JavaScript can access the content of that file directly, which opens up a wealth of opportunities for useful functionality.

Some examples:

  • ocr is the first tool I built for my collection, described in Running OCR against PDFs and images directly in your browser. It uses PDF.js and Tesseract.js to allow users to open a PDF in their browser which it then converts to an image-per-page and runs through OCR.
  • social-media-cropper lets you open (or paste in) an existing image and then crop it to common dimensions needed for different social media platforms - 2:1 for Twitter and LinkedIn, 1.4:1 for Substack etc.
  • ffmpeg-crop lets you open and preview a video file in your browser, drag a crop box within it and then copy out the ffmpeg command needed to produce a cropped copy on your own machine.
screenshot of ocr screenshot of social-media-cropper screenshot of ffmpeg-crop

You can offer downloadable files too

An HTML tool can generate a file for download without needing help from a server.

The JavaScript library ecosystem has a huge range of packages for generating files in all kinds of useful formats.

screenshot of svg-render screenshot of social-media-cropper screenshot of open-sauce-2025

Pyodide can run Python code in the browser

Pyodide is a distribution of Python that's compiled to WebAssembly and designed to run directly in browsers. It's an engineering marvel and one of the most underrated corners of the Python world.

It also cleanly loads from a CDN, which means there's no reason not to use it in HTML tools!

Even better, the Pyodide project includes micropip - a mechanism that can load extra pure-Python packages from PyPI via CORS.

screenshot of pyodide-bar-chart screenshot of numpy-pyodide-lab screenshot of apsw-query

WebAssembly opens more possibilities

Pyodide is possible thanks to WebAssembly. WebAssembly means that a vast collection of software originally written in other languages can now be loaded in HTML tools as well.

Squoosh.app was the first example I saw that convinced me of the power of this pattern - it makes several best-in-class image compression libraries available directly in the browser.

I've used WebAssembly for a few of my own tools:

screenshot of ocr screenshot of sloccount screenshot of micropython

Remix your previous tools

The biggest advantage of having a single public collection of 100+ tools is that it's easy for my LLM assistants to recombine them in interesting ways.

Sometimes I'll copy and paste a previous tool into the context, but when I'm working with a coding agent I can reference them by name - or tell the agent to search for relevant examples before it starts work.

The source code of any working tool doubles as clear documentation of how something can be done, including patterns for using editing libraries. An LLM with one or two existing tools in their context is much more likely to produce working code.

I built pypi-changelog by telling Claude Code:

Look at the pypi package explorer tool

And then, after it had found and read the source code for zip-wheel-explorer:

Build a new tool pypi-changelog.html which uses the PyPI API to get the wheel URLs of all available versions of a package, then it displays them in a list where each pair has a "Show changes" clickable in between them - clicking on that fetches the full contents of the wheels and displays a nicely rendered diff representing the difference between the two, as close to a standard diff format as you can get with JS libraries from CDNs, and when that is displayed there is a "Copy" button which copies that diff to the clipboard

Here's the full transcript.

See Running OCR against PDFs and images directly in your browser for another detailed example of remixing tools to create something new.

Record the prompt and transcript

I like keeping (and publishing) records of everything I do with LLMs, to help me grow my skills at using them over time.

For HTML tools I built by chatting with an LLM platform directly I use the "share" feature for those platforms.

For Claude Code or Codex CLI or other coding agents I copy and paste the full transcript from the terminal into my terminal-to-html tool and share that using a Gist.

In either case I include links to those transcripts in the commit message when I save the finished tool to my repository. You can see those in my tools.simonwillison.net colophon.

Go forth and build

I've had so much fun exploring the capabilities of LLMs in this way over the past year and a half, and building tools in this way has been invaluable in helping me understand both the potential for building tools with HTML and the capabilities of the LLMs that I'm building them with.

If you're interested in starting your own collection I highly recommend it! All you need to get started is a free GitHub repository with GitHub Pages enabled (Settings -> Pages -> Source -> Deploy from a branch -> main) and you can start copying in .html pages generated in whatever manner you like.

Bonus transcript: Here's how I used Claude Code and shot-scraper to add the screenshots to this post.

You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

Read the whole story
chrisamico
5 days ago
reply
Boston, MA
Share this story
Delete

All the Claude (Code) Things

1 Share

Of nearly all of the potential uses for Large Language Models, perhaps the best and most defensible is using them to write code. During the recently-completed fall semester, I co-taught a Computer Science class with Bill Pugh that explored some good ways to do that. We used a lot of Claude Code for that class, and it quickly has become my favorite coding model. And then, in the middle of November, the folks at Anthropic released Claude Code on the Web and gave paid users free credits to use it. I got $250 worth, and I had a lot of ideas.

I didn’t use all of the $250 in credits, but I made it to about $25. I had Claude Code revive a few very old (10+ years) projects, make significant progress on some long-stalled codebases and put the finishing touches on complex problems. There were the now-familiar moments when the model got caught in unproductive loops, as well as over-engineered approaches to simpler tasks. But on the whole, a couple of weeks of Claude Code on the web proved very useful. Here’s some of what I had Claude Code do.

Simple Apps

One of the sweet spots for coding models are simple HTML/CSS/JS applications - often deployed on GitHub Pages - based on some data you’ve got in a repository. I have a lot of data sitting around in repositories, so I had Claude Code make some apps, because not everyone wants to look at a CSV file.

  • Maryland Women’s Basketball Data. I’ve been collecting and storing JSON files with game-level information from women’s college basketball for a few years now. But few people want to look at JSON files in a browser, so this was a great candidate for a simple web app. This one took a number of attempts at refinement, and I needed to get specific with Claude Code about some of the things I wanted, especially the details of each game. But I also asked for an assist network analysis without providing any other instructions, and what I got is pretty great. It updates every day via GitHub Actions.
  • Post Ping. I’ve been collecting SMS alerts from the Washington Post since March 2024, and here is the entirety of my instructions: “Create a slightly whimsical webpage that presents the alert data in several ways: a quick look at the alerts from the previous 24 hours, a searchable filter on title, another on target topic. Have a line chart showing the trend of alerts and title the page “Post Ping”. Use JS CSS HTML.” I think we can all agree that the font is perhaps too “whimsical”, but Claude Code did everything I asked for and produced a site I actually look at.
  • Frederick County Fire & Rescue Incidents. Working with a Maryland newspaper,I wrote a scraper to track emergency response incidents and provide them as an RSS feed reporters could subscribe to. That wasn’t really effective, but this simple searchable and browsable database that Claude Code built was more useful.
  • UMPD Logs. A former student, Rachel Marconi, originally built this app for my news apps class, and I had Claude add some visual upgrades and some topline summary data with minimal effort.
  • Hattiesburg (Miss.) Public Notices. For a presentation at a conference on local news hosted at Southern Miss, I had Claude Code build an app that allows users to comb through public notices from the Hattiesburg American, using an LLM to provide a newsworthiness value to each announcement.
  • WV Football Game Viewer. For a 2023 Mountain State Spotlight story on transfers in West Virginia high school football, I scraped game scores to measure blowouts. Claude Code took that data and made a simple website of it.

Command-Line Utilities & Scrapers

Building pipelines for data is a big part of my work (and teaching), and mostly that has meant writing single Python scripts do do the work of obtaining and transforming data. This set of Claude Code work led to more robust and predictable CLI tools.

  • Testudo. I picked up the original code for this project, which scrapes Maryland’s Schedule of Classes site, from Ed Summers. It was good code, but it also had some constraints: users could only scrape all of the classes for a given term, producing JSON files and a single CSV file with the combined data. Using Claude Code, I added a bunch of features, including optional scraping for syllabi. Plus tests and better documentation. It is a much better-organized and more reliable piece of software now.
  • python-statement. Years ago, I extracted a Ruby gem that scraped press releases from the congressional news app I worked on at The New York Times. It was super useful but became a pain to maintain, and I haven’t written much Ruby in the past five years. So I had Claude Code port it to Python, where I hope to keep it going with some student help. Translating code from one language to another is very much in an LLM’s wheelhouse, although it was interesting and somewhat disappointing that given hundreds of methods, Claude literally gave up, coming up with lines such as “# … and more methods here”. These coding assistants really are like disinterested interns some times, but I would not have done this myself.
  • College Sports Rosters. For a few years I’ve maintained scrapers to extract player information from official collegiate roster sites for women’s basketball and volleyball. I based the latter code on the former, but that involved a lot of work and maintenance was an issue, making expanding to other sports an unappealing prospect. Claude Code changed that, not only producing very serviceable code for men’s and women’s soccer, lacrosse and women’s field hockey, but also creating common utilities shared across repositories. And it cleaned up and reorganized the original women’s basketball scraper code, making it much easier to work with.
  • Foreign Gifts to U.S. Officials. I first built this repo to show off the ability of LLMs to parse weirdly-structured PDFs, and my code was kind of a mess. Claude Code polished it up and reorganized the files, adding a CLI and a way to visualize the data. Improving or extending existing codebases is a pretty solid use case for these coding assistants.

Resurrected Apps

I’ll do a follow-up post on this category, because it is the most ambitious use case I could imagine: taking a code base that is large, old and unmaintained and bringing it back to life. I pointed Claude Code at several of these, including official Senate disbursements (one of the white whales of the open government movement), a Django app to download and parse IRS campaign finance filings and even Fumblerooski, the first Django app I ever made. Oh, and a still-private repo containing the codebase of Capitol Words, a former Sunlight Foundation joint that ProPublica briefly took over and tried to resurrect.

Why a separate post on these? Because unlike the other projects mentioned above, these are substantial and complicated codebases. I’ll need some time to dig into them and assess the changes. I’m sure some of them will be incomplete or screwed up in some way - I didn’t give many instructions aside from “update the code so it runs now” and tried to simplify things where I could. The results - as with most things generated by AI - sound impressive, but the devil’s in the details.

The Takeaway

Properly guided by users who know what they are doing, these coding assistants can be super useful. They certain can make quick work of what my friend and colleague Matt Waite calls the “throwaway news app”, a simple HTML/JS/CSS site that can make data and information accessible. That alone is a huge benefit, if newsrooms are smart about how they do it. The way I employed Claude Code during that mid-November stretch wasn’t ideal; a deadline often means things get left behind or neglected. But since then, I’ve made good use of Claude Code and a few other assistants, especially in writing tests, documentation and helping me draft plans for upgrades. Although these tools are designed for coding, you can throw other tasks at them, which sometimes is an improvement over a chat interface since they can use tools to help them understand and operate on files. I often use Claude Code to help me understand an unfamiliar code base, for example.

The other lesson I’ve learned is that these models write code quickly, and that’s often not a great thing. But you can slow them down, both through direct instruction and via what I call “constitutional” documents like Claude.md that contain basic operational principles: do X this way, wait for confirmation before doing Y, etc. The “magic” of these assistants is that they seem like they can drive the bus. But you’re the one who should be behind the wheel.

Read the whole story
chrisamico
5 days ago
reply
Boston, MA
Share this story
Delete

How musicals use motifs to tell stories

1 Share

to this line from the musical Wicked.

Or click anywhere to begin

This same melody repeats three times during the show, each at a pivotal moment for the main character, Elphaba.

🎧 Click play and see if you can hear how the melody is the same in each clip.

A chart with time on the x-axis depicting the occurence of the “unlimited” motif in wicked.

This sort of thing happens in lots of art forms, from film scores to standup sets. Depending on the medium, you might call it a theme or a callback. In music, the word motif describes a short, distinctive musical idea that recurs in a salient way.

Here are all of the motifs in Wicked.

A chart with time on the x-axis depicting the occurence of 6 total motifs in wicked.

Musicals put motifs on display in a unique way.

Music is always telling a story, but here that is quite literal. This is especially true in musicals like Les Misérables or Hamilton where the entire story is told through song, with little to no dialogue. These musicals rely on motifs to create structure and meaning, to help tell the story.

A chart with time on the x-axis depicting the occurence of 47 total motifs in lesmis.

Musicals like these are an excellent ground for observing the power and function of motifs – what exactly are they doing for the stories they are a part of? Let’s break that down, using examples from these sung-through musicals, but with patterns you’ll also spot across film, TV, and beyond.

Common Threads

Composers have been using repetition for forever. Think Beethoven’s Symphony No. 5, where repeats and reappears throughout the piece.

In the 19th century, German composers started formalizing the idea of attaching a motif to a person, place, or idea within a story (these are called leitmotifs). Think , where different instruments and melodies represent different animals in the woods. Or the theme in , which captures the idea of Carl and Ellie’s shared life together.

So a motif doesn’t just exist, it represents something.

This creates a musical storytelling shortcut: when the audience hears a motif, that something is evoked. The audience can feel this information even if they can’t consciously perceive how it’s being delivered.

Carl and Ellie from Pixar’s Up, waving out at the crowd as they get married

In the first four minutes of Pixar’s Up, a melodic motif carries the emotional weight of the story, all without any dialogue.

This technique has been embraced in many mediums — from opera to video game music to modern musical theater.

Let’s look at some examples of story and emotional information being conveyed through musical motifs.

Representing a character

One of the most straightforward uses of a motif is to represent a character in the story. These motifs can help cue the audience that a character is present, like or someone from the in Avatar the Last Airbender. A change in the motif’s instrumentation or tone can signal a change in that character.

A chart with time on the x-axis depicting the occurence of 4 character motifs in lesmis.

In Hamilton, there are often literal introductions of characters to a consistent melody or rhythm.

A chart with time on the x-axis depicting the occurence of 5 character motifs in hamilton.

Representing an idea

More often, motifs are a marker for something more abstract – love, heartbreak, adventure – and not always owned by a specific character. Like this Star Wars theme that embodies the concept of , calling in ideas around destiny, hope, the struggle between good and evil.

A chart with time on the x-axis depicting the occurence of 4 idea motifs in lesmis.

Creating emotional layers

Why does that scene from Up make everyone cry? It establishes a simple melodic that comes to represent Carl & Ellie’s adventure together. But the real emotional weight comes from the fact that we hear it both in moments of joy and in moments of loss and heartbreak, each appearance carrying the previous memories with it. We feel the weight of the past layered onto the present moment, which makes it hit even harder.

The following motifs repeat, but with drastically different emotions across the show.

A chart with time on the x-axis depicting the occurence of 4 motifs with emotional changes in lesmis.
A chart with time on the x-axis depicting the occurence of 2 motifs with emotional changes in hamilton.

Weaving everything together

Both Les Misérables and Hamilton have a song at the end of the first act where many of the motifs introduced so far all come together. The audience is reminded of everything we’ve learned and seen so far, and the most important threads of the story collide and are woven together.

A chart with time on the x-axis depicting the occurence of 6 motifs from one day more in lesmis.
A chart with time on the x-axis depicting the occurence of 12 motifs from non-stop in hamilton.

There’s something else hidden within that “Unlimited” motif from Wicked. It’s actually the same notes as “Somewhere Over The Rainbow”, a nod to the musical’s original source material.

A piano diagram showing that the first four notes of the Unlimited theme and Somewhere Over The Rainbow are the same.

From Wicked

From The Wizard of Oz

Across generations, these pieces speak to each other — the threads connect. From the subtle to the more overt, connections like these shape how we feel and what sticks with us.

Most of us don’t consciously notice this force at work in the moment. Luckily, we don’t have to understand it to feel it.

Explore all the motifs we found in Hamilton, Les Misérables, and Wicked.

hamilton playbill program A chart with time on the x-axis depicting the occurence of 35 motifs in hamilton.

🤷‍♂️ how does a

Sung by Aaron Burr, narrating and introducing a new part of the story.

💰 alexander hamilton

The main guy!

🤩 just you wait

Hamilton proving himself is a big theme throughout the musical.

🎩 aaron burr sir

An exchange, usually between Burr and Hamilton.

🤫 talk less

A piece of advice Burr gives Hamilton (which he ignores), that demonstrates the difference between them.

🏀 my shot

One of Hamilton's main motifs.

💭 i imagine death

Hamilton contemplating his mortality and legacy.

🍻 raise a glass

Like the motif in Les Misérables, this theme captures the camaraderie and idealism of young revolutionaries.

📜 angelica

Each Schyuler sister gets their own name motif.

🕯️ eliza

Each Schyuler sister gets their own name motif.

👩‍👧‍👦 schuyler sisters

This one comes back very subtly in Act 2.

😎 summer in the city

“Someone in a rush” changes to “someone under stress” later on.

👀 look around

This motif repeats lyrically in “Non-Stop” and melodically in “Schuyler Defeated.”

🇬🇧 you'll be back

These 3 songs, all sung by King George, are basically identical and musically distinct from the rest of the show.

😍 helpless

Eliza sings this motif as she falls head over heels for Hamilton.

⏳ flashback

“Satisfied” is a flashback of “Helpless,” and this motif helps place us in time as we experience the same events from a different angle.

💞 my sister

Angelica demonstrating her sisterly love for Eliza.

👍🏽 satisfied

This motif was borrowed from a letter from Angelica to Hamilton, where she writes: “You are happy my dear friend to find consolation in ‘words and thoughts.’ I cannot be so easily satisfied.”

💀 doesn't discriminate

Burr contemplating two forces that are beyond his control, love and death.

🛑 wait for it

Burr's main motif, representing his cautious, measured approach to life.

9️⃣ counting

This count is usually used in the context of a duel, but also in “Take a Break” for Phillip's piano lesson (some sad foreshadowing).

📏 duel rules

These two songs are basically identical.

🔫 aim no higher

The final rule, which recurs for each of the three duels in the show.

🥺 that would be enough

One of Eliza's main motifs, representing her love and devotion to Hamilton and their family.

🥺 that would be enough B

The second section of Eliza’s motif.

👨‍✈️ asking me to lead

Washington's motif representing duty and leadership.

🗺️ history has its eyes

One of the core messages of the show, this motif represents the weight of legacy.

🤍 who tells your story

These are the final words we hear in the show.

⏰ running out of time

This motif is sung by other people about Hamilton and his relentless sense of urgency.

😇 phillip rap

Phillip performs this rap for his parents as a 9-year old, and then again to himself when he's grown up, and feeling nervous before a duel.

📝 room where it happens

Burr feels left out, even though his passiveness has put him there.

😏 on your side

Hamilton seems to be Washington’s favorite, and Burr, Jefferson, and Madison jealously sing about it.

⚖️ equal opposite

One of Jefferson’s only motifs!

✌🏽 say goodbye

Washington’s goodbye echoes in Hamilton’s final words.

💔 quiet uptown

Hamilton calls back to the previous song about grieving the loss of his son while everyone around him in “The Election of 1800” is focused on politics.

Notes

This piece focuses on melodic motifs that are sung, leaving out those occurring just in the orchestra (which are plentiful, just harder to capture reliably). I drew the line there because these are the easiest to hear and recognize. To qualify, a motif must recur at least twice across multiple songs. Each chart shows one instance of each motif per song, though many reappear several times within a single song.

To detect the motifs, I listened to these musicals a bunch of times, and noted occurrences by hand, while consulting some outside sources.

Hear a motif that we missed? Reach out at michelle@pudding.cool.

Read the whole story
chrisamico
5 days ago
reply
Boston, MA
Share this story
Delete

Researchers Are Hunting America for Hidden Datacenters

1 Share

Advertisement

The nonprofit research group Epoch AI is tracking the physical imprint of the technology that’s changing the world.

Researchers Are Hunting America for Hidden Datacenters Google Earth image highlighted by Epoch AI.
Read the whole story
chrisamico
9 days ago
reply
Boston, MA
Share this story
Delete
Next Page of Stories