Journalist/developer. Storytelling developer @ USA Today Network. Builder of @HomicideWatch. Sinophile for fun. Past: @frontlinepbs @WBUR, @NPR, @NewsHour.
2161 stories
·
45 followers

I think "agent" may finally have a widely enough agreed upon definition to be useful jargon now

1 Share

I've noticed something interesting over the past few weeks: I've started using the term "agent" in conversations where I don't feel the need to then define it, roll my eyes or wrap it in scare quotes.

This is a big piece of personal character development for me!

Moving forward, when I talk about agents I'm going to use this:

An LLM agent runs tools in a loop to achieve a goal.

I've been very hesitant to use the term "agent" for meaningful communication over the last couple of years. It felt to me like the ultimate in buzzword bingo - everyone was talking about agents, but if you quizzed them everyone seemed to hold a different mental model of what they actually were.

I even started collecting definitions in my agent-definitions tag, including crowdsourcing 211 definitions on Twitter and attempting to summarize and group them with Gemini (I got 13 groups).

Jargon terms are only useful if you can be confident that the people you are talking to share the same definition! If they don't then communication becomes less effective - you can waste time passionately discussing entirely different concepts.

It turns out this is not a new problem. In 1994's Intelligent Agents: Theory and Practice Michael Wooldridge wrote:

Carl Hewitt recently remarked that the question what is an agent? is embarrassing for the agent-based computing community in just the same way that the question what is intelligence? is embarrassing for the mainstream AI community. The problem is that although the term is widely used, by many people working in closely related areas, it defies attempts to produce a single universally accepted definition.

So long as agents lack a commonly shared definition, using the term reduces rather than increases the clarity of a conversation.

In the AI engineering space I think we may finally have settled on a widely enough accepted definition that we can now have productive conversations about them.

Tools in a loop to achieve a goal

An LLM agent runs tools in a loop to achieve a goal. Let's break that down.

The "tools in a loop" definition has been popular for a while - Anthropic in particular have settled on that one. This is the pattern baked into many LLM APIs as tools or function calls - the LLM is given the ability to request actions to be executed by its harness, and the outcome of those tools is fed back into the model so it can continue to reason through and solve the given problem.

"To achieve a goal" reflects that these are not infinite loops - there is a stopping condition.

I debated whether to specify "... a goal set by a user". I decided that's not a necessary part of this definition: we already have sub-agent patterns where another LLM sets the goal (see Claude Code and Claude Research).

There remains an almost unlimited set of alternative definitions: if you talk to people outside of the technical field of building with LLMs you're still likely to encounter travel agent analogies or employee replacements or excitable use of the word "autonomous". In those contexts it's important to clarify the definition they are using in order to have a productive conversation.

But from now on, if a technical implementer tells me they are building an "agent" I'm going to assume they mean they are wiring up tools to an LLM in order to achieve goals using those tools in a bounded loop.

Some people might insist that agents have a memory. The "tools in a loop" model has a fundamental form of memory baked in: those tool calls are constructed as part of a conversation with the model, and the previous steps in that conversation provide short-term memory that's essential for achieving the current specified goal.

If you want long-term memory the most promising way to implement it is with an extra set of tools!

Agents as human replacements is my least favorite definition

If you talk to non-technical business folk you may encounter a depressingly common alternative definition: agents as replacements for human staff. This often takes the form of "customer support agents", but you'll also see cases where people assume that there should be marketing agents, sales agents, accounting agents and more.

If someone surveys Fortune 500s about their "agent strategy" there's a good chance that's what is being implied. Good luck getting a clear, distinct answer from them to the question "what is an agent?" though!

This category of agent remains science fiction. If your agent strategy is to replace your human staff with some fuzzily defined AI system (most likely a system prompt and a collection of tools under the hood) you're going to end up sorely disappointed.

That's because there's one key feature that remains unique to human staff: accountability. A human can take responsibility for its action and learn from its mistakes. Putting an AI agent on a performance improvement plan makes no sense at all!

Amusingly enough, humans also have agency. They can form their own goals and intentions and act autonomously to achieve them - while taking accountability for those decisions. Despite the name, AI agents can do nothing of the sort.

This legendary 1979 IBM training slide says everything we need to know:

A computer can never be held accountable. Therefore a computer must never make a management decision

OpenAI need to get their story straight

The single biggest source of agent definition confusion I'm aware of is OpenAI themselves.

OpenAI CEO Sam Altman is fond of calling agents "AI systems that can do work for you independently".

Back in July OpenAI launched a product feature called "ChatGPT agent" which is actually a browser automation system - toggle that option on in ChatGPT and it can launch a real web browser and use it to interact with web pages directly.

And in March OpenAI launched an Agents SDK with libraries in Python (openai-agents) and JavaScript (@openai/agents). This one is a much closer fit to the "tools in a loop" idea.

It may be too late for OpenAI to unify their definitions at this point. I'm going to ignore their various other definitions and stick with tools in a loop!

There's already a meme for this

Josh Bickett tweeted this in November 2023:

What is an AI agent?

Meme showing a normal distribution curve with IQ scores from 55 to 145 on x-axis, featuring cartoon characters at different points: a calm face at low end labeled "An LLM in a loop with an objective", a stressed face with glasses and tears in the middle peak with a complex flowchart showing "AGENT Performance Standard" with boxes for Critic, feedback, Learning element, Problem Generator, Sensors, Performance element, Experiments, Effectors, Percepts, Environment, and actions connected by arrows.... and a hooded figure at high end also labeled "An LLM in a loop with an objective".

I guess I've climbed my way from the left side of that curve to the right.

You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

Read the whole story
chrisamico
1 hour ago
reply
Boston, MA
Share this story
Delete

Slack is extorting us with a $195k/yr bill increase

1 Share

An open letter, or something

For nearly 11 years, Hack Club - a nonprofit that provides coding education and community to teenagers worldwide - has used Slack as the tool for communication. We weren’t freeloaders. A few years ago, when Slack transitioned us from their free nonprofit plan to a $5,000/year arrangement, we happily paid. It was reasonable, and we valued the service they provided to our community.

However, two days ago, Slack reached out to us and said that if we don’t agree to pay an extra $50k this week and $200k a year, they’ll deactivate our Slack workspace and delete all of our message history.

One could argue that Slack is free to stop providing us the nonprofit offer at any time, but in my opinion, a six month grace period is the bare minimum for a massive hike like this, if not more. Essentially, Salesforce (a $230 billion company) is strong-arming a small nonprofit for teens, by providing less than a week to pony up a pretty massive sum of money, or risk cutting off all our communications. That’s absurd.

The impact

The small amount of notice has also been catastrophic for the programs that we run. Dozens of our staff and volunteers are now scrambling to update systems, rebuild integrations and migrate years of institutional knowledge. The opportunity cost of this forced migration is simply staggering.

image image image image

Anyway, we’re moving to Mattermost. This experience has taught us that owning your data is incredibly important, and if you’re a small business especially, then I’d advise you move away too.


This post was rushed out because, well, this has been a shock! If you’d like any additional details then feel free to send me an email.

Read the whole story
chrisamico
11 hours ago
reply
Boston, MA
Share this story
Delete

Is your mayor using ChatGPT? Here’s how to FOIA around and find out - Poynter

1 Share

Last year, Nate Sanford filed a “silly story” for Spokane’s alt-weekly Inlander about a state senator getting into a Twitter argument with an AI porn spambot. The bot was eventually suspended after Spokane’s mayor reported the account.

But a city employee mentioned to Sanford, now a reporter at KNKX and Cascade PBS, that they’d been testing AI tools at work. That offhand comment sparked Sanford’s curiosity about how local governments were actually using generative artificial intelligence and led to a series of investigations that revealed how chatbots are quietly embedding into the machinery of local government.

Sanford used extensive public records of ChatGPT and Microsoft Copilot logs from city employees to show, among other things, the city of Bellingham’s draft AI policy was written with the help of ChatGPT.

I get excited when I see an intriguing use of FOIA to demystify local government. And leading Poynter’s AI work, you can imagine how I geeked out when I saw Sanford’s investigations.

Here is how they start:

When the Lummi Nation applied for funding to hire a crime victims coordinator last year, Bellingham Mayor Kim Lund sent a letter encouraging the Washington Department of Commerce to award the nation a state grant.

“The Lummi Nation has a strong history of community leadership and a deep commitment to the well-being of its members,” the letter read. “The addition of a Coordinator will enhance the Lummi Nation’s capacity to address violence and support victims in a meaningful and culturally appropriate manner.”

But the mayor didn’t write those words herself. ChatGPT did.

Records show Lund’s assistant fed the Commerce Department’s request for proposals into the artificial intelligence chatbot and asked it to write the letter for her. “Please include some facts about violence in native communities in the United States or Washington state in particular,” she added in her prompt.

The stories highlight the need for AI literacy as the technology embeds deeper into our lives, even in ways we may never see. So, I reached out to Sanford in an email conversation to find out why and how he used the Freedom of Information Act to obtain  ChatGPT logs, and what they say about where we’re heading.

This conversation has been edited for length and clarity.

Alex Mahadevan: So, why did you think to FOIA for chatbot logs? Did you get a tip? Just curious?

Nate Sanford: After the porn spambot story, I did some research online to see if it was possible to use records requests to get more info on local government AI use. I found a post from someone on MuckRock who had tried requesting AI records from their local police department. I was inspired to try something similar to see what would turn up with Washington city government leaders.

I ended up filing records requests seeking chatbot records from almost a dozen cities in Washington. I was mainly just testing the system to see if it was even possible.

Mahadevan: What was the custodian’s reaction? Did it take a lot of back and forth to get what you wanted?

Sanford: It varied by city. Many have required a fair amount of back and forth. It was clear that most jurisdictions had never dealt with this type of request before.

I got a call from one records officer who wanted to know more about what I was looking for and how they could help. They said it was the first time they’d dealt with a request of that type, and they weren’t really sure how to process it. I’ve had similar questions from several records officers.

Mahadevan: Were you surprised they complied? Surprised they even kept the records? 

Sanford: I really wasn’t sure what to expect.

The story I published ended up focusing on two Washington cities: Bellingham and Everett. We ended up focusing on those cities because they were the fastest and most responsive to my records request. They aren’t necessarily outliers in their use of AI.

Bellingham and Everett both deserve a lot of credit for acting in good faith and doing their best to provide a comprehensive response to my (very time-consuming) records request. Some cities haven’t been as cooperative or transparent. I’m aware that this type of request is expansive and a big lift for records officers and respondents. But I also think it’s important for transparency. Citizens have a right to know how their representatives are using these tools.

Mahadevan: Were you surprised by the widespread use of ChatGPT you found?

Sanford: I knew the technology was widespread in the private sector. I expected that it would also be present in government, but I really didn’t expect it to be this widespread. I hadn’t heard any public communication from governments about how or if they’d be using it.

Mahadevan: What are some tips you’d give to another local reporter looking to do the same thing?

Sanford: Request records from CoPilot as well as ChatGPT: When I first tried filing these records requests, I tried asking for chat logs from every AI chatbot city staff have used. Records officers told me that was too vague and expansive. For simplicity, I ended up limiting the requests to ChatGPT, the world’s most popular chatbot.

Requesting ChatGPT logs was fruitful, but going forward, I think requesting Copilot chatlogs will be even more valuable. Microsoft made its chatbot available to government clients earlier this year, and many jurisdictions are now instructing staff to only use Copilot. I’d recommend that reporters look for records from Copilot as well as ChatGPT. (Depending on how much time you have, it could also be worth filing additional requests for records from Claude, Grok, etc.)

Provide detailed instructions: When I first started filing requests, some city employees responded by simply taking screenshots of every ChatGPT conversation they’d had — sometimes on their mobile phone. This was incredibly chaotic and difficult to sort through. It also meant that I couldn’t see the date the messages were sent or the order they were supposed to be in.

To make things easier, I started asking records officers to send city staff instructions for how to export their ChatGPT histories into a zipped folder. The .ZIP file format is ideal because it gives you:

  • An easily readable HTML file of the chats in chronological order.
  • A JSON file of the email the user signed up for ChatGPT with.
  • A JSON file of the chat history that includes timestamps.
  • Copies of any files the user uploaded into ChatGPT, and copies of any images ChatGPT generated in response to their requests.

The datestamps are in Unix time, so you’ll need to use a free online converter to decipher them.

Call records officers: If the request is taking a while, I would absolutely recommend calling records officers to explain what you’re looking for and ask how you can help make their job easier. The cities that have been most responsive to my request so far — Bellingham and Everett — responded by sending an email to literally every single city employee asking them to turn over their ChatGPT history. It took about five months for them to close out the request.

Figure out a good file management system: The volume of records returned in response to my requests was massive. I’d recommend that reporters figure out a file management system that works for them early on so they don’t lose track of documents. I organized things by taking a screenshot of every interesting message I came across and saving those screenshots to a group of desktop folders organized by city/topic. Most of the chat logs came back as HTML files that let you search to find keywords.

Mahadevan: What did your requests look like?

Sanford: Here’s a template. I’d recommend narrowing the scope a bit if you’re looking for something specific and hoping to get a faster response.

Pursuant to the Washington Public Records Act, I am requesting the following records:

Chat histories of all ChatGPT sessions conducted by city employees on city-owned devices or used in job-related functions in the following departments: City Council, Mayor’s Office, Police, City Attorney, Public Works, Information Technology, TKTKTK and TKTKT.

The timeframe for this request is 1/6/2023 to the date this request is processed. The requested documents will be made available to the general public and this request is not being made for commercial purposes. Please make records available in installments as they are ready to release.

If it’s helpful, please share with respondents the following instructions for exporting ChatGPT histories:

  • **Click on your name or profile icon** (bottom-left corner of the ChatGPT interface).
  • Select **”Settings”**.
  • Go to the **”Data Controls”** tab.
  • Click **”Export data”**.
  • A pop-up will appear — click **”Confirm export”**.
  • OpenAI will email you a download link with a `.zip` file containing your chat history in JSON format (and HTML for easy viewing).

Mahadevan: Any interesting chat logs that didn’t make it into the story?

Sanford: There were so many!

I think the original draft I turned in was almost 10,000 words. I’m thankful to my editors for helping me trim it.

There was lots of small, silly stuff. There were also lots of really interesting examples that shed light on how city leaders are thinking about various policy questions. It was illuminating to see which topics popped up most frequently. (Washington has a huge housing crisis, and there were numerous examples of officials asking ChatGPT for advice on how to increase housing affordability.)

A lot of the chats had sensitive personal information that was really interesting, but not necessarily newsworthy enough for us to publish.

There are a few chat logs that we’re holding on to because they raise legal questions and require more reporting before we can publish.

Mahadevan: What kind of reception have you gotten from the community?

Sanford: The reception has been really positive! It’s clear that most people had no idea that their local government leaders were using AI this way. The story prompted newspaper editorials in both Bellingham and Everett calling for city leaders to approach AI with more caution.

Generative AI is such a new technology that there’s no real consensus on what the norms should be. Does it matter that the mayor’s assistant used ChatGPT to write a letter to a congressman? Or that communications staff used it to respond to emails from constituents? We’ve heard from a lot of readers who are upset about that, but we’ve also heard from people who say they don’t care. I think both perspectives are valid. It’s really interesting seeing people grappling with where the line should be.

It’s clear that local governments have been experimenting with this technology for a while, but there hasn’t been much public discussion about it. I’m glad to see that the story has sparked a really robust debate.

I’ve also heard from lots of reporters in newsrooms across the country who are planning to copy the records request in their respective jurisdictions.

Mahadevan: Got any other follow-ups planned?

Sanford: I have several follow-ups planned. I’m continuing to regularly receive new installments from other Washington jurisdictions. There are a few specific chat records we’ve obtained that require more reporting before we can publish.

Mahadevan: Do you personally use generative AI for anything?

Sanford: It isn’t technically generative AI, but I use Otter.ai every day for transcribing interviews. It’s incredibly helpful.

I’ve experimented with ChatGPT for generating headline ideas, but I haven’t been super impressed with any of its suggestions. I’ve found it helpful for a few computer/coding related questions, but I don’t feel comfortable using it for writing.

I think there probably are ways that generative AI can be helpful for newsrooms, but I’m still pretty wary of it. I’m worried about accuracy, public trust and plagiarism.

Read the whole story
chrisamico
1 day ago
reply
Boston, MA
Share this story
Delete

The climate of fear is self-imposed

1 Share
Read the whole story
chrisamico
1 day ago
reply
Boston, MA
Share this story
Delete

Mapterhorn - Terrain for Web Mapping

1 Share

The Protomaps project is the PMTiles format, its tooling, and a 120GB basemap vector cartographic tileset created from OpenStreetMap and other open data sources. PMTiles is not limited to storing vector data - it’s also used for raster data, like scans of historical paper maps.

Mapping apps often don’t just need to show vectors of buildings, boundaries and places. Some apps need elevation data, since interesting places on Earth aren’t flat! The Mapterhorn project fulfills this with a new independent open data product - it’s Protomaps for Terrain.

Project Inspiration

Mapterhorn’s inspiration is the Mapzen Joerd project. Joerd is available as tiles from AWS Open Data, and was originally created for the Tangram map renderer, but works with MapLibre GL as well. It’s built from a collection of digital elevation models (DEMs) processed into a single tileset using batch jobs on Amazon Web Services.

{
 type: 'raster-dem',
 tiles: ["https://s3.amazonaws.com/elevation-tiles-prod/terrarium/{z}/{x}/{y}.png"],
 maxzoom: 13,
 encoding: 'terrarium',
 attribution: "<a href='https://github.com/tilezen/joerd/tree/master'>Joerd</a>"
}

Mapterhorn’s goals are similar to Joerd - create a global, easy-to-use terrain tileset, with an initial focus on European DEMs.

Key project differences

Mapterhorn’s design differs from Joerd and other open data projects in these ways:

  • Focus on interactive web visualization - The end product is sliced into tiles at conventional sizes like 512x512 pixels for direct usage in 2D and 3D web maps. Tiles are stored in the terrarium encoding which MapLibre GL supports.

  • Ease of tileset recreation - Instead of being tied to AWS, Mapterhorn can be reproduced from scratch using a single powerful machine, either a desktop or a rented server. This means the pipeline can be customized with different data if your project requires more detail in certain countries. The full pipeline is open source on GitHub.

  • Ease of deployment - Mapterhorn distributes the end product as static PMTiles archives, which can be directly read from cloud storage to map libraries in browsers. This is used to visualize the Mapterhorn tileset on mapterhorn.com as well as the pmtiles.io viewer.

Using the PMTiles format for distribution means you can use the pmtiles extract CLI on a planet archive. To extract only the area surrounding the Matterhorn, try this:

pmtiles extract \
 --bbox=7.510659,45.897669,7.799642,46.04662 \
 https://download.mapterhorn.com/planet.pmtiles \
 planet.pmtiles

Project Future

Oliver Wipfli, a former official coordinator of the MapLibre project is leading the development of Mapterhorn. The initial phases of the project are supported by an NLnet grant. If you’re a company or organization that needs high resolution terrain for web visualization, start a discussion on GitHub!

Read the whole story
chrisamico
16 days ago
reply
Boston, MA
Share this story
Delete

AJC to shift to digital only publication, phase out printed newspaper

1 Share

J. Scott Trubey is the senior editor over business, climate and environment coverage at The Atlanta Journal-Constitution. He previously served as a business reporter for the AJC covering banking, real estate and economic development. He joined the AJC in 2010.

J. Scott Trubey is the senior editor over business, climate and environment coverage at The Atlanta Journal-Constitution. He previously served as a business reporter for the AJC covering banking, real estate and economic development. He joined the AJC in 2010.

Read the whole story
chrisamico
20 days ago
reply
Boston, MA
Share this story
Delete
Next Page of Stories