Can AI “Design” Something?

At Lone Star, we’ve studied the use of AI in design applications. One of our founders has a ranch (The Barwise) so we decided to see how an AI might compare with human performance in designing a branding iron. The Barwise brand was registered in 1878 and appears on the web in a few places. It seemed like a fair test. This is the actual Barwise Brand, for reference:

There is a belief that mainstream AI can be creative and do design work. In some cases, that may be true. Ask a Large Language Model (LLM) to draft a poem or a short story, the result can be impressive. We asked ChatGPT to, “Write 50 words in the style of Ernest Hemingway comparing a shotgun and a shark.” The response was not awful:

The shotgun and the shark, both deadly in their own way. One with its steel barrels and buckshot, the other with its rows of razor-sharp teeth. Each demands respect, for one can take a life with a blast, while the other can drag you down to the depths.

Beyond generating word strings, can a mainstream AI design something? We know these systems fail at engineering challenges, and many scientific questions. So, we asked both Gemini and ChatGPT to design something easier. We asked both, with the same initial prompt, to “Design a brand for Barwise ranch.” 

Both failed badly. This is interesting because LLMs are language models and cattle brand iconography is also a language. An illiterate 19th century cowboy could “read” nearly any brand. LLMs in the 21st century cannot. This is worth thinking about when we consider the limitations of mainstream AI and why Lone Star’s Evolved AI™ is sometimes a better choice.

There are several reasons why ChatGPT and Gemini failed.

First, they suffered from context and semantic confusion. Both failed to understand a ranch “brand” is associated with a branding iron. Gemini thought it was designing a logo. It created something with a longhorn skull, but still not a “brand” for branding animals. ChatGPT provided a written description of a proposed logo, making the same mistake about a “Brand” not being a “branding iron.”

Even with additional prompts both failed badly. When specifically prompted for a “branding iron design,” ChatGPT simply put a logo on the end of the iron and placed it in a stylized western setting:

Both AIs failed to understand the contextual definition of “brand.” This problem with context is a significant challenge for mainstream AI. Asking about cancer biopsy error rates is a good example of context challenges. Some AI systems will provide a summary of errors for all kinds of cancers. But error rates are dependent on several types of cancer and locations. So, the AI is likely to provide information that doesn’t match the prompt.

Second, understanding the “language” of brands requires training examples, and this math is brutal. English has over 1,000 rules (supposedly over 3,000) and something like a dozen major rules. For example, all English sentences have a noun and a verb. This is usually grammar rule #1. The internet has over a billion English users. The websites they visit have something like 30 trillion words (no one really knows). So, we have an extraordinarily rich training set for LLMs. The ratio of English words to grammar rules is something like 100 billion to 1. There are about 5,000 words in common use, so on average, we’d expect to see over 50 billion examples of usage for most of these frequently used words in their context.

Cattle brand language has only 10 rules. Rule #1 is to read the brand from left to right. Rule #2 is read from top to bottom. To train an AI to learn the rules, we need a training set. Can we get one?

Roughly 500 counites in the US use (or used) cattle branding. Only a few had 1,000 registered brands, so we’d expect a listing of all US brands to be less than 500,000 entries, or a rule-to-word ratio of about 50,000 to 1. Thid sounds workable until you note each brand is unique. There are not enough labeled images to train an LLM to read, much less understand rules or design a brand:

To make matters worse, most registrations don’t explain a brand’s meaning in English. The registration on the right assumes a reader knows the Preston ranch brand means “Bar I” In the example below the registration assumes the reader can distinguish between a “Swinging C” and a “Rocking C”:

All of this explains why Google Images can’t “read” a brand either. We gave several cattle brands to Google Images, and it failed to “read” any of them correctly. Only one of our test cases was even identified as a brand. It was the incredibly famous “Two Lazy Two P” but even here, Google failed to “read” the brand. Instead, Google hallucinated, in the case of the Barwise brand, it suggested this was the Yahoo logo.

So, what does all this mean for AI aided design?

It means we can’t expect an AI designer to depend on brute force machine learning. If we know animal branding syntax is based on rules, we need to teach the computer those rules.

This is the principle behind Lone Star’s First Order AI Designer (FAID). A corpus of knowledge like a body of engineering design rules builds the framework of the FAID which can include rules of thumb and if/then alternatives (I before E expect after C).

FAID is impressive and in use today in challenging applications. We’d love to build one for you.