Discussion about this post

User's avatar
Dominic Bristow's avatar

Wonderfully written and nourishing as always — particularly enjoyable somehow when I happen to be doing emails and I can read while the ink is still wet, justifying the distraction as one step closer to the near-mythical inbox zero.

However.

"here in the real world, humans can’t memorize all the modern-day classics, or even one classic, for that matter. We remember themes and ideas from books, not the order of each and every word. This is just fundamentally different from how LLMs operate"

Surely not so! Do you contest that having read every Murakami back to back over a year that the style of my own writing might not fundamentally change? By choice, perhaps, but certainly not by producing the words by rote. I might change my punctuation usage and mood; employ passive voice and increasingly abstract oddities in my turns of phrase. How is this different? The signal that drives these changes comes from assimilating Murakami's semantic maps through his writing in a very similar fashion to how language models do, actually. I can't replicate them verbatim like is often the argument in the 'reproduction half' of this IP debate, our brains are so much more than just 'the semantic assimilation function' that these LLMs represent. No? What am I missing here?

(I'm not drawing a line in the sand with the IP infringement btw, that feels deeply thorny and I can see both sides. Unlike breaking the book bindings and burning the content. That feels deeply revulsive with little to nothing on the other side of the coin.)

Expand full comment
Roman's Attic's avatar

“If the premise were true, perhaps Alsup’s conclusion would follow—but here in the real world, humans can’t memorize all the modern-day classics, or even one classic, for that matter. We remember themes and ideas from books, not the order of each and every word. This is just fundamentally different from how LLMs operate, and I hope lawyers that represent authors and other creatives hammer this point home in future litigation. “

Herman Goldstine once wrote of the mathematician and scientist John Von Neumann, “One of his remarkable abilities was his power of absolute recall. As far as I could tell, von Neumann was able on once reading a book or article to quote it back verbatim; moreover, he could do it years later without hesitation. He could also translate it at no diminution in speed from its original language into English. On one occasion I tested his ability by asking him to tell me how A Tale of Two Cities started. Whereupon, without any pause, he immediately began to recite the first chapter and continued until asked to stop after about ten or fifteen minutes.”

If the reason that LLMs are violating fair use laws is that they remember everything perfectly, does that mean that John Von Neumann (or anyone else with an eidetic memory) should not be allowed to read books and use that information to get better at things, like an LLM might?

Expand full comment
10 more comments...

No posts