Venkatesh Rao (@contraptions): "I've been wondering about the intersection of archival selves and archival internets. The data footprint of the information your identity rests on is probably not huge and could be stored on a chonky personal server I think. Wikipedia, books you might consider formative etc. @Sa…"

Make money doing the work you believe in

I've been wondering about the intersection of archival selves and archival internets. The data footprint of the information your identity rests on is probably not huge and could be stored on a chonky personal server I think. Wikipedia, books you might consider formative etc. Sachin’s notion of archival time from which I derived my notion of archival selves argues that LLMs are archival public internets. But no reason we should be limited to that. I myself now have a Claude-organized corner of L-space with ~700 PDFs. Considering adding Wikipedia and Gutenberg texts where I can. I think the “basic” Internet is about 40 exabytes of which 0.5-1 eb is perhaps usable training data. And the size of the largest foundation models is about 20-40 tb. I think. You could easily have a low-precision smaller model of say 80b parameters, augmented by 3-4tb “local” rag, to make yourself a reasonable offline, bespoke, archival internet. A simulated escapist digital reality of permanent nostalgia basically. Personal theme park/sandbox version of the internet. And you could invite say a small cult to share it.

The interesting thing is, with distillation and fine tuning, that personal archival internet need not stay frozen in time. It could evolve to digest new local data from your reality tunnel. So you could kinda fork off your own live digital parallel universe.

contraptions.venkateshr…

Archival Time

In Archival Time, I wrote about the carnivalesque nature of internet vs archival slice nature of LLMs.

Summer Lightning

Mar 9

3:31 PM

Make money doing the work you believe in

Archival Time

Log in or sign up