I have some open questions about AI and data.
Many experts and advisors tell me that an organisation needs to fix its data before trying to undertake any major AI project. On the face of it, this makes sense. Clean, organise, structure and collate your data to make things easy for the new system.
But…I have heard this advice in the early days of every computer or digital advance. I have picked over the graveyards of more data cleaning and data warehouse projects than I care to remember.
There is a terrible track record of failure and even that may be understated. Even so called success stories rely on a lot of heavy lifting by spreadsheets and BI packages to make them work.
So what happens if it turns out that most companies and many big institutions can’t fix their data?
Then I think about the way AI works. LLMs open up access to vast pools of data that have been largely untouched by technology up to now. Text, images, video and audio can now be considered as data to be analysed.
One thing is for sure. These new data sets are not amenable to being “fixed.”
In fact, these data sets are being opened up because AI is a tool that can interpret and analyse huge masses of unstructured and often entirely unrelated data.
So maybe AI is the opportunity to find better information without needing the hard labour of fixing the underlying data?
Like I say, these are entirely open questions. I don’t know. I would love to hear what others think.