Cameron R. Wolfe, Ph.D. (@cwolferesearch): "Code generation is (by far) the most prolific application of AI right now, but this isn't because it's the most useful / impactful... Code generation has such a heavy focus because (nearly) everyone building AI systems knows how to code. So, it's arguably the "easiest" applicat…"

The app for independent voices

Nov 27, 2024

Code generation is (by far) the most prolific application of AI right now, but this isn't because it's the most useful / impactful...

Code generation has such a heavy focus because (nearly) everyone building AI systems knows how to code. So, it's arguably the "easiest" application on which to work - the people building the model(s) are already domain experts! We can easily spot check outputs for correctness and interpret the model's output, which makes the model development process both faster and easier.

Let's consider other domains; e.g., medicine, science, legal, etc. AI undoubtedly has a ton of potential to be helpful in these domains. However, those building the model cannot easily tell whether a model is helpful or not. To evaluate an AI system in some niche domain, we have to partner with domain experts who can provide accurate input / feedback to the system.

The need to partner with domain experts during model development is what is holding adoption of AI back in domains beyond code generation. There are many reasons for this:

- There is an external dependence on people beyond those developing the model, which naturally slows down model development.

- Model developers want to follow their own intuition during model development rather than truly listening to feedback from domain experts.

- Domain experts do not trust models that they were not involved in developing, as these models tend to disagree with their expertise / opinions.

- Human evaluations are very noisy / difficult. This can complicate model evaluation and development, especially when model developers cannot easily interpret / understand the results of evaluation.

- Systematic evaluation of LLMs is difficult in general - manually inspecting model outputs is way easier than building a dependable evaluation framework with humans in the loop.

Put simply, unlocking higher ROI applications of AI will require that model developers get much better at collaborating with domain experts. Interestingly, the major roadblocks for AI adoption (at least in my opinion) are more related to developing soft skills and fostering closer collaboration rather than overcoming major technical hurdles.

Nov 27, 2024

7:36 PM

The app for independent voices

Log in or sign up