Why Is It Hard To Evaluate GenAI Applications?

TL;DR If you don’t have time to read the whole article, the following four takeaways are a concise version. You can navigate to the corresponding section in the article for details. Lack of framework: A GenAI application is not a GenAI foundation model; different frameworks are required to evaluate them. There may be a lack of clarity on the difference between the two tasks. Unstructured data: The unstructured output of a GenAI application makes evaluation more difficult than a traditional ML system. Foundation model unpredictability: GenAI foundation model usually introduces extra unpredictability into the evaluation process. Longer and more costly iteration: GenAI application evaluation is expensive and time consuming, because building evaluation dataset and running tests on GenAI application require more resources. Introduction I have spent the last two and a half years listening to what businesses want from GenAI, building GenAI applications, and delivering value from the applications. It has been an interesting journey, as I realized the advent of ChatGPT constitutes a paradigm shift for ML/AI practitioners like me. I started to believe that GenAI would change our lives, similar to personal computers in the 90s or the modern search engine in the 2000s. ...

a day scene of Hong Kong in pixel

Rebuild My Website With GenAI's Assistance

Introduction When I was in graduate school, I set up a personal blog to showcase my project and share thoughts. I planned to keep developing that site but it has since then taken a backseat while I became busy with work. I have got more time lately and decided to pick up this project. It turned out to be so much fun, and I want to share how I have re-built this website with GenAI tools. If you are only interested in the part related to GenAI, check out the GenAI coding tools section. ...

Random Walk Visualization

A Random Walk

I chose “Random Walk” as the name of my website, as both a representation of my work and personal interests. Randomness is prominent in the fields of modern statistical theory, machine learning, and AI (Deep Learning). In recent years, the unpredictability of the Large Language Model(LLM) has presented a major challenge for companies and individuals looking to adopt the latest AI in their business and daily life. However, I think randomness brings a lot of opportunities and beauty to the world, too. ...